This Lingbuzz preprint by Baroni is a nice read if you’re interested in linguistically oriented deep net analysis. I did feel it’s a bit hampered by the near-exclusive equation of linguistic theory with generative/Chomskyan aps. (I know it makes a point of claiming a “very broad notion of theoretical linguistics”, but it doesn’t really demonstrate this, and throughout the implicit notion of theory is near-exclusively aligned with GG and its associated concerns of competence, poverty of the stimulus, et cetera).
For instance, it notes (citing Lappin) that theoretical linguistics “played no role” in deep learning for NLP, but while this may hold for generative grammar (GG), linguistic theorizing was much broader than that right at the start of connectionism and RNNs, e.g. in Elman 1991.
In fact, just look at the bibliography of Elman’s classic RNN work and tell us again how exactly theoretical linguistics “played no role” — Bates & Macwhinney, Chomsky, Fillmore, Fodor, Givon, Hopper & Thompson, Lakoff, Langacker, they’re all there. Elman’s bibliography is a virtual Who is Who of big tent linguistics at the start of the 1990s. The only way to give any content to Lappin’s claim (and by extension, Baroni’s generalization) is to give the notion of “theoretical linguistics” the narrowest conceivable reading.
However, Baroni’s point may generalize: perhaps modern-day usage-based, functional, and cognitive approaches to ling theory aren’t drawing as heavily on current NLP/ML/DL work as they could either. Might a lack of reciprocity play a role? After all, the well known ahistoricism and lack of interdisciplinary engagement of NLP today does not exactly invite productive exchange. (Though some of us try.)
The theory=Chomsky equation also makes it appearance at the end, where Baroni muses about incorporating storage, retrieval, gating and attention in theories of language. Outside the confines of Chomskyan linguistics folks have long been working on precisely such things. One might think work by Joan Bybee, Maryellen MacDonald, Morten Christiansen, and others might merit a mention!
In sum, Baroni’s piece provides an informative if partial review of recent work and includes bold proposals (e.g., deep nets as algorithmic linguistic theories), worth reading if you’re interested in a particular kind of linguistics. Consider pairing it with this well-aged bottle of Elman 1991!
- Baroni, M. (2021, June). On the proper role of linguistically-oriented deep net analysis in linguistic theorizing. LingBuzz. Retrieved from https://ling.auf.net/lingbuzz/006031
- Bybee, J. L. (2010). Language, Usage, and Cognition. Cambridge: Cambridge University Press.
- Christiansen, M. H., & Chater, N. (2017). Towards an integrated science of language. Nature Human Behaviour, 1, s41562-017-0163–017. doi: 10.1038/s41562-017-0163
- Elman, J. L. (1991). Distributed Representations, Simple Recurrent Networks, And Grammatical Structure. Machine Learning, 7, 195–225. doi: 10.1023/A:1022699029236
- Lappin, S. (2021). Deep learning and linguistic representation. Boca Raton: CRC Press.
- MacDonald, M. C., & Christiansen, M. H. (2002). Reassessing working memory: Comment on Just and Carpenter (1992) and Waters and Caplan (1996). Psychological Review, 109(1), 35–54. doi: 10.1037/0033-295X.109.1.35