Over-reliance on English hinders cognitive science

Been reading this paper by @blasi_lang @JoHenrich @EvangeliaAdamou Kemmerer & @asifa_majid and can recommend it — Figure 1 is likely to end up in many lecture slides http://doi.org/10.1016/j.tics.2022.09.015

Naturally I was interested in what the paper says about conversation. The claim about indirectness in Yoruba and other languages is sourced to a very nice piece by Felix Ameka and Marina Terkourafi.

The paper also devotes some attention to the importance of linguistic diversity in computer science and NLP — a key theme in the new language diversity track at #acl2022nlp, where another paper by Blasi and colleagues stood out. (The relevance of cross-linguistically diverse corpora for NLP was also a focus in this ACL paper of ours, where we argue such data is crucial for diversity-aware modelling of dialogue and conversational AI.

I do have a nitpick about Blasi &al’s backchannel claim. They note many languages have minimal forms (citing a study of ours that provides evidence on this for 32 languages) and add, “However, listeners of Ruruuli … repeat whole words said by the speaker” — seeming to imply they rarely produce such minimal forms and (tend to) repeat words instead. Or at least I’m guessing that would be most people’s reading of this claim.

The source given for this idea is Zellers 2021. However, this actually paints a very different picture: in fact, ~87% of relevant utterances (1325 out of 1517) do consist of minimal forms like the ‘nonlexical’ hmm and the ‘short lexical’ eeh ‘yes’, against <9% featuring repetition, as seen in this table from Zellers:

I don’t think anyone has done the relevant comparison for other languages yet, but it seems safe to say that Ruruuli/Lunyala does in fact mostly use “the minimal mm-hmm”, and that repetition, while certainly worthwhile of more research, is one of the minority strategies for backchanneling in the language.

Despite this shortcoming, the relevance of cross-linguistic diversity in this domain can be supported by a different observation: the relative frequency and points of occurrence of ‘backchannels’ do seem to differ across languages — as shown in our ACL paper for English versus Korean. And the work on repetition is fascinating in itself — it is certainly possible that repetition is used in a wider range of interactional practices in some languages, with possible effects on transmission & lg structure as suggested in work by Sonja Gipper.

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on October 17, 2022.

A serendipitous wormhole into the history of Ethnomethodology and Conversation Analysis (EMCA)

A serendipitous wormhole into #EMCA history. I picked up Sudnow’s piano course online and diligently work through the lessons. Guess what he says some time into the audio-recorded version of his 1988 Chicago weekend seminar (see lines 7-11)

[Chicago, 1988. Audio recording of David Sudnow’s weekend seminar]

We learn too quickly and cannot afford to contaminate a movement by making a mistake.

People who type a lot have had this experience. You type a word and you make a mistake.

I have been involved, uh of late, in: a great deal of correspondence in connection with uh a deceased friend’s archives of scholarly work and what should be done with that and his name is Harvey. And about two months ago or three months ago when the correspondence started I made a mistake when I ( ) taped his name once and I wrote H A S R V E Y, >jst a mistake<.

I must’ve written his name uh two hundred times in the last few months in connection with all the letters and the various things they were doing. Every single time I do that I get H A S R V E Y and I have to go back and correct the S. I put it in the one time and my hands learned a new way of spelling Harvey. I call ‘m Harvey but my hands call ‘m Hasrvey.

And they learned it that one time. Right then and there, the old Harvey got replaced and a new Harvey, spelled H A S R V E Y got put in. So we learn very fast.

Folks who know #EMCA history will notice this is right at the height of the activity of the Harvey Sacks Memorial Association, when Sudnow, Jefferson, Schegloff, and others were exchanging letters on Sacks’ Nachlass, intellectual priority in CA, and so on

We have here a rare first person record of the activity that Gail Jefferson obliquely referred to in her acknowledgement to the posthumously published Sacks lectures (“With thanks to David Sudnow who kick-started the editing process when it had stalled”), and much more explicitly in an 1988 letter (paraphrased in Button et al. 2022).

Historical interest aside, I like how the telling demonstrates Sudnow’s gift for first-person observation — a powerful combination of ethnomethodology and phenomenology that is also on display in his books, Pilgrim in the Microworld and Ways of the Hand #EMCA

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on October 6, 2022.

Linguistic roots of connectionism

This Lingbuzz preprint by Baroni is a nice read if you’re interested in linguistically oriented deep net analysis. I did feel it’s a bit hampered by the near-exclusive equation of linguistic theory with generative/Chomskyan aps. (I know it makes a point of claiming a “very broad notion of theoretical linguistics”, but it doesn’t really demonstrate this, and throughout the implicit notion of theory is near-exclusively aligned with GG and its associated concerns of competence, poverty of the stimulus, et cetera).

For instance, it notes (citing Lappin) that theoretical linguistics “played no role” in deep learning for NLP, but while this may hold for generative grammar (GG), linguistic theorizing was much broader than that right at the start of connectionism and RNNs, e.g. in Elman 1991.

In fact, just look at the bibliography of Elman’s classic RNN work and tell us again how exactly theoretical linguistics “played no role” — Bates & Macwhinney, Chomsky, Fillmore, Fodor, Givon, Hopper & Thompson, Lakoff, Langacker, they’re all there. Elman’s bibliography is a virtual Who is Who of big tent linguistics at the start of the 1990s. The only way to give any content to Lappin’s claim (and by extension, Baroni’s generalization) is to give the notion of “theoretical linguistics” the narrowest conceivable reading.

However, Baroni’s point may generalize: perhaps modern-day usage-based, functional, and cognitive approaches to ling theory aren’t drawing as heavily on current NLP/ML/DL work as they could either. Might a lack of reciprocity play a role? After all, the well known ahistoricism and lack of interdisciplinary engagement of NLP today does not exactly invite productive exchange. (Though some of us try.)

The theory=Chomsky equation also makes it appearance at the end, where Baroni muses about incorporating storage, retrieval, gating and attention in theories of language. Outside the confines of Chomskyan linguistics folks have long been working on precisely such things. One might think work by Joan Bybee, Maryellen MacDonald, Morten Christiansen, and others might merit a mention!

In sum, Baroni’s piece provides an informative if partial review of recent work and includes bold proposals (e.g., deep nets as algorithmic linguistic theories), worth reading if you’re interested in a particular kind of linguistics. Consider pairing it with this well-aged bottle of Elman 1991!

References

  • Baroni, M. (2021, June). On the proper role of linguistically-oriented deep net analysis in linguistic theorizing. LingBuzz. Retrieved from https://ling.auf.net/lingbuzz/006031
  • Bybee, J. L. (2010). Language, Usage, and Cognition. Cambridge: Cambridge University Press.
  • Christiansen, M. H., & Chater, N. (2017). Towards an integrated science of language. Nature Human Behaviour, 1, s41562-017-0163–017. doi: 10.1038/s41562-017-0163
  • Elman, J. L. (1991). Distributed Representations, Simple Recurrent Networks, And Grammatical Structure. Machine Learning, 7, 195–225. doi: 10.1023/A:1022699029236
  • Lappin, S. (2021). Deep learning and linguistic representation. Boca Raton: CRC Press.
  • MacDonald, M. C., & Christiansen, M. H. (2002). Reassessing working memory: Comment on Just and Carpenter (1992) and Waters and Caplan (1996). Psychological Review, 109(1), 35–54. doi: 10.1037/0033-295X.109.1.35

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on June 17, 2021.

New paper: Interjections (Oxford Handbook of Word Classes)

📣New! “Interjections“, a contribution to the Oxford Handbook on Word Classes. One of its aims: rejuvenate work on interjections by shifting focus from stock examples (ouch, yuck) to real workhorses like mm-hm, huh? and the like. Abstract:

No class of words has better claims to universality than interjections. At the same time, no category has more variable content than this one, traditionally the catch-all basket for linguistic items that bear a complicated relation to sentential syntax. Interjections are a mirror reflecting methodological and theoretical assumptions more than a coherent linguistic category that affords unitary treatment. This chapter focuses on linguistic items that typically function as free-standing utterances, and on some of the conceptual, methodological, and theoretical questions generated by such items. A key move is to study these items in the context of conversational sequences, rather than as a mere sideshow to sentences. This makes visible how some of the most frequent interjections streamline everyday language use and scaffold complex language. Approaching interjections in terms of their sequential positions and interactional functions has the potential to reveal and explain patterns of universality and diversity in interjections.

Anyone who writes about interjections has first to cut through a tangle of assumptions about marginality, primitivity, and insignificance. I think this is incoherent: linguistics without interjections is like chemistry without the noble gases.

Re-centering interjections is possible now because there’s plenty of cool new interactional work by folks like Emily Hofstetter, Elliott Hoey, Nick Williams, Kristian Skedsmo, Johanna Mesch, and many others.

A fairly standard take in linguistics is that interjections are basically public emissions of private emotions — a view that is remarkably close to folk notions about the category. However, corpus data suggests that interjections expressive of emotions are actually not all that frequent — interactional and interpersonal uses are much more prominent (yet they are the least studied). This is why re-centering is important.

In line with this, part of the chapter focuses on some of the most frequent interjections out there: continuers, minimal particles that acknowledge a turn is underway and more is anticipated (like B’s m̀:hm seen at lines 53 and 57)

I am always impressed by the high-precision placement of these items and by their neat form-function fit, a pliable template for signalling various degrees of alignment and affiliation, with closed lips signifying ‘keep going, I won’t take the floor’.

Traditional linguistic tools are ill-suited to deal with the nature of interjections. Perhaps this is why most grammars do little more than just listing a bunch of them & noting how they don’t fit the phonological system. Fortunately, interactional linguistics and conversation analysis offer robust methodological tools, ready to be used in descriptive & comparative work. One aim of this piece is to point folks to some concrete places to start.

In the NWO project Elementary Particles of Conversation, we undertake the comparative study of these kinds of items and their consequences for language; this chapter aims to contribute towards that goal by fostering more empirical & theoretical work.

Some further goals I set myself for this piece: 1) foreground empirical work rather than traditional research agendas; 2) elevate new work by junior & minoritized scholars; 3) treat matters in modality-inclusive & modality-agnostic ways.

I found that often, these goals converge & point to exciting new directions. For instance, including sign language data (as in this case from Norwegian SL Kristian Skedsmo, but also work by Joanna Mesch) shows the prospects of a cross-modal typology of interjections.

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on June 30, 2021.

On gatekeeping in general linguistics

An exercise. Take 1️⃣️this paper on ‘Language disintegration under conditions of formal thought disorder‘ and 2️⃣ this Henner and Robinson preprint on ‘Imagining a Crip Linguistics‘.

Now tell us in earnest that only one of these contains “theoretical implications that shed light on the nature of language and the language faculty”. (That was the phrasing a handling editor at Glossa used to desk-reject Henner’s submission.)

The point here is not to hate on a published paper (though to be honest I think that paper is flawed at the very least because of its unexamined deficit-based view of autism). The point is also not to argue that a preprint should be published as is. It is to argue that desk-rejecting that 2nd paper as “mainly about language use” is incorrect, far from theoretically neutral, and problematic for a journal of general linguistics.

As Emily Carrigan wrote on twitter,

The difference is that paper 1 takes a disability-as-deficit approach, which is currently the status quo in linguistics/psychology/education, whereas paper 2 asks us to consider an alternative interpretation, at which point people aligned with the status quo shut down.

Figuring out the myriad ways in which the second paper interrogates, uproots, and respecifies the theoretical premises of the first is left as an exercise to the reader.

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on July 9, 2021.

Large language models and the unstoppable tide of uninformation

Large language models make it entirely trivial to generate endless amounts of seemingly plausible text. There’s no need to be cynical to see the virtual inevitability of unending waves of algorithmically tuned AI-generated uninformation: the market forces are in place and they will be relentless.

I say uninformation against the backdrop of Bateson’s tongue-in-cheek definition of information as ‘a difference that makes a difference’. If we don’t know (or can’t tell) the difference anymore, we are literally un-informed.

It is likely that a company like OpenAI sees some of this and that they’re keeping, for instance, time-stamped samples of AI-hallucinated content to enable some degree of textual provenance — but given how hard it is to deal with content farms already I think there’s little reason to be optimistic.

Which has an important consequence. The web makes up a large chunk of the data feeding GPT3 and kin. Posting the output of large language models online builds a feedback loop that cannot improve quality (unless we have mechanisms for textual provenance) and so will lead to uninformation feeding on uninformation.

All ingredients for an information heat death are on hand. True human-generated and human-curated information —of the kind produced, for instance, by academics in painstaking observations and publications— will become more scarce, and therefore more valuable. Counterintuitively, there was never a better time to be a scholar.

  • Bateson, G. (1979). Mind and Nature. New York: E.P. Dutton.

Concrete reasons to be skeptical about an ‘Abstract Wikipedia’

Computer scientist Denny Vrandečić has the interesting idea to develop Abstract Wikipedia, an initiative that would use “structured data from Wikidata to create a multilingual, machine-driven knowledge platform” (source). As a linguist, I am skeptical. I first recorded my skepticism about this on twitter, where I wrote:

For a limited class of ‘brute facts’ this may help; but the structural and semantic differences between, say, Chukchi and Farsi surely exceed those between C# and F# languages… and that’s even before considering cultural and communicative context

To this, Vrandečić responded with a challenge: “Can you find a concrete example in Wikipedia, where a sentence or short paragraph wouldn’t be expressible in two languages of your choice? I would be curious to work through that.”

I dediced to follow up on this, using an example of Vrandečić himself: the word “mayor”. This is seemingly a clear enough, easily definable term. As in, “The current major of San Francisco is London Breed” (source). So, what would the maximally abstract, language-agnostic content of this concept be, and how would we use it to autotranslate content into other languages?

Staying within Germanic languages, let’s start easy, with German. Do you pick Bürgermeister or Oberbürgermeister? Depends, among other things, on the kind of analogical mapping you want to do (and there are multiple possibilities, none neutral). Or take Swedish, where the cognate term borgmästare was actually abandoned in the 1970s. Perhaps this one’s easy: bilingual as they are, Swedes may well just use “mayor” for the mayor of San Francisco — but that’s boring and issues would still arise with historical articles

Moving to Slavic, how about Polish? We’ll probably use a calque from German (‘burmistrz’) but the semantic space is again being warped by alternatives, and partial incommensurability is demonstrated by key terms remaining untranslated in this academic paper on the topic.

Colonialism has an ugly habit of erasing cultural institutions and indigenous voices and vocabulary — and even then the result is rarely simple translational equivalence, as seen in the use of Spanish alcalde (judge/mayor/{…}) in the Americas (Schwaller 2015).

Moving further afield, let’s take Samoan, where the office of pulenu’u is a weird mesh of locally organized administration and colonial era divide and conquer policies. “Mayor” might be translated as pulenu’u but it would definitely have a semantic accent. On the very useful notion of “semantic accent”, see Werner 1993. It is this kind of careful anthropological linguistic work that most strongly brings home, to me, the (partial) incommensurability of the worlds we build with words.

I have here focused on languages for which there at least appears to be an available (if not fully equivalent) translation, but of course there will also be those that simply haven’t lexicalised the concept — think of languages spoken by egalitarian hunter-gatherers. One might say that they could surely adopt the concept & term from another language and get it. Sure. And there’s the rub: ontologies are rarely neutral. The English term “mayor” is supported by and realized in its own linguistically & culturally relative ontology.

While I’ve mostly taken the English -> other language direction here (which may seem easier because of globalization, cultural diffusion, calqueing, etc.), clearly the problems are at least as bad if you try going the other direction, starting from other culturally relative notions.

In sum. TL;DR Wikidata may help us autofill some slots, but social ontologies are never language-agnostic — so the project risks perpetuating rather than transcending the worldviews most prevalent in current Wikipedia databases, which means broadly speaking global north, Anglo, western worldviews. I think Wikidata is perhaps promising for brute physical facts like the periodic table and biochemistry. But the social facts we live with —from politics to personhood and kinship to currency— are never fully language-independent, so any single ontology will be biased & incomplete.

If even a seemingly innocuous term like “mayor” is subject to this kind of warping of semantic spaces (if it’s available at all), that doesn’t bode well for many other concepts. Which is why, even if I like the idea, I’m skeptical about a concrete Abstract Wikipedia.

Liminal signs

I have a new paper out as part of a special issue filled to the brim with things on the border of language if not beyond it. There are seven empirical articles on response cries, “moans”, clicks, sighs, sniffs, & whistles, flanked by an intro (by editors Leelo Keevallik and Richard Ogden) and a commentary (by me). It was truly a privilege to sit down and spend time with this collection of papers to write a commentary; and quite the challenge to formulate a coherent take on phenomena so diverse in form and function, and so neglected in the language sciences.

Why are these things neglected? As I note in my commentary, there are at least three reasons: we’ve not been able to capture them until recently; some quarters of linguistics have been actively disinterested in them; but most intriguingly, they may be designed to be overlooked, or at least overlookable.

One challenge I set myself was to come up with a characterisation of these items that doesn’t focus on what they are not. “Non”-labels like non-lexical, non-linguistics, non-conventional, non-phonemic, non-committal et cetera buy into the framing that these things are not language, and imply that they have no qualities of their own worth mentioning. However, there is at least one thing that unites them: their in-betweenness. Are they lexical or not? Conventional or not? Phonemic or not? Intentional or not? They seem to skirt these issues — and derive interactional utility from that very ambiguity. Hence: liminal signs.

Many liminal signs originate in bodily conduct with non-interactional functions: sighing, sniffing, moaning, etc. This lends them an air of plausible deniablity and makes them off the record. It also makes them awesome cases of exaptation and ritualisation. Speaking of which: when Darwin wrote about whistles and clicks, he had to rely on anecdotal reports from around the world. The papers in this issue showcase the power of sequential analysis to bring to light the workings of liminal signs in interaction.

Inspired by Harvey Sacks, the commentary also aims to highlight the methodological and conceptual contributions of this special issue — from transcriptional innovations like >.nh< to interdisciplinary connections. As Sacks wrote:

[I]t would be nice if things were ripe so that any question you wanted to ask, you could ask. But there are all sorts of problems that we know in the history of any field that can’t be asked at a given time. They don’t have the technology, they don’t have the conceptual apparatus, etc. We just have to live with that, and find what we can ask and what we can handle.”

(Spring 1966 Lecture, in Sacks, 1992, vol. I:427)

The papers in this issue are part of a wave of new research into multimodal talk-in-interaction that is making remarkable progress in just what the study of talk-in-interaction can handle.

Looking for something to read? Dip into this special issue and prepare to have your sense of the boundaries of language subtly shifted — one sniff, click, or whistle at a time. My commentary (short and open access) is here:

Dingemanse, M. (2020). Between Sound and Speech: Liminal Signs in Interaction. Research on Language and Social Interaction, 53(1), 188–196. doi: 10.1080/08351813.2020.1712967

Semantic primitives and conceptual decomposition

Thought-provoking discussion on semantic primitives and conceptual decomposition this morning at @in_interaction, led by Guillermo Montero-Melis We went from Wittgenstein & Osgood via Rosch & Lakoff to Kemp & Tenenbaum and recent work by Mitchell, Binder, and others

The paper that drew most attention was Binder et al. 2016’s ‘Toward a brain-based conceptual semantic representation’ (link). They present 65 semantic features (operationalised as human-rateable scales) with experiential roots & neurobiological plausibility — organisable into broad domains like vision, somatic, motor, spatial, temporal, causal, social, emotion, attention.

I thought of a very different corner of the language sciences, where Wierzbick, Goddard et al. have for decades refined Natural Semantic Metalanguage, a program for semantic decomposition in terms of a set of 65-odd semantic primitives — as here. On the face of it, the approaches could not be more different: NSM aims for human-readable reductive paraphrases of meanings in terms of semantic primitives thought to be lexicalised in all languages (a controversial claim: http://jstor.org/stable/4489617); on the other hand, Binder et al. aim to quantify meanings by placing them in a 65-dimensional space of experientially & neurobiologically motivated notions, some of which, to NSM adepts, may look too Anglo-specific and technical — here’s their take on ‘egg’, ‘bike’, ‘agreement’:

Contrast this with “eggs” in NSM (an explication published in http://doi.org/10.22363/2312-9182-2018-22-3-539-559…): readable if a bit verbose, and phrased mostly in terms of just 65 words (excepting things marked [m], which are ‘semantic molecules’ that themselves need explications — long story):

This would be worth a more detailed piece at some point but for now I simply want to note one thing: that these *wildly* different approaches to semantic primitives —at different levels of analysis & with different explanatory aims— still show some interesting convergences.

In particular, both postulate the relevance of broad domains like space, time, emotion, movement, body, speech, et cetera, though NSM is designed as a metalanguage (e.g., needing logical operators) while Binder et al. go more experiential and low-level (e.g. temperature). NSM scholars could have a field day with some of the Binder et al. features (e.g. Ekmanesque emotions, whose universal status is not uncontroversial) but they might also take a cue from solid neurobiological evidence for the importance of, say, sensorimotor features in meaning

TL;DR semantic primitives are fascinating & recent work suggests exciting directions for the study of conceptual semantics. I wish NSM open-sourced its explications; I’ve already been playing around with the Binder et al. open data from here.

‘big’ and ‘theory’ in terms of Binder et al. features