Deep learning, image generation, and the rise of bias automation machines

DALL-E, a new image generation system by OpenAI, does impressive visualizations of biased datasets. I like how the first example that OpenAI used to present DALL-E to the world is a meme-like koala dunking a baseball leading into an array of old white men — representing at one blow the past and future of representation and generation.

It’s easy to be impressed by cherry-picked examples of DALL•E 2 output, but if the training data is web-scraped image+text data (of course it is) the ethical questions and consequences should command much more of our attention, as argued here by Abeba Birhane and Vinay Uday Prabhu.

Suave imagery makes it easy to miss what #dalle2 really excels at: automating bias. Consider what DALL•E 2 produces for the prompt “a data scientist creating artificial general intelligence”:

When the male bias was pointed out to AI lead developer Boris Power, he countered that “it generates a woman if you ask for a woman”. Ah yes, what more could we ask for? The irony is so thicc on this one that we should be happy to have ample #dalle2 generated techbros to roll eyes at. It inspired me to make a meme. Feel free to use this meme to express your utter delight at the dexterousness of DALL-E, cream of the crop of image generation!

The systematic erasure of human labour

It is not surprising that glamour magazines like Cosmopolitan, self-appointed suppliers of suave imagery, are the first to fall for the gimmicks of image generation. As its editor Karen Cheng found out after thousands of tries, it generates a woman if you ask for “a female astronaut with an athletic feminine body walking with swagger”.

I also love this triptych because of the evidence of human curation in the editor’s tweet (“after thousands of options, none felt quite right…”) — and the glib erasure of exactly that curation in the subtitle of the magazine cover: “and it only took 20 seconds to make”.

The erasure of human labour holds for just about every stage of the processing-to-production pipeline of today’s image generation models: from data collection to output curation. Believing in the magic of AI can only happen because of this systematic erasure.

Based on a thread originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on April 7, 2022.

Sometimes precision gained is freedom lost

Part of the struggle of writing in a non-native language is that it can be hard to intuit the strength of one’s writing. Perhaps this is why it is especially gratifying when generous readers lift out precisely those lines that {it?} took hard work to streamline — belated thanks!

Interestingly, the German translation for Tech Review needed double the amount of words for the same point: “Ein Mehr an Präzision bedeutet manchmal ein Weniger an Freiheit.” I’m still wondering whether that makes it more precise or less.

  • Dingemanse, M. (2020, August). Why language remains the most flexible brain-to-brain interface. Aeon. doi: 10.5281/zenodo.4014750

Talk, tradition, templates: a meta-note on building scientific arguments

Chartres cathedral (Gazette Des Beaux-Arts, 1869)

Reading Suchman’s classic Human-machine reconfigurations: plans and situated actions, I am impressed by what I’m reading on the performative and interactional achievement of the construction of gothic cathedrals, as studied by David Turnbull. In brief, the intriguing point is that no blueprints or technical drawings or even sketches are known to have existed for any of the early modern gothic cathedrals, like that of Chartres. Instead, Turnbull proposes, their construction was massively iterative and interactional, requiring —he says— three main ingredients: “talk, tradition, templates”. Each of these well-summarized by Suchman. This sounds like an account worth reading; indeed perhaps also worth emulating or building on. In the context of the language sciences, an analogue readily suggests itself. Aren’t languages rather like cathedrals — immense, cumulative, complex outcomes of iterative human practice?

Okay nice. At such a point you can go (at least) two ways. You can take the analogy and run with it, taking Turnbull’s nicely alliterative triad and asking, what are “talk, traditions, and templates” for the case of language? It would be a nice enough paper. The benefit would be that you make it recognizably similar and so if the earlier analysis made an impact, perhaps some of its success may rub off on yours. The risk is that you’re buying into a triadic structure devised for a particular rhetorical purpose in the context of one particular scientific project.

Going meta

The second way is to ‘go meta’ and ask, if this triad is a useful device to neatly and mnemonically explain something as complex as gothic cathedrals, what is the kind of rhetorical structure we need to make a point that is as compelling as this (in both form and content) for the domain we are interested in (say, language)? See, and I like that second move a lot more. Because you’ve learnt from someone else’s work, but on a fairly abstract level, without necessarily reifying the particular distinctions or terms they brought to bear on their phenomenon.

While writing these notes I realise that I in my reading and reviewing practice, I also tend to judge scientific work on these grounds (among others). Does it work with (‘apply’) reified distinctions in an unexamined way, or does it go a level up and truly build on others’ work? Does it treat citations perfunctorily and take frameworks as given, or does it reveal deep reading and critical engagement with the subject matter? The second approach, to me, is not only more interesting — it is also more likely to be novel, to hold water, to make a real contribution.

Spandrels

Spandrel: “a necessarily triangular space where a round dome meets two rounded arches at right angles” (from Gould 1997)

Sticking with architecture, let me draw attention to a case where the collateral effects of terminological choices have been quite visible. Gould and Lewontin’s (1979) original critique of the adaptationist programme leaned heavily on the notion of spandrel. The term itself was less crucial than the conceptual contribution they wanted to make, which was to highlight the need for biological science to enrich the default adaptationist paradigm by taking into account other possible sources of structuration. According to many measures, they succeeded in making this contribution. The Spandrels paper is a modern classic and has been hailed as a rhetorical masterpiece.

And yet a significant part of the literature in the wake of this very influential proposal did get hung up this terminological choice, asking questions like whether the San Marco pendentives do or do not qualify as spandrels in the architectural sense, and questioning the prominence or importance of spandrels in the first place (Gould 1997). This is especially ironic since this kind of rhetorical move was foreseen in the original piece:

In natural history, all possible things happen sometimes; you generally do not support your favoured phenomenon by declaring rivals impossible in theory. Rather, you acknowledge the rival, but circumscribe its domain of action so narrowly that it cannot have any importance in the affairs of nature. Then, you often congratulate yourself for being such an undogmatic and ecumenical chap [sic].

Folks building on the notion of spandrel, but also folks critiquing it, can do so either in superficial or in deep ways. Choice of terminology matters, but so does the conceptual use you’re making of other’s choices of terms and distinctions, and the good or bad faith you may be bringing to them. Bringing the argument full circle, perhaps what I’m saying is that in building and evaluating scientific arguments, it can be useful to make a distinction between the conceptual foundations and structural features that make up an argument, and the ‘terminological spandrels’ that may add to its general appeal, or detract from it as the case may be.

A final corrollary of this is that for productive scholarly discourse, especially across disciplinary boundaries, it is probably a good idea to try to not let oneself get carried away by terminological choices, and instead prioritize the structure and content of arguments — in one’s own work as well as in that of others.

References cited

  • Gould, S. J., & Lewontin, R. C. (1979). The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist programme. Proceedings of the Royal Society of London. Series B. Biological Sciences, 205(1161), 581–598. doi: 10.1098/rspb.1979.0086
  • Gould, S. J. (1997). The exaptive excellence of spandrels as a term and prototype. Proceedings of the National Academy of Sciences, 94(20), 10750–10755. doi: 10.1073/pnas.94.20.10750
  • Suchman, L. A. (2007). Human-machine reconfigurations: Plans and situated actions (2nd ed). Cambridge ; New York: Cambridge University Press.
  • Turnbull, D. (1993). The Ad Hoc Collective Work of Building Gothic Cathedrals with Templates, String, and Geometry. Science, Technology, & Human Values, 18(3), 315–340. doi: 10.1177/016224399301800304

Over-reliance on English hinders cognitive science

Been reading this paper by @blasi_lang @JoHenrich @EvangeliaAdamou Kemmerer & @asifa_majid and can recommend it — Figure 1 is likely to end up in many lecture slides http://doi.org/10.1016/j.tics.2022.09.015

Naturally I was interested in what the paper says about conversation. The claim about indirectness in Yoruba and other languages is sourced to a very nice piece by Felix Ameka and Marina Terkourafi.

The paper also devotes some attention to the importance of linguistic diversity in computer science and NLP — a key theme in the new language diversity track at #acl2022nlp, where another paper by Blasi and colleagues stood out. (The relevance of cross-linguistically diverse corpora for NLP was also a focus in this ACL paper of ours, where we argue such data is crucial for diversity-aware modelling of dialogue and conversational AI.

I do have a nitpick about Blasi &al’s backchannel claim. They note many languages have minimal forms (citing a study of ours that provides evidence on this for 32 languages) and add, “However, listeners of Ruruuli … repeat whole words said by the speaker” — seeming to imply they rarely produce such minimal forms and (tend to) repeat words instead. Or at least I’m guessing that would be most people’s reading of this claim.

The source given for this idea is Zellers 2021. However, this actually paints a very different picture: in fact, ~87% of relevant utterances (1325 out of 1517) do consist of minimal forms like the ‘nonlexical’ hmm and the ‘short lexical’ eeh ‘yes’, against <9% featuring repetition, as seen in this table from Zellers:

I don’t think anyone has done the relevant comparison for other languages yet, but it seems safe to say that Ruruuli/Lunyala does in fact mostly use “the minimal mm-hmm”, and that repetition, while certainly worthwhile of more research, is one of the minority strategies for backchanneling in the language.

Despite this shortcoming, the relevance of cross-linguistic diversity in this domain can be supported by a different observation: the relative frequency and points of occurrence of ‘backchannels’ do seem to differ across languages — as shown in our ACL paper for English versus Korean. And the work on repetition is fascinating in itself — it is certainly possible that repetition is used in a wider range of interactional practices in some languages, with possible effects on transmission & lg structure as suggested in work by Sonja Gipper.

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on October 17, 2022.

A serendipitous wormhole into the history of Ethnomethodology and Conversation Analysis (EMCA)

A serendipitous wormhole into #EMCA history. I picked up Sudnow’s piano course online and diligently work through the lessons. Guess what he says some time into the audio-recorded version of his 1988 Chicago weekend seminar (see lines 7-11)

[Chicago, 1988. Audio recording of David Sudnow’s weekend seminar]

We learn too quickly and cannot afford to contaminate a movement by making a mistake.

People who type a lot have had this experience. You type a word and you make a mistake.

I have been involved, uh of late, in: a great deal of correspondence in connection with uh a deceased friend’s archives of scholarly work and what should be done with that and his name is Harvey. And about two months ago or three months ago when the correspondence started I made a mistake when I ( ) taped his name once and I wrote H A S R V E Y, >jst a mistake<.

I must’ve written his name uh two hundred times in the last few months in connection with all the letters and the various things they were doing. Every single time I do that I get H A S R V E Y and I have to go back and correct the S. I put it in the one time and my hands learned a new way of spelling Harvey. I call ‘m Harvey but my hands call ‘m Hasrvey.

And they learned it that one time. Right then and there, the old Harvey got replaced and a new Harvey, spelled H A S R V E Y got put in. So we learn very fast.

Folks who know #EMCA history will notice this is right at the height of the activity of the Harvey Sacks Memorial Association, when Sudnow, Jefferson, Schegloff, and others were exchanging letters on Sacks’ Nachlass, intellectual priority in CA, and so on

We have here a rare first person record of the activity that Gail Jefferson obliquely referred to in her acknowledgement to the posthumously published Sacks lectures (“With thanks to David Sudnow who kick-started the editing process when it had stalled”), and much more explicitly in an 1988 letter (paraphrased in Button et al. 2022).

Historical interest aside, I like how the telling demonstrates Sudnow’s gift for first-person observation — a powerful combination of ethnomethodology and phenomenology that is also on display in his books, Pilgrim in the Microworld and Ways of the Hand #EMCA

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on October 6, 2022.

The perils of edited volumes

Ten years ago, fresh out of my PhD, I completed three papers. One I submitted to a regular journal; it came out in 2012. One was for a special issue; it took until 2017 to appear. One was for an edited volume; the volume is yet to appear.

These may be extreme cases, but I think they reflect quite well the relative risks for early career researchers (in linguistics & perhaps more widely) of submitting to regular journals vs special issues vs edited volumes.

Avoiding the latter is not always possible; in linguistics, handbooks still have an audience. If I could advise my 2012 self, I’d say: 1. always preprint your work; 2. privilege online-first & open access venues; 3. use #RightsRetention statements to keep control over your work.

A natural experiment

Anyway, these three papers also provide an interesting natural experiment on the role of availability for reach and impact. The first, Advances in the cross-linguistic study of ideophones, now has >400 cites according to Google Scholar, improbably making it one of the most cited papers in its journal. This paper has done amazingly well.

The second, Expressiveness and system integration, has >50 cites and was scooped by a paper on Japanese that I wrote with Kimi Akita. We wrote that second paper two years after the first, but it appeared one year before it, if you still follow the chronology. As linguistics papers go, I don’t think it has done all that bad, especially considering that its impact was stunted by being in editorial purgatory for 4 years.

The third, “The language of perception in Siwu”, has only been seen by three people and cited by one of them (not me). I am not sure if or when it will see the light of day.

Some ACL2022 papers of interest

Too much going on at #acl2022nlp for live-tweeting, but I’ll do a wee thread on 3 papers I found thought-provoking: one on robustness probing by @jmderiu et al.; one on underclaiming by @sleepinyourhat; and one on bots for psychotherapy by Das et al..

Deriu et al. stress-test automated metrics for evaluating conversational dialogue systems. They use Blenderbot to identify local maxima in trained metrics and so identify blatantly nonsensical response types that reliably lead to high scores https://aclanthology.org/2022.acl-short.85/

As they write, "there are no known remedies to this problem". My conjecture (also see Goodhart's law): any automated metric will be affected by this as long as we're training on form alone. It's a thought-provoking paper, go read it

Next! Bowman https://aclanthology.org/2022.acl-long.516 acknowledges the harms of hype but focuses on the inverse: overclaiming the scope of work on limitations (='underclaiming'). I think his argument underestimates the enormous asymmetry of these cases and therefore may overclaim the harms?

I did wonder whether @sleepinyourhat is playing 4D chess here by writing a paper that's likely to attract citations from work that may have an incentive to overclaim the harms of underclaiming 🤯😂 #acl2022nlp

Third is Das et. al https://aclanthology.org/2022.bionlp-1.27 who propose to expose psychologically vulnerable people to conversational bots trained on Reddit, which frankly is every bit as bad an idea as it sounds (the words "ethics" and "risk" do not occur in the paper 🤷) #acl2022nlp #bionlp

There’s been loads more interesting and intriguing work at #acl2022nlp and I have particularly enjoyed the many talks in the theme track sessions on linguistic diversity. Check out the hundreds of papers (8831 pages) in the @aclanthology here: https://aclanthology.org/events/acl-2022

Okay because @KLM has decided to cancel my flight and delay the next one, some quick notes from the liminality of Dublin Airport on a few more #acl2022nlp papers I found interesting, revealing, or thought-provoking

Ung et al. (Facebook AI Research) train chatbots to say sorry in nicer ways, though without addressing the underlying problems that make them say offensive things in the 1st place. I thought this was both interesting and revealing of FBs priorities. Paper:https://aclanthology.org/2022.acl-long.447

Room for improvement: throughout, Ung et al remove "stop words" — but as conversation analysts can tell you, turn prefaces like uh, um, well, etc. often signal interactionally delicate matters, i.e. precisely the stuff they're hoping to track here 😬

Further, feedback is seen as strictly individual — whereas in normal human interaction it (also) reinforces *social* norms. Consider: those offended may not always have the social capital, privilege or energy to speak out ➡️ FBs bots will blithely continue to offend them 🤷

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on May 25, 2022.

‘From text to talk’, ACL 2022 paper

(this post originated as a twitter thread)

📣New! From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology — very happy to see this position paper w/ @a_liesenfeld accepted to #acl2022nlp — Preprint 📜: http://doi.org/10.31219/osf.io/m43zh

Screenhot of cover page of article. Abstract: "Informal social interaction is the primordial home of human language. Linguistically diverse conversational corpora are an important and largely untapped resource for computational linguistics and language technology. Through the efforts of a worldwide language documentation movement, such corpora are increasingly becoming available. We show how interactional data from 63 languages (26 families) harbours insights about turn-taking, timing, sequential structure and social action, with implications for language technology, natural language understanding, and the design of conversational interfaces. Harnessing linguistically diverse conversational corpora will provide the empirical foundations for flexible, localizable, humane language technologies of the future."

This paper is one of multiple coming out of our @NWO_SSH Vidi project 'Elementary Particles of Conversation' and presents a broad-ranging overview of our approach, which combines comparison, computation and conversation

More NLP work on diverse languages is direly needed. In this #acl2022nlp position paper we identify a type of data that will be critical to the field's future and yet remains largely untapped: linguistically diverse conversational corpora. There's more of it than you might think!

World map showing the location of 63 spoken languages included in the curated collection considered in the paper: 1 Arapaho 2 Cora 3 English 4 Otomi 5 Ulwa 6 Kichwa 7 Siona 8 Tehuelche 9 Br. Portuguese 10 Kakabe 11 Minderico 12 Spanish 13 Siwu 14 Catalan 15 French 16 Dutch 17 Akpes 18 Hausa 19 Danish 20 Zaar 21 Baa 22 German 23 Italian 24 Sakun 25 Czech 26 Croatian 27 Limassa 28 }Akhoe 29 Saami 30 Laal 31 Polish 32 N|uu 33 Hungarian 34 Juba Creole 35 Arabic 36 Siputhi 37 Farsi 38 Chitkuli 39 Gutob 40 Nganasan 41 Yakkha 42 Anal 43 Zauzou 44 Kerinci 45 Duoxu 46 S. Qiang 47 Nasal 48 Sambas 49 Kelabit 50 Mandarin 51 Totoli 52 Kula 53 Jejueo 54 Korean 55 Pagu 56 Ambel 57 Gunwinggu 58 Japanese 59 Wooi 60 Yali 61 Heyo 62 Yélî Dnye 63 Vamale.

Large conversational corpora are still rare & ᴡᴇɪʀᴅ* but granularity matters: even an hour of conversation easily means 1000s of turns with fine details on timing, joint action, incremental planning, & other aspects of interactional infrastructure (*Henrich et al. 2010)

Language resources (corpora) and their size in relation to global language diversity. >7000 languages, >180 with some form of corpus resources, ~70 with conversational corpora of casual talk.

We argue for a move from monologic text to interactive, dialogical, incremental talk. One simple reason this matters: the text corpora that feed most language models & inform many theories woefully underrepresent the very stuff that streamlines & scaffolds human interaction

Diagram showing the words and expressions most distinctive of talk (compared to text): interjections like hhuh, hm, mhm, wow, um, yeah, etc.

Text is atemporal, depersonalized, concatenated, monologic — it yields readily to our transformers, tokenizers, taggers, and classifiers. Talk is temporal, personal, sequentially contingent, dialogical. As Wittgenstein would say, it's a whole different ball game

Take turn-taking. Building on prior work, we find that across unrelated lgs people seem to aim for rapid transitions on the order of 0~200ms, resulting in plenty small gaps and overlaps — one big reason most voice UIs today feel stilted and out of sync

The timing of turn transitions in dyadic interactions in 24 languages around the world, replicating earlier findings and extending the evidence for the interplay of universals and cultural variation in turn-taking (n = number of turn transitions per corpus). Positive values represent gaps between turns; negative values represent overlaps. Across languages, the mean transition time is 59ms, and 46% of turns are produced in (slight) terminal overlap with a prior turn

This calls for incremental architectures (as folks like @davidschlangen @GabrielSkantze @KarolaPitsch have long pointed out). Here, cross-linguistically diverse conversational corpora can help to enable local calibration & to identify features that may inform TRP projection

Turns come in sequences. It's alluring to see exchanges as slot filling exercises (e.g. Q→A), but most conversations are way more open-ended and fluid. Promisingly, some broad activity type distinctions can be made visible in language-agnostic ways & are worth examing

Two types of conversational activity in 6 unrelated languages, showing the viability of identifying broad activity types using ebbs and flows in amount of talk contributed (time in ms). Panel A: a 'piano roll' display of turns by two participants as they unfold over time. Tellings (‘chunks’) are characterized by highly skewed relative contributions, with one participant serving as teller and the other taking on a recipient role (roles may switch, as in the Japanese example). Panel B. In ‘chat’ segments, turns and speaking time are distributed more evenly. Panel C. Shifts from one state to another are interactionally managed by participants.

This bottom-up view invites us to think about languages less in terms of tokens with transition probabilities, and more as tools for flexible coordination games. Look closely at just a minute of quotidian conversation in any language (as #EMCA does) and you cannot unsee this

Even seemingly similar patterns can harbour diversity. While English & Korean both use a minimal continuer form mhm/응, we find that response tokens are about twice as frequent in the latter (and more often overlapped), with implications for parsers & interaction design

Finally, we touch on J.R. Firth — not his NLP-famous dictum on distributional semantics, but his lesser known thoughts on conversation, which according to him holds "the key to a better understanding of what language really is and how it works" (1935, p. 71)

Quote from Firth (1935): "Neither linguists nor psychologists have begun the study of conversation; but it is here we shall find the key to a better understanding of what language really is and how it works"

As Firth observed, talk is more orderly and ritualized than most people think. We rightly celebrate the amazing creativity of language, but tend to overlook the extent to which it is scaffolded by recurrent turn formats — which, we find, may make up ~20% of turns at talk

a look at conversational
a look at conversational data shows that many turns are not one-offs: at least 28% of the utterances in our sample (436 367 out of 1 532 915 across 63 languages) occur more than once, and over 21% (329 548) occur more than 20 times. Many of these recurring turn formats are interjections and other pragmatic devices that help manage the flow of interaction and calibrate understanding

Do recurrent turn formats follow the kind of rank/frequency distribution we know from tokenised words? We find that across 22 languages, it seems they do — further evidence they serve as interactional tools (making them a prime example of Zipf's notion of tools-for-jobs)

We ignore these turn formats at our own peril. Text erases them; tokenisation obscures them; dialog managers stumble over them; ASR misses them — and yet we can't have a conversation for even thirty seconds without them. Firth was not wrong in saying they're 🔑

Conversation Deepening GIF

I've been slow-threading my way through some of our empirical results and will be adding a bunch more tweets on the implications. If you're hopping on, this is work with the amazing @a_liesenfeld, preprinted at http://doi.org/10.31219/osf.io/m43zh & to be presented at #acl2022nlp soon

So, implications! How can linguistically diverse conversational corpora help us do better language science and build more inclusive language technologies? We present three principles: aim for ecological validity, represent interactional infrastructure, design for diversity

Ecological validity is a hard sell because incentives in #nlproc and #ML —size, speed, SOTA-chasing— work against a move from text to talk. However, terabytes of text cannot replace the intricacies of interpersonal interaction. Data curation is key, we say with @annargrs

Pivoting from text to talk means taking conversational infrastructure seriously, also at the level of data structures & representations. Flattened text is radically different from the texture of turn-organized, time-bound, tailor-made talk — it takes two to tango

Kaysar Dadour Mayara Araujo GIF

A user study (Hoegen et al. 2019) provides a fascinating view of what happens when interactional infrastructure is overlooked. People run into overlap when talking with a conversational agent; the paper proposes this may be solved by filtering out "stop words and interjections"

This seems pretty much the wrong way round to us. Filtering out interjections to avoid overlap is like removing all pedestrian crossings to give free reign to self-driving cars. It's robbing people of self-determination & agency just because technology can't cope

Our 3rd recommendation is to design for diversity. As the case studies show, we cannot assume that well-studied languages tell us everything we need to know. Extending the empirical and conceptual foundations of #nlproc and language technologies will be critical for progress

To escape the reign of the resourceful few, use linguistically diverse data and anticipate a combination of universal and language-specific design principles. This not only ensures broad empirical coverage and enables new discoveries; it also benefits diversity and inclusion, as it enables language technology development that serves the needs of diverse communities. and makes technology more inclusive, more humane and more convivial for a larger range of possible users (Munn, 2018; Voinea, 2018). Localizing user interface elements is only a first step; diversity in how and when basic interactional structures are deployed must ultimately be reflected in the design of conversational user interfaces. In the rush for better language technology we should avoid being driven into the arms of only the

Voice user interfaces are ubiquitous, yet still feel stilted; text-based LMs have many applications, yet can't sustain meaningful interaction; and crosslinguistic data & perspectives are in short supply. Our #acl2022nlp paper sits right at the intersection of these challenges

Still of video showing opening page of paper, which is available here: https://osf.io/m43zh

Cleaning up the youtube autocaptions for our #acl2022nlp preview, it is really uncanny how accurate it is at *not ever transcribing* interjections like "m-hm", "huh?" — a neat illustration of our point that ASR often misses these words

Revisiting this thread to record the official link to the paper in the ACL Anthology (for those of you who like official page numbers):

https://aclanthology.org/2022.acl-long.385/

If you're going to be at ACL you'll find our talk on Underline, but here's a public version of the 12min pre-recorded talk with corrected captions for accessibility — w/ @a_liesenfeld #ACL2022 #nlproc #ACL2022nlp

Sweet: this line from our paper's conclusions was highlighted by @thamar_solorio as a key take-away message at the #acl2022nlp Next Big Ideas plenary session. Here's to more room for linguistic agency and diversity in NLP

By the way, one of the more puzzling #acl2022nlp reviewer comments we got was precisely about that line (among others), and featured a serious charge that @a_liesenfeld and I now often lob at each other: 🚨 "figurative language in evidence" 🚨

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on March 23, 2022.

Why it is useful to distinguish iconicity from indexicality

Every once in a while I come across work that conflates iconicity and indexicality, or lumps them together under a broad label of motivation (often in opposition to ‘arbitrariness’). Even if I tend to advocate for treating terminology lightly, I think there are many cases where it does pay off to maintain this distinction, and conflating it comes at a cost.

Not distinguishing iconicity and indexicality means losing the ability to explain how and why some linguistic resources differ in markedness & morphosyntactic behaviour, as I point out for the analogical issue of ideophones vs interjections here. A related case is transparent compounds, which naïve raters (under some instructions) also rate as highly iconic, yet for which it helps to be able to articulate how they differ from the kind of form-meaning resemblance usually targeted by the technical term iconicity.

There are also deeper evolutionary implications you’d lose sight of without the distinction. If an ancestral pain vocalization underlies interjections like ‘ow’, that makes for a different causal story than cross-linguistic similarities that can be ascribed to (possibly convergent) iconic mappings. So to explain why today’s languages are the way they are, a distinction like this comes in useful.

But for my money, the most interesting questions lie in where iconic vs indexical motivations overlap and where they diverge, and how this influences learning, processing, and cultural evolution. We can’t see those questions if we lump the notions together, nor when we dichotomize them.

This short post originated as a twitter thread.

New paper: Trilled /r/ is associated with roughness

Very happy to see this paper out! We combine comparative, lexical, historical, and psycholinguistic evidence for an in-depth look at a pervasive form of cross-modal iconicity.

For me, this goes back to ~2011, when I wondered why Siwu ideophones for roughness like wòsòròò, safaraa and dɛkpɛrɛɛ (all with trilled /r:/) felt so… rough. So something clicked when Bodo Winter told me about an intriguing link between /r/ & roughness in English in 2015

Many email threads, conversations, github commits and submissions & revisions later, we have this beast of a paper where we look at /r/~rough in sensory adjectives in English & Hungarian, trace it across hundreds of languages worldwide, and even peer back some 6 millennia in Proto-Indo-European.

It’s been such a pleasure to be part of this endeavour alongside Bodo Winter, Martón Soskuthy and Marcus Perlman. Do check out Bodo’s excellent summary in the thread linked above. And find the paper —open access!— here:

https://www.nature.com/articles/s41598-021-04311-7

Oh, by the way, one crunchy factoid about this paper (which Marcus Perlman pointed out to us) is that the r-for-rough link persists in present-day English variants where /r/ is no longer trilled — and that it can be awakened, like a sleeping beauty, as in this ad for Ruffles chips.

Originally tweeted by @dingemansemark@scholar.social (@DingemanseMark) on January 21, 2022.