Semantic primitives and conceptual decomposition

Thought-provoking discussion on semantic primitives and conceptual decomposition this morning at @in_interaction, led by Guillermo Montero-Melis. We went from Wittgenstein & Osgood via Rosch & Lakoff to Kemp & Tenenbaum and recent work by Mitchell, Binder, and others.

The paper that drew most attention was Binder et al. 2016’s ‘Toward a brain-based conceptual semantic representation’ (link). They present 65 semantic features (operationalised as human-rateable scales) with experiential roots & neurobiological plausibility — organisable into broad domains like vision, somatic, motor, spatial, temporal, causal, social, emotion, attention.

I thought of a very different corner of the language sciences, where Wierzbick, Goddard et al. have for decades refined Natural Semantic Metalanguage, a program for semantic decomposition in terms of a set of 65-odd semantic primitives — as here. On the face of it, the approaches could not be more different: NSM aims for human-readable reductive paraphrases of meanings in terms of semantic primitives thought to be lexicalised in all languages (a controversial claim: http://jstor.org/stable/4489617); on the other hand, Binder et al. aim to quantify meanings by placing them in a 65-dimensional space of experientially & neurobiologically motivated notions, some of which, to NSM adepts, may look too Anglo-specific and technical — here’s their take on ‘egg’, ‘bike’, ‘agreement’:

Contrast this with “eggs” in NSM (an explication published in http://doi.org/10.22363/2312-9182-2018-22-3-539-559…): readable if a bit verbose, and phrased mostly in terms of just 65 words (excepting things marked [m], which are ‘semantic molecules’ that themselves need explications — long story):

This would be worth a more detailed piece at some point but for now I simply want to note one thing: that these *wildly* different approaches to semantic primitives —at different levels of analysis & with different explanatory aims— still show some interesting convergences.

In particular, both postulate the relevance of broad domains like space, time, emotion, movement, body, speech, et cetera, though NSM is designed as a metalanguage (e.g., needing logical operators) while Binder et al. go more experiential and low-level (e.g. temperature). NSM scholars could have a field day with some of the Binder et al. features (e.g. Ekmanesque emotions, whose universal status is not uncontroversial) but they might also take a cue from solid neurobiological evidence for the importance of, say, sensorimotor features in meaning

TL;DR semantic primitives are fascinating & recent work suggests exciting directions for the study of conceptual semantics. I wish NSM open-sourced its explications; I’ve already been playing around with the Binder et al. open data from here.

‘big’ and ‘theory’ in terms of Binder et al. features

Playful iconicity: Having fun with words

What do words like waddleslobbertingleoink, and zigzag have in common? These words sound funny, but they are also iconic, with forms that resemble aspects of their meanings. In a new paper we investigate the link between funniness and iconicity in 70,000 English words.

“This is play”

The starting point is a theory about metacommunication: some words (or signs) are more striking than others in terms of their form, which means they draw more attention to themselves and signal “this is play”. We think this explains the first finding of the paper: words that people rate as highly funny are also often rated as highly iconic.

Relations between funniness and iconicity after controlling for word frequency, in: A words with human ratings; B words with human funniness ratings and imputed iconicity ratings; C words for which we only have imputed ratings.

To test how general this finding is we developed a way to predict funniness and iconicity ratings for new words. Based on semantic relationships between millions of English words, we trained an algorithm to predict the iconicity (or funniness) of words that have already been rated by people, and then asked that algorithm to predict iconicity (or funniness) for new words.

For example, say the new word is ‘waggle’. First the algorithm learned that ‘waggle’ occurs in similar contexts to ‘wiggle’ and ‘wobble’. Then it learned that ‘wiggle’ and ‘wobble’ were rated as highly iconic by participants. As a result, it predicts that ‘waggle’ will be highly iconic too. Applying this method to ~70,000 words, we find that the relation between funniness and iconicity holds even for predicted ratings.

But what is it about some words that makes them both funny and iconic? Analysing the words that people rated as most funny and iconic, we found a number of recurring features: complex sequences of sounds at the start like str- and cl- or at the end like -nk and -mp, and an ending -le in verbs that contributes an element of movement and playfulness (as in ‘waggle’ and ‘wobble’).

These structural features, we propose, act as metacommunicative signals that help words stand out as playful, performative, and even poetic. They occur disproportionally in highly rated words. When we combined these cues in an overall structural markedness score, we found structural markedness predicts the iconicity and funniness ratings much better than other measures.

The relation between structural markedness and A funniness ratings, B iconicity ratings, and C funniness and iconicity together. Each dot represents 14 or 15 words. Solid lines and shading represent a loess function of cumulative markedness with 95% confidence intervals. Other lines show relative prevalence of complex onsets, codas, and verbal diminutives.

So our three main findings are:

  1. Words that are rated as highly iconic also tend to be rated as highly funny (in the few thousand words for which we have such ratings)
  2. This relation holds even in for ratings predicted based on semantic relationships (in ~65.000 words for which we have done this)
  3. The highly rated words tend to have special forms: they sound different from other words, which invites people to treat them as playful and performative

Making sense of apparent exceptions

We also found some other things. First, funniness and iconicity ratings do not always go hand in hand. There are highly iconic words like ‘roar’ and ‘scratch’ that people don’t feel are funny because they have to do with negative events. There are also words that are rated as very funny like ‘blonde’ and ‘buttocks’ mainly because they tend to be used in jokes; these are not rated as iconic and they are not relevant for our theory.

Another thing we found is that human ratings are far from perfect. As it turns out, for the data we used, the people who rated words for how much they “sound like what they mean” gave high ratings to words like ‘whoosh’ (where the sound of the word resembles aspects of its meaning) but also to words like ‘bedroom’ (which are built by combining meaningful parts).

Only words of the first type are really iconic; the others are merely analysable. Our theory holds only for the first, which means that the 10-15% of analysable words with high iconicity ratings are probably diluting the effects we find. Indeed, when we control for this issue by looking only at words of one piece, the relation between iconicity and funniness comes out a little stronger.

We included this analysis not just to show the subtleties of the effects, but also because we believe lexical ratings (whether done by people or by machines) should never be taken at face value. Now that there are so many types of ratings available, it’s tempting to just throw together a bunch of them and have a look at correlations. But to avoid cherry-picking or reporting false positives, it is important to start with a theoretical question, and to always control the findings with other methods.

Having fun with linguistics

While the study is based on English, its questions are inspired by work on ideophones, highly evocative words found in many languages around the world. And the theory put forward in the paper is general enough to help account for many other examples of playful language described in the literature, and to guide future investigations of the relation between playfulness and iconicity in spoken and signed languages.

Our study also contributes to broadening the perspective of linguistics. While anecdotal reports about perceptions of funniness and iconicity abound, our study is the first to investigate this relation on a large scale in English, and perhaps in any language. That this hasn’t been done before is partly because linguistics has long preferred to focus on “serious” matters. However, we argue that there is nothing frivolous about studying playful language.

Cybernetician Gregory Bateson argued that the very notion of play represents a fundamental transition in the evolution of communication. This is because play requires a form of metacommunication, a way of saying “What I do now is special”. Human language has perfected such forms of metacommunication, and in our paper we trace its influence in the very texture of the lexicon.

To enable others to build on our work we’ve made sure it is open science all the way: all primary data as well as our new predicted iconicity and funniness ratings are publicly available. We also share the Python code for our prediction algorithm and the R code for all of the analyses and figures. And last but not least, the paper itself is also published open access.

  • All data and code is in our GitHub repository
  • Dingemanse, M., & Thompson, B. (2020). Playful iconicity: Structural markedness underlies the relation between funniness and iconicity. Language and Cognition, 1-22. doi:10.1017/langcog.2019.49

Rethinking Marginality: panel on interjections & interaction at IPRA

We’re convening a panel at the 16th International Pragmatics Conference in Hong Kong next week. This doubles as the inaugural workshop of my VIDI project Elementary Particles of Conversation. The workshop ties into the overall theme of the conference, which is “Pragmatics at the Margins”. Have a look at the panel programme & abstracts (PDF), or check out the overview below (? links go to the abstracts in the IPRA programme):

Tuesday June 11, room TU107, 13:30-17:00 (including break)

1330 Intro | Negotiating mutual understanding in multimodal interaction: a comparative and experimental approach
Marlou Rasenberg & Mark Dingemanse
?
1400 Interjection as coordination device: feedback relevance spaces
Christine Howes & Arash Eshghi
?
1430 Probabilistic Pragmatic Inference of Communicative Feedback Meaning
Hendrik Buschmeier & Stefan Kopp
?
1500 break (30min)—  
1530 Turn structure & interjections
Christoph Rühlemann
?
1600 Hebrew clicks: From the periphery of language to the heart of grammar
Yotam Ben Moshe & Yael Maschler
?
1630 Interjections in Action
Isabel Ward & Nigel Ward
?

Here’s the panel session abstract:

Rethinking Marginality: Interjections as the beating heart of language

Mark Dingemanse & Marlou Rasenberg
Radboud University & Max Planck Institute for Psycholinguistics

Oxford linguist Max Müller once pontificated that “Language begins where interjections end”. Work in pragmatics turns this view on its head by studying language in its natural habitat of face-to-face interaction, where interjections help us every moment to calibrate understanding and use complex language efficiently. A guiding hypothesis for this panel is that at least some interjections are highly adaptive communicative tools, culturally evolved for the job of keeping our social interactional machinery in good repair (Yngve, 1970; Dingemanse 2017). Far from being marginal grunts, words like ‘oh!’, ‘mm’, ‘um’ and ‘huh?’ play central roles in the most sophisticated uses of language. As metacommunicative signals, they are one of the places where theories of mind and pragmatic reasoning come to the surface, and they afford human language a degree of flexibility, robustness and error-tolerance unmatched in other known communication systems.

This session brings together new research on the centrality of pragmatic interjections in language, with a special focus on items and interactional practices that play crucial roles in managing the back and forth of everyday interaction. These phenomena have been studied in disparate disciplines, as seen by the proliferation of available labels, including back channels, discourse markers, phatic interjections, collateral signals, response tokens and non-lexical conversational sounds. In this lies both a challenge and an opportunity. The challenge is to formulate a unified perspective that can provide conceptual foundations and ensure cumulative progress. The opportunity lies in the disciplinary diversity, which provides us with complementary methods that can deliver converging evidence on open questions.

Topics covered in the session include: the central roles of ‘marginal’ items in the pragmatics of human interaction; their linguistic status as lexical or nonlexical items; their multimodal composition, as items combining verbal and visual cues; their semiotic status, combining indexical, iconic and symbolic properties; their cross-linguistic attestations, including patterns of universality and diversity; the paths of semantic and pragmatic change leading to and from them; and their implementation in models of language processing, dialogue systems and conversational agents.

The role of serendipity in shaping fundamental research

After much postponement, writing the final report for my NWO Veni grant (2015-2018) turned out to be an unexpected pleasure. It made me realise a couple of things — key among them the role of serendipity in shaping fundamental research.

The project was called “Towards a science of linguistic depiction”. Looking at the publications that came out of it and at imperfect indicators like citations, it’s made some useful contributions.

So did everything go according to plan? Not at all. I had to cancel fieldwork because of little ones, had to abandon some lofty plans for knowledge utilisation, and had papers variously rejected for being ‘vague’, ‘without content’ or worse.

For instance, the paper I’m most proud of was rejected twice, including by Language, where it languished 6 months in review only to get reviewer 2 call it ‘charming’ before slamming it (if you’re an ECR, it’s probably best to avoid Language: not worth the risky wait). That charming, twice-rejected paper ultimately appeared in Journal of Linguistics, where it seems to be doing pretty well, with a place in the top 10 most cited JL papers of the last 5 years and in the top 10 most downloaded in that journal in 2018.

Three happy coincidences

Back to the project. While we have better answers now to the research questions posed, and even the ‘utilisation’ goals were reached, how I got there was very different from what was originally planned, with a major role for serendipity. Three happy coincidences stand out.

First, right at the start of the project (following multiple failed attempts) me and Tessa van Leeuwen got an NWO grant to do a “Groot Nationaal Onderzoek” w/ thousands of participants: a wonderful opportunity that utterly derailed all other outreach plans I’d made. We’re now working on several papers from that project; if you read Dutch, check out this piece we wrote for Onze Taal on language, cross-modal perception and synaesthesia: “Taal als samenspel van de zintuigen” (PDF).

Second, at MPI, I had the opportunity to supervise a cool PhD project by UCL graduate Gwilym Lockwood that perfectly matched the Veni project. This is privilege kicking in: outside the Max Planck system a fully funded PhD like this would have been unlikely. It’s also luck: Gwilym turned out to be a productive, self-propelled student who knew what he wanted and found a cool data science job right after his PhD. Check out his 2017 thesis —Talking Sense— here.

Third, a pleasant and productive synergy developed with other Nijmegen VI-laureates Gerardo Ortega (Veni) and Asli Özyürek (Vici): the Iconicity Focus Group, a vibrant iconicity-focused research community in Nijmegen with lots of inspiring lectures and meetings. This culminated in an exciting co-sponsored international workshop on ‘Types of iconicity in language use, development and processing’ #IFG17, funded by our three NWO grants and supported by MPI, Donders and the Centre for Language Studies at Radboud University.

All of these things massively enhanced the project and made it possible to meet goals set in the distant past of clueless yet hopeful grant-writing. Add cool student projects, visiting scholars, and a concurrent ‘iconicity boom’ and it’s clear that success can’t really be planned.

Serendipity

Looking back at the project as a whole, it’s nice to be able to point to a coherent set of publications, and gratifying to see the growing impact of work on ideophones & depiction. But in the report I just submitted to NWO, I made sure to also acknowledge serendipity. Here’s the final paragraph of my report:

I am extremely grateful to NWO for the opportunity and the freedom to carry out basic research using a Veni grant. As I have noted at several points above, while the goals have been met, the project has also seen considerable changes and has benefited from unexpected synergies. My overall conclusion is that serendipity plays an important role in shaping fundamental research and its applications.

The process reminded me of what the Hitchhiker’s Guide to the Galaxy says about the art of flying: “The knack lies in learning how to throw yourself at the ground and miss”. For that, you need to be “lucky enough to have your attention momentarily distracted at the crucial moment”. Certainly some of the most fun and serendipitous collaborations have come from me being distracted by too many new ideas at once — often when I was about to hit the ground (or a deadline). Make of that what you will.

Okay, time to wrap up. Lessons learned:

  1. Things will go different than planned, and that’s okay (be curious).
  2. Reviewer 2 doesn’t get it, and that’s okay (keep trying).
  3. It’s more fun when you do it together (collaborate).

And that’s all folks. Enjoy the holidays. Take some time off, you deserve it. So long and thanks for all the funding!

What is ‘non-lexical’? Notes on non-lexical vocalisations, II

New! Some of this is now published here (open access, free for all!):

Dingemanse, Mark. 2020. Between Sound and Speech: Liminal Signs in InteractionResearch on Language and Social Interaction53(1), 188–196.  doi:10.1080/08351813.2020.1712967 (PDF)

TL;DRNon-lexical is a term people use for things that seem borderline linguistic, like sniffs, coughs, and grunts. However, it’s rarely a great idea to define things in terms of what they are not. In fact we can think of many of these things as liminal signs: signs that can but need not be used communicatively because they occupy the borderland between sound and speech.

This is part II of my notes on the “Ideophones and non-lexical vocalisations” workshop. Part I is here.

Order at all points

One of the nice features of the workshop was the “rapid data session” format, which enabled analysts to make available one or two data extracts (often with audio, video and transcripts) for repeated inspection, allowing everyone in the audience to study them and make observations or ask questions. In this way we discussed data featuring vocalisations including ermmmnrrrnuh::, ʔouiʔ, ha:i: (sighed), du du ka du du du ka, k’hohhh, zuppum, hop-paa, and many more.

But there is method to the madness. For instance, talking on the topic of “How to audibly not say something with clicks”, Richard Ogden (York) showed how English speakers use various click sounds for double entendres, collusions and in general things that are treated as best left off the record. He also made a convincing case for a systematic, conventionalised contrast between lateral and central click sounds, which maps onto a contrast in social actions. Despite English not being generally known as a click language, English speakers have no trouble mastering this contrast and use it in everday interaction (some details are in Richard’s 2013 paper on clicks).

When speech sounds are distinctive in this way, linguists often use that as evidence to argue for phonemic status: the contrastive sounds earn their place in the phonology of the language. These conversational clicks form an interesting test case. Is a single systematic contrast, or even a small number of similarly contrasting items, sufficient for admission to the phoneme inventory, or is there some kind of threshold we use to determine this?

I think it is fair to point out that the majority of English words don’t feature contrastive click phonemes, and so this could be a reason to say they are not part of English phonology. But such frequency-based arguments can be slippery. Given that phonemes show a Zipfian distribution, we expect there to be relatively rare phonemes. Are clicks simply one extreme of this continuum? I can’t bring myself to agree with this either, if only because their distribution (in terms of places where they occur) seems quite different.

Most importantly, in English, these click sounds don’t seem to be contrastive within words the way p/t or k/g are, but instead are contrastive as stand alone items. So on a generous reading of ‘word’, these items are words, or at least lexical items, or at least conventionalised linguistic items. Which brings me to the key question I was left with after the workshop.

What, if anything, is “non-lexical”?

Throughout the workshop we faced the challenge of how exactly to refer to the various things we studied. One term used widely, mostly for want of a better one, was non-lexical vocalisations. While it may be the best we have currently, there are several issues with it.

First, it’s never great to define by negation. Is not being lexical the key feature setting apart these vocalisations as a phenomenon? What would lexical vocalisations be, anyway? We have the term ‘word’, so using an alternative like ‘vocalisation’ already implies some relevant difference between the items in focus and run-of-the-mill words like ‘cat’ or ‘mat’. And as we saw above for the clicks, a case could be made that even these phonologically outlandish items have some recognised (or at least recognisable) status as conventionalised items in a larger system of practices, i.e., a lexicon.

Second, calling them “non-lexical” implies that the lexicality of these items is somehow lacking or in doubt. True, these items are unlikely to be found in traditional lexicons; but the arbitrary constraints of printed dictionaries will never be a reliable guide for linguistic questions. Anyway this doesn’t help if we want to argue (as several of us did during the workshop) that the shape of these items can to an important degree be conventionalised, or that they may draw on partly conventionalised inventories of depictive practices, or that they are used in systematic ways, or that they form paradigmatic relations within larger systems of practices. All of these point to a conventionalised, and therefore possibly lexicalised, status of these things.

Depictions and displays

Before we worry about lexicality, it’s worth asking whether there is a unified phenomenon here in need of a single label like “non-lexical vocalisations”, or whether there are multiple distinct phenomena. I think there may be at least two clear groups of phenomena worth distinguishing:

1. Vocal depictions (≈ Clark’s ‘demonstrations’, Güldemann’s ‘mimesis’)

These are vocalisations typically presented as depictions of sensory scenes that enable others to perceive for themselves the scene depicted. Examples include ideophones, creative vocal imitations of sounds, movements and other sensory scenes. In Peircean terms, their mode of signification is primarily iconic. For example, a vocalisation like wop pa da PUM can iconically depict aspects of the temporal and kinetic dynamics of a sequence of dance moves (Keevallik 2010). Like all signs, vocal depictions may also have symbolic and indexical properties.

While most English speakers won’t feel that wop or pa da PUM are words, one could make a case for a degree of conventionalisation in particular communities of practice. For instance, dancers or musicians who work closely together likely converge on a small set of vocalisations they use in this way (Sundberg’s 1994 syllabling). From here it’s not far at all to the larger inventories of conventionalised vocal depictions we call ideophones. Indeed one place where we find ideophones is precisely in situations where there is a premium on sharing and calibrating sensory perceptions and achieving bodily coordination, as Elena Mihas (2013) has shown for ideophones in Ashenika Perene. (Some of these uses of ideophones are reviewed in a forthcoming article for the Oxford Research Encyclopedia of Linguistics; preprint here.) So I see vocal depictions as an overarching category that includes creative vocalisations as well as conventionalised ideophones, and everything in between.

2. Vocal displays (≈ Goffman’s ‘response cries’, Kockelman’s ‘interjections’ — I’m not sure whether ‘display’ is the best term here)

These are vocalisations typically produced as indexical signs of emotion, effort, evaluation. They are presented not so much depictions of events as responses to events. Examples include strain grunts, pain cries, yawns, interjections of disgust, vocal signs of cognitive effort, etc. For Goffman these would present themselves more as “giving off” than “giving” information, though of course precisely this opens up the possibility for people to produce or treat them as doing other things ostensibly off record. In Peircean terms, their mode of signification is primarily indexical. For instance, the phonetic form of a strain grunt does not itself present a resemblance to its ascribed meaning of ‘effort’ — it can be seen to indexically show that effort. Like all signs, vocal displays may also have symbolic and iconic properties.

I’m trying to be careful here in saying that vocal displays are “typically produced as indexical signs”. An inbreath or a click sound can be ‘merely’ an index of the physical process of preparing to speak, involuntarily produced; but that it regularly occurs in this indexical relationship means that we can also use it in a more controlled way to display imminent speakership, and therefore do interactional work. Likewise, something like um can be ‘merely’ an index of the cognitive process of starting to formulate a turn but not being ready to speak yet; but that it regularly occurs in this function makes it possible for us to do interactional work with it, for instance, buy ourselves time at interactionally fraught moments (Clark & Fox Tree 2003).

(Non)lexicality is an orthogonal issue

The two groups, vocal depictions and vocal displays, are united at least in being commonly treated as marginalia in the subjective sense (Dingemanse 2017). Further, vocal depictions and vocal displays are both more ‘showing’ than ‘telling’, though for different reasons: depictions because they iconically create a likeness (Donald 1998), displays because they indexically provide evidence of some inner feeling or state (Wharton 2003). Both groups also appear to allow a degree of gradience that seems to be less typical for more descriptive vocabulary: depictions because modifications in form analogically correspond to modifications in meaning, and displays because they are productively combined with a wide range of prosodic resources in the service of showing stance and streamlining interaction. All of these things may justify grouping them together as “vocalisations”. But I wouldn’t want to call them “non-lexical” across the board.

The reason is that lexicality is an orthogonal matter. Lexicality is a graded property (something can be more or less lexical) and it runs through both groups: in both, we have fully conventionalised lexical items like ideophones or the word “um” ; and items that are less clearly conventionalised and linguistically integrated, like the vocal depiction “pa da PUM” or a vocal display like an inbreath. And there are going to be lots of intermediate forms as well.

There are yet other things that have been called “nonlexical” or variations thereof, that may or may not be groupable with either of these two broad categories. For instance, Nigel Ward has an interesting line of work on continuers, backchannels and the like, which he calls “nonlexical conversational sounds” (Ward 2006). Despite an interesting degree of formal gradience, I think the claim of nonlexicality here is premature, and may be too strong. Likewise, Schegloff has described the interjection Huh?, used to initiate repair, as a “virtually pre-lexical grunt” (Schegloff 1997). Comparative interactional linguistic research has since shown that many languages have an interjection of this kind, and while it may not be the most prototypical lexical item, it certainly is a word rather than a grunt: it is integrated in terms of phonology and interrogative prosody, and its cross-linguistic commonalities notwithstanding, the actual realisations show enough language-specificity that they have to be learned.

Some of these items may be close to the vocal displays above, a link that is alluring because they don’t sound like many other words. But I would hesitate to identify them with response cries, exclamations or grunts; as I have argued elsewhere, perhaps their peculiar shapes are not so much because they originate as involuntary grunts, but because they are optimally adapted to the exigencies of conversation (as we have argued in detail for “Huh?”). That topic is at the core of my newest research project on Elementary particles of conversation. More about that on some other occassion.

References

  • Akita, Kimi, and Mark Dingemanse. 2019. “Ideophones (Mimetics, Expressives).” In: Oxford Research Encyclopedia of Linguistics. Preprint: https://ling.auf.net/lingbuzz/004347
  • Clark, Herbert H., and Jean E. Fox Tree. 2002. “Using Uh and Um in Spontaneous Speaking.” Cognition 84: 73–111.
  • Dingemanse, Mark. 2014. “Making New Ideophones in Siwu: Creative Depiction in Conversation.” Pragmatics and Society 5 (3): 384–405. https://doi.org/10.1075/ps.5.3.04din.
  • Dingemanse, Mark. 2017. “On the Margins of Language: Ideophones, Interjections and Dependencies in Linguistic Theory.” In Dependencies in Language, edited by N. J. Enfield, 195–202. Berlin: Language Science Press. https://doi.org/10.5281/zenodo.573781.
  • Donald, Merlin. 1998. “Mimesis and the Executive Suite: Missing Links in Language Evolution.” In Approaches to the Evolution of Language: Social and Cognitive Bases, edited by James R. Hurford, Michael Studdert-Kennedy, and Chris Knight, 44–67. Cambridge: Cambridge University Press.
  • Goffman, Erving. 1978. “Response Cries.” Language 54 (4): 787–815.
  • Keevallik, Leelo. 2010. “Bodily Quoting in Dance Correction.” Research on Language & Social Interaction 43 (4): 401–26. https://doi.org/10.1080/08351813.2010.518065.
  • Keevallik, Leelo. 2014. “Turn Organization and Bodily-Vocal Demonstrations.” Journal of Pragmatics, A body of resources – CA studies of social conduct, 65 (May): 103–20. https://doi.org/10.1016/j.pragma.2014.01.008.
  • Kockelman, Paul. 2003. “The Meanings of Interjections in Q’eqchi’ Maya: From Emotive Reaction to Social and Discursive Action.” Current Anthropology 44 (4): 467–97.
  • Mihas, Elena. 2013. “Composite Ideophone-Gesture Utterances in the Ashéninka Perené ‘Community of Practice’, an Amazonian Arawak Society from Central-Eastern Peru.” Gesture 13 (1): 28–62. https://doi.org/10.1075/gest.13.1.02mih.
  • Ogden, Richard. 2013. “Clicks and Percussives in English Conversation.” Journal of the International Phonetic Association 43 (3): 299–320. https://doi.org/10.1017/S0025100313000224.
  • Schegloff, Emanuel A. 1997. “Practices and Actions: Boundary Cases of Other-Initiated Repair.” Discourse Processes 23 (3): 499–545. https://doi.org/10.1080/01638539709545001.
  • Ward, Nigel. 2006. “Non-Lexical Conversational Sounds in American English.” Pragmatics & Cognition 14: 129–82. https://doi.org/10.1075/pc.14.1.08war.
  • Wharton, Tim. 2003. “Interjections, Language, and the `showing/Saying’ Continuum.” Pragmatics & Cognition 11: 39–91. https://doi.org/10.1075/pc.11.1.04wha.

A variety of vocal depictions: Notes on non-lexical vocalisations, I

Last week I was happy to present my work at a workshop on Ideophones and nonlexical vocalisations in Linköping, Sweden, organised by Leelo Keevallik and Emily Hofstetter. This was the kick-off for a new project on “Non-lexical vocalisations“. It was my first time in Linköping and it was great getting to know the vibrant community of interaction researchers from across departments. Also, I kind of fell in love with the Key Huset building and its light-flooded wood toned spaces.

The workshop was thought-provoking in many ways. This is the first of two posts in which I share some of my notes. It’s a personal take, not at all intended as a comprehensive summary, if only because I had to leave early to pick up my daughter from daycare back in Nijmegen and therefore missed the last third of the workshop, which (judging from Emily Hofstetter’s live tweeting) was just as interesting as the first two thirds. A central concern of the larger project hosting the workshop is to “problematise the traditional boundaries of linguistics”. This is something I’m sympathetic to, if only because my own work on ideophones and interjections has made me acutely aware of the subjectiveness of our notions of what is marginal and what is core in language.

Rara versus marginalia

In thinking about marginality, I find it useful to distinguish two ways in which things may be peripheral: rara and marginalia (see Dingemanse 2017). Rara are truly rare linguistic phenomena that are interesting precisely because they are so out of the ordinary: things like click phonemes, nominal tense, or affixation by place of articulation. Marginalia are common phenomena that just don’t happen to be part of the traditional interests of linguistics: things like gesture, ideophones, or indeed “non-lexical” vocalisations.

The crucial difference between rara and marginalia lies in the subjectivity of the latter. We can objectively tell whether something is truly rare or exceptional. But many classifications of things as peripheral or marginal are much more subjective. What we think of as marginal is determined by our data, methods, and theories; and in addition to that, by our own linguistic experience and language ideologies. There is nothing wrong about declaring some things as peripheral to your current interests: time is limited and we all have to make choices. But it is always useful to be aware of how you come to such choices, and to reflect on whether your interests (or methods, theories, ideologies) might benefit from a bit of recalibration.

Many of the phenomena in focus during the workshop were not rara but marginalia in this subjective sense: they occur all the time in language use and might tell us interesting things about language structure — but they’ve been mostly treated as marginal to the concerns of mainstream linguistics. However, the tide may be turning for at least some marginalia: work on ideophones is clearly on the rise, and initiatives such as Martina Wiltschko’s Eh lab at UBC and this new nonlexical vocalizations project at Linköping University show there is significant interest in this area.

Vocal depiction is rampant

One thing that struck me during the workshop is how common it is to use the voice to depict meaning, often in contexts where other means of communication may be much less efficient or effective. Whether it’s during lindy hop learning sessions (as in Leelo Keevallik‘s work) or band practice (as in Agnes Löfgren‘s data), in professional choreography rehearsals (as in Johanna Skubisz‘ work) or in everyday interaction in Siwu (as in my work on ideophones), people use vocal depictions —often in multimodal ensembles— to evoke perceptual experiences and coordinate bodily behaviour.

One thing all kinds of vocal depictions have in common is that they show rather than tell. It is incredibly hard to tell a dancer to execute a movement in a certain way; it is much easier to show it, either by means of a bodily demonstration or by means of gestural and vocal depictions. Or to take an example from my own research in Ghana, it is quite hard to explain how you can visually tell a real batch of gunpowder from a counterfeit one, but if you manage to depict its particular sheen using using gestures and an ideophone like kɛlɛŋkɛlɛŋkɛlɛŋ (as in example 11.11 here), you can go a long way.

Depictions construe a likeness or a replica of some sensory scene (Clark 2016), making aspects of it more directly accessible and manipulable than would be the case if the scene was merely described in arbitrary words. This is what makes them useful in a wide range of communicative contexts. In my own work on creative vocal depictions (PDF) I mentioned settings as diversified as storytelling, joint work in animation studios, and interaction in music and dance lessons. During the workshop we saw further examples from band practice, choreography rehearsals, multilingual conversations, and doctor-patient interaction. This diversity of contexts brings home the versatility of depiction as a communicative practice.

Versions of the ‘same’ thing are analytical rich points

Some of the richest opportunities for analysis come from cases where the interaction provides multiple versions of some behaviour designed to represent ostensibly the same scene. For instance, in Agnes Löfgren‘s extract from a band rehearsal, we heard a bass player convey (to the drummer) a particular rhythmic structure he had in mind for this piece. The bass player produced at least four versions of ostensibly the same content. The versions can be seen as escalations or upgrades, in part shaped by the drummer’s responses which ranged from ‘isn’t that what I’m doing now’ to ‘alright okay’ to ‘I don’t see it yet’ to ‘like it actually gets kind of cool’:

  1. a prose description (‘so it’s like you play fou- a four against our three’)
  2. a depiction in syllables (du du ka du du ka du ka) with the foot doubling as bass drum
  3. a short rhythmic phrase played on the bass, soon abandoned
  4. an actual demonstration on the drum set

Cases like this raise many intriguing questions, some inspired by Clark & Gerrig’s (1990) classic work on quotations as demonstrations. How do we decide between  modalities (or combinations of modalities) in designing depictions? What determines the ordering of strategies seen in successive pursuits? What is the role of recipient design in choosing one over another strategy? How do we select the aspects of a scene that we are going to depict, and how do we map these to the depictive means at hand? How is the design of our depictions shaped and constrained by the affordances of meaning and modality? And so on.

We saw more examples in Leelo Keevallik’s lindy hop data. In one memorable case, a lindy hop learner asks a question about a possibly problematic element of a dance move, referring to it using the creative vocal depiction “zup↑pum↑”. The teachers decide to show rather than tell by actually executing the moves, and in synchrony with this they produce vocalisations that depict some of the rhythmic and kinetic aspects of the dance — including a piece that structurally is recognisable (for us analysts as well as, presumably, for the learner asking the question) as the relevant referent of “”zup↑pum↑”. Also during the dance, the other teacher produces ‘nonlexical’ syllables like chigi digi digi in sync with the beat and with his movements, and after completing the dance, adds, “So yeah, it’s just a nice little jigijigijigi‘, simultaneously depicting some of the kinetic aspects of the dance in voice and hands.

Versions of ostensibly the same thing are crucial because they give us more material to work with if we want to understand the link between the depiction and the depicted scene — often a challenge not just for the analyst but also for the recipient in interaction. Versions give us analytical purchase in two key ways: they show multiple iterations of ostensibly the same action, and if we’re lucky, they also give us multiple takes on the material by the recipient, providing crucial interactional evidence of the success or failure of depictive stretches of behaviour.

One type of useful interactional evidence is when different participants provide takes on ostensibly the same scene that demonstrate (rather than just claim) their understanding or expertise. With ideophones, I have found that when one participant produces an ideophone evoking a scene (e.g., munyɛmunyɛ ‘sparkling’), in second position another participant may then produce another ideophone (e.g., gelegele ‘shiny’) as if to say, I agree with you, and here is how see it. This is where vocal depictions in interaction touch on matters of epistemics and authority.

A key challenge when working with creative depictions is that it can be hard for the analyst to even know what they are supposed to depict. Here, another type of interactional evidence can be particularly useful: when a recipient formulates their understanding of the depiction. In my talk at the workshop I discussed a case from my study on creative vocal depictions where one person’s creative ideophone kpaw is followed by the other’s interpretation in next turn: “the gun didn’t go off”:

  1. A:  lopɛ↑kpaw↑
         I fired ↑kpaw↑
  2.      (1.2)
  3. B:  kùdu leiba inɔ̀
         the gunpowder didn’t go through
  4. A:  kùdu leiba- kɔ
         the gunpowder didn’t go- gee!

What B does in line 3 is take A’s creative depiction and formulate an understanding of it in descriptive terms. This is analytically very useful, because it saves us the trouble of speculating what the depiction was supposed to evoke. B’s interpretation is ratified by A when he repeats it and continues the telling.

It is kind of wonderful that we can create and interpret vocal depictions just like that.  What cases like this show is that interactional evidence can help us crack some of the most intriguing questions about creative vocal depictions. Their interpretation is scaffolded by context, supported by people’s familiarity with (conventional) depictive strategies, and ratified in interaction by these kinds of understandings.

(An interesting boundary case comes from Hannah Pelikan‘s work on interaction with a Nao robot. She recorded games of charades. Nao would produce a pre-programmed ‘depiction’ (e.g. playing a plane sound and visually imitating wings with arms) and a participant would produce a verbal guess, which was then treated as right or wrong by Nao depending on a pre-programmed set of answers. Hannah’s data shows that people
are pretty graceful even when perfectly reasonable guesses are dismissed by Nao, and rapidly adapt to the limited agency displayed by the robot. What’s potentially interesting here is that we could get multiple takes on what is guaranteed to be the exact same depiction. Holding one side of the equation still, as it were, to see what the other, more flexible human side makes of it. However, due to the restricted format of the charades game, usually there was only one guess and no opportunities for redress.)

In closing

One thing that is so fascinating about marginalia is the combination of relatively common occurrence with a striking lack of systematic attention from linguists and interaction researchers. It means that there are lots of things still to find out about some of the most fundamental aspects of how we use language, and how language is shaped by and for social interaction. In the next installment I’ll explore some other themes from the workshop, focusing on the question: what does it mean to call something “non-lexical”?

References

  • Clark, Herbert H., and Richard J. Gerrig. 1990. “Quotations as Demonstrations.” Language 66 (4): 764–805.
  • Clark, Herbert H. 2016. “Depicting as a Method of Communication.” Psychological Review 123 (3): 324–47. https://doi.org/10.1037/rev0000026.
  • Dingemanse, Mark. 2014. “Making New Ideophones in Siwu: Creative Depiction in Conversation.” Pragmatics and Society 5 (3): 384–405. https://doi.org/10.1075/ps.5.3.04din.
  • Dingemanse, Mark. 2017. “On the Margins of Language: Ideophones, Interjections and Dependencies in Linguistic Theory.” In Dependencies in Language, edited by N. J. Enfield, 195–202. Berlin: Language Science Press. https://doi.org/10.5281/zenodo.573781.
  • Keevallik, Leelo. 2010. “Bodily Quoting in Dance Correction.” Research on Language & Social Interaction 43 (4): 401–26. https://doi.org/10.1080/08351813.2010.518065.
  • Keevallik, Leelo. 2014. “Turn Organization and Bodily-Vocal Demonstrations.” Journal of Pragmatics, A body of resources – CA studies of social conduct, 65 (May): 103–20. https://doi.org/10.1016/j.pragma.2014.01.008.
  •  

Sign names and theories of naming

Every time I learn new name signs —e.g. during my UCL visit hosted by @gab_hodge— I’m struck by how they call into question Searle’s (spoken English-based) arguments about how proper names work. Many sign names appear to be descriptive (or at least originate as descriptions)

Moreover, often one gets the ‘baptismal story’ along with learning the name sign, meaning the motivation is kept alive by users. There is some literature on this, e.g. Mindess 1990 has some interesting data on the connection of name signs to identies and descriptive features

And there are surveys like Meadow 1977 and Suppala 1990 that show important sociolinguistic differences in how names originate, how some of them are more arbitrary than others, and how these issues are wrapped up with matters of Deaf culture.

In general, I think the philosophy of proper names (and philosophy of language more generally) could benefit a lot from shedding its spoken language bias and learning from naming practices and name signs in sign languages — these literatures have barely touched each other.

This originated as a thread on twitter:

New paper: Redrawing the margins of language

Just out in Glossa, the premier open access journal of general linguistics:

Dingemanse, Mark. 2018. “Redrawing the Margins of Language: Lessons from Research on Ideophones.” Glossa: A Journal of General Linguistics 3 (1): 1–30. doi:10.5334/gjgl.444. (download PDF)

In this paper I take up the theme of marginality (as distinct from rarity) from my 2017 essay, and take it in a different direction. I argue that the narrative of marginalisation, while historically justified, no longer suffices for ideophones, and that it obscures some of the insights from 150 years worth of research on this phenomenon. The paper is openly available so I won’t summarise it fully here; instead I’ll draw up a few of the lessons I learned while writing it.

How things get marginalised

As many have pointed out, ideophones have long been treated as marginal in linguistics. But how does something come to be seen as marginal? For ideophones, I found there are two basic strategies: assimilation and exceptionalism. In assimilation, we explain away a phenomenon by assuming it’s the same as something already familiar (and marginal anyway), giving us a reason to neglect it. In the case of ideophones, this is often done by shelving them away as interjections or as onomatopoeia. Exceptionalism is the reverse: we stress the utter difference of a phenomenon and thereby place it outside the bounds of normal linguistic inquiry — another reason to neglect it (or leave its investigation to scholars happy to work on ‘exotic’ topics).

One of the best examples of how exceptionalism works is Vidal, who in an introduction to a Yoruba dictionary wrote that he considered ideophones a “singularly unique feature” of the language, and continued, “therefore I shall not waste time in comparing it with the adverbial systems, whatever they may be, of other African languages” (Vidal 1852). Ironically, exceptionalism often arises out of a wish to stress the significance of something; but it may have the same effect as assimilation, namely to shield it from broader investigation. A goal of my paper is to walk the fine line between assimilation and exceptionalism: show what’s special about ideophones without losing sight of how they fit into the bigger picture.

Ideophones are a major word class in many languages

If you haven’t worked on or don’t speak a language with a well-developed ideophone system it can be hard to appreciate the sheer scale of ideophone inventories. Here’s a remarkable fact: in some of the most well-documented languages, ideophones are a major word class on the same order of magnitude as nouns or verbs. Would you be able to take a grammar seriously if it didn’t treat verbs? If you encounter a grammar of a Bantu language, or of Basque, Korean or Japanese, that doesn’t treat ideophones in detail, you should look at it with the same suspicion.

Language Reported magnitude of ideophone inventory
Basque “more than 4,500” (Ibarretxe-Antuñano 2006: 150)
Gbeya “over 3,000” (Samarin 1971: 161)
Japanese “4,500” (Ono 2007)
Korean “several thousands” (Sohn 2001: 96)
Semai “same order of magnitude” as nouns and verbs (Diffloth 1976: 249)
Turkish “one to two thousand” (Jendraschek 2001: 39)
Zulu “3,000” (von Staden 1977: 200)

Stress-testing theories

If ideophones indeed are a major word class in some languages, one consequence is that it becomes more urgent to include them in our theorising. What good is a theory of phonological features that can’t deal with the phonosemantic mappings or phonotactic markedness of a major word class? Or a theory of morphology that can’t deal with templatic phenomena? Or a theory of words that can’t deal with gradience in form and meaning? In the 1970’s and 1980’s, the time of the first ‘cross-linguistic encounter’, ideophones played an important role in theory formation in many areas of general linguistics.  Their role was often one of ‘stress-testing’ theories: ideophones provided the kind of boundary phenomena that could make or break generalisations.

For instance, ideophones played a crucial role in McCarthy’s (1983) new theory of nonconcatenative morphology. As he noted, “these exotic phenomena pervade the world’s languages with a regularity and complexity that makes them both essential and ideal for testing any theory of morphology”. By the way, that ideophones could be described as “exotic phenomena” and as “pervading the world’s languages with regularity” in one sentence is a perfect illustration of the viewpoint dependence of notions of marginality

Forgotten classics

Digging up old work on ideophones is very rewarding. It turns out luminaries like Vidal, Junod, and Westermann had lots of interesting stuff to say. One problem is that their work often comes in languages other than English — for instance, Junod wrote in French and Westermann in German. Since it bothered me that so few people had access to their pioneering work, my review presents some of their most insightful comments in the hope that others will benefit from them as well.

I’m particularly fond of Westermann, whose two classic papers on iconic mappings in West-African ideophones I made available for download before. These papers as well as his grammars and dictionaries of Ewe radiate a deep knowledge of the language, and his comments show how he worked closely with native speakers to really understand what ideophones do and how they work.

Diverse voices

Speaking of native speakers, one thing that is striking when you take any reasonably comprehensive bibliography of ideophone studies is the number of contributions by scholars who are also native speakers. It is hard to find other linguistic phenomena that have benefited so much from work by linguists with native speaker sensibilities. Especially in the last decades, this has shaped the course of developments in ideophone studies in important ways.

Here’s why this is important. As we have seen, marginality is to a large degree subjective: what you consider marginal depends on your methodological focus, your theoretical framework, your disciplinary upbringing, but also, importantly, your own native language(s). Scholars with native speaker sensibilities can provide an insider perspective that others may lack. It has been pointed out that having contributions from both native and non-native scholars is one of the most productive ways to do language science (Ameka 2006). Ideophone studies provide a good model for this.

In short

As ideophones are increasingly being brought into the fold of the language sciences,  they make visible our scholarly biases; they help us innovate methods and theories; and they keep giving us reasons to look at language with fresh eyes.

More in the paper: Dingemanse, Mark. 2018. “Redrawing the Margins of Language: Lessons from Research on Ideophones.” Glossa: A Journal of General Linguistics 3 (1): 1–30. doi:10.5334/gjgl.444. (download PDF)

When publication lag turns predictions into postdictions

In late 2011, I defended my PhD thesis and submitted two papers on ideophones. One to Language and Linguistics Compass, where it was reviewed, revised and accepted in May 2012. It appeared in late 2012 and against all odds (for a topic so obscure) went on to become the #1 most cited article in that journal of the last 5 years. Around the same time, I submitted another paper to a special issue of STUF – Language typology and universals, where like the first, it was reviewed, revised and accepted in May 2012. That paper finally appeared in… wait for it… August 2017 (!). A preprint has been available for a while, but in linguistics, people generally avoid citing those so it hasn’t really had much of a chance. Anyway, here it finally is!

Old! New! Dingemanse, Mark. 2017. “Expressiveness and system integration. On the typology of ideophones, wish special reference to Siwu .” STUF – Language Typology and Universals 70 (2): 363–84. doi:10.1515/stuf-2017-0018 (PDF).

Postdiction? Prereplication?

This has led to the interesting situation that some predictions made in this paper have become postdictions:

The generality of these proposals predicts that the morphosyntax of ideophones in other languages should pattern in similar ways, at least with respect to grammatical integration and expressiveness. (p. 378)

Indeed, a replication of the main result appeared before the paper itself (Dingemanse & Akita 2016), making it what, a precognitive replication? Pre-replication? Anyway, here’s the call for replication that was the original impetus for my collaboration with Kimi Akita:

We know now that most languages have multiple constructions in which ideophones can be used, and these constructions will in all likelihood differ from each other along the lines sketched here (as well as in other ways). Cataloguing such differences on the basis of evidence from naturally occurring data will contribute to the description of the morphosyntax of ideophone systems in individual languages and will make it possible to refine and replicate the findings here crosslinguistically. (p. 379)

I’m glad to see this paper finally out. Fortunately, it contains some stuff that wasn’t preempted by later papers that appeared earlier. For instance, there are observations on frequency, borrowing, and ideophonisation and deideophonisation that would be worth following up in larger corpora and in other languages. Have a read!

Firth on the analysis of conversation (1935): sequence and social accountability

Here are some insights from J.R. Firth in 1935 that offer an interesting early outlook on language use in social interaction. Firth (1890-1960) was an expert in phonetics and prosody, but always stressed the importance of the larger context in which words and utterances occurred. In this piece, he turns to conversation as a source of insight about language:

Neither linguists nor psychologists have begun the study of conversation; but it is here we shall find the key to a better understanding of what language really is and how it works.

Firth’s observations appear in the course of a methodological commentary on the problem of polysemy in lexicography and in language learning. His proposal is to let context contribute to a solution. As he notes, while “situations are infinitely various”, still “Speech is not the “boundless chaos” Johnson thought it was.” (p. 66). He continues:

Conversation is much more of a roughly prescribed ritual than most people think. Once someone speaks to you, you are in a relatively determined context and you are not free just to say what you please. We are born individuals, but to satisfy our needs we have to become social persons, and every social person is a bundle of rôles or personae

As Firth observes, in conversation, you are not free to say what you please. Instead, what has been said before shapes and constrains your options, and what you say similarly shapes and constrains what happens further on. When conversation analysts talk about how any turn is both context-shaped and context-renewing (in Heritage’s apt formulation), this is essentially what they mean. Further, an important aspect of constraints on what is said derives from the need to manage social roles and personae: Goffman avant la lettre. These points together moreover bring into view a notion of social accountability.

Further on in the paper, Firth foreshadows notions like sequential structure and conditional relevance, which have come to occupy a key place in conversation analysis:

The moment a conversation is started, whatever is said is a determining condition for what, in any reasonable expectation, may follow. What you say raises the threshold against most of the language of your companion, and leaves only a limited opening for a certain likely range of responses. This sort of thing is an aspect of what I have called contextual elimination. There is a positive force in what you say in a given situation, and there is also the negative force of elimination both in the events and circumstances of the situation and in the words employed, which are of course events in the situation.

Again, the words “reasonable expectation” implicitly invoke a notion of accountability. Here Firth goes further into the idea of prior speech providing ‘determining conditions’ for what is sayable next. Take a polar question: it expects, invites (or as conversation analysts say, makes relevant) a limited range of answers, one type of which is preferred. The ‘limited opening for a certain likely range of responses’ is a proto-version of what conversation analysts have come to call conditional relevance and preference.

Firth’s observations on the structuring of conversation go beyond simple behavioristic conceptions like response probability and ‘behavior under the control of some stimulus’ (Skinner). His discussion captures the role of social accountability as well as the probabilistic aspects inherent in language use. His notion of ‘contextual elimination’ captures the sense in which one’s contribution to conversation shape and constrain what happens downstream without uniquely determining it.

While this paper is widely cited in corpus linguistic circles and in the Firth/Halliday tradition, Firth’s observations on conversation have rarely been drawn attention to, and there is as far as I know no direct historical connection between them and later insights developed in the field of conversation analysis, which started a few decades later in California with Sacks, Schegloff and Jefferson. So this is likely a case of scholars reaching the same kind of conclusions independently — a powerful reminder of what can happen if we don’t assume conversation is messy and irregular, and instead sit down and take conversation for what it is: the primary ecology of language use, and one of the best places to gain new insights about the nature of language.

Firth, J. R. 1935. “The Technique of Semantics.” Transactions of the Philological Society 34 (1): 36–73. doi:10.1111/j.1467-968X.1935.tb01254.x.