Phonosemantics, Chinese characters, and coerced iconicity

The light descending (from the sun, moon and stars.) To be watched as component in ideograms indicating spirits, rites, ceremonies.The linguistic blogosphere featured some posts recently on the topic of phonosymbolism, phonosemantics, and Chinese characters. It started with a post by Victor Mair over at Language Log, outlining several approaches to “etymologizing” Chinese characters. A follow-up by David Branner highlighted some of the problems with simplistic notions of phonosymbolism. Here I add some texture to the conversation by discussing the views of Ezra Pound, making a comparison to form-meaning mappings in ideophones, and introducing the notion of coerced iconicity.

The posts by Mair and Branner address a popular but quite mistaken notion: the idea that Chinese characters are like little pictures whose meaning can be “read off” from the strokes. The academic best known for debunking this popular misconception was John DeFrancis in his (1984) The Chinese Language: Fact and Fantasy. He showed that the bulk of Chinese characters are phono-semantic compounds in which one element indicates (at most) a general category of meaning and the other suggests the pronunciation.

The pictorial view

The roots of the “pictorial” view of Chinese characters in the Western world no doubt go far back. One of the driving forces behind it in the first half of the 20th century was the poet Ezra Pound. Pound is a fascinating figure, famed for his influence as a Modernist, Imagist poet and literary critic (and controversial for some of his political views). I have recently described Pound’s ideas about Chinese ideograms:

Over the years, Pound developed a fascination with the poetic affordances of logographic writing systems, especially Chinese. This fascination originated with his discovery of a theory of the Chinese character by Ernest Fenollosa [published in an edition by Pound in 1920], who argued that Chinese writing reflects etymology (‘true sense’) in a way that phonetic writing does not. In Pound’s idealist view of etymology (Li 1986), this rendered the Chinese character vastly superior to Western phonetic script in terms of picture-making. Soon enough however, scholarly studies of logographic writing systems showed that Chinese characters are semantic-phonetic compounds rather than transparent pictures, and Pound’s idyllic conception of Chinese characters as evocative ideograms was severely and justly criticized (Kennedy 1958; cf. DeFrancis 1984).

(Dingemanse 2011:44-45)

In my paper (titled Ezra Pound among the Mawu and published in Semblance and Signification), Pound’s ideas serve as a cautionary tale. I argue that there is a parallel between Pound’s overeager “iconicization” of Chinese characters and the tendency of many linguists to ascribe iconicity to ideophones. One important point of the paper is to note that there are limits to the iconic representational powers of speech, and there is reason to be careful in ascribing iconicity to ideophones (p. 45-6).

Ideophones are not the unproblematically imitative words that many people have made them out to be. There is something about them that makes us want to believe this, no doubt — just like there is something about Chinese characters that makes us want to believe the pictorial story. In my analysis of ideophones, this something is not iconicity, but first and foremost their depictive nature — the fact that they are presented as, or to use a more apt metaphor, framed as depictions.

Three types of form-meaning mappings

It may be useful to describe the development of my own thinking about these matters. Back in 2007 my reading of the ideophone literature suggested that ideophones are simply sound-symbolic words. Over time, with my inventory of Siwu ideophones steadily growing and my grasp of the semiotics of depiction in speech slowly evolving, I came to question simplistic notions of sound symbolism and iconicity in ideophones.

It became clear to me, for instance, that in a language with thousands of ideophones, it would be very difficult for all ideophones to be iconic to the same degree or in the same way. So there had to be different types of iconicity — different ways in which ideophones could evoke sensory imagery. My paper addressed this matter empirically by surveying the Siwu ideophone inventory. The result of this survey was a description of three basic, non-exclusive types of form-meaning mappings in ideophones.

Coerced iconicity

While working on this I also realized that even if we allow for different types of iconic mappings, certain ideophones do not actually seem to be that transparently iconic. How does one iconically map colours, internal sensations, or cognitive states? Is iconicity really the point of ideophones like Siwu fùrùfùrù ‘seeing things in a blur’ or Japanese iya iya ‘with a heavy heart’? It seems unlikely. Have ideophone enthusiasts (native speakers as well as linguists) simply been over-eager in iconicizing ideophones? Doing an Ezra Pound in the domain of sound? If so, it is important to figure why the form of ideophones is so often identified with their meaning. I argue that it is their depictive nature:

Depiction, rather than iconicity, is what invites people to treat the ideophone as a performance of sensory imagery. An analogy may help to explain this point. Consider the category of objects called paintings. Paintings vary quite widely in the degree to which they are iconic (i.e. show a perceived resemblance to what they depict). And yet there is a distinct interpretive frame we bring to all of them: we tend to view them as depictions rather than read them as texts (Gombrich 2002[1960]; Walton 1973). In a similar way, we may think of ideophones as setting up a depictive interpretive frame, inviting the listener onto the scene and invoking images of being there.
(…)
If we want to invoke iconicity here at all, we should call it COERCED ICONICITY. The depictive nature of the ideophone coerces us into treating the word as an adequate rendition of the depicted event.

(Dingemanse 2011:51)

Coerced iconicity may be a useful concept in discussions of supposed iconicity because it describes a mechanism familiar to us all and realistically locates it in the eye of the beholder. In Peircean terms, it locates iconicity in the interpretants of eager observers rather than solely in properties of the sign-object relationship. Why was it difficult for Pound to resist associating meaning with the shape of Chinese characters? Why does the pictorial view of Chinese characters, thoroughly debunked as it is, keep coming back? One reason may be that there is some amount of truly pictorial characters that feed the imagination and that makes all Chinese characters look like pictures, especially to the untrained eye. This coerces people into treating all characters as pictorial renditions. Why do speakers treat all ideophones as perfectly adequate depictions of sensory imagery? Perhaps all that is needed is a critical mass of transparently iconic ideophones (using the three principles I described), and for the remainder, the framing devices of performative prosody and expressive morphology may be enough to coerce people into treating them as good depictions.

Explanatory leakage

Sapir famously said that all grammars leak. Much the same holds for any grand theory of how linguistic signs —spoken as well written words— are motivated. (This is the source of my unease with the “big picture” theory of Chinese phonosymbolism by Howell that Mair outlines in his post.) All linguistic systems are the messy, fuzzy products of a long term interaction of human communicative needs, intersubjective language use, modality-specific features, and the mindless opportunism of evolution (among other factors). In the case of the form and meaning of ideophones, there are many forces tugging at them and shaping them. Although many people like to think of ideophones as prototypically “iconic” words, on reflection, it is clear that the story leaks. Yes, there are clearly iconic structures in ideophones that help guide the imagination, perhaps somewhat like the lines and shading in a naturalistic painting. But some ideophones (many in some languages?) may be more like abstract paintings: depictions that are invested with meaning by eager observers, not necessarily on the basis of information contained within their form.

Often a certain amount of explanatory leakage is more exciting than a neat account. Seeking regularity all the way leads to oversimplification. In some possible world, all Chinese characters are neat pictograms, the Chinese language is phonosemantic in nature, and all ideophones are nice imitative words. This world is not ours however; and isn’t it is far more interesting to investigate the manifold ways in which humans can do cross-modal mappings of form to meaning, and to describe the different processes by which they discern motivation in what to the analyst may look like arbitrary gibberish? Gibberish. Hmm, let me frame that word for you so that you can experience some coerced iconicity on the way out. Gibberish.

References

  1. DeFrancis, John. 1984. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press.
  2. Dingemanse, Mark. 2011. ‘Ezra Pound among the Mawu: Ideophones and Iconicity in Siwu’. In Semblance and Signification, edited by Pascal Michelucci, Olga Fischer, and Christina Ljungberg, 39-54. Iconicity in Language and Literature 10. Amsterdam: John Benjamins. (download here)
  3. Dingemanse, Mark. (in press) “Advances in the cross-linguistic study of ideophones.” Language and Linguistics Compass.
  4. Fenollosa, Ernest. 1920. The Chinese Written Character as a Medium for Poetry. Edited by Ezra Pound. London: Stanley Nott.
  5. Kennedy, G. 1958. ‘Fenollosa, Pound, and the Chinese Character’. Yale Literary Magazine 126, no. 5: 24–36.
  6. Li, Victor P. H. 1986. ‘Philology and Power: Ezra Pound and the Regulation of Language’. boundary 2 15, no. 1/2: 187-210.
  7. Pound, Ezra. 1947. The Unwobbling Pivot and the Great Digest. New York: New Directions.

Now available: The Meaning and Use of Ideophones in Siwu

Yesterday I successfully defended my PhD thesis at the Radboud University Nijmegen. I was promoted to doctor cum laude.

This means that I can now make the thesis officially available to anyone interested. You can find it at thesis.ideophone.org, where you can also inspect the online supplementary materials, listen to audio clips, and check out photos. Or just download the PDF directly. Enjoy!

Also check out these press releases related to the thesis and the defense:

Transcription mode in ELAN

A new version of ELAN, the widely used tool for time-aligned annotation of linguistic data, was released today by the developers, Han Sloetjes and Aarthy Somasundaram. One of its major features is a whole new user interface for high-speed transcription. This interface is the outcome of a process of user consultation and usability testing at the MPI for Psycholinguistics led by Mark Dingemanse, Jeremy Hammond, and Simeon Floyd in close collaboration with the ELAN developers Han Sloetjes and Aarthy Somasundaram. In this post we outline the most important features of Transcription mode.

Transcription mode

Transcription Mode is a mode designed to increase the speed and efficiency of transcription work. The interface is keyboard-driven and minimizes UI actions. All annotations of a certain tier type are displayed in a vertical list for easy visual access. Transcription mode brings down the transcription work to the bare essentials: listen, type, listen, type, listen, type.

Note. Transcription mode presupposes that the initial segmentation of the recording is already done. The rationale for this is that the most efficient workflow for transcribing large amounts of linguistic data is a two-step process: first segmenting the recording into turns —also attributing turns to the appropriate speaker— (this can be done in Annotation mode or in the special purpose Segmentation mode), and then transcribing and translating these turns.

1. Setting it up

You can reach Transcription mode via the Options > Transcription mode menu. If you go to Transcription mode for the first time, a Settings dialog will come up. Here you can select the tier types to be used for up to three columns. Note that you select tier types, not individual tiers. This is because Transcription mode displays all annotations on all tiers of a certain type in a vertical column.

For the purposes of this description we will asssume that the user is working with a file that has four main tier types: po (practical orthography), tl (literal translation), tf (free translation), and vb (visible behaviour). Our example file contains tiers of these types for two participants, and the overall tier structure looks like this (tier names in bold, tier types in courier):

  • A_po po (practical orthography)
    • A_tl tl (literal translation)
    • A_tf tf (free translation)
  • A_vb vb (visible behaviour)
  • B_po
    • B_tl
    • B_tf
  • B_vb

In our example, we choose the type po (practical orthography) as the first column. We can leave it at that if we just want to work on the transcript. Or we can display up to two additional columns next to the primary one. In our example, we’ll add the literal translation type.

For the second and third columns we can only select tier types that are time-aligned with the first using the stereotype “Symbolic Assocation”. In our example, we can select tl (literal translation) and/or tf (free translation) as second and third columns. We cannot choose the tier type vb (visible behaviour) here, because it is not time-aligned with our primary column.

Having selected the tier types we want, we click “Apply”. Now the chosen tier types are displayed in vertical columns, and the two largest differences from the default Annotation mode become visible: (i) all annotations are displayed vertically (top to bottom) rather than horizontally (left to right), and (ii) columns display all annotations of a certain type. For instance, the po (practical orthography) column displays turns from both speakers A and B.

Note. Transcription mode presupposes that you use linguistic types to differentiate the types of information in your tiers. Thus the linguistic type of your free translation tier should be different from the linguistic type of your main transcription. This is necessary for any serious corpus work anyway — for instance, ELAN’s multiple layer search also relies on this. If you haven’t been using linguistic types yet, consider investing the time to bring your files up to speed. This will not only let you use Transcription mode, it will also allow complex corpus searches and in general make your data more structured. The best way to enforce proper use of linguistic types across your files is to use a template.

2. Using Transcription mode

Transcription mode is built for high-speed transcription work. It plays automatically so that you can start typing right away. You can hit TAB to replay, and if you finish typing you hit ENTER, which brings you to the next annotation, which is played automatically so that you can start typing right away… and so on. Transcription mode boils down the transcription process to the two most essential actions: listening and typing. Once you’ve set it up, you don’t need to worry about anything else.

You can use Transcription mode to do initial transcription of a segmented recording. For this you would use the simplest, one-column setup. You can also use it to work on translations if you already have transcriptions. For this you would display both tier types side by side. And of course you can do the transcription and translation work in one go. For this you would use the two-column setup and check the option to “Navigate across columns”.

The basic philosophy of Transcription mode is to make things as easy as possible for the transcriber. This is why it displays annotations in a table rather than on a timeline, plays automatically on selection, and moves to the next annotation without requiring additional clicks or key presses. It will also silently create child annotations if they don’t exist yet — merely clicking an empty cell (or moving there using the keyboard) creates an annotation and opens it for immediate editing. The user just has to make sure the relevant tiers exist for all participants, and Transcription mode takes care of the rest.

3. Essential shortcuts

Typing and playing back

  • ENTER saves the current annotation, moves to the next annotation, and plays this new annotation if the automatic playback option is selected. [Three for the price of one!]
  • TAB plays the current annotation. It acts as a play/pause key, so you can press it again to pause playback, and again to continue playing.
  • SHIFT+TAB plays back the current annotation from the start.

Moving around

  • UP and DOWN arrows move up and down within a column.
  • ALT+LEFT and ALT+RIGHT arrows move left and right across columns (and just because we know you’d try this, ALT+UP and ALT+DOWN also move up and down within a column)
  • Remember that ENTER automatically moves to the next annotation. The Navigate across columns setting controls whether you go down within a column or you move across columns (from left to right).

Using the mouse

  • Clicking on any annotation activates it for editing. The cursor will be placed close to where you clicked and you edit right away.
  • You can also use the mouse to select part of the waveform for playback. TAB will play/pause the selection.
  • Right click on annotation will give you an option to jump to the Annotation mode. This will allow you to finely manipulate annotation boundaries and then return to Transcription mode.

4. The settings explained

Below the video signal and above the waveform you find the normal playback buttons (Play, Play selection, Clear selection). (Though recall that you can simply use Tab for quick playback of the full annotation or the selection.) In this area there are two further options:

  • screen layout. This option determines whether the media and settings are displayed on the left side or on the right side of the screen. Clicking it flips the screen layout. Default: video, audio and settings on the left.
  • loop mode. This option is found to the right of the play/pause buttons. When checked it means that a selected annotation while constantly loop until a new annotation is selected. Default: unchecked.

Below the waveform your find a couple further options that you can use to customize the Transcription mode experience.

  • automatic playback of media. This controls whether the annotation is automatically played or not when you arrive at it. Default: checked.
  • show tier names. This controls whether tier names are shown within the list or not. If unchecked, colour coding distinguishes different tiers/participants, and hovering over an annotation will tell you the tier/participant name. (There is an additional choice to show colours in the cells themselves or only in the line number column.) Default: checked.
  • navigate across columns. This controls what annotation you move to after hitting Enter. If unchecked, you move only within a column (from top to bottom). If checked, you move across columns (from left to right). Default: unchecked.
  • always scroll the current annotation to the center. This mode keeps your current annotation always in the middle of the screen. Default: unchecked.

5. Layout and viewing

The layout of the transcription mode is designed to replicate the best aspects of a word processor and a spreadsheet – all the while allowing you access to the time-aligned video and audio signals.

  • The video column, which also includes the options and wav form, can be placed on the right or left. Press the screen layout button to toggle the video/settings column from left to right. The video can also be detached for viewing independent of the main window, for instance on a secondary monitor.
  • All of the columns are resizable: just mouse click and drag the boundaries to fit your desired widths.
  • You can order columns as you please. So once you have established your types you can then reorder them on the screen simply by dragging them to the desired location.
  • You can zoom in on the video signal in order to focus a particular part (also available in annotation mode). This works best with HD footage.
  • Font size of the columns is variable: you can change this in the settings dialog box.

6. Have (quick and easy) fun

We hope you enjoy this addition to the Elan toolset. It is designed to cut down on the many hours it takes to do detailed transcriptions and we feel that you will find it an indispensable part of your workflow.

We welcome your comments and feedback!

(This post was co-written by Mark Dingemanse & Jeremy Hammond and appears in slightly different form in the help function of ELAN.)

The LSA Language Anthology survey: some additional data

The LSA asks its members in a survey to choose the most important papers in Language, 1925-2000. Have you ever wondered what might be the most cited ones?

The Linguistic Society of America (LSA) is currently doing a member survey to collect suggestions for an anthology of the most influential and significant articles published in Language. From the survey:

For each volume of the Anthology, we are seeking input on those articles which represent the best scholarship published during that particular period. By “best,” we mean the most influential, the most cited, the most visited in JSTOR, and those considered a must-read for students and scholars of the discipline.

The survey includes some data that is normally hard to come by: most viewed articles from the JSTOR archive of Language. Since I think we can learn some useful things by crowdsourcing this data, I have put it in a publicly editable Google spreadsheet called Language Anthology data.

Looking over the lists, a number of interesting observations could be made. One thing that strikes me is the relatively low number of generativist studies, or conversely, the large number of non-generativist studies. There is more to say, but I don’t want to interpret too much. (Supply your own interpretations in the comments below!)

More data

As a first addition to the spreadsheet, I thought it would be interesting to get some idea of the number of citations of the items listed in the LSA survey. Do articles that are ‘viewed’ or ‘downloaded’ actually get cited? As a rough approximation, I use the “cited by” number from Google Scholar (if anyone has better data for all of the items, feel free to add a column in the spreadsheet!). The citation chart below is generated from the data in the spreadsheet:

Some interesting things emerge here. First, there is the Sacks, Schegloff and Jefferson (1974) paper on turn-taking, which with a whopping 5638 citations must surely be the most cited article in the history of the journal Language. There was an interesting discussion on Funknet the other day (prompted by a question of Fritz Newmeyer) about outside views of linguistics as a discipline. In a contribution to this discussion, Brian MacWhinney noted the following:

Finally, I wish that I could refer to Conversation Analysis as a part of linguistics. I know that I can’t really get away with this, although personally I think it is a part. In any case, I see a lot of interest and respect for CA from areas as diverse as marketing, sociology, politics, aphasiology, and so on.

Seeing statistics like this (and noting that the fourth most cited article on the list —Schegloff et al. 1977, with 1698 citations— is another CA article), the question whether or not CA should be considered part of some (apparently narrowly construed) discipline doesn’t really matter. Clearly, scholars inside and outside of “linguistics” have no trouble finding the results of CA worthwile enough to cite. That said, I do agree with MacWhinney: if there is to be a true science of human language (a reasonable gloss for “linguistics” I would say), it seems clear to me that the analysis of conversation should form an integral part of it.

Curiously, the next most cited article, Dowty 1991, has a little over two thousand citations, leaving a enormous gap. Her’s something to think about: What will these charts look like for the upcoming decades? What kinds of approaches to language are going fill that gap in the next fify years? My bet is on more data-driven approaches: CA, corpus linguistics, large scale typological studies based on fine-grained datasets, and the like.

An outlier on the other side of the spectrum is Michael Shapiro’s “Sound and meaning in Shakespearer’s sonnets”, which according to Google Scholar is cited by only 3 works. One wonders how it ends up in the JSTOR top 25 downloads — and why it is not getting cited!

Some missing items

Now the LSA survey would not be a member survey if they did not ask their members to supply their own candidate articles for inclusion in the Language Anthology. I have a few of my own (Friedrich’s “Shape in grammar”; Jakobson’s “Grammatical parallelism”; Fromkin’s “Anomalous utterances”; Clark & Gerrig’s “Quotations as demonstrations”; Evans & Wilkins’s “In the mind’s ear”), but perhaps it is more interesting to look at some other widely cited articles that didn’t make it through the LSA selection process.

Here is a Google Scholar search that will produce articles from the period 1925-2000 in Language, sorted (roughly, as all data in Google Scholar) by number of citations. On that list we find the following widely cited articles that are not present in the LSA survey (though I suspect that many of them should also be in the top 25 downloads from JSTOR):

There are some great and important articles in this list that would in my opinion be worth anthologizing. What are your choices?

Appendix: the lists

Although the data is in the Google spreadsheet, I include the lists from the LSA survey below in a proper bibliographic format. All of this is simply pulled from Zotero (which made it a breeze to get the papers from JSTOR), and users of Zotero will see a little “folder” icon that they can use to import the references into their library. Additionally, the papers are in the public Zotero group “Linguistics”: see the collections 1976-2000 and 1925-1975, respectively.

LSA 1. JSTOR’s most viewed/downloaded Language papers, 1976-2000.

(I’m not sure how this list is ordered; I’m presenting it here as it is presented in the survey.)

  1. Hale, Ken, Michael Krauss, Lucille J. Watahomigie, Akira Y. Yamamoto, Colette Craig, LaVerne Masayesva Jeanne & Nora C. England. 1992. Endangered Languages. Language 68(1). 1-42.  
  2. Birdsong, David. 1992. Ultimate Attainment in Second Language Acquisition. Language 68(4). 706-755.  
  3. Pfaff, Carol W. 1979. Constraints on Language Mixing: Intrasentential Code-Switching and Borrowing in Spanish/English. Language 55(2). 291-318.  
  4. Hopper, Paul J. & Sandra A. Thompson. 1984. The Discourse Basis for Lexical Categories in Universal Grammar. Language 60(4). 703-752.  
  5. Hopper, Paul J. & Sandra A. Thompson. 1980. Transitivity in Grammar and Discourse. Language 56(2). 251-299.  
  6. Dowty, David. 1991. Thematic Proto-Roles and Argument Selection. Language 67(3). 547-619.  
  7. Schegloff, Emanuel A., Gail Jefferson & Harvey Sacks. 1977. The Preference for Self-Correction in the Organization of Repair in Conversation. Language 53(2). 361-382.  
  8. Ladefoged, Peter. 1992. Another View of Endangered Languages. Language 68(4). 809-811. doi:10.2307/416854.  
  9. Lichtenberk, Frantisek. 1991. Semantic Change and Heterosemy in Grammaticalization. Language 67(3). 475-509.  
  10. Meier, Richard P. & Elissa L. Newport. 1990. Out of the Hands of Babes: On a Possible Sign Advantage in Language Acquisition. Language 66(1). 1-23.  

LSA 2. Nine more articles from the top 25 downloads, 1976-2000

Then there is “a list of nine additional articles falling within the top 25 downloaded articles from 1976-2000″.

  1. Nunberg, Geoffrey, Ivan A. Sag & Thomas Wasow. 1994. Idioms. Language 70(3). 491-538.  
  2. Tannen, Deborah. 1982. Oral and Literate Strategies in Spoken and Written Narratives. Language 58(1). 1-21.  
  3. Chambers, J. K. 1992. Dialect Acquisition. Language 68(4). 673-705.  
  4. Shapiro, Michael. 1998. Sound and Meaning in Shakespeare’s Sonnets. Language 74(1). 81-103.  
  5. Traugott, Elizabeth Closs. 1989. On the Rise of Epistemic Meanings in English: An Example of Subjectification in Semantic Change. Language 65(1). 31-55.  
  6. Kay, Paul & Chad K. McDaniel. 1978. The Linguistic Significance of the Meanings of Basic Color Terms. Language 54(3). 610-646.  
  7. Bybee, Joan L. & Dan I. Slobin. 1982. Rules and Schemas in the Development and Use of the English past Tense. Language 58(2). 265-289.  
  8. McWhorter, John H. 1998. Identifying the Creole Prototype: Vindicating a Typological Class. Language 74(4). 788-818.  
  9. Gundel, Jeanette K., Nancy Hedberg & Ron Zacharski. 1993. Cognitive Status and the Form of Referring Expressions in Discourse. Language 69(2). 274-307.  

LSA 3. Six articles from the period 1925-1975

It is unclear why there wouldn’t also be a top 10 or even top 25 list of articles for the 1925-1975 period, but anyway, the survey gives only the following six, in this order:

  1. Sacks, Harvey, Emanuel A. Schegloff & Gail Jefferson. 1974. A Simplest Systematics for the Organization of Turn-Taking for Conversation. Language 50(4). 696-735.  
  2. Curtiss, Susan, Victoria Fromkin, Stephen Krashen, David Rigler & Marilyn Rigler. 1974. The Linguistic Development of Genie. Language 50(3). 528-554.  
  3. Chomsky, Noam. 1959. A Review of B.F. Skinner’s Verbal Behavior. Language 35(1). 26-58.  
  4. Haugen, Einar. 1950. The Analysis of Linguistic Borrowing. Language 26(2). 210-231.  
  5. Hays, David G. 1964. Dependency Theory: A Formalism and Some Observations. Language 40(4). 511-525.  
  6. Ferguson, Charles A. & Carol B. Farwell. 1975. Words and Sounds in Early Language Acquisition. Language 51(2). 419-439.  

Now online: fieldmanuals.mpi.nl

screenshot

We’ve been working on this for quite some time, and we’re excited to go live now: the L&C Field Manuals and Stimulus Materials. This is a website providing access to many of the field manuals produced over the years by the Language and Cognition Group at the Max Planck Institute for Psycholinguistics. As the front page explains:

This site contains a bonanza of material for the field elicitation of semantics and and the field collection of verbal behaviour. These are unique resources that have been compiled over nearly twenty years of investigation of under-studied languages by the Language & Cognition Group at the Max Planck Institute for Psycholinguistics. During this period we collectively pioneered the field of semantic typology.

Many entries from these manuals have been circulating informally for years and they have been used by field workers all over the globe. With this archive we offer a centralized, easy to use resource. We’ve started by making available the most recent couple of years. Over the coming months, we will be uploading older manuals and materials, but you can start by checking out the wealth of materials already there — from guidelines on Building a Corpus of Multimodal Interaction in your Field Site to our cross-cultural Synaesthesia Pilot, and from the recent Language of Perception project to the classic Put Project: The Cross-Linguistic Encoding of Placement Events.

Transcribing linguistic data: bottlenecks and one way to speed up

Transient Languages & Cultures published a nice post by Peter Austin last month on the question of how much time it takes to transcribe linguistic data. Working under tight time constraints during some recent fieldtrips, I found one way to speed the process up. It still takes an awful lot of time, but here goes.

In my experience, two very important bottlenecks in transcription, especially of conversational material, are (1) initial recognition (what exactly was said?), and (2) writing it down (how quickly can this be written down in the orthography you have chosen?). In my field situation (a Siwu-speaking village in eastern Ghana with few literate and even fewer computer-literate people), I don’t have someone (yet) who could do the actual transcribing, which is usually done directly in ELAN, so I am responsible for bottleneck #2 (getting it into the computer).

As regards the first bottleneck, for a native speaker it is much, much easier to ‘hear’ a fuller form of conversational speech than for a non-native speaker, so it makes sense to get that kind of help for bottleneck #1. Initially therefore, I would sit down with a consultant, play a conversation utterance by utterance (I would have done the segmentation in ELAN beforehand), and have the consultant repeat the speech while I wrote it down. For bottleneck #2 reasons this often meant replaying the same utterance multiple times. I soon realized that this was a waste of time for my consultant, who would patiently repeat two to four times what he already got right the very first time. Essentially I was using him as a tape recorder, rewinding and replaying his careful repetitions to make up for my deficiencies in short-term memory and typing speed! So here is what I did to speed up the process: Continue reading

But is it grammar?

Finally, some commentaries on the Evans & Levinson paper are trickling down the blogosphere. Nigel Duffield‘s “Roll up for the mystery tour” is one. Unfortunately, the comments on that post are closed. I have a question, so let me just post it here, where the comments are open.

The commentary is entertainingly written. Basically, it agrees with E&L’s rallying cry for the need to describe and recognize diversity; but it argues that, due to certain misconceptions about UG on the part of E&L, “Universal Grammar … walks free from the courtroom.” The first point that E&L get wrong about UG according to Duffield is the status of the notion ‘subject’. He has an interesting quote from McCloskey to support this point; and he also points out that, ironically, the topic/agent/pivot distinction championed by E&L is in fact ‘commonly accepted, if differently formalized’ in (some) UG quarters.

But it is really the second point, about the ontology of UG, that piqued my interest. Duffield argues that “UG is a theory of the initial state, which Chomsky now terms FL (Faculty of Language), not of any particular endstate grammar (LEnglish, LJiwarli, LPiraha, etc.,).” But that is not all; he adds, “The problem is not merely that UG is not claimed to be a property of final state grammars, but that it need not even be definitional of these grammars.” (emphasis mine, MD). And then comes the crucial point, for me as well as for Duffield:

the crucial point here is that facts about attained, endstate grammars bear only tangentially on theories of UG. Baldly stated, the absence of Language Universals—granting for the sake of argument that these are a ‘myth’—does not imply the absence of UG.

To be honest, this baffled me. Not so much because I disagree, but rather because there is so little left to agree or disagree with! I have wondered before (in an admittedly tongue-in-cheek post on the ‘grammar of the gaps‘) about the gradual shift of UG to evermore abstract territories — compare for example the switch-like parameters of the Principles & Parameters approach (some of which clearly refer to concrete (endstate) grammatical phenomena) with the most recent claims that recursion and some form of Merge should be sufficient for the FLN (Hauser, Chomsky & Fitch 2002). Duffield himself quotes Chomsky to the effect that ‘It is a coherent and perhaps correct proposal that the language faculty constructs a grammar only in conjunction with other faculties of mind.’ (Chomsky 1975:41). In my earlier post, I mention the question posed to Adele Goldberg by Jan-Wouter Zwart at the Nijmegen Lectures 2007. Zwart, in search of common ground between generative grammar and construction grammar, asked ‘Is it conceivable that underlying the structure of constructions are abstract principles of a simple kind, rooted in universal properties of human cognition?’ Goldberg’s answer was affirmative — but as I note in my discussion, the statement is sufficiently general to engender agreement from almost everyone.

But is it grammar?

The big question such abstract conceptions of UG raise for me is this: but is it grammar? That is, if it is indeed the case, as Duffield holds, that UG is not a property of endstate grammars; that it is not even definitional of these grammars; and moreover that ‘attained, endstate grammars bear only tangentially on UG’, what exactly is UG supposed to be, how do we go about empirically validating the UG hypothesis, and why are we calling it “universal grammar“?

My worry goes deeper than the apparent misnomer (though I do wonder whether the theory is not in need of a new name, if ‘endstate’ grammars have so little to do with it; but then I’m probably overlooking useful connotations of ‘grammar’ for the initial state). I find it difficult to see (1) how such an abstract concept could be isolated from more general cognitive abilities, and (2) why one would want to isolate it a priori. To isolate it —i.e. to show that UG is the ‘language faculty’ in some relevant sense— would one not need to show that its core business is language? (But how would one go about that if it doesn’t necessarily show in ‘endstate’ grammars?) And would one not need to show, conversely, that more general cognitive abilities cannot take care of language — in other words, that UG is necessary to explain (aspects of) language usage and language structure?

Excavating UG

A final issue, prodded by another statement from Duffield’s commentary:

No matter how deep one digs into mature grammatical systems, there is no logical reason to expect that one will excavate UG in any recognizable form, any more than one should discover universal principles of embryology through an in-depth study of mature organisms.

This does raise the question of what linguistics as a science is looking at. Following the analogy here (which is always a dangerous thing to do, but then, it is perhaps a dangerous analogy), Duffield seems to say that UG is to language structure what universal principles of embryology are to mature organisms. That would imply that UG is not about language structure (since digging into language structure is not going to yield UG ‘in any recognizable form’) but about the early ontogenesis of language in acquisition. I would agree that acquisition is of central importance (though again, I don’t see why we shouldn’t try and see how far we get with (1) domain-general cognitive abilities and (2) a socially grounded approach, before assuming there has to be a non-trivial ‘language organ’). But what of the flood of generativist literature that does dig deep into ‘mature grammatical systems’, purporting to learn things about UG (presumably in recognizable form)? Does this literature bear more than a tangential relation to the notion of UG espoused here?

Now, I expect there are interesting answers to these questions. No doubt I have overlooked some important ramifications, and perhaps I, too, have mistaken the ontology of UG. If so, set me straight! Comments are open.

References

  1. Chomsky, Noam. 1975. Reflections on Language. 1st ed. New York: Pantheon Books.
  2. Evans, Nicholas, and Stephen C. Levinson. 2009. The Myth of Language Universals: Language diversity and its importance for cognitive science. Behavioral and Brain Sciences 32: 429-492. DOI: 10.1017/S0140525X0999094X
  3. Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch. 2002. The Faculty of Language: What Is It, Who Has It, and How Did It Evolve. Science 298: 1569-1579.

The clay tablet tradition of African comparative linguistics

Found this gem in a review of Paul de Wolf’s (1971) The Noun Class System of Proto-Benue-Congo:

This work falls within the ‘clay tablet’ tradition of African comparative linguistics, and, like other things in the same tradition (Meinhof, Greenberg), it has the properties of being inscrutable and yet at the same time, in broad outline, convincing. The two together make an infuriating whole. (Kelly 1973:716)

Kelly goes on to list some good things and some major problems about the book; unfortunately, the problems are much bigger than the good things in his opinion. His final paragraph is also worth quoting for the subtle (and not so subtle) critique ingeniously giftwrapped in a counterfactual:

Anyone interested in African comparative linguistics need not regret 50 shillings spent on this monograph, which represents a good deal of painstaking work, more than actually appears between the covers. Used in conjunction with the previous publications of the Benue Congo section of the West African Linguistic Society, it provides a mass of data together with some attempt at a historical overview. But the price is not, alas, 50 shillings. It is £7.55 at this time of writing.

References

  1. De Wolf, Paul P. 1971. The Noun Class System of Proto-Benue-Congo. The Hague: Mouton.
  2. Kelly, John. 1973. Review of The Noun Class System of Proto-Benue-Congo. Bulletin of the School of Oriental and African Studies, University of London 36(3). 716-718.

A short review of Talking Voices (2nd ed)

Language in Society just published a book note by me on the second edition of Deborah Tannen’s well-known book Talking Voices. Here is the pdf.

In the review I am slightly critical of this classic for three reasons. First of all, for a second edition of a work that appeared two decades ago, it is very thin on updates and revisions. Secondly, it still focuses on the acoustic signal only (thereby overlooking a wealth of work on gesture and multimodal interaction that appeared since the first edition). Third, despite its general claims, Talking Voices limits itself mainly to various Anglophone ways of speaking (excepting some Greek examples). The Greek examples (which derive from an interesting 1983 paper) actually point to the relevance of a widespread linguistic resource that happens not to be very common in either the Greek or the Anglophone cultures discussed: ideophony. I argue that ideophones are immediately relevant to ‘repetition, dialogue, and imagery’ (the subtitle of TV), and thus to core themes of Tannen’s work (see also Nuckolls 1992, 1996).

Here is the conclusion:

The strength of Tannen’s book lies in its insightful analysis of the auditory side of conversation. Yet talking voices have always been embedded in richly contextualized multimodal speech events. As spontaneous and pervasive involvement strategies, both iconic gestures and ideophones should be of central importance to the analysis of conversational discourse. Unfortunately, someone who picks up this second edition is pretty much left in the dark about the prevalence of these phenomena in everyday face-to-face interaction all over the world.

Should Tannen have looked at gesture and ideophones? Of course every researcher has to make general choices and every published piece of scientific work is by definition incomplete. So I don’t think there’s an issue of ‘should have’ — but I do think it is unfortunate for the 2nd edition to miss out on these phenomena, because they would have offered many interesting and helpful illustrations of the book’s themes.

References

  1. Dingemanse, Mark. 2010. Review of on Tannen, Deborah, Talking Voices: Repetition, Dialogue, and Imagery in Conversational Discourse (2nd ed.). Language in Society, 39, 1, 139-140.
  2. Nuckolls, Janis B. 1992. Sound Symbolic Involvement. Journal of Linguistic Anthropology 2, no. 1: 51-80.
  3. Nuckolls, Janis B. 1996. Sounds Like Life: Sound-Symbolic Grammar, Performance, and Cognition in Pastaza Quechua. New York: Oxford University Press.
  4. Tannen, Deborah. 1983. “I Take Out the Rock-Dok!”: How Greek Women Tell about Being Molested (and Create Involvement). Anthropological Linguistics 25, no. 3: 359-374.
  5. Tannen, Deborah. 2007. Talking Voices: Repetition, Dialogue, and Imagery in Conversational Discourse. 2nd ed. Studies in Interactional Sociolinguistics 25. Cambridge: Cambridge University Press.

Basquekpafu

The Basque word for their language is Euskara or Euskera, earlier Heuskara. The first part of this word is the Togo R. word for “Akpafu”, Likpe be-fu “Akpafu”, Bowili o-vu-ne “Akpafumann”, Santrokofi o-fu “Akpafumann”, Akpafu ka-wu, ka-’u “Akpafu”. The early initial Basque h is from k, as can be seen from ka-wu, ka’u. The a has changed to e in this lexeme. The consonant between e and u has been lost. Basque lacks the semivowel w, which drops out here in Akpafu ka’u. See Lafon (1960 : 92) for confirmation from placenames etc.: Ausci, Aoiz, Auch.

The second part of the word, ka or ke is a word for “speak”, Niger-Congo gue “voice, language”, Ewe, Ga gbe “voice”, Agni guere “language, speech”, Yoruba i-gbe “loud cry”, Gbari e-gwe, e-gbe “mouth”. The e is for original a in this word. Niger-Congo e is secondary. Compare Niger-Congo ka, ke, k’e “to speak”, which is related. The final sylable -ra is the Niger-Congo article. No clearer proof could be found that the Basques were originally the Akpafu!

Thus says mr. GJK Campbell-Dunn “M.A. (NZ), M.A. (Camb.) Ph.D.” in a most interesting document titled “Basque as Niger-Congo“. (Just to remind you, Akpafu is another name for Siwu, the language I’ve been doing fieldwork on over the last three years.) I mentioned this story over a year ago in the comments of an excellent post over at Glossographia titled Debunking and de-Basque-ing, but I never got around to posting about it here. In his post, Stephen Chrisomalis notes that “There is probably no culture or language that has attracted more pseudoscientific attention than Basque.”

I’m not intent on debunking Campbell-Dunn’s story here; I think the quotation above stands just fine on its own. But I do want to draw attention to the irony of this particular case. There you are, author of such groundbreaking works as The African Origins of Classical Civilisation, Maori: The African Evidence, and Who were the Minoans?: an African answer. You now want to solve the Basque enigma once and for all, and since the general thrust of your work is to link everything to Africa one way or another you set out to discover that Basque is in fact a Niger-Congo language. A look at the rich lexical material in Westermann (1927) provides ample inspiration. Let’s pick one of the Togo Remnant Languages, you think — after all, Basque is sort of remnant too. Akpafu. Euskara. Hey, why not. Let’s just see what we can do… no-one’s going to notice, right?

Well, I noticed. And I just want to say it loud and clear: Graham Campbell-Dunn’s work is crackpot science. Don’t believe it; don’t even read it. Siwu and Euskara are fascinating languages that deserve of serious research. But they are most certainly not related. Although… come closer, I have to tell you a secret…

References

  1. Ibarretxe-Antuñano, Iraide. 2006. Sound Symbolism and Motion in Basque. Lincom Europa.
  2. Westermann, Diedrich. 1927. Die Westlichen Sudansprachen Und Ihre Beziehungen Zum Bantu. Berlin: In kommission bei W. de Gruyter & co.