I have been blogging at The Ideophone since 2007, and not all of it has been as ephemeral as my PhD promotor once feared. Over the past week I have worked with Rogue Scholar to archive selected content from The Ideophone and make it more durably accessible. This posts documents the process and some of the choices made.
Interjections are, in Felix Ameka’s memorable formulation, “the universal yet neglected part of speech” (1992). They are rarely the subject of historical, typological or comparative research in linguistics, and they are notably underrepresented in descriptive grammars. As grammars are the main source of data for typologists, this is of course a perfect example of a self-reinforcing feedback loop. How can we break this trend?
There is a minor industry in speech science and NLP devoted to detecting and removing disfluencies. In some of our recent work we’re showing this adversely impacts voice user interfaces. Here I review a case where the hemming and hawing is the point — and where removing it adversely impacts our ability to make sense of what people do in interaction.
The last time I blindly accepted an invitation to speak was in 2012, when I was invited to an exclusive round table on the future of linguistics. As a fresh postdoc I was honoured and bedazzled. When the programme was circulated, I got a friendly email from a colleague asking me how I’d ended up there, and whether I thought the future of linguistics was to be all male. Turns out the round table was exclusive in more than one sense.
This is a the second part in a two part series of peer commentary on a recent preprint.
One of the benefits of today’s preprint culture is that it is possible to provide constructive critique of pending work before it is out. This post is written in that spirit.
In a recent BBS paper, Clark & Fischer propose that people see social robots as interactive depictions and that this explains some aspects of people’s behaviour towards them. We note that they leave unexamined the notion of “social” in social robots: the question of how technologies like this become enmeshed in human sociality.
A lot of our recent work revolves around working with conversational data, and one thing that’s struck me is that there are no easy ways to create compelling visualizations. In the Elementary Particles of Conversations project we’re aiming to change that. Here’s a sneak peek.
It’s easy to forget amidst a rising tide of synthetic text, but language is not actually about strings of words, and language scientists would do well not to chain themselves to models that presume so. For apt and timely commentary we turn to Bronislaw Malinowski
We don’t generally see PhD dissertations as an exciting genre to read, and that is wholly our loss. As the publishing landscape of academia is fast being homogenised, the thesis is one of the last places where we have a chance to see the unalloyed brilliance of up and coming researchers. Let me show you using three examples of remarkable theses I have come across in the past years.
Sketches, visualizations and other forms of externalizing cognition play a prominent role in the work of just about any scientist. It’s why we love using blackboards, whiteboards, notebooks and scraps of paper. I rarely rave about tools (to each their own, etc.) but here I write about the Remarkable, an e-paper device that has changed my habits for the better in several ways: I’ve been reading more, taking more notes, writing more, and also doodling and sketching more. I would describe it as a distraction-free piece of technology with just the right affordances.
Over two years ago I wrote about the unstoppable tide of uninformation that follows the rise of large language models. With ChatGPT and other models bringing large-scale text generation to the masses, I want to register a dystopian prediction: this enables a whole new form of monetization.
The construction of gothic cathedrals like Chartres was governed not by blueprints but by “talk, tradition, and templates” — at least that is what Turnbull has compellingly argued. When you come across such a neatly alliterative triad, there are two ways you can go. You can adopt the terms in an unexamined way and rely on their alliterative power. Or you can go meta and think critically about what it takes to make a point that is as compelling as this in both form and content. See, and I like that second move a lot more.
DALL-E, a new image generation system by OpenAI, does impressive visualizations of biased datasets. A widely circulated PR animation features meme-like koala dunking a baseball leading into an array of old white men — representing at one blow the past and future of representation and generation. This post jumps from reflections on techbros to the erasure of human labour in Cosmopolitan’s rushed “first AI magazine cover”.
Few historical maps of Ghana’s Volta and Oti regions have been invested with so much political and sociohistorical meaning as Hans Gruner’s 1913 map of the Togo Plateau. Gruner, stationed for over twenty years at Misahöhe in present-day Togo, was a long-time colonial administrator known as much for his ethnographical and historical knowledge of the area as for being a petty tyrant with a powerful grip on ‘his’ district. The map is obscure and hard to find, and I make available some digital versions here.
There is a considerable halo-effect attached to JIFs, whereby an article that ends up in a high IF journal (whether by sheer brilliance or simply knowing the right editor, or both) is treated, unread, with a level of veneration normally reserved for Wunderkinder. Usually this is done by people totally oblivious to network effects, gatekeeping and institutional biases.
Lezenswaardig: een groep jonge medici ageert tegen de marketing-wedstrijd waarin volgens hen narratieve CVs in kunnen ontaarden — de nieuwste bijdrage aan het Erkennen & Waarderen-debat. Maar niets is wat het lijkt. Over evidence-based CVs, kwaliteit & kwantificatie.
A preprint claims that “ideas from theoretical linguistics have played no role in [NLP]”. Outside the confines of Chomskyan linguistics folks have long been working on incorporating storage, retrieval, gating and attention in theories of language, with direct relevance to computational models. The only way to give any content to the claim is by giving the notion “theoretical linguistics” the narrowest conceivable reading.
I have other things to do but one day I’ll enlarge on the insidious effects of elevating this cursed little histogram of “Research output per year” as the single most important bit of information about academics at thousands of universities that use Elsevier Pure. Consider this rant my notes for that occasion.
With Times Higher Education writing about citation gaming and hyperprolific authors (surely not unrelated) I hope we can save some of our attention for what Uta Frith and others have called slow science. On that note, consider this: Team science is (often) slow science.