Sounding out ideas on language, vivid sensory words, and iconicity

Done well: WALS Online

Note: An updated version of this review has been published in eLanguage on July 15th, 2008.

A common dashboard sticker in Ghanaian taxi’s has it that “If it must be done, it must be done well”, where ‘done well’ cleverly doubles as a brand name. This is largely irrelevant except by way of introducing WALS Online, the web version of the World Atlas of Language Structures, which really has been done well.

The massive 2005 volume and the somewhat bumpy interface of the interactive maps on the accompanying CDROM have been transformed into a slick web interface with all sorts of clever stuff going on behind the scenes. In a time where an increasing number of print sources is thrown online simply in the form of scans or huge PDF files, it is refreshing to see what true adaptation to the medium of hypertext can bring us. One consequence of this is that WALS Online, rather than a reprint or a second edition of WALS 2005, has become a separate publication, edited by the same authors but published by the Max Planck Digital Library.

Features and languages

WALS Online is a website consisting of five main parts. The first part, Features, functions as an index to the 142 maps and chapters of the original edition. The opening page of each feature is merely a configuration screen from where one can navigate to the chapter text or map, change the indicators used on the map, or select another feature for combined display. The chapter text is beautifully laid out, with an eye for good web typography. A minor issue is that after using the atlas for some time, the configuration screen starts to feel as an unnecessary barrier between the index and the texts and maps. It might have been better to make the content more directly accessible from the main index of features.

The second part, Languages, provides multiple interfaces to the languages that comprise the WALS dataset. Languages are indexed by name, by language family, and by country. Moreover, under ‘Choose by region’, a nice implementation of Google Maps enables the user to display the languages from any rectangular piece of the earth’s surface simply by dragging sliders on a world map (or by manually inputting latitudes and longitudes). For unclear reasons the display is limited to the first 100 results of the query; this limitation seems a bit out of place since several other types of requests (listing all Niger-Congo languages, for example) easily result in more than 100 languages being displayed on the map. Things missing here are (1) numbers of languages — it would be trivial to display the number of languages included in WALS from a given country, family, or region on the relevant pages — and (2) a big overview map of all the languages included.

References and authors

The third part is a database of all 5584 references perused in extracting the feature values for the individual languages. This part of the website is extensively cross-referenced from both the Feature and the Language pages. My only gripe with it is that apart from the cross-referencing, the sole interface offered to explore this part of the website is a search screen. Even if a list of all the references would be rather too long, it would have been nice if references could at least be browsed by language family and by language.1 The search screen, additionally, isn’t very generous in the hints it provides, blurting ‘You have not provided enough search criteria!’ at various creative search attempts. But that’s just me wanting too much, probably; for general purposes, the search works excellent and the display of search results and individual citations is also perfectly fine. (More on that below.)

One consequence of putting all the references in a central database is that users are dependent on the quality of the cross-referencing job. On the whole, this job seems to have been done very well. Nonetheless, it was not too difficult to find an unlinked reference included in the database (Darwin 1878, mentioned but not linked in chapter 142) and worse, an unlinked reference which is wholly absent from the database (Zeschan 2004 in chapter 140). Hopefully these glitches can be corrected in due course.

The fourth part is simply an index of all the authors that coded features and wrote the chapter texts, with links to the features. No biographical information is provided. The fifth part is called Newsblog. The link leads to messages in the category ‘News’ on a weblog that at the same time functions as a place where comments pertaining to individual Features/Chapters can be left. To that end, every feature page includes a link ‘discuss’ which leads to a post on the blog. This is an innovative way of soliciting comments. There are two more links in the main navigation bar: one leads to a contact page, and one leads to an online Help feature.

So much for the interface. What about stuff behind the scenes?

Under the hood

Behind the scenes of any web application is interesting stuff that average users need not worry about, but that is the foundation of usability and extensibility. First a small gold nugget: a downloadable KML file is provided for each page that includes a map. This is a nice touch which really characterizes the great attention to detail that makes using WALS Online such a pleasant experience.

The reference database is an example of how an online bibliography should be done.2 Not only is it fully searchable, but every single citation can also be exported to various formats. A very neat feature, invisible to most users, is the embedding of bibliographic data (in the COinS format) on individual reference pages. This allows OpenURL resolvers to look up the citation online or in specified databases. It also allows users of clever research tools like Zotero (review here) to directly save the citation to their library. A logical extension of this feature would have been to provide a COinS span for each individual feature/chapter, to make those as easily citable as the references from the database.

Another nice feature which may go unnoticed (though not unused) by many visitors is the URL layout, i.e. the shape of the web addresses of the individual pages. Most of the pages have nice, clean URLs without meaningless clutter like index.php?&page=bla&id#=etc. Even individual citations in the reference database have their own URLs (e.g. http://wals.info/refdb/record/Ameka-1991). This is a big plus, since web-savy users tend to think of the URL as another interface to the website (see URL as UI).

HTML and CSS

(Most readers will want to skip this section, which nags about unimportant details.) Amidst all the greatness, I could identify one missed opportunity. (X)HTML provides the possibility to ‘tag’ language data with the corresponding ISO code of the language. This is a nice feature which, if more widely and rigorously used, would bring us closer to usable web corpora. Regrettably, WALS Online does not code its language data like that. Instead, non-semantic class attributes are used in the CSS to specify widespread Unicode fonts. That part of the CSS, for that matter, is hugely redundant and could have been done a lot more economical and transparent.

Talking about transparency, most pages are coded in reasonably clean HTML 4.01 Transitional markup. There are several markup problems however, such as empty elements resulting from sloppy template coding (e.g. an empty ul ‘altnames’ if an alternate name is actually not available, as in the row ‘Routledge’ on the page for Ossetic); invalid attributes either deriving from the underlying Turbogears application or from AJAX-related stuff (like div ex:role="viewPanel"); and simple coding oversights such as the fact that id attributes always must start with a letter rather than a digit. The ‘Alternate names’ markup by the way is very messy: it uses a table for unclear reasons, and within that table a very redundant unordered list is used. And while we’re at it, it is also a bit unsatisfactory that h3 elements are used with a class attribute heading-2. But I probably shouldn’t be nitpicking like that; overall, the HTML is perfectly functional and good for what it needs to do: deliver a compelling interface to the users of WALS Online.

Part of this compelling interface is also the use of AJAX in many places to minimize whole page reloads and to streamline the UI experience. AJAX is used for example in providing realtime search suggestions; in retrieving the languages for a given feature value; and in pulling together the feature list on the main Features page.3

Conclusion

In conclusion, then, I simply want to reiterate what I started with: WALS Online is a formidable linguistic resource done well. It bears all the hallmarks of a well-executed web application that is here to stay for years to come. The blurb on the book version read ‘I suspect that many linguists will not be able to resist curling up with this massive volume on rainy days just for the fun facts.’ I suspect the same holds for this online version. Why not make yourself a nice cup of tea and enjoy the World Atlas of Language Structures Online?

(Hat tip: Lev Michael at Greater Blogazonia)

References

  1. Haspelmath, Martin, Matthew S. Dryer, David Gil, and Bernard Comrie, eds. 2005. The World Atlas of Language Structures. Oxford/New York: Oxford University Press.
  2. Haspelmath, Martin, Matthew S. Dryer, David Gil, and Bernard Comrie, eds. 2008. The World Atlas of Language Structures Online. Max Planck Digital Library.
  1. Links to the references are provided on individual language, but this is not the same as browsing a list of references for a given language. []
  2. It is not fully free from errors, though; for example, the editor of the 2004 volume Coordinating Constructions is cited in one reference as “Haspelmath, M.” and in another as “Haspelmath, Martin”. You’ll see what I mean when you search for ‘Haspelmath’. []
  3. That one is a bit over the top BTW, as there is now a visible delay before the list of features is displayed on the main Features page. It would have been better to just pull the list together with the HTML as the list is relatively static anyway. If it were simply coded in a table, the browser would be displaying the first rows before the user could say AJAX. []
,

7 responses to “Done well: WALS Online”

  1. thanx for the report – almost saves us the work of writing a better help! i’ll see to include the improvements you mentioned; but the big overview map of all languages may have to wait: at the moment most browsers’ javascript implementations have a hard time dealing with 2500 markers on a google map.

  2. oh, and some background about the reference linking: the links from chapter texts to references have so far been inserted automatically, which isn’t perfect as you noticed. it’s even worse for the linking of languages, genera and family. that’s why we plan on polishing the chapter texts manually over time (see the milestone).

    another problem regarding the reference linking: so far, in the reference database we have only the references cited for individual value assignments from the cd-rom. the set of references cited in the chapter texts is different from this, although largely overlapping. this is why you find references cited in chapter texts which are not present in the reference database.

    it’s actually even more complicated: since the two sets of references (from book and cd-rom) are different, it may even happen that the same citation keys (e.g. “Dryer 1998b”) refer to different references.

    so essentially, it isn’t perfect yet. but we hope to improve and are thankful for bug reports.

  3. and once more (i’m reading the post in stages):

    what do you mean with “tagging language data”? tagging the glosses/examples in the chapter texts? i was actually looking for some sort markup standard for these, but didn’t find any. could you propose something?

  4. Robert, thanks for your comments. By ‘tagging language data’ I simply meant using the lang attribute where possible (or its XML cousin xml:lang if you would be coding XHTML). In some places (e.g. the references, but strangely enough not the feature or language pages) I’ve seen a default lang="en" on the body element, but it would be good, I think, to go a little further and tag example sentences in the chapters with the proper language attributes. It’s as simple as tacking a lang="ISOcode" attribute on the HTML element containing the stuff.

    One project which does this consistently is Verba Africana, for example in the Ewe story Headless Crabs. For more details on that example, see the Verba Africana technical notes. (I did the graphics, code, and I18N for that project.)

    A good reference is the W3C I18N FAQ. That page also notes that IE, unfortunately and stupidly, does not support styling using the :lang CSS pseudo-selector, rendering a redundant use of generic CSS classes necessary (if IE would’ve supported it, the CSS really could have been dramatically simplified).

    BTW, another question about the CSS: what’s with all the .Px CSS selectors? A lot of the CSS rules targeting them are exactly the same. Looks like someone forgot that one can target a CSS declaration at several selectors at once, as in .P3, .P4, .P8 { font-family:'Charis SIL','Lucida Sans Unicode','Lucida Grande','Arial Unicode MS'; text-align:justify }

  5. Also, frequent use of !important in CSS declarations usually indicates that the cascading powers of CSS are not really being harnessed to any serious extent. This in turn means loss of future flexibility and ease of use.

    The more I look at it, the more I think that the HTML/CSS really could use a major revamping. That, or it is going to be a pain in the ass in the future.

  6. ok. i see. your comments target the chapter texts, mainly. we actually do plan on overhauling these (Testing HTML descriptions).

    the way they are right now is because we use openoffice to convert word documents to html. some editing tasks are still easier to do in openoffice. but once we stop doing this, we can actually improve the html by hand. we will have to do this anyway, because the automatic linking of references and languages is far from perfect.

    so we’ll try to improve the chapter texts over time, which will also be a chance to tag the languages.

Leave a Reply

Your email address will not be published. Required fields are marked *