A while back some low quality citations showed up on my Google Scholar profile. (Yes, I’m narcissistic enough to keep track of where my work is cited.) They had titles like “CHAPTER 2 draft — email email@example.com” and it was hard find actual bibliographic metadata. Google Scholar seemed to have scraped random PDFs uploaded on Academia.edu and decided it was worth counting the citations in them even in the absence of proper metadata. I shared this on Twitter and promptly forgot about it.
Then I got an email from someone at Academia.edu asking me to say a bit more about my concerns with poor metadata. I decided to put them in a blog post. I’m afraid it turned into a a bit of rant about how Academia.edu seems built not so much for sharing scientific information as for playing to our vanity. Sorry about that. Let’s start with the poor metadata issue, which turns out to be rather pervasive.
- Academia.edu doesn’t currently expose any metadata using standard formats like RDF, unAPI or embedded <metadata>. The result is that reference managers like Zotero cannot easily detect the papers, and users of such tools cannot easily add them to their libraries. (See exposing metadata for background.) For those hoping to cite works uploaded there, it makes life more difficult than it needs to be. For users of academia.edu, this hurts citability. (Yes, I know academia.edu touts a 69% citation advantage. See here for a discussion of some concerns about that study. My point here is simply that whoever wants to cite papers found on academia currently has to get the metadata from elsewhere.)
- Relatedly, Academia.edu doesn’t currently comply with Google Scholar standards for exposing metadata. (See inclusion guidelines for background.) The result is that Google Scholar resorts to scraping only the most superficial info available, often straight from the pdfs, which is usually inferior or underspecified. This is the number one reason for the junk citations in Google Scholar. It means many things will be overcounted (for papers not detected as duplicates) and other things are counted but not identified correctly. For users of academia.edu, this hurts both the findability and the citability of their work. For users of Google Scholar, this adds more noise in an already noisy system.
- Academia.edu seems built for single-authored papers, and its handling of multi-authored papers is very poor. There is no system for duplicate detection and resolution. Only the original uploader can add or edit bibliographic metadata once an item is added. It is too easy for multiple authors to upload the same paper with slight differences in bibliographic metadata. It is too hard to clean up the mess and make sure there is only one good version of record. This affects people’s profiles and has undesirable knock-on effects for the points above.
- The process of adding papers is geared towards enriching Academia.edu content rather than towards promoting the sharing of correct and complete scientific information. After it gets your PDF (and that’s a requirement for adding a paper), there are very few opportunities for providing robust metadata and the primary upload interface cares more about fluff like ‘research interests’ than about getting the basic bibliographic metadata right. It easily gets author order wrong and offers users only manual ways to fix this. There is no way to import via DOI or PMID or even to record these identifiers — a fatal lack of concern for interoperability which is quite surprising. Essentially, a user interface should make it easy for people to get things right, and hard to get them wrong. The current user interface for adding papers does the exact opposite (see annoted screenshot).
- It is surprisingly (and needlessly) hard to add the really important information. More details can only be added after importing papers, which means most users won’t do it. As far as I can see, the only way to do it is to go back to your publication list, hover over the Edit button, and find other fields to edit. Even here, there appears to be no place for identifiers like DOI or PMID! Page numbers and so on are hidden in an “Other” field. Any user interface designer will tell you that stuff buried this deeply might as well be left out: only a negligible amount of users will find and use it. (This is reminiscent of the backward way in which Facebook designs its privacy UIs, but confusingly, there seems to be little reason for Academia.edu to make it so hard to get things right.) Anyway, if you’ve succeeded in adding some of this metadata, congratulations for completing a futile exercise. The information you painstakingly entered is nowhere exposed and so cannot be reused or exported, except, again, manually. Quite unbelievable in the age of APIs and interoperability!
My conclusion from these points: Academia.edu seems not to care about promoting the curation and sharing of correct and high quality metadata of scientific publications. One might counter that this is not the goal of the network, and that the content of the papers is what’s most important anyway. But peer-reviewed publications are still the main vehicle for advancing scientific results, and citations are still the main currency of cumulative science. So getting bibliographic metadata right is key to promoting science as a cumulative enterprise. Nor should this be hard in the era of DOIs and PMIDs, making it all the more surprising Academia.edu doesn’t care about them.
But plenty of people like it!
If things are so poor, why do relatively few people complain and why are so many users seemingly happy about the service? There are several reasons. Not every academic has access to personal website or an open academic repository, and Academia.edu presents itself as one of the easiest options to make their work visible online (never mind the fact that it doesn’t actually make it easily citable). It may be a way to keep up with colleagues. Also, I’ve heard people are happy with its “sessions” as a way to get interactive feedback on a paper. But there’s one important reason that I haven’t seen commented upon often: Academia.edu plays to our vanity. Many elements of its design are built to satisfy and amplify our craving for external validation.
Judging from its position in the navigation menu, analytics are one of the most important elements of Academia.edu. Upload papers, tag them with research interests, and they generate views. Follow people and they’ll follow you back, generating profile views. Tomorrow your paper may be in the top 5%! Next week you might be crowned as the 1%. Look, your paper was just read by someone from Vienna! Your work is being read in 27 countries! You’re being followed by someone you barely know! All those things are nicely presented in spiffy graphs — evidently a part of Academia.edu that a lot of design resources have been devoted to.
And note some of the design here is cynical. The only two time windows offered are 30 days and 60 days, inviting you to come back at least this often to keep up with the stats (yes, you can download a CSV for more, but once again that is one of those power user features that will rarely be used). Views are promoted over actual downloads while bounce rates (basically, how many people really stick around after arriving on a page) are concealed. The most important metadata for papers (again, just taking the design as a measure of what Academia.edu promotes as important) is this mostly meaningless view count — not where it was published, nor how to cite it, just how many people had a look.
Academia.edu nudges us towards narcissism
Does this mean everybody on Academia.edu is a narcissist? Of course not. My point is not about users; it is about the design of the service. User interface design is not innocent: as a recent Medium essay noted, technology hijacks people’s minds, constraining our options and nudging us in ways that often elude our awareness. Not everybody on Academia.edu is a narcissist, but many aspects of its design make it easy to become one.
On balance, I feel Academia.edu doesn’t really take me seriously as an academic. It takes my work to make a profit (for instance by putting advertisements around it), totally botches the metadata and tries to appease me by offering stats and social rankings that promote constant comparison. And nothing in its design suggests a regard for getting even the most basic bibliographic information about my scientific work right — even though that would be one way to turn page views into citations. This is one of the reasons the only paper I’ve uploaded there is one pointing people to where they can find all my papers freely and without hassle.
To end on a slightly more optimistic note: at least the poor metadata problem can be solved. As far as I can see, nothing in Academia.edu’s business model turns on proliferating poor and incomplete metadata. The citation advantage it likes to claim could significantly increase if it started exposing metadata in ways that are compatible with widely used tools like Google Scholar and Zotero. It still won’t be a service I’m keen on using, but I do hold hope it will become better at promoting cumulative science rather than playing to our vanity.