Utopia Documents: pulling scientific data into the PDF for interactive exploration

November 14th, 2010
by jodi

What if data were accessible within the document itself?

Utopia Documents is a free PDF viewer which recognizes certain enhanced figures, and fetches the underlying data. This allows readers to view and experiment with the tables, graphs, molecular structures, and sequences in situ.


You can download Utopia Documents for Mac and Windows to view enhanced papers, such as those published in The Semantic Biochemical Journal.

These screencasts were made from pages 9 and 10 of PDF of a paper by the Manchester-based Utopia team: T. K. Attwood, D. B. Kell, P. Mcdermott, J. Marsh, S. R. Pettifer, and D. Thorne. Calling international rescue: knowledge lost in literature and data landslide! Biochemical Journal, Dec 2009. doi:10.1042/BJ20091474.

In an interview at the Guardian, Utopia’s Phillip McDermott says:

“Utopia Documents links scientific research papers to the data and to the community. It enables publishers to enhance their publications with additional material, interactive graphs and models. It allow the reader to access a wealth of data resources directly from the paper they are viewing, makes private notes and start public conversations. It does all this on normal PDFs, and never alters the original file. We are targeting the PDF, since they still have around 80% readership over online viewing.

“Semantics, loose-coupling, fingerprinting and linked-data are the key ingredients. All the data is described using ontologies, and a plug-in system allows third parties to integrate their database or tool within a few lines of script. We use fingerprinting to allow us to recognise what paper a user is reading, and to spot duplicates. All annotations are held remotely, so that wherever you view a paper, the result is the same.”

I’d still like to see a demo of the commenting functionality.

I’d also be particularly interested in the publisher perspective, about the production work that goes into creating the enhancements. Portland Press’s October news announces that they’ve been promoting Utopia at the Charleston conference and SSP, with an upcoming appearance at the STM Innovations Seminar.

Utopia came to my attention via Steve Pettifer’s mention.

Tags: , , , , , , , , ,
Posted in future of publishing, information ecosystem, library and information science, scholarly communication, semantic web, social semantic web | Comments (4)

  • Jakob says:

    Nice. We could have this kinds of documents with “data accessible within the document” since years. Maybe this time it gets more adopted with PDFs, but you could also have it with HTML, or XML, or RDF or any format (remember Microsoft OLE/COM documents?). We could already have such documents even text based in the 1980s. The concept was there proposed by Ted Nelson long before the rise of personal computers. It is not a matter of technology, but a matter of habits, culture, and businesses.

  • Jodi says:

    “It is not a matter of technology, but a matter of habits, culture, and businesses.” Absolutely!

    Essentially, this is using interactive hypertext in a PDF. The technology has to be sufficiently well-understood (maybe ‘mundane’) in order for the social/organizational changes to be easy.

  • The currently downloadable version should already have the comment facility in place. You’ll need to make yourself an account in order to write comments though.

    To add a comment, highlight the region, and select ‘Comment’ from the menu that appears. At the moment comments are either ‘private’ (only you see them) or ‘public’ (everyone in the world sees them!), so the model is rather crude from that perspective… we’re working on a more sophisticated sharing version.

  • Jodi says:

    Ah, I see: ‘Copy’, ‘Comment’, ‘Explore’. I guess I expected a menu item for commenting–even if it were greyed out if nothing were highlighted that would make it more discoverable.

    One small UI thing I noticed: when I ‘Explore’, if I don’t have the inspector open, it’s not clear that anything has happened.