Code4Lib 2012 talk proposals are out

November 21st, 2011

Code4Lib2012 talk proposals are now on the wiki. This year there are 72 proposals for 20-25 slots. I pulled out the talks mentioning semantics (linked data, semantic web, microdata, RDF) for my own convenience (and maybe yours).

Property Graphs And TinkerPop Applications in Digital Libraries

  • Brian Tingle, California Digital Library

TinkerPop is an open source software development group focusing on technologies in the graph database space.
This talk will provide a general introduction to the TinkerPop Graph Stack and the property graph model is uses. The introduction will include code examples and explanations of the property graph models used by the Social Networks in Archival Context project and show how the historical social graph is exposed as a JSON/REST API implemented by a TinkerPop rexster Kibble that contains the application’s graph theory logic. Other graph database applications possible with TinkerPop such as RDF support, and citation analysis will also be discussed.

HTML5 Microdata and

  • Jason Ronallo, North Carolina State University Libraries

When the big search engines announced support for HTML5 microdata and the vocabularies, the balance of power for semantic markup in HTML shifted.

  • What is microdata?
  • Where does microdata fit with regards to other approaches like RDFa and microformats?
  • Where do libraries stand in the worldview of and what can they do about it?
  • How can implementing microdata and optimize your sites for search engines?
  • What tools are available?

“Linked-Data-Ready” Software for Libraries

  • Jennifer Bowen, University of Rochester River Campus Libraries

Linked data is poised to replace MARC as the basis for the new library bibliographic framework. For libraries to benefit from linked data, they must learn about it, experiment with it, demonstrate its usefulness, and take a leadership role in its deployment.

The eXtensible Catalog Organization (XCO) offers open-source software for libraries that is “linked-data-ready.” XC software prepares MARC and Dublin Core metadata for exposure to the semantic web, incorporating FRBR Group 1 entities and registered vocabularies for RDA elements and roles. This presentation will include a software demonstration, proposed software architecture for creation and management of linked data, a vision for how libraries can migrate from MARC to linked data, and an update on XCO progress toward linked data goals.

Your Catalog in Linked Data

  • Tom Johnson, Oregon State University Libraries

Linked Library Data activity over the last year has seen bibliographic data sets and vocabularies proliferating from traditional library
sources. We’ve reached a point where regular libraries don’t have to go it alone to be on the Semantic Web. There is a quickly growing pool of things we can actually ”link to”, and everyone’s existing data can be immediately enriched by participating.

This is a quick and dirty road to getting your catalog onto the Linked Data web. The talk will take you from start to finish, using Free Software tools to establish a namespace, put up a SPARQL endpoint, make a simple data model, convert MARC records to RDF, and link the results to major existing data sets (skipping conveniently over pesky processing time). A small amount of “why linked data?” content will be covered, but the primary goal is to leave you able to reproduce the process and start linking your catalog into the web of data. Appropriate documentation will be on the web.

NoSQL Bibliographic Records: Implementing a Native FRBR Datastore with Redis

  • Jeremy Nelson, Colorado College,

In October, the Library of Congress issued a news release, “A Bibliographic Framework for the Digital Age” outlining a list of requirements for a New Bibliographic Framework Environment. Responding to this challenge, this talk will demonstrate a Redis ( FRBR datastore proof-of-concept that, with a lightweight python-based interface, can meet these requirements.

Because FRBR is an Entity-Relationship model; it is easily implemented as key-value within the primitive data structures provided by Redis. Redis’ flexibility makes it easy to associate arbitrary metadata and vocabularies, like MARC, METS, VRA or MODS, with FRBR entities and inter-operate with legacy and emerging standards and practices like RDA Vocabularies and LinkedData.

ALL TEH METADATAS! or How we use RDF to keep all of the digital object metadata formats thrown at us.

  • Declan Fleming, University of California, San Diego

What’s the right metadata standard to use for a digital repository? There isn’t just one standard that fits documents, videos, newspapers, audio files, local data, etc. And there is no standard to rule them all. So what do you do? At UC San Diego Libraries, we went down a conceptual level and attempted to hold every piece of metadata and give each holding place some context, hopefully in a common namespace. RDF has proven to be the ideal solution, and allows us to work with MODS, PREMIS, MIX, and just about anything else we’ve tried. It also opens up the potential for data re-use and authority control as other metadata owners start thinking about and expressing their data in the same way. I’ll talk about our workflow which takes metadata from a stew of various sources (CSV dumps, spreadsheet data of varying richness, MARC data, and MODS data), normalizes them into METS by our Metadata Specialists who create an assembly plan, and then ingests them into our digital asset management system. The result is a beautiful graph of RDF triples with metadata poised to be expressed as HTML, RSS, METS, XML, and opens linked data possibilities that we are just starting to explore.

UDFR: Building a Registry using Open-Source Semantic Software

  • Stephen Abrams, Associate Director, UC3, California Digital Library
  • Lisa Dawn Colvin, UDFR Project Manager, California Digital Library

Fundamental to effective long-term preservation analysis, planning, and intervention is the deep understanding of the diverse digital formats used to represent content. The Unified Digital Format Registry project (UDFR, will provide an open source platform for an online, semantically-enabled registry of significant format representation information.

We will give an introduction to the UDFR tool and its use within a preservation process.

We will also discuss our experiences of integrating disparate data sources and models into RDF: describing our iterative data modeling process and decisions around integrating vocabularies, data sources and provenance representation.

Finally, we will share how we extended an existing open-source semantic wiki tool, OntoWiki, to create the registry.

saveMLAK: How Librarians, Curators, Archivists and Library Engineers Work Together with Semantic MediaWiki after the Great Earthquake of Japan

  • Yuka Egusa, Senior Researcher of National Institute of Educational Policy Research
  • Makoto Okamoto, Chief Editor of Academic Resource Guide (ARG)

In March 11th 2011, the biggest earthquake and tsunami in the history attacked a large area of northern east region of Japan. A lot of people have worked together to save people in the area. For library community, a wiki named "savelibrary" was launched for sharing information on damages and rescues on the next day of the earthquake. Later then people from museum curators, archivists and community learning centers started similar projects. In April we joined to a project "saveMLAK", and launched a wiki site using Semantic MediaWiki under

As of November 2011, information on over 13,000 cultural organizations are posted on the site by 269 contributors since the launch. The gathered information are organized along with Wiki categories of each type of facilities such library, museum, school, etc. We have held eight edit-a-thons to encourage people to contribute to the wiki.

We will report our activity, how the libraries and museums were damaged and have been recovered with lots of efforts, and how we can do a new style of collaboration with MLAK community, Wiki and other voluntary communities at the crisis.

Extended deadline for STLR 2011

April 29th, 2011

We’ve extended the STLR 2011 deadline due to several requests; submissions are now due May 8th.

JCDL workshops are split over two half-days, and we are lucky enough to have *two* keynote speakers: Bernhard Haslhofer of the University of Vienna and Cathy Marshall of Microsoft Research.

Consider submitting!

The 1st Workshop on Semantic Web Technologies for Libraries and Readers

STLR 2011

June 16 (PM) & 17 (AM) 2011
Co-located with the ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2011 Ottawa, Canada

While Semantic Web technologies are successfully being applied to library catalogs and digital libraries, the semantic enhancement of books and other electronic media is ripe for further exploration. Connections between envisioned and emerging scholarly objects (which are doubtless social and semantic) and the digital libraries in which these items will be housed, encountered, and explored have yet to be made and implemented. Likewise, mobile reading brings new opportunities for personalized, context-aware interactions between reader and material, enriched by information such as location, time of day and access history.

This full-day workshop, motivated by the idea that reading is mobile, interactive, social, and material, will be focused on semantically enhancing electronic media as well as on the mobile and social aspects of the Semantic Web for electronic media, libraries and their users. It aims to bring together practitioners and developers involved in semantically enhancing electronic media (including documents, books, research objects, multimedia materials and digital libraries) as well as academics researching more formal aspects of the interactions between such resources and their users. We also particularly invite entrepreneurs and developers interested in enhancing electronic media using Semantic Web technologies with a user-centered approach.

We invite the submission of papers, demonstrations and posters which describe implementations or original research that are related (but are not limited) to the following areas of interest:

  • Strategies for semantic publishing (technical, social, and economic)
  • Approaches for consuming semantic representations of digital documents and electronic media
  • Open and shared semantic bookmarks and annotations for mobile and device-independent use
  • User-centered approaches for semantically annotating reading lists and/or library catalogues
  • Applications of Semantic Web technologies for building personal or context-aware media libraries
  • Approaches for interacting with context-aware electronic media (e.g. location-aware storytelling, context-sensitive mobile applications, use of geolocation, personalization, etc.)
  • Applications for media recommendations and filtering using Semantic Web technologies
  • Applications integrating natural language processing with approaches for semantic annotation of reading materials
  • Applications leveraging the interoperability of semantic annotations for aggregation and crowd-sourcing
  • Approaches for discipline-specific or task-specific information sharing and collaboration
  • Social semantic approaches for using, publishing, and filtering scholarly objects and personal electronic media


*EXTENDED* Paper submission deadline: May 8th 2011
Acceptance notification: June 1st 2011
Camera-ready version: June 8th 2011



Each submission will be independently reviewed by 2-3 program committee members.


  • Alison Callahan, Dept of Biology, Carleton University, Ottawa, Canada
  • Dr. Michel Dumontier, Dept of Biology, Carleton University, Ottawa, Canada
  • Jodi Schneider, DERI, NUI Galway, Ireland
  • Dr. Lars Svensson, German National Library


Please use PDF format for all submissions. Semantically annotated versions of submissions, and submissions in novel digital formats, are encouraged and will be accepted in addition to a PDF version.

All submissions must adhere to the following page limits:
Full length papers: maximum 8 pages
Demonstrations: 2 pages
Posters: 1 page

Use the ACM template for formatting:

Submit using EasyChair:

CiTO in the wild

October 18th, 2010

CiTO has escaped the lab and can now be used either directly in the CiteULike interface or with CiteULike machine tags. Go Citation Typing Ontology!

In the CiteULike Interface

To add a CiTO relationship between articles using the CiteULike interface, both articles must be in your own library. You’ll see a a “Citations (CiTO)” section after your tags. Click on edit and set the current article as the target.

set the CiTO target

First set the CiTO target

Then navigate around your own library to find a related article. Now you can add a CiTO tag.

Adding a CiTO tag in CiteULike

Adding a CiTO tag in CiteULike

There are a lot of choices. Choose just one. :)

CiTO Object Properties appear in the dropdown

CiTO Object Properties now appear in the dropdown

Congratulations, you’ve added a CiTO relationship! Now mousing over the CiTO section will show details on the related article.

CiTO result

Mouse over the resulting CiTO tag to get details of the related article

Machine Tags

Machine tags take fewer clicks but a little more know-how. They can be added just like any other tag, as long as you know the secret formula: cito--(insert a CiTO Object Property here from this list)--(insert article permalink numbers here) Here are two more concrete examples.

First, we can keep a list of articles citing a paper. For example, tagging an article


says “this article CiTO:cites article 137511″. Article 137511 can be found at, aka JChemPaint – Using the Collaborative Forces of the Internet to Develop a Free Editor for 2D Chemical Structures. Then we can get the list of (hand-tagged) citations to the article. Look—a community generated reverse citation index!

Second, we can indicate specific relationships between articles, whether or not they cite each other. For example, tagging an article


says “this item CiTO:usesmethodin item 42338″. Item 42338 is found at, aka The Chemistry Development Kit (CDK):  An Open-Source Java Library for Chemo- and Bioinformatics.


Automation and improved annotation interfaces will make CiTO more useful. CiTO:cites and CiTO:isCitedBy could used to mark up existing relationships in digital libraries such as ACM Digital Library and CiteSeer, and could enhance collections like Google Books and Mendeley, to make human navigation and automated use easier. To capture more sophisticated relationships, David Shotton has hopes of authors marking up citations before submitting papers; if it’s required, anything is possible. Data curators and article commentators may observe contradictions between papers, or methodology reuses; in these cases CiTO could be layered with an annotation ontology such as AO in order to make the provenance of such assertions clear.

CiTO could put pressure on existing publishers and information providers to improve their data services, perform more data cleanup, or to exposing bibliographies in open formats. Improved tools will be needed, as well as communities that are willing to add data by hand, and algorithms for inferring deep citation relationships.

One remaining challenge is aggregation of CiTO relationships between bibliographic data providers; article identifiers such as DOI are unfortunately not universal, and the bibliographic environment is messy, with many types of items, from books to theses to white papers to articles to reports. CiTO and related ontologies will help explicitly show the bibliographic web and relationships between these items, on the web of (meta)data.

Further Details

CiTO is part of an ecosystem of citations called Semantic Publishing and Referencing Ontologies (SPAR); see also the JISC Open Citation Project which is taking bibliographic data to the Web, and the JISC Open Bibliography Project. For those familiar with Shotton’s earlier writing on CiTO, note that SPAR breaks out some parts of the earlier formulation of this ontology.

W3C Library Linked Data Incubator Group starting

May 25th, 2010

The W3C has announced an incubator activity around Library Linked Data. I’ll be one of DERI’s participants in the group.

Its mission? To help increase global interoperability of library data on the Web, and to bring together people from archives, museums, publishing, etc. to talk about metadata. See the charter for more details.

Interested in joining? If you’re at a W3C member organization, ask your Advisory Committee Representative to appoint you. Or, get appointed as an invited expert by contacting one of the chairs (Tom Baker, Emmanuelle Bermes, Antoine Isaac); their contact info is available from the participants’ list.

Or, you can follow along on the incubator group’s public mailing list. (For organizing, the Sem lib mailing list was used.)

The first teleconference will be Thursday, 3 June at 1500 UTC.

