Annotation summaries: standardization needed

August 4th, 2011
by jodi

I’m finding an iPad amazing for reading PDFs — it’s like instant printing, with no weight to carry around (heavy, and they get wet). And with software like iAnnotatePDF and GoodReader, I can annotate with just a bit more effort than while using pen and paper.

iAnnotate (video review) is the killer app that convinced me to buy an iPad. But it has a killer flaw: I couldn’t keep my reading organized with it.

Hence I started looking into reference managers that would work well on the iPad–allowing annotation, making it easy to keep PDF’s organized, and ensuring that annotations were kept in a sensible place.

Sente fulfills many of my requirements. Sync seems to work effortlessly — well exceeding my experience with other products. The annotation process is reasonably smooth but so far I haven’t found a way to export annotations directly.

This is a bit problematic because PDF editors don’t seem to play nice with each others’ annotations. For instance, iAnnotate and GoodReader both export annotations for their own software. You get something very useful and readable like this:

Page 1, Highlight (Yellow):
Content: “The scientific use of Twitter has received some attention in previous work: [4] and [5] have performed several automatic analyses of tweets collected for different conference hashtags, including for example time series and lists of most active twitterers. [3] and [9] have furthermore carried out manual analyses of tweet contents for conference tweet datasets to determine, what conference participants are tweeting about. [10] are develop ing automatic methods for extracting semantic information from conference tweets. [6] have focused on tweets published by a set of manually identified scientists and have investigated their citation behavior.”

Page 1, Highlight (Yellow):
Content: “citations and references are two sides of the same coin.”

But when you annotate in one program and get notes from another program, things get messier.

For PDFs annotated externally, iAnnotate lists highlights without only grabs text from the notes, like this:

Page 1, Highlight (Custom Color: #fdf7bc):

Page 2, Highlight (Custom Color: #fdf7bc):

Page 2, Note (Custom Color: #fdffaa):
Not sure why this stands out from other lists by individuals.

GoodReader plays a bit nicer with annotations from other programs: it breaks annotations made by other programs at line boundaries. This makes summaries a little difficult to read, but at least there’s some content:

Highlight (color #FDF7BC):
first of all it will have to start with the general problem in

Highlight (color #FDF7BC):
analyzing scientific impact of Twitter:

Highlight (color #FDF7BC):
[6] define

Highlight (color #FDF7BC):
tweet to a peer-U

I’m currently checking into the standardization around annotations summaries.

I’d be very interested to hear about how you detect metadata and annotation differences in PDFs. As examples, I’ve marked up a recent WebSci poster, with some annotations from GoodReader, from iAnnotatePDF, and from Sente.

Tags: , , ,
Posted in books and reading, iOS: iPad, iPhone, etc. | Comments (1)

  • Leonard Rosenthol says:

    There is already a standard for PDF annotations – it’s called FDF and it’s documented in the PDF standard (ISO 32000-1:2008) in section 12.7.7. It is supported by all of the desktop PDF viewers (Adobe, Nuance, FoxIt) EXCEPT Apple Preview.

    Leonard Rosenthol
    PDF Architect
    Adobe Systems