Forking conversations, forking documents

August 7th, 2011
by jodi

When the topic of discussion changes, how do you indicate that? Tender Support seems clunky in some ways, but their forking mechanism helps conversations stay focused on their topic:

Forking with Tender Support

Lately forking has also been on my mind as the Library Linked Data group edits and reorganizes our draft report: wiki history and version control is helpful, but insufficient. What I miss most is a “fork” feature, where you could temporarily take ownership of a copy (socially, this indicates that something is a possibility, rather than the consensus; technically, it indicates provenance, would allow “show all forks of this”, and might help in merge changes back). Perhaps naming and tagging particular history items in MediaWiki could help address this, but I think really I want something like git.

I’ve seen a few examples of writing and editing prose with git; I’d like to get a better understanding of the best practices for making collaborative changes in texts with distributed version control systems. Surely somebody’s written up manuals on this?

Tags: , , , , , ,
Posted in argumentative discussions, library and information science, PhD diary, random thoughts | Comments (2)

Annotation summaries: standardization needed

August 4th, 2011
by jodi

I’m finding an iPad amazing for reading PDFs — it’s like instant printing, with no weight to carry around (heavy, and they get wet). And with software like iAnnotatePDF and GoodReader, I can annotate with just a bit more effort than while using pen and paper.

iAnnotate (video review) is the killer app that convinced me to buy an iPad. But it has a killer flaw: I couldn’t keep my reading organized with it.

Hence I started looking into reference managers that would work well on the iPad–allowing annotation, making it easy to keep PDF’s organized, and ensuring that annotations were kept in a sensible place.

Sente fulfills many of my requirements. Sync seems to work effortlessly — well exceeding my experience with other products. The annotation process is reasonably smooth but so far I haven’t found a way to export annotations directly.

This is a bit problematic because PDF editors don’t seem to play nice with each others’ annotations. For instance, iAnnotate and GoodReader both export annotations for their own software. You get something very useful and readable like this:

Page 1, Highlight (Yellow):
Content: “The scientific use of Twitter has received some attention in previous work: [4] and [5] have performed several automatic analyses of tweets collected for different conference hashtags, including for example time series and lists of most active twitterers. [3] and [9] have furthermore carried out manual analyses of tweet contents for conference tweet datasets to determine, what conference participants are tweeting about. [10] are develop ing automatic methods for extracting semantic information from conference tweets. [6] have focused on tweets published by a set of manually identified scientists and have investigated their citation behavior.”

Page 1, Highlight (Yellow):
Content: “citations and references are two sides of the same coin.”

But when you annotate in one program and get notes from another program, things get messier.

For PDFs annotated externally, iAnnotate lists highlights without only grabs text from the notes, like this:

Page 1, Highlight (Custom Color: #fdf7bc):

Page 2, Highlight (Custom Color: #fdf7bc):

Page 2, Note (Custom Color: #fdffaa):
Not sure why this stands out from other lists by individuals.

GoodReader plays a bit nicer with annotations from other programs: it breaks annotations made by other programs at line boundaries. This makes summaries a little difficult to read, but at least there’s some content:

Highlight (color #FDF7BC):
first of all it will have to start with the general problem in

Highlight (color #FDF7BC):
analyzing scientific impact of Twitter:

Highlight (color #FDF7BC):
[6] define

Highlight (color #FDF7BC):
tweet to a peer-U

I’m currently checking into the standardization around annotations summaries.

I’d be very interested to hear about how you detect metadata and annotation differences in PDFs. As examples, I’ve marked up a recent WebSci poster, with some annotations from GoodReader, from iAnnotatePDF, and from Sente.

Tags: , , ,
Posted in books and reading, iOS: iPad, iPhone, etc. | Comments (1)

091labs again!

August 4th, 2011
by jodi

Yesterday, our local hackerspace/makerspace re-opened!

For awhile now, Fiacre O’Duinn has been talking about the shared purpose between libraries and these spaces:

The ideas that fuel hackerspaces, such as cooperation, resource and information sharing, self-directed education, and a diversity of views are concepts that are central to our profession’s ethos.

Not to mention the cool tech (3D printers, laser engravers, tool lending libraries, …) we’d like to see in libraries in the not-so-distant future.

It’s a conversation I hope to pick up with Willow & others (thx!)at CCCamp.

Tags: , , , ,
Posted in library and information science, random thoughts | Comments (0)

Citation management means different things to different people

August 3rd, 2011
by jodi

I got to talking with a mathematician friend about citation management. We came to the conclusion that “manage PDFs” is my primary goal while “get out good citations” is his primary goal. I thought it would interesting to look at his requirements.

His ideal program would

  1. Organize the PDFs (Papers does this, when it doesn’t botch the author names and the title) preferably in the file system, so I can use Dropbox
  2. Get BibTeX entires from MathSciNet, ACM, etc. EXACTLY AS THEY ARE
  3. Have some decent way to organize notes by “project” or something

He doesn’t care about:

  1. Typing \cite
  2. A “unified” bibliographic database
  3. Social bibliographies (though I am not against them; it is just not a burning issue)

He says:

I guess the point is that, if I am writing something and I know I want to cite it, and I know there is a “official” BibTeX for it, I just need a way to get that more quickly than:

  1. Type the URL
  2. Click on “Proxy this” in my bookmarks bar
  3. Search for the paper
  4. Copy/paste the BibTeX
  5. Edit the cite key to something mnemonic

He followed up with an example of the “awful” awful, lossy markup Papers produces which loses information including the ISSN and DOI; he prefers the minimalist BibTeX. (oops!; he adds “I understated how bad papers is. The real papers entry (top) not only has screwy names, but junk instead of the full journal name. The papers cite key is meaningless noise too (but mathscinet is meaningful noise).”) To get around this, he does the same search/download “a million times”.

AMS Papers2 BibTeX:
@article{AR78,
author = {L Asimow and B Roth},
journal = {Trans. Amer. Math. Soc.},
title = {The rigidity of graphs},
pages = {279--289},
volume = {245},
year = {1978},
}

Papers' The AMS version of the same BibTeX:
@article {AR78,
    AUTHOR = {Asimow, L. and Roth, B.},
     TITLE = {The rigidity of graphs},
   JOURNAL = {Trans. Amer. Math. Soc.},
  FJOURNAL = {Transactions of the American Mathematical Society},
    VOLUME = {245},
      YEAR = {1978},
     PAGES = {279--289},
      ISSN = {0002-9947},
     CODEN = {TAMTAM},
   MRCLASS = {57M15 (05C10 52A40 53B50 73K05)},
  MRNUMBER = {511410 (80i:57004a)},
MRREVIEWER = {G. Laman},
       DOI = {10.2307/1998867},
       URL = {http://dx.doi.org/10.2307/1998867},
}

I’ve just discovered that BibDesk‘s ((See also A short review of BibDesk from MacResearch)) ‘minimize’ does what he wants: its has output is quite close to the AMS Papers2 version:

@article{AR78,
	Author = {Asimow, L. and Roth, B.},
	Journal = {Trans. Amer. Math. Soc.},
	Pages = {279--289},
	Title = {The rigidity of graphs},
	Volume = {245},
	Year = {1978}}

I’d still like to understand the impact the non-minimal BibTeX is having; could be bad citation styles are causing part of the problem.

While we have different needs for citation management, we’re both annoyed by the default filenames many publishers use – like fulltext.pdf and sdarticle.pdf. But I’ll tolerate these, as long as I can get to it from a database index with a nice frontend.

We of course moved on to discussing how research needs an iTunes or, as Geoff Bilder has called it, an iPapers.

This blog post brought to you by Google chat and the number 3.

Tags: , , , , ,
Posted in books and reading, information ecosystem, library and information science, scholarly communication | Comments (0)

Sente, a first look

August 1st, 2011
by jodi

Today I’ve been testing out Sente, on the theory that it might help me organize the PDFs I’m annotating on my iPad.

The desktop application is geared to Mac users who really care about bibliographies, with several fantastic features, including

I like Sente’s statuses; read/unread and Recently Modified and Recently Added are automatically tracked, and you can rate items. I especially like the workflow statuses, which match some of my common tasks:

  • Get Full Text
  • Discuss Further
  • Cite
  • Do Not Cite

“Sort by citation” is surprisingly illuminating: I didn’t realize how many papers from “Discourse Studies” I’d been looking at recently.

Another great feature that could be easily and fruitfully added to most other bibliographic managers: title case and exact case lists (I am *so* sick of seeing lowercased ‘wikipedia’ in bibliographies!), which you can very easily customize.
Sente also has a journal dictionary: You can assign the abbreviations and ISSNs (authority control, yippee!)!

Their visual display could use an update (thankfully it’s on the way) and I find their icons confusing (maybe ‘pencil’ for ‘note’ is sensible, but what in the world about ‘four dots in a diamond shape’ says ‘abstract’ to you?)

I tested the Zotero import. As I wrote Sente’s developers, there are some issues:

In testing it out on my large (5000+ item) Zotero library I see that:

  1. HTML attachments are not copied into the Sente library
  2. Image attachments are not copied into the Sente library
  3. Text note attachments are not copied into the Sente library
  4. Subcollections are not preserved

Since then, I’ve noticed that the keywords don’t get imported. Further, the date added and “date modified” fields are not preserved, but instead now reflect the import date and time (as I noted on twitter). But I do like their duplicate detection. Along with promising to consolidate matched items, they provide a report about the discarded matches. For instance:

Rule “DOI rule” flagged these two references as possible duplicates:
Vilar, P., & Žumer, M. (2008). Perceptions and importance of user friendliness of IR systems according to users’ individual characteristics and academic discipline. Journal of the American Society for Information Science & Technology, 59(12), 1995-2007. doi:Article
Quick-Response Barcodes. (2008). Library Technology Reports, 44(5), 46-47. doi:Article
However, the match was rejected because the references differ in: Article Title, pages, Publication Title, URL, Volume, Issue.

I have played briefly with the Sente’s free iPad viewer, but not yet with their paid ($19.99) app which allows annotation. Based on reviews (why no permalinks, Apple?), “Export seems to be an option but crucially, import is not.” However, if Sente’s annotation is enough, there’s hope, since documentation of the Sync functionality already in the current (6.2) version the description of Sync for the planned 6.5 release (via this) is *very* promising: “As you read a PDF on your iPad on the bus ride home, highlighting passages and taking notes, the highlighting and notes appear in all copies by the time you arrive home.”

By Sente user standards, I am far from a power user: the biggest databases seem to be about 10 times mine. This could be an improvement from Zotero, where my library speed can’t quite keep up some days. I’d be *very* interested to hear from enthusiastic Sente users. Switching seems quite feasible, and probably worth checking out their iPad app.

The main obvious concerns I have are about notetaking and portability. Notetaking of offline/non-fulltext items is important but doesn’t seem to have been a particular focus of development. Portability is incredibly important: I need to ensure that export (and ideally import) brings along files and notes as well as PDFs.

I’ve been thinking of direct, in-file PDF annotation as the best possible way to ensure that my annotations outlive my reference manager. Should I rethink that? So far (according to their draft manual as above): “Highlighting created in Sente 6.2 is not stored in the PDF itself — it is stored in the library database. This change has several very positive effects, notably on syncing.” Let me know what you think in the comments!

Tags: , , ,
Posted in books and reading, library and information science, reviews, scholarly communication | Comments (2)

Annotating PDFs on an iPad: GoodReader and iAnnotatePDF

July 31st, 2011
by jodi

Colleagues were interested in my recommendations for iPad annotation: GoodReader and iAnnotatePDF. Here’s a brief comparison.

Both save Acrobat-compatible annotations, which can be exported out as text (for instance to see everything you’ve highlighted yellow), offer synching, and multiple styles of annotation. The exact annotation workflow and navigation differ somewhat.

GoodReader’s main strength is the ability to easily pinpoint the exact boundaries of an annotation: a circular magnifying ‘loope’ window automatically pops up. GoodReader also warns you when scanned images don’t have text behind them (offering to OCR them would be a welcome, though challenging enhancement: it would be enough to put them into an OCR-queue you could have Acrobat Pro watch and act on). One weakness (for me at least) is that to get the toolmenu, you must tap in the middle of the screen. My fingers seem expect it to pop up when you tap on the right-hand side of the screen: sometimes that advances the page, but sometimes that just changes the view on the current page. Further, I find its small black-and-white icons somewhat confusing.

I prefer iAnnotatePDF, especially because it saves annotations by default, has customizable navigation, and clearer icons. Its key strength is that annotations are auto-saved, with ‘undo’, ‘delete’, and ‘edit’ functions. Further, the annotation type is maintained between annotations, until you (say) put down the highlighter by clicking an x. This is a small weakness since I find that to switch pages I have to close the annotation tool I’m currently using. Another weakness is that there’s a limited time window for editing existing annotations: just after they are created, annotations can be adjusted, for instance to move the boundaries of text highlights and underlines. Yet after this period has expired, annotations can be deleted, but locations cannot be adjusted (as far as I can tell). Another weakness is that interacting with image-only PDFs can be confusing; without any text, some functions (text highlight, text underline, …) just don’t work, without any warning or notice.

I would be interested in hearing comparisons of the syncing functionality, as well as comparisons to PDFExpert.

Criterion  GoodReader  iAnnotatePDF 
Pageview  default is snap to page (double-spreads show left-to-right)  flow (can see parts of 2 pages at once, top-to-bottom) 
Saving annotations  Must save each annotation  Annotations automatically save 
Navigation  tap left/right to navigate forward/back; scroll only shows the same page  tap, slide, or swipe to navigate (customizable)  
Toolbar  tap in the middle  tap on the right 
Icons  black & white, some are obscure   medium-sized color, some are clearly understandable  

Posted in books and reading, iOS: iPad, iPhone, etc. | Comments (1)

How do you organize papers on your iPad?

July 31st, 2011
by jodi

You read papers, right? How do you store and organize them? I’m looking for advice on a workflow for annotating PDFs and syncing between devices.

I’m striking out on iPad apps for organizing scholarly papers. Papers2 doesn’t pull annotated copies back. Mendeley lite doesn’t even let me log in ((In Mendeley v1.3.1 (build 19) when I enter my login details, the only option is ‘close’. After closing, Mendeley reports “Not logged in”. Yes, I’ve double-checked my password!)). Zotero, which has been my main reference manager for at least 5 years, doesn’t offer an iPad app.

For annotation, I like iAnnotatePDF and GoodReader (and I’m getting ready to try PDFExpert). What I don’t know is how to have manageable filenames, when the documents originate in another iPad app, instead of on the desktop.

The only ideas I have left involve either spending more time with filemanagers or relying on the synching inside the annotation tools.

Reference managers/PDF managers:

  1. Try Sente
  2. See what DevonThink Pro Office can do, maybe with Zotero export others have worked on. Surely that’s overkill?

Synching from annotation tools:

iAnnotate or GoodReader can “watch” folders. Main challenge is going to be coming up with a sufficiently small collection of PDFs to sync back and forth to the iPad.

  1. Stick with Zotero, maybe with files renamed from Zotfile, then use iAnnotatePDF’s “watch folder” feature to keep in sync.
  2. Stick with Papers2, manually manage the file synch for everything I’ve annotated, then use watch its data directory with iAnnotatePDF as above.
  3. Try Mendeley, watch its data directory with iAnnotatePDF as above.

Thoughts and suggestions? What would you do?

Posted in books and reading, iOS: iPad, iPhone, etc. | Comments (1)

Papers2 does not integrate with external iPad applications in the way I expected

July 31st, 2011
by jodi

Papers2 does not integrate with external iPad applications in the way I expected. I use iPad applications like GoodReader, iAnnotatePDF, and PDFExpert to read and annotate papers.

The functionality I expected was:

  • Export from Papers to an external PDF annotation application
  • When I reopen Papers, the annotated PDF is shown in my library

However, here is what happens:

  • Export from Papers to an external PDF annotation application. It renames the file, using a random string as the filename.
  • When I reopen Papers, only the original (unannotated) PDF is in my library.
  • Alternately when I export from the external application, the annotated file is imported as a *new* PDF, unconnected to the original, with a random string used for the filename.

I started using Papers because managing filenames in iAnnotate wasn’t working: I couldn’t figure out which files were which. So this is absolutely key for me.

==

This is a bug report to Papers2, copied here since bug reports are private. Any workarounds or suggestions for alternate annotation/reference management workflows would be very welcome.

This annotation environment completely failed to meet my expectations: I expected to ‘Open In’ an annotation application; in fact there’s just ‘Export’ and ‘Import’, meaning that the annotated file isn’t automatically stored in the Papers2 library.

Tags: , , , , , , ,
Posted in books and reading, iOS: iPad, iPhone, etc. | Comments (1)

GetSatisfaction’s “feedback-as-you-type”

July 24th, 2011
by jodi

GetSatisfaction does so many things right. Smart, immediate feedback is one example.

A couple weeks ago, I noticed this message while adding a post:
“EASE UP ON THE ALL CAPS IN YOUR TITLE. It looks like you’re shouting”
Feedback from GetSatisfaction: STOP SHOUTING

This is great in several ways:

  1. It’s immediate.
  2. It makes a single, clear, personalized ((i.e. specific to the situation)) suggestion.
  3. It uses a familiar analogy (“shouting”) — helping to explain the perceived problem.
  4. It’s not enforced: this nudges the poster, but leaves them to make up their own mind.
  5. It hints at humor/puts the shoe on the other foot (by USING CAPS FOR THE START OF THE MESSAGE).
  6. It’s not overwhelming.

Like their mood feedback it’s lightweight and appears to be effective.

Figuring out appropriate ways of presenting people with the “right” feedback at the right time will be important for a lot of the work I’m doing!

Tags: , , , ,
Posted in PhD diary, random thoughts, social web | Comments (0)

Reading Ontologically?

July 24th, 2011
by jodi

What are the right ontologies for reading? And what kind of ontology support would let books recombine themselves, on the fly, in novel ways?

Today keyword searches within books and book collections is commonplace, highlighting a word in your ebook reader can bring up a definition, and dictionaries grab recent examples of word use from microblogs. ((In 2003, Gregory Crane wrote that “Already the books in a digital library are beginning to read one another and to confer among themselves before creating a new synthetic document for review by their human readers.” When I first read it in 2006, that article seemed incredibly visionary to me. Yet these commonplace “syntheses” no longer seem extraordinary to me.)) But can’t we do more? But what kind of synthesis do we need (and what is possible) for supporting readers of literature, classics, and humanities texts?

Current approaches seem to aim at analysis (e.g. getting an overview of the literary works of a period with “distant reading”/”macroanalysis”) and at creating flexible critical editions (e.g. structural, sometimes overlapping markup, as in TEI-based editions and projects like Wendell Piez’ Sonneteer ((currently offline, but brilliant; do check back, meanwhile see also his Digital Humanities 2010 talk notes)).) I would call these “sensemaking” approaches rather than tools for reading.

I was intrigued by the Bible Ontology ((It’s a bit disingenuous to advertise their work as an ontology: in fact they have applied the ontology, rather than just creating it.)) because of their tagline: “ever wanted to read and study the Bible Ontologically?” Yet I don’t really know what they mean by reading ontologically ((even though I’ve given a talk about supporting reading with ontologies!)).

Of course, they have recorded various pieces of data. For instance, for Rebekah, we see her children, siblings, birthplace, book and chapters she figures in, etc.: http://bibleontology.com/page/Rebekah. ((The most meaningful of their terms is the bop:isRelatedInEvent, perhaps since these events, like Isaac_blesses_Jacob, would require more analysis to discern.))

Rebekah, from bibleontology.com

They offer a SPARQL endpoint, so you can query. For instance, to find all the married women ((Gender is not recorded so we can’t (yet) ask for all the women overall, though I’ve just asked about this.)) (live query result):

PREFIX bop: <http://bibleontology.com/property/>
select ?s ?o where {?s bop:isWifeOf ?o }

Intense and long-term work has gone into Bible concordances, scholarship, etc., so it seems like a great use case for “reading ontologically”. With theologians and others looking at the site, using the SPARQL endpoint, etc., perhaps someone will be able to tell me what that means!

Tags: , , , , ,
Posted in books and reading, future of publishing, semantic web | Comments (0)