Archive for the ‘books and reading’ Category
Papers2 does not integrate with external iPad applications in the way I expected. I use iPad applications like GoodReader, iAnnotatePDF, and PDFExpert to read and annotate papers.
The functionality I expected was:
- Export from Papers to an external PDF annotation application
- When I reopen Papers, the annotated PDF is shown in my library
However, here is what happens:
- Export from Papers to an external PDF annotation application. It renames the file, using a random string as the filename.
- When I reopen Papers, only the original (unannotated) PDF is in my library.
- Alternately when I export from the external application, the annotated file is imported as a *new* PDF, unconnected to the original, with a random string used for the filename.
I started using Papers because managing filenames in iAnnotate wasn’t working: I couldn’t figure out which files were which. So this is absolutely key for me.
==
This is a bug report to Papers2, copied here since bug reports are private. Any workarounds or suggestions for alternate annotation/reference management workflows would be very welcome.
This annotation environment completely failed to meet my expectations: I expected to ‘Open In’ an annotation application; in fact there’s just ‘Export’ and ‘Import’, meaning that the annotated file isn’t automatically stored in the Papers2 library.
Tags: annotation, bibliographic management, bug reports, iPad, mekentosj.com, Papers, Papers2, reference management
Posted in books and reading, iOS: iPad, iPhone, etc. | Comments (1)
What are the right ontologies for reading? And what kind of ontology support would let books recombine themselves, on the fly, in novel ways?
Today keyword searches within books and book collections is commonplace, highlighting a word in your ebook reader can bring up a definition, and dictionaries grab recent examples of word use from microblogs. But can’t we do more? But what kind of synthesis do we need (and what is possible) for supporting readers of literature, classics, and humanities texts?
Current approaches seem to aim at analysis (e.g. getting an overview of the literary works of a period with “distant reading”/”macroanalysis”) and at creating flexible critical editions (e.g. structural, sometimes overlapping markup, as in TEI-based editions and projects like Wendell Piez’ Sonneteer.) I would call these “sensemaking” approaches rather than tools for reading.
I was intrigued by the Bible Ontology because of their tagline: “ever wanted to read and study the Bible Ontologically?” Yet I don’t really know what they mean by reading ontologically.
Of course, they have recorded various pieces of data. For instance, for Rebekah, we see her children, siblings, birthplace, book and chapters she figures in, etc.: http://bibleontology.com/page/Rebekah.

They offer a SPARQL endpoint, so you can query. For instance, to find all the married women (live query result):
PREFIX bop: <http://bibleontology.com/property/>
select ?s ?o where {?s bop:isWifeOf ?o }
Intense and long-term work has gone into Bible concordances, scholarship, etc., so it seems like a great use case for “reading ontologically”. With theologians and others looking at the site, using the SPARQL endpoint, etc., perhaps someone will be able to tell me what that means!
Tags: Bible, Bible Ontology, ontologies, ontologies for reading, reading ontologically, SPARQL
Posted in books and reading, future of publishing, semantic web | Comments (0)
Douglas Knox touches on the future of “distant reading” with Google Books.
For rights management reasons and also for material engineering reasons, the research architecture will move the computation to the data. That is, the vision of the future here is not one in which major data providers give access to data in big downloadable chunks for reuse and querying in other contexts, but one in which researchers’ queries are somehow formalized in code that the data provider’s servers will run on the researcher’s behalf, presumably also producing economically sized result sets.
There are also some implicit research goals, for those in cyberinfrastructure, digital humanities support, and people in text mining aiming at supporting humanities scholars:
Whatever we mean by “computation,” that is, can’t be locked up in an interface that tightly binds computation and data. Readers already need (and for the most part do not have) our own agents and our own data, our own algorithms for testing, validating, calibrating, and recording our interaction with the black boxes of external infrastructure.
This kind of blackbox infrastructure contrasts with “using technology critically and experimentally, fiddling with knobs to see what happens, and adjusting based on what they find.” when a scholar is “free to write short scripts and see results in quick cycles of exploration”.
I’m pulling these out of context — from Douglas’ post on the Digital Humanities 2011 conference.
Tags: dh11, digital humanities, distant reading, Google Books, macroanalysis
Posted in books and reading, information ecosystem | Comments (0)
Holding on to old business models is not the way to endear yourself to customers.
But unfortunately this is also, simultaneously, a bad time to be a reader. Because the dinosaurs still don’t get it. Ten years of object lessons from the music industry, and they still don’t get it. We have learned, painfully, that media consumers—be they listeners, watchers, or readers—want one of two things:
- DRM-free works for a reasonable price
- or, unlimited single-payment subscription to streaming/DRMed works
Give them either of those things, and they’ll happily pay. Look at iTunes. Look at Netflix. But give them neither, and they’ll pirate. So what are publishers doing?
- Refusing to sell DRM-free books. My debut novel will be re-e-published by the Friday Project imprint of HarperCollins UK later this year; both its editor and I would like it to be published without DRM; and yet I doubt we will be able to make that happen.
- crippling library e-books
- and not offering anything even remotely like a subscription service.
– Jon Evans, When Dinosaurs Ruled the Books, via James Bridle’s Stop Press
Eric Hellman is one of the pioneers of tomorrow’s ebook business models: his company, Gluejar, uses a crowdfunding model to re-release books under Creative Commons licenses. Authors and publishers are paid; fans pay for the books they’re most interested in; and everyone can read and distribute the resulting “unglued” ebooks. Everybody wins.
Tags: business models, content, DRM, ebooks, publishing, subscriptions
Posted in books and reading, future of publishing, information ecosystem | Comments (0)
To support reading, think about diversity of reading styles.
A study of “How examiners assess research theses” mentions the diversity:
[F]our examples give a good indication of the range of ‘reading styles’:
- A (Hum/Male/17) sets aside time to read the thesis. He checks who is in the references to see that the writers are there who should be there. Then he reads slowly, from the beginning like a book, but taking copious notes.
- B (Sc/Male/22) reads the thesis from cover to cover first without doing anything else. For the first read he is just trying to gain a general impression of what the thesis is about and whether it is a good thesis—that is, are the results worthwhile. He can also tell how much work has actually been done. After the first read he then ‘sits on it’ for a while. During the second reading he starts making notes and reading more critically. If it is an area with which he is not very familiar, he might read some of the references. He marks typographical errors, mistakes in calculations, etc., and makes a list of them. He also checks several of the references just to be sure they have been used appropriately.
- C (SocSc/Female/27) reads the abstract first and then the introduction and the conclusion, as well as the table of contents to see how the thesis is structured; and she familiarises herself with appendices so that she knows where everything is. Then she starts reading through; generally the literature review, and methodology, in the first weekend, and the findings, analysis and conclusions in the second weekend. The intervening week allows time for ideas to mull over in her mind. On the third weekend she writes the report.
- D (SocSc/Male/15) reads the thesis from cover to cover without marking it. He then schedules time to mark it, in about three sittings, again working from beginning to end. At this stage he ‘takes it apart’. Then he reads the whole thesis again.
from [1] Mullins, G. & Kiley, M. (2002), It’s a PhD, not a Nobel Prize: how experienced examiners asses research theses, Studies in Higher Education, 27, 4, pp.369-386. DOI:10.1080/0307507022000011507
Parenthetical comments are (discipline/gender/interview number). Thanks to the NUIG Postgrad Research Society for suggesting this paper.
References
Posted in books and reading, higher education, PhD diary, scholarly communication | Comments (0)
Apple’s press release about its “new subscription services” seems at first innocuous, and the well-crafted quote from Steve Jobs has been widely reposted:
“when Apple brings a new subscriber to the app, Apple earns a 30 percent share; when the publisher brings an existing or new subscriber to the app, the publisher keeps 100 percent and Apple earns nothing.” Yet analysts reading between the lines have been less than pleased.
Bad for publishers
The problems for publishers? (See also “Steve Jobs to pubs: Our way or highway“)
- Apple takes a 30% cut of all in-app purchases
- Apps may not bypass in-app purchase: apps may not link to an external website (such as Amazon) that allows customers to buy content or subscriptions.
- Content available for purchase in the app cannot be cheaper elsewhere.
- The customer’s demographic information resides with Apple, not with the publisher. Customers must opt-in to share their name, email, and zipcode with the publisher, though Apple will of course have this information.
- Limited reaction time; changes will be finalized by June 30th.
Bad for customers?
And there are problems for customers, too.
- Reduction of content available in apps (likely for the near-term).
- More complex, clunky purchase workflows (possible).
Publishers may sell material only outside of apps, from their own website, to avoid paying 30% to Apple. Will we see a proliferation of publisher-run stores?
- Price increases to cover Apple’s commission (likely).
If enacted, these must apply to all customers, not just iOS device users.
- Increased lockdown of content in the future (probably).
Apple already prevents some iBooks customers from reading books they bought and paid, using extra DRM affecting some jailbroken devices. Even though jailbreaking is explicitly legal in the United States. And even though carrier unlock and SIM-free phones are not available in the U.S.
More HTML5 apps?
The upside? Device-independent HTML5 apps may see wider adoption. HTML5 mobile apps work well on iOS, on other mobile platforms, and on laptops and desktops.
For ebooks, HTML5 means Ibis Reader and Book.ish. For publishers looking to break free of Apple, yet satisfy customers, Ibis Reader may be a particularly good choice: this year they are focusing on licensing Ibis Reader, as Liza Daly’s Threepress announced in a savvy and well-timed post, anticipating Apple’s announcement. Having been a beta tester of Ibis Reader, I can recommend it!
If you know of other HTML5 ebook apps, please leave them in the comments.
Tags: agency model, Apple, book.ish, business models, content, HTML5, ibis reader, ibisreader, iBooks, iOS, iOS: iPad, iPhone, etc., iPad, jailbreaking, middleman, subscriptions, walled garden
Posted in books and reading, future of publishing, information ecosystem, iOS: iPad, iPhone, etc. | Comments (0)
Nicole Henning suggests that academic libraries and scholarly presses work together to create the ultimate mobile app for scholarly ereading. I think about the requirements a bit differently, in terms of the functional requirements.
The main functions are obtaining materials, reading them, organizing them, keeping them, and sharing them.
For obtaining materials, the key new requirement is to simplify authentication: handle campus authentication systems and personal subscriptions. Multiple credentialed identities should be supported. A secondary consideration is that RSS feeds (e.g. for journal tables of contents) should be supported.
For reading materials, the key requirement is to support multiple formats in the same application. I don’t know of a web app or mobile app that supports PDF, EPUB, and HTML. Reading interfaces matter: look to Stanza and Ibis Reader for best-in-class examples.
For organizing materials, the key is synergy between the user’s data and existing data. Allow tags, folders, and multiple collections. But also leverage existing publisher and library metadata. Keep it flexible, allowing the user to modify metadata for personal use (e.g. for consistency or personal terminology) and to optionally submit corrections.
For keeping materials, import, export, and sync content from the user’s chosen cloud-based storage and WebDAV servers. No other device (e.g. laptop or desktop) should be needed.
For sharing materials, support lightweight micropublishing on social networks and email; networks should be extensible and user-customizable. Sync to or integrate with citation managers and social cataloging/reading list management systems.
Regardless of the ultimate system, I’d stress that device independence is important, meaning that an HTML5 website would probably the place to start: look to Ibis Reader as a model.
Tags: beyondthePDF, mobile, scholarly publishing
Posted in books and reading, future of publishing, information ecosystem, library and information science, scholarly communication | Comments (5)
Today, in many countries around the world, new works become public property: January 1st every year is Public Domain Day. Material in the public domain can be used, remixed and shared freely — without violating copyright and without asking permission.
However, in the United States, not a single new work entered the public domain today. Americans must wait 8 more years: Under United States copyright law, nothing more will be added to the public domain until January 1, 2019.
Until the 1970′s the maximum copyright term was 56 years. Under that law, Americans would have been able to truly celebrate Public Domain Day:
- All works published in 1954 would be entering the public domain today.
- up to 85% of all copyrighted works from 1982 would be entering the public domain today. (Copyright Office and Duke).
Instead, only works published before 1923 are conclusively in the public domain in the U.S. today. What about post-1923 publications? It’s complicated: in the United States.
For more information on Public Domain Day and the United States, Duke’s Center for the Study of the Public Domain has a series of useful pages.
Tags: copyright, copyright law, public domain
Posted in books and reading, information ecosystem, intellectual freedom, library and information science | Comments (0)
What if you could rerun computational experiments from within a scientific paper?
The GenePattern add-on for Word for Windows integrates reusable genomic experiment pipelines into Microsoft Word. Readers can rerun the original or modified experiments from within the document by connecting to a GenePattern server.

Rerunning a pipeline inside Word
I don’t run Windows, so I took this screenshot from a video produced at the Broad Institute of MIT and Harvard, where GenePattern is developed.
Readers without Word for Windows can also access the experimental pipelines by exporting them from the document: just run a GenePatternDocumentExtractor command from a GenePattern server. The GenePattern public server was very easy to access and start using. Here’s what the GenePatternDocumentExtractor command looks like:

Running GenePatternDocumentExtractor at the GenePattern public server
Unfortunately the jobs I ran didn’t extract any pipelines from the Institute’s sample DOC. I’ve sent in an inquiry (either I’m doing something wrong or there’s a bug, either way it’s useful). I was very impressed that I could make my jobs public, then refer to them by URL in my email, to make clear what exactly I did.
The GenePattern add-on for Word is another find from the beyondthepdf list. Its development was funded by Microsoft. See also Accessible Reproducible Research by Jill P. Mesirov (Science, 327:415, 2010). doi:10.1126/science.1179653, which describes the underlying philosophy: have a Reproducible Research System (RRS) made up of an environment for doing computational work (the Reproducible Research Environment or RRE) and an authoring environment (the Reproducible Research Publisher or RRP) which links back to the research system.
Tags: beyondthePDF, GenePattern, Microsoft Word, Reproducible Research Environment, Reproducible Research Publisher, Reproducible Research System, Word for Windows
Posted in books and reading, future of publishing, information ecosystem, scholarly communication, Uncategorized | Comments (1)