Archive for the ‘books and reading’ Category

Happy Public Domain Day!

January 2nd, 2011

Today, in many countries around the world, new works become public property: January 1st every year is Public Domain Day. Material in the public domain can be used, remixed and shared freely — without violating copyright and without asking permission.

However, in the United States, not a single new work entered the public domain today. Americans must wait 8 more years: Under United States copyright law, nothing more will be added to the public domain until January 1, 2019.

Until the 1970’s the maximum copyright term was 56 years. Under that law, Americans would have been able to truly celebrate Public Domain Day:

  1. All works published in 1954 would be entering the public domain today.
  2. up to 85% of all copyrighted works from 1982 would be entering the public domain today. (Copyright Office and Duke).

Instead, only works published before 1923 are conclusively in the public domain in the U.S. today. What about post-1923 publications? It’s complicated: in the United States ((609 pages worth of complicated)).

For more information on Public Domain Day and the United States, Duke’s Center for the Study of the Public Domain has a series of useful pages.

Tags: , ,
Posted in books and reading, information ecosystem, intellectual freedom, library and information science | Comments (0)

Accessing genomics workflows from Word documents with GenePattern

November 14th, 2010

What if you could rerun computational experiments from within a scientific paper?

The GenePattern add-on for Word for Windows integrates reusable genomic experiment pipelines into Microsoft Word. Readers can rerun the original or modified experiments from within the document by connecting to a GenePattern server.

Rerunning a pipeline inside Word

Rerunning a pipeline inside Word

I don’t run Windows, so I took this screenshot from a video produced at the Broad Institute of MIT and Harvard, where GenePattern is developed.

Readers without Word for Windows can also access the experimental pipelines by exporting them from the document: just run a GenePatternDocumentExtractor command from a GenePattern server. The GenePattern public server was very easy to access and start using. Here’s what the GenePatternDocumentExtractor command looks like:

Running GenePatternDocumentExtractor at the GenePattern public server

Running GenePatternDocumentExtractor at the GenePattern public server

Unfortunately the jobs I ran didn’t extract any pipelines from the Institute’s sample DOC. I’ve sent in an inquiry (either I’m doing something wrong or there’s a bug, either way it’s useful). I was very impressed that I could make my jobs public, then refer to them by URL in my email, to make clear what exactly I did.

The GenePattern add-on for Word is another find from the beyondthepdf list. Its development was funded by Microsoft. See also Accessible Reproducible Research by Jill P. Mesirov (Science, 327:415, 2010). doi:10.1126/science.1179653, which describes the underlying philosophy: have a Reproducible Research System (RRS) made up of an environment for doing computational work (the Reproducible Research Environment or RRE) and an authoring environment (the Reproducible Research Publisher or RRP) which links back to the research system.

Tags: , , , , , ,
Posted in books and reading, future of publishing, information ecosystem, scholarly communication | Comments (1)

A Model-View-Controller perspective of scholarly articles

November 13th, 2010

A scholarly paper is not a PDF. A PDF is merely one view of a scholarly paper. To push ‘beyond the PDF’, we need design patterns that allow us to segregate the user interface of the paper (whether it is displayed as an aggregation of triples, a list of assertions, a PDF, an ePub, HTML, …) from the thing itself.

Towards this end, Steve Pettifer has a Model-View-Controller perspective on scholarly articles, which he shared in a post on the Beyond the PDF listserv, where discussions are leading up to a workshop in January. I am awe-struck: I wish I’d thought of this way of separating the structure and explaining it.

I think a lot of the disagreement about the role of the PDF can be put down to trying to overload its function: to try to imbue it with the qualities of both ‘model’ and ‘view’. … One of the things that software architects (and I suspect designers in general) have learned over the years is that if you try to give something functions that it shouldn’t have, you end up with a mess; if you can separate out the concerns, you get a much more elegant and robust solution.

My personal take on this is that we should keep these things very separate, and that if we do this, then many of the problems we’ve been discussing become more clearly defined (and I hope, many of the apparent contradictions, resolved).

So… a PDF (or come to that, an e-book version or a html page) is merely a *view* of an article. The article itself (the ‘model’) is a completely different (and perhaps more abstract) thing. Views can be tailored for a particular purpose, whether that’s for machine processing, human reading, human browsing, etc etc.

[paragraph break inserted]

The relationship between the views and their underlying model is managed by the concept of a ‘controller’. For example, if we represent an article’s model in XML or RDF (its text, illustrations, association nanopublications, annotations and whatever else we like), then that model can be transformed in to any number of views. In the case of converting XML into human-readable XHTML, there are many stable and mature technologies (XSLT etc). In the case of doing the same with PDF, the traditional controller is something that generates PDFs.

[paragraph break inserted]

The thing that’s been (somewhat) lacking so far is the two-way communication between view and model (via controller) that’s necessary to prevent the views from ossifying and becoming out of date (i.e. there’s no easy way to see that comments have been added to the HTML version of an article’s view if you happen to be reading the PDF version, so the view here can rapidly diverge from its underlying model).

[paragraph break inserted, link added]

Our Utopia software is an attempt to provide this two-way controller for PDFs. I believe that once you have this bidirectional relationship between view and model, then the actual detailed affordances of the individual views (i.e. what can a PDF do well / badly, what can HTML do well / badly) become less important. They are all merely means to channeling the content of an article to its destination (whether that’s human or machine).

The good thing about having this ‘model view controller’ take on the problem is that only the model needs to be pinned down completely …

Perhaps separating out our concerns in this way — that is, treating the PDF as one possible representation of an article — might help focus our criticisms of the current state of affairs? I fear at the moment we are conflating the issues to some degree.

– Steve Pettifer in a Beyond the PDF listserv post

I’m particularly interested in hearing if this perspective, using the MVC model, makes sense to others.

Tags: , , , , , , ,
Posted in books and reading, future of publishing, information ecosystem, library and information science, scholarly communication, social semantic web | Comments (9)

Ebook pricing fail (Springer edition)

November 1st, 2010

I wrote Springer to ask about buying an ebook that’s not in our university subscriptions. They sell the print copy at €62.95, but the electronic copy comes to €425, chapter by chapter.

Publishers: this is short-sighted (not to mention frustrating)–especially when your customers are looking for a portable copy of a book they already owns!

———- Forwarded message ———-
From: Springerlink, Support, Springer DE
Date: Fri, Oct 29, 2010 at 8:46 PM
Subject: WG: ebook pricing

Dear Jodi,

Thank you for your message.

On SpringerLink you can purchase online single journal articles and book chapters, but no complete ebooks.
eBooks are sold by Springer in topical eBook packages only.

with kind regards,
SpringerLink Support Team
eProduct Management & Innovation | SpringerLink Operations
support.springerlink@springer.com | + 49 (06221) 4878 743
www.springerlink.com

—–Original Message—–
From: Jodi Schneider
Sent: Thursday, October 28, 2010 5:09 PM
To: MetaPress Support
Subject: ebook pricing

Hi,

I’m interested in buying a copy of [redacted] as an ebook:
http://www.springerlink.com/content/[redacted]

This book has 17 chapters, which seem to be priced at 25 EUR each = 425 EUR.

But I could buy a print version, new at springer.com for 62.95 EUR:
http://www.springer.com/mathematics/book/[redacted]

Can you help me get the ebook at this price?
Thanks!
-Jodi

Tags: , , , ,
Posted in books and reading, future of publishing | Comments (3)

CiTO in the wild

October 18th, 2010

CiTO has escaped the lab and can now be used either directly in the CiteULike interface or with CiteULike machine tags. Go Citation Typing Ontology!

In the CiteULike Interface

To add a CiTO relationship between articles using the CiteULike interface, both articles must be in your own library. You’ll see a a “Citations (CiTO)” section after your tags. Click on edit and set the current article as the target.

set the CiTO target

First set the CiTO target

Then navigate around your own library to find a related article. Now you can add a CiTO tag.

Adding a CiTO tag in CiteULike

Adding a CiTO tag in CiteULike

There are a lot of choices. Choose just one. :)

CiTO Object Properties appear in the dropdown

CiTO Object Properties now appear in the dropdown

Congratulations, you’ve added a CiTO relationship! Now mousing over the CiTO section will show details on the related article.

CiTO result

Mouse over the resulting CiTO tag to get details of the related article

Machine Tags

Machine tags take fewer clicks but a little more know-how. They can be added just like any other tag, as long as you know the secret formula: cito--(insert a CiTO Object Property here from this list)--(insert article permalink numbers here) Here are two more concrete examples.

First, we can keep a list of articles citing a paper. For example, tagging an article

cito--cites--1375511

says “this article CiTO:cites article 137511”. Article 137511 can be found at http://www.citeulike.org/article/137511, aka JChemPaint – Using the Collaborative Forces of the Internet to Develop a Free Editor for 2D Chemical Structures. Then we can get the list of (hand-tagged) citations to the article. Look—a community generated reverse citation index!

Second, we can indicate specific relationships between articles, whether or not they cite each other. For example, tagging an article

cito--usesmethodin--423382

says “this item CiTO:usesmethodin item 42338”. Item 42338 is found at http://www.citeulike.org/article/423382, aka The Chemistry Development Kit (CDK):  An Open-Source Java Library for Chemo- and Bioinformatics.

Upshot

Automation and improved annotation interfaces will make CiTO more useful. CiTO:cites and CiTO:isCitedBy could used to mark up existing relationships in digital libraries such as ACM Digital Library and CiteSeer, and could enhance collections like Google Books and Mendeley, to make human navigation and automated use easier. To capture more sophisticated relationships, David Shotton has hopes of authors marking up citations before submitting papers; if it’s required, anything is possible. Data curators and article commentators may observe contradictions between papers, or methodology reuses; in these cases CiTO could be layered with an annotation ontology such as AO in order to make the provenance of such assertions clear.

CiTO could put pressure on existing publishers and information providers to improve their data services, perform more data cleanup, or to exposing bibliographies in open formats. Improved tools will be needed, as well as communities that are willing to add data by hand, and algorithms for inferring deep citation relationships.

One remaining challenge is aggregation of CiTO relationships between bibliographic data providers; article identifiers such as DOI are unfortunately not universal, and the bibliographic environment is messy, with many types of items, from books to theses to white papers to articles to reports. CiTO and related ontologies will help explicitly show the bibliographic web and relationships between these items, on the web of (meta)data.

Further Details

CiTO is part of an ecosystem of citations called Semantic Publishing and Referencing Ontologies (SPAR); see also the JISC Open Citation Project which is taking bibliographic data to the Web, and the JISC Open Bibliography Project. For those familiar with Shotton’s earlier writing on CiTO, note that SPAR breaks out some parts of the earlier formulation of this ontology.

Tags: , , , , ,
Posted in argumentative discussions, books and reading, information ecosystem, library and information science, PhD diary, scholarly communication, semantic web | Comments (3)

Funding Models for Books

July 17th, 2010

Paying for books per copy “developed in response to the invention of the printing press”, and a Readercon panel discussed some alternatives.

Existing alternatives, as noted in Cecilia Tan’s summary of the panel:

  • the donation model
  • the Kickstarter model
  • the “ransom” model
  • the subscription or membership model
  • the “perks” model
  • the merchandising model
  • the collectibles model
  • the company or support grant model
  • the voting model
  • the hits/pageviews model

Any synergies with Kevin Kelly’s Better than Free?

via HTLit’s Readercon overview

Tags: , ,
Posted in books and reading, future of publishing | Comments (0)

Book as experience? Or book as storage/retrieval mechanism?

June 24th, 2010

Here’s a research question for historians of the book (and maybe book futurists, too):

What’s the key aspect of the book?

  1. the cognitive experience
  2. information storage and retrieval enabled (e.g. book features such as ToC & indexes within a book itself; reproducibility of ‘exact’ copies, wider distribution and ownership of books, ability to have multiple books on the shelf, etc.)?

That arises from Steven Berlin Johnson:

[W]as the intellectual revolution post-Gutenberg driven by the mental experience of long-form reading? Or was it driven by the ability to share information asynchronously, and transmit that information easily around the globe? I think it is a mix of the two, but Nick, taking his cues from McLuhan, places almost all of his emphasis on the cognitive effects of deep focus reading. There’s no real way to prove it, but I think there’s a very strong case to be made that the information storage-and-retrieval advances made possible by the book were more important to the Enlightenment and the modern age than the contemplative mode of the literary mind. And if that’s true, then the Web should be seen as a continuation of the Gutenberg galaxy, not a betrayal of it.”

from a post where Steven Berlin Johnson summarizes his own New York Times essay Yes, People Still Read, but Now It’s Social responding to Nick Carr’s book The Shallows: What the Internet Is Doing to Our Brains. I assume Carr’s current position to be well-represented by his 2008 article in The Altantic, Is Google Making Us Stupid? What the Internet Is Doing to Our Brains.

Tags: ,
Posted in books and reading, information ecosystem | Comments (0)

Locative texts

June 13th, 2010

A post at HLit got me thinking about locative hypertexts, which are meant to be read in a particular place.

Monday, Liza Daly shared an epub demo which pulls in the reader’s location, and makes decisions about the character’s actions based on movement. Think of it as a choose-your-own-adventure novel crossed with a geo-aware travel guide. It’s a brief proof-of-concept, and the most exciting part is that the code is free for the taking under the very permissive (GPL + commercial-compatible) MIT License. Thanks, Liza and Threepress for lowering barriers to experimentation with ebooks!

‘Locative hypertexts’ also bring to mind GPS-based guidebooks as envisioned in the 2007 Editus video ‘Possible ou probable…?’ ((Editus’ copy of the video)):

Tim McCormick summarizes:

In the 9-minute video, we get mouth-watering, partly tongue-in-cheek scenes of continental Europe’s quality-of-life — fantastic trains & pedestrian streetscapes,independent bookstores, delicious food, world-class museums, weekend getaway to Bruges, etc.– as the movie follows a couple through a riotous few days of E-book high living.

On their fabulously svelte, Kindle 2-like devices, they

  • read and purchase novels
  • enjoy reading on the beach
  • get multimedia museum guides
  • navigate foreign cities with ease
  • stay in multimedia contact with friends and family
  • collaborate with colleagues on shared virtual desktops while at sidewalk cafes
  • see many hi-resolution Breughel paintings online and off that I’m dying to see myself

etc.

Multimedia guidebooks ((e.g. the Lonely Planet city guide series for iPhone)) are approaching this vision. Combine them with (also-existing) turn-by-turn directions, and connectivity and privacy will be the largest remaining obstacles.

So then what about location-based storytelling? I got to thinking about the iPhone apps I’ve already encountered, which are intended for use in particular places:

  • Walking Cinema: Murder on Beacon Hill – a murder mystery/travel series based in Boston (available as an iPhone app and podcast).
  • Museum of the Phantom City: Other Futures – a multimedia map/alternate history of NYC architecture, described as a way to “see the city that could have been”. It maps never-built structures envisioned by Buckminster Fuller, Gaudi, and others – ideally while you’re “standing on the projects’ intended sites”.
  • Museum of London: Streetmuseum, true history of London in photos, meant for use on the streets
  • Historic Earth, has historical maps which could be interesting settings for historical locative storytelling

Tags: , , ,
Posted in books and reading, future of publishing, information ecosystem, iOS: iPad, iPhone, etc. | Comments (0)

Weeding

October 28th, 2009

I have always admired what Knuth says about email:

Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration. I try to learn certain areas of computer science exhaustively; then I try to digest that knowledge into a form that is accessible to people who don’t have time for such study.

(emphasis added) – [Knuth versus Email]

I’m not giving up email (heaven forbid!). And none of us are Knuth!

But I am weeding: culling away most listservs and feeds that aren’t core to semantic web or social web, or to my personal life (or joy). I’ll be keeping an eye on twitter, of course.


cc licensed flickr photo “weedy mulch on the start of compost pile” by Liz Henry

This (hopefully) brings me more attention for figuring out what’s going on in the semantic web and social web community, and for my literature review. But it means I’m going to have to accept lagging behind a little in everything else.

I have a love for being-in-the-know and finding interesting links. That makes it hard to stay on the bottom of things! For now, this energy and mindset get directed to my literature review (which is very much like finding the coolest new thing, only that thing may be from 1985, and nearly wholly forgotten, and not cool or new to anyone except for oneself.)

The good news is that when studying the social web, total disconnection generally isn’t desirable!

Tags: , , ,
Posted in books and reading, PhD diary | Comments (0)

Google Books settlement: a monopoly waiting to happen

October 10th, 2009

Will Google Books create a monopoly? Some ((“Several European nations, including France and Germany, have expressed concern that the proposed settlement gives Google a monopoly in content. Since the settlement was the result of a class action against Google, it applies only to Google. Other companies would not be free to digitise books under the same terms.” (bolding mine) – Nigel Kendall, Times (UK) Online, Google Book Search: why it matters )) people think ((“Google’s five-year head start and its relationships with libraries and publishers give it an effective monopoly: No competitor will be able to come after it on the same scale. Nor is technology going to lower the cost of entry. Scanning will always be an expensive, labor-intensive project.” (bolding mine) – Geoffrey Nunberg, Chronicle of Higher Education, Google’s Book Search: A Disaster for Scholars (pardon the paywall))) so. Brin claims it won’t:

If Google Books is successful, others will follow. And they will have an easier path: this agreement creates a books rights registry that will encourage rights holders to come forward and will provide a convenient way for other projects to obtain permissions.

-Sergey Brin, New York Times, A Library To Last Forever

Brin is wrong: the proposed Google Books settlement will not smooth the way for other digitization projects. It creates a red carpet for Google while leaving everyone else at risk of copyright infringement.

The safe harbor provisions apply only to Google. Anyone else who wants to use one of these books would face the draconian penalties of statutory copyright infringement if it turned out the book was actually still copyrighted. Even with all this effort, one will not be able to say with certainty that a book is in the public domain. To do that would require a legislative change – and not a negotiated settlement.

– Peter Hirtle, LibraryLawBlog: The Google Book Settlement and the Public Domain.

Monopoly is not the only risk. Others include ((Of course there are lots of benefits, too!)) reader privacy, access to culture, suitability for bulk and some research users (metadata, etc.). Too bad Brin isn’t acknowledging that!

Don’t know what all the fuss is with Google Books and the proposed settlement? Wired has a good outline from April.

Tags: , , , ,
Posted in books and reading, future of publishing, information ecosystem, intellectual freedom, library and information science | Comments (1)