Jason Priem has a wonderful slidedeck on how to smoothly transition from today’s practices in scientific communication to the future. Here is my reading of the argument given in Jason’s slides:
Communicating science is a central and essential part of doing science, and we have always used the best technology available.
Yet currently, there are several problems with journals, the primary form of scholarly communication.
Journal publication is
Slow
Closed
Hard to innovate
and has
Restrictive format: function follows form
Inconsistent quality control
These problems are fixable, if we realize that journals serve four traditional functions:
Registration
Archiving
Dissemination
Certification
By decoupling these functions, into an a la carte publishing menu, we can fix the scholarly communication system. Decoupled scholarly outlets already exist. Jason mentions some outlets (I would say these mainly serve registration functions, maybe also dissemination ones):
ArXiv
Math Overflow
SSRN
Faculty of 1000 Research
the blag-o-sphere
Jason doesn’t mention here — but we could add to this list — systems for data publishing, e-science workflow, and open notebook science; these may fulfil registration and archiving functions. Also, among existing archiving systems, we could add the journal archiving functions of LOCKSS is the main player I’m familiar with.
Jason’s argument well worth reading in full; it’s a well-articulated argument for decoupling journal functions, with some detailed descriptions of altmetrics. The core argument is very solid, and of wide interest: Unlike previous articulations for “pre-publication peer review”, this argument will make sense to everyone who believes in big data, I think. There are other formats: video of the talk1 and a draft article called “Decoupling the scholarly journal”2.
by Jason Priem and Bradley M. Hemminger, under review for the Frontiers in Computational Neuroscience special issue “Beyond open access: visions for open evaluation of scientific papers by post-publication peer review” [↩]
One thing I can say about Kindle: error reporting is easier.
You report problems in context, by selecting the offending text. No need to explain where - just what the problem is.
Feedback receipt is confirmed, along with the next steps for how it will be used.
By contrast, to report problems to academic publishers, you often must fill out an elaborate form (e.g. Springer or Elsevier). Digging up contact information often requires going to another page (e.g. ACM.). Some make you *both* go to another page to leave feedback and then fill out a form (e.g. EBSCO). Do any academic publishers keep the context of what journal article or book chapter you’re reporting a problem with? (If so, I’ve never noticed!)
I tried out the beta of a new commercial tool, The Altmetric Explorer, from Altmetric.com. They are building on the success and ideas of the academic and non-profit community (but not formally associated with Altmetrics.org). The Altmetric Explorer gives overviews of articles and journals by the social media mentions. You can filter by publisher, journal, subject, source, etc. Altmetric Explore has a closed beta, but you can try the basic functionality on articles with their open tool, the PLoS Impact explorer.
"The default view shows the articles mentioned most frequently in all sources, from all journals. Various filters are available.
Rolling over the donut shows which sources (Twitter, blogs, ...) an article was mentioned in.
Sparklines can be used to compare journals.
A 'people' tab lets you look at individual messages. Rolling over the photo or avatar shows the poster's profile.
Altmetric.com seems largely aimed at publishers2. This may add promotional noise, not unlike coercive citation, if it is used as an evaluation metric as they suggest:3
Want to see which journals have improved their profile in social media or with a particular news outlet?
Their API is currently free for non-commercial use. Altmetric.com are crawling Twitter since July 2011 and focusing on papers with PubMed, arXiv, and DOI identifiers. They also get data from Facebook, Google+, and blogs, but they don’t disclose how. (I assume that blogs using ResearchBlogging code are crawled, for instance.)
J. Priem, D. Taraborelli, P. Groth, C. Neylon (2010), Altmetrics: A manifesto, (v.1.0), 26 October 2010. http://altmetrics.org/manifesto [↩]
“Altmetric sustains itself by selling more detailed data and analysis tools to publishers, institutions and academic societies.”, says the bookmarklet page, to explain why that is free [↩]
‘This quote from an editor as a condition for publication highlights the problem: “you cite Leukemia [once in 42 references]. Consequently, we kindly ask you to add references of articles published in Leukemia to your present article”’-from the abstract of Science. 2012 Feb 3;335(6068):542-3. Scientific publications. Coercive citation in academic publishing. Wilhite AW, Fong EA. summary on Science Daily. [↩]
EPUB is just HTML + CSS in a tasty ZIP package. Let’s have more of it!
That’s the message of this 3 minute spiel I gave David Weinberger when he interviewed me at LOD-LAM back in June. Resulting video is on YouTube and below.
Sometimes people are important to you not for who they are, but for what they do. Michael S. Hart, the founder of Project Gutenberg, is one such person. While I never met him, Michael’s work has definitely impacted my life: The last book I finished1, like most of my fiction reading over the past 3 years, was a public domain ebook. I love the illustrations.
The first personal computer: KENBAK-1 (1971)
In 1971, the idea of pleasure reading on screens must have been novel. The personal computer had just been invented; a KENBAK-1 would set you back $750 — equivalent to $4200 in 2011 dollars2.
Xerox Sigma V-SDS mainframe
Project Gutenberg’s first text — the U.S. Declaration of Independence — was keyed into a mainframe, about one month after Unix was first released34. That mainframe, a Xerox Sigma V, was one of the first 15 computers on the Internet (well, technically, ARPANET)5. Project Gutenberg is an echo of the generosity of some UIUC sysadmins: The first digital library began a gift back to the world in appreciation of access to that computer.
This is my kind of performance art, from this year’s Printer’s Ball. Got pictures, anybody?
Busted Books: The Great Soaking. Performance by Davis Schneiderman. Attendees are invited to use a artisan-constructed dunk tank to soak either a book or a Kindle—depending upon the dunker’s feelings regarding the printed word and e-readers. With this simple choice, this physical act, readers can finally stop theorizing about the future of the book and do something about it.
How well do you know Wikipedia? Get to know it a little better by looking at how your favorite article changes over time. To inspire you, here are two examples.
Jon Udell’s screencast about ‘Heavy Metal Umlaut’ is a classic, looking back (in 2005) at the first two years of that article. It points out the accumulation of information, vandalism (and its swift reversion), formatting changes, and issues around the verifiability of facts.
In a recent article for the Awl1, Emily Morris sifts through 2,303 edits of ‘Lolita’ to pull out nitpicking revision comments, interesting diffs, and statistics.
The Awl is *woefully* distracting. I urge you not to follow any links. (Thanks a lot Louis!) [↩]
What are the right ontologies for reading? And what kind of ontology support would let books recombine themselves, on the fly, in novel ways?
Today keyword searches within books and book collections is commonplace, highlighting a word in your ebook reader can bring up a definition, and dictionaries grab recent examples of word use from microblogs.1 But can’t we do more? But what kind of synthesis do we need (and what is possible) for supporting readers of literature, classics, and humanities texts?
Current approaches seem to aim at analysis (e.g. getting an overview of the literary works of a period with “distant reading”/”macroanalysis”) and at creating flexible critical editions (e.g. structural, sometimes overlapping markup, as in TEI-based editions and projects like Wendell Piez’ Sonneteer2.) I would call these “sensemaking” approaches rather than tools for reading.
I was intrigued by the Bible Ontology3 because of their tagline: “ever wanted to read and study the Bible Ontologically?” Yet I don’t really know what they mean by reading ontologically4.
Of course, they have recorded various pieces of data. For instance, for Rebekah, we see her children, siblings, birthplace, book and chapters she figures in, etc.: http://bibleontology.com/page/Rebekah.5
They offer a SPARQL endpoint, so you can query. For instance, to find all the married women6 (live query result):
PREFIX bop: <http://bibleontology.com/property/>
select ?s ?o where {?s bop:isWifeOf ?o }
Intense and long-term work has gone into Bible concordances, scholarship, etc., so it seems like a great use case for “reading ontologically”. With theologians and others looking at the site, using the SPARQL endpoint, etc., perhaps someone will be able to tell me what that means!
In 2003, Gregory Crane wrote that “Already the books in a digital library are beginning to read one another and to confer among themselves before creating a new synthetic document for review by their human readers.” When I first read it in 2006, that article seemed incredibly visionary to me. Yet these commonplace “syntheses” no longer seem extraordinary to me. [↩]
The most meaningful of their terms is the bop:isRelatedInEvent, perhaps since these events, like Isaac_blesses_Jacob, would require more analysis to discern. [↩]
Gender is not recorded so we can’t (yet) ask for all the women overall, though I’ve just asked about this. [↩]
The long-term freedom of the Internet may depend, in part, on convincing the big players of the content industry to modernize their business models.
Motivated by “protecting” the content industry, the U.S. Congress is discussing proposed legislation that could be used to seize domain names and force websites (even search engines) to remove links.
The only way to convince Washington to drop this issue for good is to show that artists and musicians can get paid on the Internet.
Currently they are not seeing any evidence of this. The Congresswoman believes that new technology needs to be developed to let artists get paid. I believe she is entirely wrong about this; see below.
The arguments that have been raised by tech companies and civil liberties groups in Washington all center around free speech; there is nothing wrong with that but it is not a viable strategy in the long run because the issue is going to keep coming back.
Arvind’s response is that the technology needed is already here. That’s old news to technologists, but the technology sector needs to educate Congress, who may not have the time and skills to get this information by themselves.
Holding on to old business models is not the way to endear yourself to customers.
But unfortunately this is also, simultaneously, a bad time to be a reader. Because the dinosaurs still don’t get it. Ten years of object lessons from the music industry, and they still don’t get it. We have learned, painfully, that media consumers—be they listeners, watchers, or readers—want one of two things:
DRM-free works for a reasonable price
or, unlimited single-payment subscription to streaming/DRMed works
Give them either of those things, and they’ll happily pay. Look at iTunes. Look at Netflix. But give them neither, and they’ll pirate. So what are publishers doing?
Refusing to sell DRM-free books. My debut novel will be re-e-published by the Friday Project imprint of HarperCollins UK later this year; both its editor and I would like it to be published without DRM; and yet I doubt we will be able to make that happen.
Eric Hellman is one of the pioneers of tomorrow’s ebook business models: his company, Gluejar, uses a crowdfunding model to re-release books under Creative Commons licenses. Authors and publishers are paid; fans pay for the books they’re most interested in; and everyone can read and distribute the resulting “unglued” ebooks. Everybody wins.