Support EPUB!

November 7th, 2011
by jodi

EPUB is just HTML + CSS in a tasty ZIP package. Let’s have more of it!

That’s the message of this 3 minute spiel I gave David Weinberger when he interviewed me at LOD-LAM back in June. Resulting video is on YouTube and below.

Tags: , ,
Posted in books and reading, future of publishing, information ecosystem, library and information science | Comments (0)

Ways to use the crowd

November 6th, 2011
by jodi

Loro Aroyo gave a talk in DERI on Friday, based on her “Crowdsourcing community science” slide deck. She was in town for Smita‘s viva. This is a deck of interest to anybody in digital cultural heritage.

The slide on “Ways to use the crowd” seemed particularly useful to me:

  • tagging & classification
  • editing & transcribing
  • contextualising
  • acquisition
  • co-curation
  • crowdfunding

Posted in information ecosystem, PhD diary, social web | Comments (0)

Web of data for books?

November 5th, 2011
by jodi

If you were building a user interface for the Web of data, for books, it just might look like Small Demons.

Unfortunately you can’t see much without logging in, so go get yourself a beta account. (I’ve already complained about asking for a birthday. My new one is 29 Feb 1904, you can help me celebrate in 2012!)

Their data on Ireland is pretty sketchy so far. They do offer to help you buy Guiness on Amazon though. :)

Tags: ,
Posted in books and reading, library and information science, semantic web, social semantic web | Comments (0)

Frank van Harmelen’s laws of information

November 1st, 2011
by jodi

What are the laws of information? Frank van Harmelen proposes seven laws of information science in his keynote to the Semantic Web community at ISWC2011. ((He presents them as “computer science laws” underlying the Semantic Web; yet they are laws about knowledge. This makes them candidate laws of information science, in my terminology.))

  1. Factual knowledge is a graph. ((“The vast majority of our factual knowledge consists of simple relationships between things,
    represented as an ground instance of a binary predicate.
    And lots of these relations between things together form a giant graph.”))
  2. Terminological knowledge is a hierarchy.
  3. Terminological knowledge is much smaller ((by 1-2 orders of magnitude)) than the factual knowledge.
  4. Terminological knowledge is of low complexity. ((This is seen in “the unreasonable effectiveness of low-expressive KR”: “the information universe is apparently structured in such a way that the double exponential worse case complexity bounds don’t hit us in practice.”))
  5. Heterogeneity is unavoidable. ((But heterogeneity is solvable through mostly social, cultural, and economic means (algorithms contribute a little bit). ))
  6. Publication should be distributed, computation should be centralized to decrease speed: “The Web is not a database, and I don’t think it ever will be.”
  7. Knowledge is layered.
What do you think? If they are laws, can they be proven/disproven?

Semantic Web vocabularies in the Tower of Babel

I wish every presentation came with this sort of summary: slides and transcript, presented in a linear fashion. But these laws deserve more attention and discussion–especially from information scientists. So I needed something even punchier to share, (prioritized thanks to Karen).

Tags: , , ,
Posted in computer science, information ecosystem, library and information science, PhD diary, semantic web | Comments (0)

OH: Informal argumentation

October 31st, 2011
by jodi

Yesterday I overheard two guys talking in the grocery store:

I am more of a John Lennon than you are.

The response?

My hair has more volume, therefore I am.

A brief, informal argument. Halloween-themed, I presume.

Tags: , ,
Posted in argumentative discussions, PhD diary | Comments (1)

Quantified Self Europe, two talks proposed

October 12th, 2011
by jodi

Thanksgiving weekend doesn’t really register in Europe. But this year it will for me: I’m going to Amsterdam for Quantified Self Europe, since I’m lucky enough to have a scholarship covering conference fees.

Today I proposed two talks:

  1. Weight and exercise tracking (which I’ve been doing in various forms for 19 months, currently using a Phillips DirectLife exercise monitor, and a normal scale, collected with the hacker’s diet). Mainly, these are less integrated than they could be, and I’d like to advocate interoperability, APIs, and uniform formats — while hopefully getting some ideas from the audience about quick hacks to improve my current system.
  2. Lifetracking, privacy & the surveillance society. This brings together two themes: First, how individuals’ lifetracking can be seen as a re-enactment of privacy, with changed ideas of what that means (e.g. panopticon, sousveillance, etc.). Second, the increased awareness about the wealth of personal data held by corporations (e.g. German politician Malt Spitz sued to get 6 months of his telcom data). The boundary between public life and private life is continually shifting as communication technology and social norms evolve; this talk investigates how lifetracking and the quantified self movement push the privacy/publicity boundaries in multiple ways. QS increases the public audience for data-driven stories of private lives while also highlighting the need for individuals to control access to and the disposition of their own personal data.

Ironically, self-surveillance was an academic interest of mine before it became a personal one:  Back in 2009, Nathan Yau and I wrote a paper for the ASIST Bulletin about self-surveillance (PDF) [less pretty in HTML]. It helped interest me in the Semantic Web, too: putting data in standard formats would make it easier to make data-driven visualizations, so lifetracking and the quantified self movement is a great usecase for the (social) Semantic Web. QS also shows how privacy cuts both ways and could provide an early-adopter audience for the kind of fine-grained privacy tools a colleague is developing.

(A first reply to Nic’s encouragement)

Tags: , , , , , , ,
Posted in semantic web, social semantic web | Comments (0)

The Legacy of Michael S. Hart

September 16th, 2011
by jodi

ship sinking into a whirlpool near the Lone Tower

Sometimes people are important to you not for who they are, but for what they do. Michael S. Hart, the founder of Project Gutenberg, is one such person. While I never met him, Michael’s work has definitely impacted my life: The last book I finished ((The Book of Dragons, by Edith Nesbit: highly recommended, especially if you like silly explanations or fairy tales with morals.)), like most of my fiction reading over the past 3 years, was a public domain ebook. I love the illustrations.

KENBAK-1 from 1971

The first personal computer: KENBAK-1 (1971)

In 1971, the idea of pleasure reading on screens must have been novel. The personal computer had just been invented; a KENBAK-1 would set you back $750 — equivalent to $4200 in 2011 dollars ((CPI Inflation Calculator)).

Xerox Sigma V-SDS mainframe

Xerox Sigma V-SDS mainframe

Project Gutenberg’s first text — the U.S. Declaration of Independence — was keyed into a mainframe, about one month after Unix was first released ((Computer history timeline 1960-1980)) ((Project Gutenberg Digital Library Seeks To Spur Literacy:
Library hopes to offer 1 million electronic books in 100 languages
, 2007-07-20, Jeffrey Thomas)). That mainframe, a Xerox Sigma V, was one of the first 15 computers on the Internet (well, technically, ARPANET) ((Amazingly, this predated NCSA. You can see the building–Thomas Siebel–hosting the node thanks to a UIUC Communication Technology and Society class assignment)). Project Gutenberg is an echo of the generosity of some UIUC sysadmins: The first digital library began a gift back to the world in appreciation of access to that computer.

Thanks, Michael.

Originally via @muttinmall

Tags: , , , , , , , ,
Posted in books and reading, future of publishing, information ecosystem, library and information science | Comments (0)

They really know how to throw a party in Chicago…

September 14th, 2011
by jodi

This is my kind of performance art, from this year’s Printer’s Ball. Got pictures, anybody?

Busted Books: The Great Soaking. Performance by Davis Schneiderman. Attendees are invited to use a artisan-constructed dunk tank to soak either a book or a Kindle—depending upon the dunker’s feelings regarding the printed word and e-readers. With this simple choice, this physical act, readers can finally stop theorizing about the future of the book and do something about it.

Tags: , , , ,
Posted in books and reading, future of publishing, random thoughts | Comments (0)

Reading Group talk: Using Controlled Natural Language and First Order Logic to improve e-consultation discussion forums

September 7th, 2011
by jodi

Today the DERI Reading Group starts up again for the fall. I’m talking about three papers from the IMPACT project.

For now this is just to provide my colleagues with links; check back later for slides, etc.Scroll down for slides and video.

  1. Adam Wyner and Tom van Engers. A Framework for Enriched, Controlled On-line Discussion Forums for e-Government Policy-making. EGOVIS 2010. AcaWiki Summary
  2. Adam Wyner, Tom van Enger, and Kiavash Bahreini. From Policy-making Statements to First-order Logic. Electronic Government and Electronic Participation 2010. AcaWiki Summary
  3. Adam Wyner and Tom van Enger. Towards Web-based Mass Argumentation in Natural Language. (long version of this EKAW 2010 poster). AcaWiki Summary

Reading Group talk: Using Controlled Natural Language and First Order Logic to improve e-consultation discussion forums from Jodi Schneider on Vimeo.



Tags: , ,
Posted in argumentative discussions, PhD diary, social semantic web | Comments (1)

Understanding Wikipedia through the evolution of a single page

August 26th, 2011
by jodi

“The only constant is change.” – Heraclitis

How well do you know Wikipedia? Get to know it a little better by looking at how your favorite article changes over time. To inspire you, here are two examples.

Jon Udell’s screencast about ‘Heavy Metal Umlaut’ is a classic, looking back (in 2005) at the first two years of that article. It points out the accumulation of information, vandalism (and its swift reversion), formatting changes, and issues around the verifiability of facts.

In a recent article for the Awl ((The Awl is *woefully* distracting. I urge you not to follow any links. (Thanks a lot Louis!) )), Emily Morris sifts through 2,303 edits of ‘Lolita’ to pull out nitpicking revision comments, interesting diffs, and statistics.

Tags: ,
Posted in books and reading, future of publishing, information ecosystem, library and information science, social web | Comments (0)