Archive for the ‘information ecosystem’ Category

Karen Coyle on Library Linked Data: let’s create data not records

January 12th, 2012

There have been some interesting posts on BIBFRAME recently (noted a few of them).

Karen Coyle also pointed to her recent blog post on transforming bibliographic data into RDF. As she says, for a real library linked data environment,

we need to be creating data, not records, and that we need to create the data first, then build records with it for those applications where records are needed.

Tags: , , , , ,
Posted in information ecosystem, library and information science, semantic web | Comments (1)

Quantified Self Europe, Saturday morning.

November 26th, 2011

What is this Quantified Self stuff, anyway? Here’s a brief intro (prettier PDF version) Nathan Yau and I wrote.

This weekend I’m in Amsterdam for Quantified Self Europe. So far this morning I’ve met Arduino hackers, seen several talks about monitoring heart rate (continuously, cool, or even with swimming goggles) and lung capacity. Oh, and given a talk about Exercise and Weight tracking.

There’s lots of blogging/photoblogging going on. Twitter hashtag (formerly #QSelfEurope) is #QS2011.

Tags: , , , ,
Posted in information ecosystem, random thoughts, social web | Comments (0)

Time-based comments

November 14th, 2011

I’ve been digging SoundCloud lately.

Today I noticed time-based comments in their tracks. It’s a bit disorienting to have comments pop up as you’re listening. Maybe after adjusting, there’s a pleasant sense of having a conversation going on around you. Definitely feels like you’ve got company!

Comments pop up as the track plays

Avatars appear below the track to indicate that there are comments, and you can scroll over avatars to read comments. You can also hide the comments if you prefer.

Entering a comment from the timeline


Comments are indicated by avatar icons in the full view.

Avatar icons appear in the overview

Example track due to Duncan.

Tags: , , , , ,
Posted in argumentative discussions, information ecosystem, PhD diary, social web | Comments (0)

YouTube “I dislike this” button

November 14th, 2011

A few weeks ago, I noticed something new on YouTube: an “I dislike this” button.

I wonder how long that’s been there?

 

When I talk about online argumentation, a frequent comment is “too bad there’s only +1 and Like; we need more expressivity”.

See related discussions:

Tags: , , , ,
Posted in argumentative discussions, information ecosystem, PhD diary, social web | Comments (1)

Support EPUB!

November 7th, 2011

EPUB is just HTML + CSS in a tasty ZIP package. Let’s have more of it!

That’s the message of this 3 minute spiel I gave David Weinberger when he interviewed me at LOD-LAM back in June. Resulting video is on YouTube and below.

Tags: , ,
Posted in books and reading, future of publishing, information ecosystem, library and information science | Comments (0)

Ways to use the crowd

November 6th, 2011

Loro Aroyo gave a talk in DERI on Friday, based on her “Crowdsourcing community science” slide deck. She was in town for Smita‘s viva. This is a deck of interest to anybody in digital cultural heritage.

The slide on “Ways to use the crowd” seemed particularly useful to me:

  • tagging & classification
  • editing & transcribing
  • contextualising
  • acquisition
  • co-curation
  • crowdfunding

Posted in information ecosystem, PhD diary, social web | Comments (0)

Frank van Harmelen’s laws of information

November 1st, 2011

What are the laws of information? Frank van Harmelen proposes seven laws of information science in his keynote to the Semantic Web community at ISWC2011.1

  1. Factual knowledge is a graph.2
  2. Terminological knowledge is a hierarchy.
  3. Terminological knowledge is much smaller3 than the factual knowledge.
  4. Terminological knowledge is of low complexity.4
  5. Heterogeneity is unavoidable.5
  6. Publication should be distributed, computation should be centralized to decrease speed: “The Web is not a database, and I don’t think it ever will be.”
  7. Knowledge is layered.
What do you think? If they are laws, can they be proven/disproven?

Semantic Web vocabularies in the Tower of Babel

I wish every presentation came with this sort of summary: slides and transcript, presented in a linear fashion. But these laws deserve more attention and discussion–especially from information scientists. So I needed something even punchier to share, (prioritized thanks to Karen).

  1. He presents them as “computer science laws” underlying the Semantic Web; yet they are laws about knowledge. This makes them candidate laws of information science, in my terminology. []
  2. “The vast majority of our factual knowledge consists of simple relationships between things,
    represented as an ground instance of a binary predicate.
    And lots of these relations between things together form a giant graph.” []
  3. by 1-2 orders of magnitude []
  4. This is seen in “the unreasonable effectiveness of low-expressive KR”: ”the information universe is apparently structured in such a way that the double exponential worse case complexity bounds don’t hit us in practice.” []
  5. But heterogeneity is solvable through mostly social, cultural, and economic means (algorithms contribute a little bit). []

Tags: , , ,
Posted in computer science, information ecosystem, library and information science, PhD diary, semantic web | Comments (0)

The Legacy of Michael S. Hart

September 16th, 2011

ship sinking into a whirlpool near the Lone Tower

Sometimes people are important to you not for who they are, but for what they do. Michael S. Hart, the founder of Project Gutenberg, is one such person. While I never met him, Michael’s work has definitely impacted my life: The last book I finished1, like most of my fiction reading over the past 3 years, was a public domain ebook. I love the illustrations.

KENBAK-1 from 1971

The first personal computer: KENBAK-1 (1971)

In 1971, the idea of pleasure reading on screens must have been novel. The personal computer had just been invented; a KENBAK-1 would set you back $750 — equivalent to $4200 in 2011 dollars2.

Xerox Sigma V-SDS mainframe

Xerox Sigma V-SDS mainframe

Project Gutenberg’s first text — the U.S. Declaration of Independence — was keyed into a mainframe, about one month after Unix was first released34. That mainframe, a Xerox Sigma V, was one of the first 15 computers on the Internet (well, technically, ARPANET)5. Project Gutenberg is an echo of the generosity of some UIUC sysadmins: The first digital library began a gift back to the world in appreciation of access to that computer.

Thanks, Michael.

Originally via @muttinmall

  1. The Book of Dragons, by Edith Nesbit: highly recommended, especially if you like silly explanations or fairy tales with morals. []
  2. CPI Inflation Calculator []
  3. Computer history timeline 1960-1980 []
  4. Project Gutenberg Digital Library Seeks To Spur Literacy:
    Library hopes to offer 1 million electronic books in 100 languages
    , 2007-07-20, Jeffrey Thomas []
  5. Amazingly, this predated NCSA. You can see the building–Thomas Siebel–hosting the node thanks to a UIUC Communication Technology and Society class assignment []

Tags: , , , , , , , ,
Posted in books and reading, future of publishing, information ecosystem, library and information science | Comments (0)

Understanding Wikipedia through the evolution of a single page

August 26th, 2011

“The only constant is change.” – Heraclitis

How well do you know Wikipedia? Get to know it a little better by looking at how your favorite article changes over time. To inspire you, here are two examples.

Jon Udell’s screencast about ‘Heavy Metal Umlaut’ is a classic, looking back (in 2005) at the first two years of that article. It points out the accumulation of information, vandalism (and its swift reversion), formatting changes, and issues around the verifiability of facts.

In a recent article for the Awl1, Emily Morris sifts through 2,303 edits of ‘Lolita’ to pull out nitpicking revision comments, interesting diffs, and statistics.

  1. The Awl is *woefully* distracting. I urge you not to follow any links. (Thanks a lot Louis!) []

Tags: ,
Posted in books and reading, future of publishing, information ecosystem, library and information science, social web | Comments (0)

Citation management means different things to different people

August 3rd, 2011

I got to talking with a mathematician friend about citation management. We came to the conclusion that “manage PDFs” is my primary goal while “get out good citations” is his primary goal. I thought it would interesting to look at his requirements.

His ideal program would

  1. Organize the PDFs (Papers does this, when it doesn’t botch the author names and the title) preferably in the file system, so I can use Dropbox
  2. Get BibTeX entires from MathSciNet, ACM, etc. EXACTLY AS THEY ARE
  3. Have some decent way to organize notes by “project” or something

He doesn’t care about:

  1. Typing \cite
  2. A “unified” bibliographic database
  3. Social bibliographies (though I am not against them; it is just not a burning issue)

He says:

I guess the point is that, if I am writing something and I know I want to cite it, and I know there is a “official” BibTeX for it, I just need a way to get that more quickly than:

  1. Type the URL
  2. Click on “Proxy this” in my bookmarks bar
  3. Search for the paper
  4. Copy/paste the BibTeX
  5. Edit the cite key to something mnemonic

He followed up with an example of the “awful” awful, lossy markup Papers produces which loses information including the ISSN and DOI; he prefers the minimalist BibTeX. (oops!; he adds “I understated how bad papers is. The real papers entry (top) not only has screwy names, but junk instead of the full journal name. The papers cite key is meaningless noise too (but mathscinet is meaningful noise).”) To get around this, he does the same search/download “a million times”.

AMS Papers2 BibTeX:
@article{AR78,
author = {L Asimow and B Roth},
journal = {Trans. Amer. Math. Soc.},
title = {The rigidity of graphs},
pages = {279--289},
volume = {245},
year = {1978},
}

Papers' The AMS version of the same BibTeX:
@article {AR78,
    AUTHOR = {Asimow, L. and Roth, B.},
     TITLE = {The rigidity of graphs},
   JOURNAL = {Trans. Amer. Math. Soc.},
  FJOURNAL = {Transactions of the American Mathematical Society},
    VOLUME = {245},
      YEAR = {1978},
     PAGES = {279--289},
      ISSN = {0002-9947},
     CODEN = {TAMTAM},
   MRCLASS = {57M15 (05C10 52A40 53B50 73K05)},
  MRNUMBER = {511410 (80i:57004a)},
MRREVIEWER = {G. Laman},
       DOI = {10.2307/1998867},
       URL = {http://dx.doi.org/10.2307/1998867},
}

I’ve just discovered that BibDesk‘s1 ‘minimize’ does what he wants: its has output is quite close to the AMS Papers2 version:

@article{AR78,
	Author = {Asimow, L. and Roth, B.},
	Journal = {Trans. Amer. Math. Soc.},
	Pages = {279--289},
	Title = {The rigidity of graphs},
	Volume = {245},
	Year = {1978}}

I’d still like to understand the impact the non-minimal BibTeX is having; could be bad citation styles are causing part of the problem.

While we have different needs for citation management, we’re both annoyed by the default filenames many publishers use – like fulltext.pdf and sdarticle.pdf. But I’ll tolerate these, as long as I can get to it from a database index with a nice frontend.

We of course moved on to discussing how research needs an iTunes or, as Geoff Bilder has called it, an iPapers.

This blog post brought to you by Google chat and the number 3.

  1. See also A short review of BibDesk from MacResearch []

Tags: , , , , ,
Posted in books and reading, information ecosystem, library and information science, scholarly communication | Comments (0)