Archive for October, 2009


October 28th, 2009

I have always admired what Knuth says about email:

Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration. I try to learn certain areas of computer science exhaustively; then I try to digest that knowledge into a form that is accessible to people who don’t have time for such study.

(emphasis added) – [Knuth versus Email]

I’m not giving up email (heaven forbid!). And none of us are Knuth!

But I am weeding: culling away most listservs and feeds that aren’t core to semantic web or social web, or to my personal life (or joy). I’ll be keeping an eye on twitter, of course.

cc licensed flickr photo “weedy mulch on the start of compost pile” by Liz Henry

This (hopefully) brings me more attention for figuring out what’s going on in the semantic web and social web community, and for my literature review. But it means I’m going to have to accept lagging behind a little in everything else.

I have a love for being-in-the-know and finding interesting links. That makes it hard to stay on the bottom of things! For now, this energy and mindset get directed to my literature review (which is very much like finding the coolest new thing, only that thing may be from 1985, and nearly wholly forgotten, and not cool or new to anyone except for oneself.)

The good news is that when studying the social web, total disconnection generally isn’t desirable!

Tags: , , ,
Posted in books and reading, PhD diary | Comments (0)

When an abstract is not a summary: check the audience

October 27th, 2009

I’ve been arguing with Jim Pitman about how abstracts are different from summaries. The audience, I think, determines whether a text is suitable to be used as a summary.

This seems like a good example:

Lumley, J., Gimson, R., & Rees, O. (2007). Endless documents: a publication as a continual function. In Proceedings of the 2007 ACM symposium on Document engineering (pp. 174-176). Winnipeg, Manitoba, Canada: ACM. doi: 10.1145/1284420.1284463

Variable data can be considered as functions of their bindings to values. The Document Description Framework (DDF) treats documents in this manner, using XSLT semantics to describe document functionality and a variety of related mechanisms to support layout, reference and so forth. But the result of evaluation of a function could itself be a function: can variable data documents behave likewise? We show that documents can be treated as simple continuations within that framework with minor modifications. We demonstrate this on a perpetual diary.

This is a really interesting article from a team at HP Bristol (UK). They seem to be talking about the benefit of publishing as you go along (i.e. blogs or medical records). They call these “continual documents”.

I picked it up1 because the abstract seemed bizarre, but the topic seemed interesting. “Continual documents” struck me as “continual functions”. And the mention of XSLT hinted at transforming a document using its underlying structure.

Surely, I thought, this abstract couldn’t describe its contents. After glancing through it, I’m not sure: This abstract may well summarize the contents of the article. But for me, the abstract really didn’t serve as a summary: I don’t know the field, so the terminology (e.g. document engineering, Document Description Framework2) didn’t clue me in.

This difference gets at what AcaWiki is trying to do: provide a place for people to discuss/summarize research articles, in the way that Wikipedia is a place to discuss/summarize topics. Neither is a place for research but both are places for experts to share knowledge, for would-be-experts to describe what they know, and for non-experts to glean a deeper sense of the world than they might have had otherwise.

  1. I came across a conference on ‘document engineering’ [ACM digital library, may have a paywall] while sifting through articles for my literature review. ‘Document engineering’ includes lots of stuff that’s out of scope. Some material, on structural markup,may be relevant to online argumentation. []
  2. One interesting line stands out: “In DDF documents most program elements are <xslt:template/> trees.” []

Posted in argumentative discussions, PhD diary, random thoughts | Comments (1)

New Beginnings

October 22nd, 2009

This week I’m beginning my Ph.D. in Galway, Ireland at DERI.

Things move very quickly here. Unlike a U.S. Ph.D. student, I start with an supervisor (Alexandre Passant), an academic mentor (John Breslin), and a ‘professor in discipline’ (not quite sure yet what that entails) (Stefan Decker). Before arriving, I also put in a thesis proposal, which Alex drafted and I merely tweaked:

My thesis will investigate the use of Semantic Web technologies to represent argumentative discussions in online communities—how people have discussions on blogs, wikis, etc. and how they agree, disagree. etc.—and to make these discussions machine-readable and interoperable.

Tags: ,
Posted in argumentative discussions, PhD diary, social semantic web | Comments (4)

Gone to Galway

October 21st, 2009

Sure, you expect photos of lush green landscapes (and you might get some out of me eventually).

But first, a more practical sign that I’d arrived:

Google Calendar asks: "Change time zone to Dublin?"

Tags: , ,
Posted in random thoughts | Comments (0)

Google Books settlement: a monopoly waiting to happen

October 10th, 2009

Will Google Books create a monopoly? Some1 people think2 so. Brin claims it won’t:

If Google Books is successful, others will follow. And they will have an easier path: this agreement creates a books rights registry that will encourage rights holders to come forward and will provide a convenient way for other projects to obtain permissions.

-Sergey Brin, New York Times, A Library To Last Forever

Brin is wrong: the proposed Google Books settlement will not smooth the way for other digitization projects. It creates a red carpet for Google while leaving everyone else at risk of copyright infringement.

The safe harbor provisions apply only to Google. Anyone else who wants to use one of these books would face the draconian penalties of statutory copyright infringement if it turned out the book was actually still copyrighted. Even with all this effort, one will not be able to say with certainty that a book is in the public domain. To do that would require a legislative change – and not a negotiated settlement.

– Peter Hirtle, LibraryLawBlog: The Google Book Settlement and the Public Domain.

Monopoly is not the only risk. Others include3 reader privacy, access to culture, suitability for bulk and some research users (metadata, etc.). Too bad Brin isn’t acknowledging that!

Don’t know what all the fuss is with Google Books and the proposed settlement? Wired has a good outline from April.

  1. “Several European nations, including France and Germany, have expressed concern that the proposed settlement gives Google a monopoly in content. Since the settlement was the result of a class action against Google, it applies only to Google. Other companies would not be free to digitise books under the same terms.” (bolding mine) – Nigel Kendall, Times (UK) Online, Google Book Search: why it matters []
  2. “Google’s five-year head start and its relationships with libraries and publishers give it an effective monopoly: No competitor will be able to come after it on the same scale. Nor is technology going to lower the cost of entry. Scanning will always be an expensive, labor-intensive project.” (bolding mine) – Geoffrey Nunberg, Chronicle of Higher Education, Google’s Book Search: A Disaster for Scholars (pardon the paywall) []
  3. Of course there are lots of benefits, too! []

Tags: , , , ,
Posted in books and reading, future of publishing, information ecosystem, intellectual freedom, library and information science | Comments (1)