What types of data do social networks have? See Schneier’s Taxonomy.

November 20th, 2009
by jodi

Rights to data may depend, says Bruce Schneier, on what type of data it is and who provided it. He provides a useful enumeration:

1. Service data. Service data is the data you need to give to a social networking site in order to use it. It might include your legal name, your age, and your credit card number.

2. Disclosed data. This is what you post on your own pages: blog entries, photographs, messages, comments, and so on.

3. Entrusted data. This is what you post on other people’s pages. It’s basically the same stuff as disclosed data, but the difference is that you don’t have control over the data — someone else does.

4. Incidental data. Incidental data is data the other people post about you. Again, it’s basically same same stuff as disclosed data, but the difference is that 1) you don’t have control over it, and 2) you didn’t create it in the first place.

5. Behavioral data. This is data that the site collects about your habits by recording what you do and who you do it with.

See Schenier’s post for discussion. Via a pointer on Rob Styles’ blog, in turn via Rob’s tweet.

Have you come across other taxonomies for social networking data?

Here’s a simple but far less expressive one way to characterize data on social networks. Is it “about you” or “from you”? Either the first, the second, neither, or both. “Aboutness”, however, is ontologically challenging. Any use for this?

Collaboration/shared control isn’t considered in this taxonomy. For instance, “entrusted data” doesn’t capture the notion of “shared data” in a collaborative system such as wave, a wiki, or perhaps even email.

For behavioral data in libraries, see also “intentional data”, as used by Lorcan Dempsey, back to 2005 (and many times since) [for instance, in discussion with “emergent knowledge”]. I prefer “behavioral data” since much data about intention is by no means deliberate/intentional!

Tags: ,
Posted in social web | Comments (3)

Wave: mostly a rant

November 15th, 2009
by jodi

I’ve been on Google Wave for about a week and a half. So far I only have things to complain about.

I watched, with rapt fascination, the hour-long intro video ((oops–make that 1:20!)) back in May. Though video is usually something I consume for entertainment, not information.

So it may be that my hopes were too high. Given that Wave is a ‘preview’ (is that one level below beta?), there’s still hope for the future.

Things I don’t like about wave:

  • The new stuff isn’t always at the bottom, and ‘diff’ is a video
  • I have to add my own contacts all over again, and they’ve got new “email” addresses
  • Closed system–so I can’t communicate with just anybody
  • Feels very slow
  • Need to click to edit–yet I’m still always creating errant blank notes
  • I can’t tell what I can edit and what I can’t
  • What’s the etiquette? ((For instance, I *am* going to delete blips and extraneous comments to make things easier to follow. In a wiki this would be expected. In my own inbox it’s up to me. But in a public listserv conversation it’s verboten, except perhaps for spam deletion.))
  • Doesn’t separate content and discussion
  • Waves with lots of people get really long really quickly
  • Other maintenance–like, I guess I’m supposed to add a picture for myself?
  • The ‘inbox’ is really a list of things I’m paying attention to. ‘inbox’ seems a misnomer.
  • I can’t subscribe to wave alerts via email (e.g. if I haven’t logged in in some amount of time, remind me by email that I might want to)
  • Those damn arrows! I DON’T WANT TO SCROLL!!!!!
  • I want a list of bots, and to add a bot by clicking a button.
  • I want a ‘make this public’ button, rather than having to scramble for an email address to add.

For more information about wave, check Google’s About pages, Wikipedia’s overview, or the in-progress wiki aiming to The Complete Guide to Google Wave. At the moment, I’ve still got a few invites to give away, if you’d like to try it out for yourself.

Overall, I’m struck by the length and lack of summarization in Wave. One of the reasons I keep using gmail is that it (often but not always) helps me to keep track of the conversation. Wave doesn’t do that right now: the ‘preview’ or subject line just pulls from the first blip. (Even just pulling from the latest blip would help!)

I have a few active collaborations in Wave (SIOC, the ‘unofficial code4lib conference wave’, and a small advertising/new media conversation we’re testing moving from email). Perhaps as time goes on I’ll have a better understanding of what it’s good for in practice. Meanwhile, I welcome pointers to others’ experiences, especially easy-to-digest tips about how you’re using Wave!

Tags: , ,
Posted in information ecosystem, random thoughts | Comments (0)

Litl, the explictly social webbook, for the living room

November 14th, 2009
by jodi

The litl is a $695 ‘webbook’ with a 2-year money-back guarantee. via Scott Janousek ((Scott is a Boston-based flash developer I first discovered when following the chumby. Scott has started his own blog devoted to the litl, which, like the chumby, uses widgets. More details from his regular blog.))


cc licensed flickr photo shared by litl

The company is selling the litl as the no-fuss way to get online at home. It reminds me of the olpc more than anything I’ve seen:

  • “practically sunlight readable” screen
  • explicitly social (more below)
  • has a handle
  • converts to an easel
  • its own new, linux-based Litl OS
  • keyboard changes and simplification: “We’ve eliminated the inscrutable function keys and buttons with weird symbols. We also took out the cap locks key, which everyone uses only by mistake.” They’ve also added a ‘Litl button’ to get back to the home screen.
  • everything is always full screen ((well, except that you get 12 widgets on the homescreen))
  • 3 pounds
  • sturdy: only moving part is a small fan

It’s also something of an ambient information device, with focus on viewing rather than typing, and ‘distracted interaction’.

Like the chumby, litl is

  • invites others to build widgets
  • advertises itself as a clock
  • has channels (which can be synced with the other lidls)
  • unusual navigation (in litl’s case: a roller-wheel and remote control)
  • has upgraded packaging

Marketing is snazzy, with a lot of thought into packaging, including card illustrations by David Macaulay (flickr) (company blog post).


cc licensed flickr photo shared by litl

Litl has a strong social media presence. For instance, they advertise their minimalist packaging with a company-made unboxing video:

litl webbook unboxing from litl on Vimeo.

Aside from the company website, the most detailed information is from Wade Roush‘s xconomy Boston review.

In place of a desktop, the Webbook has a home screen that displays up to 12 boxes that Chuang calls “Web cards.” Some represent Web pages, others represent RSS feeds, and still others represent widgets or “channels” that are the Webbook’s closest thing to native applications—for example, there’s an egg timer widget for use in the kitchen and a Weather Channel widget that shows the temperature outdoors.

The litl is explicitly social: “By linking multiple litls, you can synchronize channels automatically.”
A ‘share’ button also pushes the current content to another Litl.

Convenience features

Tags: , , , , ,
Posted in random thoughts | Comments (0)

Joining the W3C

November 14th, 2009
by jodi

DERI is a W3C member, so one of the perks of studying here is getting nominated for W3C membership. Yesterday I got my W3C account. While I’ve yet to explore the Member area, I’ve been thoroughly briefed on the dissemination and confidentiality policies.

18 months ago, I wrote about the W3C for Wendell Piez‘s Document Processing class. This particular assignment was to research a standard or standards organization, and to prepare a wiki page summarizing it for our colleagues. I’ve shared this below. Among other things, it shows what (little) I know about the W3C to date.


March 15, 2008 (with markup revisions)

Who are they? What does the acronym stand for?

W3C, the World Wide Web Consortium, is an international membership organization founded in 1994. Their mission: “To lead the World Wide Web to its full potential by developing protocols and guidelines that ensure long-term growth for the Web.”

How are they organized? Who pays for their operations?

The Members—over 400 organizations in 40 countries—pay the bills for the W3C. The W3C also has a Team of 68 full-time staff headed by founding director Tim Berners-Lee (http://www.w3.org/Consortium/about-w3c.html). Management and Oversight functions are provided by an Advisory Committee (a representative body of the Members), an Advisory Board, a Technical Architecture Group, and Host Institutions. Further details about Membership and Oversight are available below.

What does the W3C do? What are their most important standards and what sorts of standards do they create?

The W3C “develops open specifications (de facto standards) to enhance the interoperability of web-related products.” Webstandards.org

The W3C’s best-known standards (‘Recommendations’) are HTML, XML, CSS, and xHTML. XSL and XSLT are also W3C Recommendations. Many XML-related technologies are also Recommendations of the W3C. See the full list of W3C Recommendations.

How do they go about creating these standards?

Specifications originate from Working Groups and pass through a multi-step process in order to become a W3C Recommendation:

  • Working Draft (WD)
  • Candidate Recommendation (CR)
  • Proposed Recommendation (PR)
  • Recommendation (REC)

During this process, Working Groups are expected to publish a new draft at least every 3 months under the heartbeat rule. Frequent publication allows for early, frequent feedback, and the Working Group is expected to attend to issues raised by the community. Public comment, as well as the endorsement of W3C Members and the W3C Director, are an important part of the Recommendation Track.

Recommendation Track

“The W3C Recommendation Track process is designed to maximize consensus about the content of a technical report, to ensure high technical and editorial quality, and to earn endorsement by W3C and the broader community.” The Organizational Process document provides details about the documents produced on the Recommendation Track:

Working Draft (WD)
A Working Draft is a document that W3C has published for review by the community, including W3C Members, the public, and other technical organizations.
Candidate Recommendation (CR)
A Candidate Recommendation is a document that W3C believes has been widely reviewed and satisfies the Working Group’s technical requirements. W3C publishes a Candidate Recommendation to gather implementation experience.
Proposed Recommendation (PR)
A Proposed Recommendation is a mature technical report that, after wide review for technical soundness and implementability, W3C has sent to the W3C Advisory Committee for final endorsement.
W3C Recommendation (REC)
A W3C Recommendation is a specification or set of guidelines that, after extensive consensus-building, has received the endorsement of W3C Members and the Director. W3C recommends the wide deployment of its Recommendations. Note: W3C Recommendations are similar to the standards published by other organizations. ”

Other publications

Recommendations may also become Rescinded Recommendations, and Working Groups may decide to abort work and publish a Working Group Note annotating their work to date (alternatives to Recommendations).

The W3C also publicizes selected content from their constituents, without endorsing this content. See Team Submissions and Member Submissions These submissions are not standards.

W3C’s work is organized into the 21 areas, called Activities. Activity summaries and homepages are a rich source of information about ongoing work of the W3C.

Who does this standards work?

Working Groups do much of the standards work of the W3C. Working Groups primarily consist of Member repesentatives and Team representatives; Invited Experts may also participate. Designated individuals work as Chair and Team Contact for these small groups “(typically fewer than 15 people)”. Typically, Working Groups last for 6 months to 2 years. Working Groups “typically produce deliverables (e.g., Recommendation Track technical reports, software, test suites, and reviews of the deliverables of other groups).” Working Groups are constituted when the W3C Director, currently Tim Berners-Lee, issues a Call for Participation, providing the Advisory Committee with a Charter of the group’s mission, duration, and deliverables. They are expected to invite and respond to public commentary about standards in progress.

The W3C also has Interest Groups and Coordination Groups. Interest Groups are larger bodies without deliverables formed around a technical interest. For some Interest Groups, participation on a public mailing list is the only criterion for participation. Like Working Groups, Interest Groups are originally created by a Call for Participation and Charter from the Director.

Coordination Groups “manage dependencies”. Coordination Groups consist of a Chair, the Chair of each coordinated group (to promote effective communication among the groups), invited experts (e.g., liaisons to groups inside or outside W3C), and Team representatives (including the Team Contact).

How do they promulgate their standards? What leverage, if any, do they have over their users or potential users?

The multi-step community process of the Recommendation Track publicizes draft standards and invites input from the community. So to some extent, W3C Recommendations are promulgated during development. Furthermore, many Members are large corporations, particularly technology companies which may wield influence over the adoption of standards.

While W3C doesn’t issue certifications for compliance implementations, in some cases they do sponsor compliance testing tools, for instance, HTML Conformance Testing. However, in some cases, standards may be only partially implemented or extended. HTML 4 provides an example: browsers such as Internet Explorer and Safari don’t exactly implement this Recommendation. Rather, they approximate the standard, by implementing the protocols designed by the companies that build them.

W3C Activities may have associated education groups such as the Semantic Web Education and Outreach (SWEO) Interest Group which may promote the W3C, its technologies, and the associated Recommendations.

W3C Membership

Membership is open, subject to approval by the W3C, and involves paying a fee to the organization. Rates are tiered, depending on the income-level of the country where the organization’s headquarters, as classified by the World Bank. (In 2008, the fees range from $953/year to $63,500/year. U.S. organizations pay $6,350 or $63,500, depending on their non-profit status and income.)

A list of current Members is available. Representatives of Member organizations make up the Advisory Committee, composed of one representative from each Member organization.

Oversight and management of the W3C

Oversight and management functions are handled by various groups: Advisory Committee, Advisory Board, Technical Advisory Group, and Host Institution.

Advisory Committee

Handles: Meets twice a year about overall direction of the W3C
Composition: 1 participant per Member Organization

Advisory Board (AB)

Handles: Business oversight, including Member concerns and matters of “strategy, management, legal matters, process, and conflict resolution”
Composition: 10 participants: 9 elected participants and a Chair (currently Tim Berners-Lee)

Technical Advisory Group (TAG)

Handles: Technical oversight and stewardship of the Web architecture, especially consensus-building and collaboration relating to Web architecture.
Composition: 9 participants: 5 elected by Advisory Committee, 3 appointed by Directory, 1 Chair (currently Tim Berners-Lee)

Host Institutions

Handles: Signing contracts, oversight of “Team salaries, detailed budgeting, and other business decisions”.
Composition: The W3C has 3 host institutions: MIT in Cambridge, MA, U.S.A., the European Research Consortium for Informatics and Mathematics (ERCIM) in France, and Keio University in Japan.

Note: The W3C is not legally incorporated. Instead, the host organizations (which are not members of the W3C) enter into contracts for the W3C. Membership documents, for example, are currently executed by each host organization.

Getting involved with the W3C as an individual

Tags: ,
Posted in PhD diary | Comments (0)

Scholarly Streams

November 10th, 2009
by jodi

Streams aren’t new. Funding for streams, though, that’s new.

MediaCommons has just announced funding from the NEH to create “digital portfolios”:
“Given this proliferation, what we need as scholars may be less a system that will manage our communication for us than a system that will allow us to manage our communication, a system than recognizes that the key aspect of scholarly communication into the future may be less the distribution of the products of our research than the management of the social networks through which our research is distributed.” [emphasis mine] MediaCommons as Digital Scholarly Network: Unveiling the Profile System. Via @kfitz.

So scholars don’t have to roll their own, ((Personally I’m all for rolling your own. At least in theory. The first lifestream I ever noticed was code4lib’ber Mark Matienzo’s self-hosted planet , which aggregates his blog posts (both personal and work), tweets, youtube uploads, delicious bookmarks, and last.fm scrobbles. Brilliant, but thus far I’ve been too shy & lazy to follow suit.)) or depend on dubiously-funded startups. ((FriendFeed popularized lifestreams. When Facebook bought FriendFeed back in August, my networks of librarians and scientists had several discussions of alternatives for scientists and other scholars.))

While the announcement implies “less is more”, Kathleen’s sample profile strikes me as a lifestream. Streams themselves are more “more” than “less”. (‘Firehose’ comes to mind.) So streams alone aren’t going to solve scholarly communication. But streams can be sliced and diced any number of ways. First the data. Then, if there’s interest, maybe some services.

Tags: , , , , , ,
Posted in information ecosystem, scholarly communication | Comments (2)

Weeding

October 28th, 2009
by jodi

I have always admired what Knuth says about email:

Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration. I try to learn certain areas of computer science exhaustively; then I try to digest that knowledge into a form that is accessible to people who don’t have time for such study.

(emphasis added) – [Knuth versus Email]

I’m not giving up email (heaven forbid!). And none of us are Knuth!

But I am weeding: culling away most listservs and feeds that aren’t core to semantic web or social web, or to my personal life (or joy). I’ll be keeping an eye on twitter, of course.


cc licensed flickr photo “weedy mulch on the start of compost pile” by Liz Henry

This (hopefully) brings me more attention for figuring out what’s going on in the semantic web and social web community, and for my literature review. But it means I’m going to have to accept lagging behind a little in everything else.

I have a love for being-in-the-know and finding interesting links. That makes it hard to stay on the bottom of things! For now, this energy and mindset get directed to my literature review (which is very much like finding the coolest new thing, only that thing may be from 1985, and nearly wholly forgotten, and not cool or new to anyone except for oneself.)

The good news is that when studying the social web, total disconnection generally isn’t desirable!

Tags: , , ,
Posted in books and reading, PhD diary | Comments (0)

When an abstract is not a summary: check the audience

October 27th, 2009
by jodi

I’ve been arguing with Jim Pitman about how abstracts are different from summaries. The audience, I think, determines whether a text is suitable to be used as a summary.

This seems like a good example:

Lumley, J., Gimson, R., & Rees, O. (2007). Endless documents: a publication as a continual function. In Proceedings of the 2007 ACM symposium on Document engineering (pp. 174-176). Winnipeg, Manitoba, Canada: ACM. doi: 10.1145/1284420.1284463

Variable data can be considered as functions of their bindings to values. The Document Description Framework (DDF) treats documents in this manner, using XSLT semantics to describe document functionality and a variety of related mechanisms to support layout, reference and so forth. But the result of evaluation of a function could itself be a function: can variable data documents behave likewise? We show that documents can be treated as simple continuations within that framework with minor modifications. We demonstrate this on a perpetual diary.

This is a really interesting article from a team at HP Bristol (UK). They seem to be talking about the benefit of publishing as you go along (i.e. blogs or medical records). They call these “continual documents”.

I picked it up ((I came across a conference on ‘document engineering’ [ACM digital library, may have a paywall] while sifting through articles for my literature review. ‘Document engineering’ includes lots of stuff that’s out of scope. Some material, on structural markup,may be relevant to online argumentation.)) because the abstract seemed bizarre, but the topic seemed interesting. “Continual documents” struck me as “continual functions”. And the mention of XSLT hinted at transforming a document using its underlying structure.

Surely, I thought, this abstract couldn’t describe its contents. After glancing through it, I’m not sure: This abstract may well summarize the contents of the article. But for me, the abstract really didn’t serve as a summary: I don’t know the field, so the terminology (e.g. document engineering, Document Description Framework ((One interesting line stands out: “In DDF documents most program elements are <xslt:template/> trees.”))) didn’t clue me in.

This difference gets at what AcaWiki is trying to do: provide a place for people to discuss/summarize research articles, in the way that Wikipedia is a place to discuss/summarize topics. Neither is a place for research but both are places for experts to share knowledge, for would-be-experts to describe what they know, and for non-experts to glean a deeper sense of the world than they might have had otherwise.

Tags:
Posted in argumentative discussions, PhD diary, random thoughts | Comments (1)

New Beginnings

October 22nd, 2009
by jodi

This week I’m beginning my Ph.D. in Galway, Ireland at DERI.

Things move very quickly here. Unlike a U.S. Ph.D. student, I start with an supervisor (Alexandre Passant), an academic mentor (John Breslin), and a ‘professor in discipline’ (not quite sure yet what that entails) (Stefan Decker). Before arriving, I also put in a thesis proposal, which Alex drafted and I merely tweaked:

My thesis will investigate the use of Semantic Web technologies to represent argumentative discussions in online communities—how people have discussions on blogs, wikis, etc. and how they agree, disagree. etc.—and to make these discussions machine-readable and interoperable.

Tags: ,
Posted in argumentative discussions, PhD diary, social semantic web | Comments (4)

Gone to Galway

October 21st, 2009
by jodi

Sure, you expect photos of lush green landscapes (and you might get some out of me eventually).

But first, a more practical sign that I’d arrived:

Google Calendar asks: "Change time zone to Dublin?"

Tags: , ,
Posted in random thoughts | Comments (0)

Google Books settlement: a monopoly waiting to happen

October 10th, 2009
by jodi

Will Google Books create a monopoly? Some ((“Several European nations, including France and Germany, have expressed concern that the proposed settlement gives Google a monopoly in content. Since the settlement was the result of a class action against Google, it applies only to Google. Other companies would not be free to digitise books under the same terms.” (bolding mine) – Nigel Kendall, Times (UK) Online, Google Book Search: why it matters )) people think ((“Google’s five-year head start and its relationships with libraries and publishers give it an effective monopoly: No competitor will be able to come after it on the same scale. Nor is technology going to lower the cost of entry. Scanning will always be an expensive, labor-intensive project.” (bolding mine) – Geoffrey Nunberg, Chronicle of Higher Education, Google’s Book Search: A Disaster for Scholars (pardon the paywall))) so. Brin claims it won’t:

If Google Books is successful, others will follow. And they will have an easier path: this agreement creates a books rights registry that will encourage rights holders to come forward and will provide a convenient way for other projects to obtain permissions.

-Sergey Brin, New York Times, A Library To Last Forever

Brin is wrong: the proposed Google Books settlement will not smooth the way for other digitization projects. It creates a red carpet for Google while leaving everyone else at risk of copyright infringement.

The safe harbor provisions apply only to Google. Anyone else who wants to use one of these books would face the draconian penalties of statutory copyright infringement if it turned out the book was actually still copyrighted. Even with all this effort, one will not be able to say with certainty that a book is in the public domain. To do that would require a legislative change – and not a negotiated settlement.

– Peter Hirtle, LibraryLawBlog: The Google Book Settlement and the Public Domain.

Monopoly is not the only risk. Others include ((Of course there are lots of benefits, too!)) reader privacy, access to culture, suitability for bulk and some research users (metadata, etc.). Too bad Brin isn’t acknowledging that!

Don’t know what all the fuss is with Google Books and the proposed settlement? Wired has a good outline from April.

Tags: , , , ,
Posted in books and reading, future of publishing, information ecosystem, intellectual freedom, library and information science | Comments (1)