» PhD diary

Archive for the ‘PhD diary’ Category

Enabling a Social Semantic Web for Argumentation (defining my Ph.D. research problem)

July 23rd, 2010

I’m working on online argumentation: Making it easier to have discussions, get to consensus, and understand disagreements across websites.

Here are the 3 key questions and the most closely related work that I’ve identified in the first 9 months of my Ph.D.

Read on, if you want to know more. Then let me know what you think! Suggestions will be especially helpful since I’m writing my first year Ph.D. report, which will set the direction for my second year at DERI.

Enabling a Social Semantic Web for Argumentation

Argumentative discussions occur informally throughout the Web, however there is currently no way of bringing together all of the discussions on a given topic along with an indication of who is agreeing and who is disagreeing. Thus substantial human analysis is required to integrate opinions and expertise to, for instance, determine the best policies and procedures to mitigate global warming, or the recommended treatment for a given disease. New techniques for gathering and organising the Social Web using ontologies such as FOAF and SIOC show promise for creating a Social Semantic Web for argumentation.

I am currently investigating three main research questions to establish the Social Semantic Web for argumentation:

How can we best define argumentation for the Social Semantic Web, to isolate the essential problems? We wish to enable reasoning with inconsistent knowledge, to integrate disparate knowledge, and identify consensus and disputes. Similar questions and techniques come up in related but distinct areas, such as sentiment analysis, dialogue mapping, dispute resolution, question-answering and e-government participation.
What sort of modular framework for argumentation can support distributed, emergent argumentation — a World Wide Argumentation Web? Some Web 2.0 tools, such as Debatepedia, LivingVote, and Debategraph, provide integrated environments for explicit argumentation. But our goal is for individuals to be able to use their own preferred tools — in a social environment — while understanding what else is being discussed.
How can we manage the tension between informality and ease of expression on the one hand and formal semantics and retrievability/reusability on the other hand? Minimal integration of informal arguments requires two pieces of information: a statement of the issue or proposition, and an indication of polarity (agreement or disagreement). How can we gather this information without adding cognitive overhead for users?

Related Work

Ennals et al. ask: ‘What is disputed on the Web? (Ennals 2010b). They use annotation and NLP techniques to develop a prototype system for highlighting disputed claims in Web documents (Ennals 2010a). Cabanac et al. find that two algorithms for identifying the level of controversy about an issue were up to 84% accurate (compared to human perception), on a corpus of 13 arguments. These are useful prototypes of what could be done; Ennals prototype is indeed a Web-scale system, but disputed claims are not arguments.

Rahwan et al. (2007) present a pilot Semantic Web-based system, ArgDF, in which users can create arguments, and query to find networks of arguments. ArgDF is backed with the AIF-RDF ontology, and uses Semantic Web standards. Rahwan (2008) surveys current Web2.0 tools, pointing out that integration between these tools is lacking, and that only very shallow argument structures are supported; ArgDF and AIF-RDF are explained as an improvement. What is lacking is uptake in end-user orientated (e.g. Web 2.0) tools.

The Web2.0 aspect of the problem is explored in several papers, including Buckingham Shum (2008), which presents Cohere, a Web2.0-style argumentation system supporting existing (non-Semantic Web) argumentation standards, and Groza et al. (2009) which proposes a abstract framework for modeling argumentation. These are either minimally implemented frameworks or stand-alone systems which do not yet support the distributed, emergent argumentation envisioned, as further elucidated by Buckingham Shum (2010).

References with links to preprints

S. Buckingham Shum, “Cohere: Towards Web 2.0 Argumentation,” Computational Models of Argument – Proceedings of COMMA 2008, IOS Press, 2008.
S. Buckingham Shum, AIF Use Case: Iraq Debate, Glenshee, Scotland, UK: 2010. http://projects.kmi.open.ac.uk/hyperdiscourse/docs/AIF-UseCase-v2.pdf
G. Cabanac, M. Chevalier, C. Chrisment, and C. Julien, “Social validation of collective annotations: Definition and experiment,” Journal of the American Society for Information Science and Technology, vol. 61, 2010, pp. 271-287.
R. Ennals, B. Trushkowsky, and J.M. Agosta, “Highlighting Disputed Claims on the Web,” WICOW 2010, Raleigh, North Carolina: 2010.
R. Ennals, D. Byler, J.M. Agosta, and Barboara Rosario, “What is Disputed on the Web?,” WWW 2010, Raleigh, North Carolina: 2010.
T. Groza, S. Handschuh, J.G. Breslin, and S. Decker, “An Abstract Framework for Modeling Argumentation in Virtual Communities,” International Journal of Virtual Communities and Social Networking, vol. 1, Sep. 2009, pp. 35-47.
I. Rahwan, “Mass argumentation and the semantic web,” Web Semantics: Science, Services and Agents on the World Wide Web, vol. 6, Feb. 2008, pp. 29-37.
I. Rahwan, F. Zablith, and C. Reed, “Laying the foundations for a World Wide Argument Web,” Artificial Intelligence, vol. 171, Jul. 2007, pp. 897-921.

Tags: DERI, first year report
Posted in argumentative discussions, PhD diary, semantic web, social semantic web, social web | Comments (1)

W3C Library Linked Data Incubator Group starting

May 25th, 2010

The W3C has announced an incubator activity around Library Linked Data. I’ll be one of DERI’s participants in the group.

Its mission? To help increase global interoperability of library data on the Web, and to bring together people from archives, museums, publishing, etc. to talk about metadata. See the charter for more details.

Interested in joining? If you’re at a W3C member organization, ask your Advisory Committee Representative to appoint you. Or, get appointed as an invited expert by contacting one of the chairs (Tom Baker, Emmanuelle Bermes, Antoine Isaac); their contact info is available from the participants’ list.

Or, you can follow along on the incubator group’s public mailing list. (For organizing, the Sem lib mailing list was used.)

The first teleconference will be Thursday, 3 June at 1500 UTC.

Tags: library linked data, semantic libraries, W3C, W3C library linked data incubator group
Posted in library and information science, PhD diary, semantic web | Comments (0)

Starving the subconscious

November 30th, 2009

Your brain builds something from whatever mental flotsam and jetsam is in your head. Perhaps it’s a useful thing, an answer to a question you didn’t know you needed. Perhaps it’s just an interesting combination of thoughts put into a story. It’s dreaming, but you’re awake.

-[Rands]

…when you have a real important problem you don’t let anything else get the center of your attention – you keep your thoughts on the problem. Keep your subconscious starved so it has to work on your problem, so you can sleep peacefully and get the answer in the morning, free.

-[Richard Hamming]

Metaresearch?

Tags: metaresearch, subconscious
Posted in PhD diary, random thoughts | Comments (0)

Joining the W3C

November 14th, 2009

DERI is a W3C member, so one of the perks of studying here is getting nominated for W3C membership. Yesterday I got my W3C account. While I’ve yet to explore the Member area, I’ve been thoroughly briefed on the dissemination and confidentiality policies.

18 months ago, I wrote about the W3C for Wendell Piez‘s Document Processing class. This particular assignment was to research a standard or standards organization, and to prepare a wiki page summarizing it for our colleagues. I’ve shared this below. Among other things, it shows what (little) I know about the W3C to date.

March 15, 2008 (with markup revisions)

Who are they? What does the acronym stand for?

W3C, the World Wide Web Consortium, is an international membership organization founded in 1994. Their mission: “To lead the World Wide Web to its full potential by developing protocols and guidelines that ensure long-term growth for the Web.”

How are they organized? Who pays for their operations?

The Members—over 400 organizations in 40 countries—pay the bills for the W3C. The W3C also has a Team of 68 full-time staff headed by founding director Tim Berners-Lee (http://www.w3.org/Consortium/about-w3c.html). Management and Oversight functions are provided by an Advisory Committee (a representative body of the Members), an Advisory Board, a Technical Architecture Group, and Host Institutions. Further details about Membership and Oversight are available below.

What does the W3C do? What are their most important standards and what sorts of standards do they create?

The W3C “develops open specifications (de facto standards) to enhance the interoperability of web-related products.” Webstandards.org

The W3C’s best-known standards (‘Recommendations’) are HTML, XML, CSS, and xHTML. XSL and XSLT are also W3C Recommendations. Many XML-related technologies are also Recommendations of the W3C. See the full list of W3C Recommendations.

How do they go about creating these standards?

Specifications originate from Working Groups and pass through a multi-step process in order to become a W3C Recommendation:

Working Draft (WD)
Candidate Recommendation (CR)
Proposed Recommendation (PR)
Recommendation (REC)

During this process, Working Groups are expected to publish a new draft at least every 3 months under the heartbeat rule. Frequent publication allows for early, frequent feedback, and the Working Group is expected to attend to issues raised by the community. Public comment, as well as the endorsement of W3C Members and the W3C Director, are an important part of the Recommendation Track.

Recommendation Track

“The W3C Recommendation Track process is designed to maximize consensus about the content of a technical report, to ensure high technical and editorial quality, and to earn endorsement by W3C and the broader community.” The Organizational Process document provides details about the documents produced on the Recommendation Track:

“Working Draft (WD)
A Working Draft is a document that W3C has published for review by the community, including W3C Members, the public, and other technical organizations.
Candidate Recommendation (CR)
A Candidate Recommendation is a document that W3C believes has been widely reviewed and satisfies the Working Group’s technical requirements. W3C publishes a Candidate Recommendation to gather implementation experience.
Proposed Recommendation (PR)
A Proposed Recommendation is a mature technical report that, after wide review for technical soundness and implementability, W3C has sent to the W3C Advisory Committee for final endorsement.
W3C Recommendation (REC)
A W3C Recommendation is a specification or set of guidelines that, after extensive consensus-building, has received the endorsement of W3C Members and the Director. W3C recommends the wide deployment of its Recommendations. Note: W3C Recommendations are similar to the standards published by other organizations. ”

Other publications

Recommendations may also become Rescinded Recommendations, and Working Groups may decide to abort work and publish a Working Group Note annotating their work to date (alternatives to Recommendations).

The W3C also publicizes selected content from their constituents, without endorsing this content. See Team Submissions and Member Submissions These submissions are not standards.

W3C’s work is organized into the 21 areas, called Activities. Activity summaries and homepages are a rich source of information about ongoing work of the W3C.

Who does this standards work?

Working Groups do much of the standards work of the W3C. Working Groups primarily consist of Member repesentatives and Team representatives; Invited Experts may also participate. Designated individuals work as Chair and Team Contact for these small groups “(typically fewer than 15 people)”. Typically, Working Groups last for 6 months to 2 years. Working Groups “typically produce deliverables (e.g., Recommendation Track technical reports, software, test suites, and reviews of the deliverables of other groups).” Working Groups are constituted when the W3C Director, currently Tim Berners-Lee, issues a Call for Participation, providing the Advisory Committee with a Charter of the group’s mission, duration, and deliverables. They are expected to invite and respond to public commentary about standards in progress.

The W3C also has Interest Groups and Coordination Groups. Interest Groups are larger bodies without deliverables formed around a technical interest. For some Interest Groups, participation on a public mailing list is the only criterion for participation. Like Working Groups, Interest Groups are originally created by a Call for Participation and Charter from the Director.

Coordination Groups “manage dependencies”. Coordination Groups consist of a Chair, the Chair of each coordinated group (to promote effective communication among the groups), invited experts (e.g., liaisons to groups inside or outside W3C), and Team representatives (including the Team Contact).

How do they promulgate their standards? What leverage, if any, do they have over their users or potential users?

The multi-step community process of the Recommendation Track publicizes draft standards and invites input from the community. So to some extent, W3C Recommendations are promulgated during development. Furthermore, many Members are large corporations, particularly technology companies which may wield influence over the adoption of standards.

While W3C doesn’t issue certifications for compliance implementations, in some cases they do sponsor compliance testing tools, for instance, HTML Conformance Testing. However, in some cases, standards may be only partially implemented or extended. HTML 4 provides an example: browsers such as Internet Explorer and Safari don’t exactly implement this Recommendation. Rather, they approximate the standard, by implementing the protocols designed by the companies that build them.

W3C Activities may have associated education groups such as the Semantic Web Education and Outreach (SWEO) Interest Group which may promote the W3C, its technologies, and the associated Recommendations.

W3C Membership

Membership is open, subject to approval by the W3C, and involves paying a fee to the organization. Rates are tiered, depending on the income-level of the country where the organization’s headquarters, as classified by the World Bank. (In 2008, the fees range from $953/year to $63,500/year. U.S. organizations pay $6,350 or $63,500, depending on their non-profit status and income.)

A list of current Members is available. Representatives of Member organizations make up the Advisory Committee, composed of one representative from each Member organization.

Oversight and management of the W3C

Oversight and management functions are handled by various groups: Advisory Committee, Advisory Board, Technical Advisory Group, and Host Institution.

Advisory Committee

Handles: Meets twice a year about overall direction of the W3C
Composition: 1 participant per Member Organization

Advisory Board (AB)

Handles: Business oversight, including Member concerns and matters of “strategy, management, legal matters, process, and conflict resolution”
Composition: 10 participants: 9 elected participants and a Chair (currently Tim Berners-Lee)

Technical Advisory Group (TAG)

Handles: Technical oversight and stewardship of the Web architecture, especially consensus-building and collaboration relating to Web architecture.
Composition: 9 participants: 5 elected by Advisory Committee, 3 appointed by Directory, 1 Chair (currently Tim Berners-Lee)

Host Institutions

Handles: Signing contracts, oversight of “Team salaries, detailed budgeting, and other business decisions”.
Composition: The W3C has 3 host institutions: MIT in Cambridge, MA, U.S.A., the European Research Consortium for Informatics and Mathematics (ERCIM) in France, and Keio University in Japan.

Note: The W3C is not legally incorporated. Instead, the host organizations (which are not members of the W3C) enter into contracts for the W3C. Membership documents, for example, are currently executed by each host organization.

Getting involved with the W3C as an individual

Tags: DERI, W3C
Posted in PhD diary | Comments (0)

Weeding

October 28th, 2009

I have always admired what Knuth says about email:

Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration. I try to learn certain areas of computer science exhaustively; then I try to digest that knowledge into a form that is accessible to people who don’t have time for such study.

(emphasis added) – [Knuth versus Email]

I’m not giving up email (heaven forbid!). And none of us are Knuth!

But I am weeding: culling away most listservs and feeds that aren’t core to semantic web or social web, or to my personal life (or joy). I’ll be keeping an eye on twitter, of course.

cc licensed flickr photo “weedy mulch on the start of compost pile” by Liz Henry

This (hopefully) brings me more attention for figuring out what’s going on in the semantic web and social web community, and for my literature review. But it means I’m going to have to accept lagging behind a little in everything else.

I have a love for being-in-the-know and finding interesting links. That makes it hard to stay on the bottom of things! For now, this energy and mindset get directed to my literature review (which is very much like finding the coolest new thing, only that thing may be from 1985, and nearly wholly forgotten, and not cool or new to anyone except for oneself.)

The good news is that when studying the social web, total disconnection generally isn’t desirable!

Tags: attention, culling, Donald Knuth, literature reviews
Posted in books and reading, PhD diary | Comments (0)

When an abstract is not a summary: check the audience

October 27th, 2009

I’ve been arguing with Jim Pitman about how abstracts are different from summaries. The audience, I think, determines whether a text is suitable to be used as a summary.

This seems like a good example:

Lumley, J., Gimson, R., & Rees, O. (2007). Endless documents: a publication as a continual function. In Proceedings of the 2007 ACM symposium on Document engineering (pp. 174-176). Winnipeg, Manitoba, Canada: ACM. doi: 10.1145/1284420.1284463

Variable data can be considered as functions of their bindings to values. The Document Description Framework (DDF) treats documents in this manner, using XSLT semantics to describe document functionality and a variety of related mechanisms to support layout, reference and so forth. But the result of evaluation of a function could itself be a function: can variable data documents behave likewise? We show that documents can be treated as simple continuations within that framework with minor modifications. We demonstrate this on a perpetual diary.

This is a really interesting article from a team at HP Bristol (UK). They seem to be talking about the benefit of publishing as you go along (i.e. blogs or medical records). They call these “continual documents”.

I picked it up ((I came across a conference on ‘document engineering’ [ACM digital library, may have a paywall] while sifting through articles for my literature review. ‘Document engineering’ includes lots of stuff that’s out of scope. Some material, on structural markup,may be relevant to online argumentation.)) because the abstract seemed bizarre, but the topic seemed interesting. “Continual documents” struck me as “continual functions”. And the mention of XSLT hinted at transforming a document using its underlying structure.

Surely, I thought, this abstract couldn’t describe its contents. After glancing through it, I’m not sure: This abstract may well summarize the contents of the article. But for me, the abstract really didn’t serve as a summary: I don’t know the field, so the terminology (e.g. document engineering, Document Description Framework ((One interesting line stands out: “In DDF documents most program elements are <xslt:template/> trees.”))) didn’t clue me in.

This difference gets at what AcaWiki is trying to do: provide a place for people to discuss/summarize research articles, in the way that Wikipedia is a place to discuss/summarize topics. Neither is a place for research but both are places for experts to share knowledge, for would-be-experts to describe what they know, and for non-experts to glean a deeper sense of the world than they might have had otherwise.

Tags: acawiki
Posted in argumentative discussions, PhD diary, random thoughts | Comments (1)

Newer Entries »

Recent Posts

Monthly

Meta
- Log in
- Valid XHTML
- XFN
- WordPress

jodischneider.com/blog

reading, technology, stray thoughts

Categories

Search