Archive for the ‘semantic web’ Category

Enabling a Social Semantic Web for Argumentation (defining my Ph.D. research problem)

July 23rd, 2010

I’m working on online argumentation: Making it easier to have discussions, get to consensus, and understand disagreements across websites.

Here are the 3 key questions and the most closely related work that I’ve identified in the first 9 months of my Ph.D.

Read on, if you want to know more. Then let me know what you think! Suggestions will be especially helpful since I’m writing my first year Ph.D. report, which will set the direction for my second year at DERI.


Enabling a Social Semantic Web for Argumentation

Argumentative discussions occur informally throughout the Web, however there is currently no way of bringing together all of the discussions on a given topic along with an indication of who is agreeing and who is disagreeing. Thus substantial human analysis is required to integrate opinions and expertise to, for instance, determine the best policies and procedures to mitigate global warming, or the recommended treatment for a given disease. New techniques for gathering and organising the Social Web using ontologies such as FOAF and SIOC show promise for creating a Social Semantic Web for argumentation.

I am currently investigating three main research questions to establish the Social Semantic Web for argumentation:

  1. How can we best define argumentation for the Social Semantic Web, to isolate the essential problems? We wish to enable reasoning with inconsistent knowledge, to integrate disparate knowledge, and identify consensus and disputes.  Similar questions and techniques come up in related but distinct areas, such as sentiment analysis, dialogue mapping, dispute resolution, question-answering and e-government participation.
  2. What sort of modular framework for argumentation can support distributed, emergent argumentation — a World Wide Argumentation Web? Some Web 2.0 tools, such as Debatepedia, LivingVote, and Debategraph, provide integrated environments for explicit argumentation. But our goal is for individuals to be able to use their own preferred tools — in a social environment — while understanding what else is being discussed.
  3. How can we manage the tension between informality and ease of expression on the one hand and formal semantics and retrievability/reusability on the other hand? Minimal integration of informal arguments requires two pieces of information: a statement of the issue or proposition, and an indication of polarity (agreement or disagreement). How can we gather this information without adding cognitive overhead for users?

Related Work

Ennals et al. ask: ‘What is disputed on the Web? (Ennals 2010b). They use annotation and NLP techniques to develop a prototype system for highlighting disputed claims in Web documents (Ennals 2010a). Cabanac et al. find that two algorithms for identifying the level of controversy about an issue were up to 84% accurate (compared to human perception), on a corpus of 13 arguments. These are useful prototypes of what could be done; Ennals prototype is indeed a Web-scale system, but disputed claims are not arguments.

Rahwan et al. (2007) present a pilot Semantic Web-based system, ArgDF, in which users can create arguments, and query to find networks of arguments. ArgDF is backed with the AIF-RDF ontology, and uses Semantic Web standards.  Rahwan (2008) surveys current Web2.0 tools, pointing out that integration between these tools is lacking, and that only very shallow argument structures are supported; ArgDF and AIF-RDF are explained as an improvement. What is lacking is uptake in end-user orientated (e.g. Web 2.0) tools.

The Web2.0 aspect of the problem is explored in several papers, including Buckingham Shum (2008), which presents Cohere, a Web2.0-style argumentation system supporting existing (non-Semantic Web) argumentation standards, and Groza et al. (2009) which proposes a abstract framework for modeling argumentation. These are either minimally implemented frameworks or stand-alone systems which do not yet support the distributed, emergent argumentation envisioned, as further elucidated by Buckingham Shum (2010).

References with links to preprints

  1. S. Buckingham Shum, “Cohere: Towards Web 2.0 Argumentation,” Computational Models of Argument – Proceedings of COMMA 2008, IOS Press, 2008.
  2. S. Buckingham Shum, AIF Use Case: Iraq Debate, Glenshee, Scotland, UK: 2010. http://projects.kmi.open.ac.uk/hyperdiscourse/docs/AIF-UseCase-v2.pdf
  3. G. Cabanac, M. Chevalier, C. Chrisment, and C. Julien, “Social validation of collective annotations: Definition and experiment,” Journal of the American Society for Information Science and Technology, vol. 61, 2010, pp. 271-287.
  4. R. Ennals, B. Trushkowsky, and J.M. Agosta, “Highlighting Disputed Claims on the Web,” WICOW 2010, Raleigh, North Carolina: 2010.
  5. R. Ennals, D. Byler, J.M. Agosta, and Barboara Rosario, “What is Disputed on the Web?,” WWW 2010, Raleigh, North Carolina: 2010.
  6. T. Groza, S. Handschuh, J.G. Breslin, and S. Decker, “An Abstract Framework for Modeling Argumentation in Virtual Communities,” International Journal of Virtual Communities and Social Networking, vol. 1, Sep. 2009, pp. 35-47. 
  7. I. Rahwan, “Mass argumentation and the semantic web,” Web Semantics: Science, Services and Agents on the World Wide Web, vol. 6, Feb. 2008, pp. 29-37.
  8. I. Rahwan, F. Zablith, and C. Reed, “Laying the foundations for a World Wide Argument Web,” Artificial Intelligence, vol. 171, Jul. 2007, pp. 897-921.

Tags: ,
Posted in PhD diary, argumentative discussions, semantic web, social semantic web, social web | Comments (1)

DERI “Research Explained” video series

July 15th, 2010

Word has gotten out about DERI’s “Research Explained” video series, which I’m narrating. These videos explain DERI’s Semantic Web research to a broad audience, so far in three areas: mobile/social sensing, expert finding, and semantic search.

James Lyng, Julie Letierce, Brendan Smith, and Dr. Brian Wall produce these videos with in collaboration with DERI scientists. Drawings are by Eoghan Hynes and James Lyng.

screenshot from "Semantic Search Explained" at YouTube

Watch the series at DERI Galway’s youtube video channel.

My voiceover role came thanks to Julie’s instigation, since I had narrated a screencast for our colleague Peyman Nasirifard’s Conterprise project.

Tags: , , ,
Posted in scholarly communication, semantic web | Comments (0)

Enhancing MediaWiki Talk pages with Semantics for Better Coordination – a proposal (SemWiki 2010 short paper)

May 31st, 2010

Today Alex is presenting our short paper (PDF) at SemWiki. We describe an extension to SIOC for MediaWiki Talk pages: the SIOC WikiTalk ontology, http://rdfs.org/sioc/wikitalk.

The general line of research is to provide tools for arguing, convincing others, and understanding the status of a debate, in social media, with Semantic Web technologies. This is work in progress and suggestions and other feedback are most welcome.

Besides the short paper (PDF), these slides are also downloadable (slides PDF).

Tags: , , ,
Posted in PhD diary, semantic web, social semantic web | Comments (0)

W3C Library Linked Data Incubator Group starting

May 25th, 2010

The W3C has announced an incubator activity around Library Linked Data. I’ll be one of DERI’s participants in the group.

Its mission? To help increase global interoperability of library data on the Web, and to bring together people from archives, museums, publishing, etc. to talk about metadata. See the charter for more details.

Interested in joining? If you’re at a W3C member organization, ask your Advisory Committee Representative to appoint you. Or, get appointed as an invited expert by contacting one of the chairs (Tom Baker, Emmanuelle Bermes, Antoine Isaac); their contact info is available from the participants’ list.

Or, you can follow along on the incubator group’s public mailing list. (For organizing, the Sem lib mailing list was used.)

The first teleconference will be Thursday, 3 June at 1500 UTC.

Tags: , , ,
Posted in PhD diary, library and information science, semantic web | Comments (0)

How metadata could pay for newspapers

February 13th, 2010

What if newspapers published not just stories but databases? Dan Conover’s vision for the future of newspapers is inspired in part by his first reporting job, for NATO:

When we spotted something interesting, we recorded it in a highly structured way that could be accurately and quickly communicated over a two-way radio, to be transcribed by specialists at our border camp and relayed to intelligence analysts in Brussells.

The story, says Conover, is only one aspect of reporting. The other part? Gathering structured metadata, which could be stored in a database—or expressed as linked data.1

Newspapers already have classification systems and professional taxonomists. The New York Times’ classifications system, in use since 1851, now aggregates stories from the archives in Times Topics, a website and API.2

What if, in addition to these classifications, each story had even more structured metadata?
Capturing metadata ranges from automatic to manual. Some automatic capture is already standard (timestamps) or could be (saving GPS coordinates from a photo or storing timestamps), and some information needing manual capture (like the number of alarms of a fire) is already reported.

Dan compares the “old way” with his “new way”:

The old way:

Dan the reporter covers a house fire in 2005. He gives the street address, the date and time, who was victimized, who put it out, how extensive the fire was and what investigators think might have caused it. He files the story, sits with an editor as it’s reviewed, then goes home. Later, he takes a phone call from another editor. This editor wants to know the value of the property damaged in the fire, but nobody has done that estimate yet, so the editor adds a statement to that effect. The story is published and stored in an electronic archive, where it is searchable by keyword.

The new way:

Dan the reporter covers a house fire in 2010. In addition to a street address, he records a six-digit grid coordinate that isn’t intended for publication. His word-processing program captures the date and time he writes in his story and converts it to a Zulu time signature, which is also appended to the file.

As he records the names of the victimized and the departments involved in putting out the fire, he highlights each first reference for computer comparison. If the proper name he highlights has never been mentioned by the organization, Dan’s newswriting word processor prompts him to compare the subject to a list of near-matches and either associate the name with an existing digital file or approve the creation of a new one.

When Dan codes the story subject as “fire,” his word processor gives him a new series of fields to complete. How many alarms? Official cause? Forest fire (y/n)? Official damage estimate? Addresses of other properties damaged by the fire? And so on. Every answer he can’t provide is coded “Pending.”

Later, Dan sits with an editor as his story is reviewed, but a second editor decides not to call him at home because he sees the answer to the damage-estimate question in the file’s metadata. The story is published and archived electronically, along with extensive metadata that now exists in a relational database. New information (the name of victims, for instance) automatically generates new files, which are retained by the news organization’s database but not published.

And those information fields Dan coded as “Pending?” Dan and his editors will be prompted to provide that structured information later — and the prompting will continue until the data set is completed.

- Dan Conover in The “Lack of Vision” thing? Well, here’s a hopeful vision for you

And that data set? It might even be saleable, even though each individual story had perhaps been given away for free. Dan highlights some possibilities, and entire industries have grown around repackaging free and non-free data (e.g. U.S. Census data, phone book data). I think of mashups such as Everyblock and hyperlocal news sites like outside.in.

  1. Some news organizations, like the New York Times (see Linked Open Data) and the BBC (overview, tech blog) are already embracing linked data. []
  2. I delved into Times Topics’ taxonomy and vocabulary in an earlier post. []

Tags: , , , , , ,
Posted in future of publishing, information ecosystem, semantic web | Comments (0)