Archive for the ‘social semantic web’ Category

The Social Semantic Web – a message for scholarly publishers

November 15th, 2010

I always appreciate how Geoffrey Bilder can manage to talk about the Social Semantic Web and the early modern print in (nearly) the same breath. See for yourself in the presentation he gave to scholarly publishers at the International Society of Managing and Technical Editors last month.

Geoff’s presentation is outlined, to a large extent, in an interview Geoff gave 18 months ago (search “key messages” to find the good bits). I hope to blog further about these, because Geoff has so many good things to say, which deserve unpacking!

I especially love the timeline from slide 159, which shows that we’re just past the incunabula age of the Internet:

The Early Modern Internet

We're still in the Early Modern era of the Internet. Compare to the history of print.

Tags: , , , , , ,
Posted in future of publishing, information ecosystem, PhD diary, scholarly communication, semantic web, social semantic web, social web | Comments (3)

Utopia Documents: pulling scientific data into the PDF for interactive exploration

November 14th, 2010

What if data were accessible within the document itself?

Utopia Documents is a free PDF viewer which recognizes certain enhanced figures, and fetches the underlying data. This allows readers to view and experiment with the tables, graphs, molecular structures, and sequences in situ.


You can download Utopia Documents for Mac and Windows to view enhanced papers, such as those published in The Semantic Biochemical Journal.

These screencasts were made from pages 9 and 10 of PDF of a paper by the Manchester-based Utopia team: T. K. Attwood, D. B. Kell, P. Mcdermott, J. Marsh, S. R. Pettifer, and D. Thorne. Calling international rescue: knowledge lost in literature and data landslide! Biochemical Journal, Dec 2009. doi:10.1042/BJ20091474.

In an interview at the Guardian, Utopia’s Phillip McDermott says:

“Utopia Documents links scientific research papers to the data and to the community. It enables publishers to enhance their publications with additional material, interactive graphs and models. It allow the reader to access a wealth of data resources directly from the paper they are viewing, makes private notes and start public conversations. It does all this on normal PDFs, and never alters the original file. We are targeting the PDF, since they still have around 80% readership over online viewing.

“Semantics, loose-coupling, fingerprinting and linked-data are the key ingredients. All the data is described using ontologies, and a plug-in system allows third parties to integrate their database or tool within a few lines of script. We use fingerprinting to allow us to recognise what paper a user is reading, and to spot duplicates. All annotations are held remotely, so that wherever you view a paper, the result is the same.”

I’d still like to see a demo of the commenting functionality.

I’d also be particularly interested in the publisher perspective, about the production work that goes into creating the enhancements. Portland Press’s October news announces that they’ve been promoting Utopia at the Charleston conference and SSP, with an upcoming appearance at the STM Innovations Seminar.

Utopia came to my attention via Steve Pettifer’s mention.

Tags: , , , , , , , , ,
Posted in future of publishing, information ecosystem, library and information science, scholarly communication, semantic web, social semantic web | Comments (4)

A Model-View-Controller perspective of scholarly articles

November 13th, 2010

A scholarly paper is not a PDF. A PDF is merely one view of a scholarly paper. To push ‘beyond the PDF’, we need design patterns that allow us to segregate the user interface of the paper (whether it is displayed as an aggregation of triples, a list of assertions, a PDF, an ePub, HTML, …) from the thing itself.

Towards this end, Steve Pettifer has a Model-View-Controller perspective on scholarly articles, which he shared in a post on the Beyond the PDF listserv, where discussions are leading up to a workshop in January. I am awe-struck: I wish I’d thought of this way of separating the structure and explaining it.

I think a lot of the disagreement about the role of the PDF can be put down to trying to overload its function: to try to imbue it with the qualities of both ‘model’ and ‘view’. … One of the things that software architects (and I suspect designers in general) have learned over the years is that if you try to give something functions that it shouldn’t have, you end up with a mess; if you can separate out the concerns, you get a much more elegant and robust solution.

My personal take on this is that we should keep these things very separate, and that if we do this, then many of the problems we’ve been discussing become more clearly defined (and I hope, many of the apparent contradictions, resolved).

So… a PDF (or come to that, an e-book version or a html page) is merely a *view* of an article. The article itself (the ‘model’) is a completely different (and perhaps more abstract) thing. Views can be tailored for a particular purpose, whether that’s for machine processing, human reading, human browsing, etc etc.

[paragraph break inserted]

The relationship between the views and their underlying model is managed by the concept of a ‘controller’. For example, if we represent an article’s model in XML or RDF (its text, illustrations, association nanopublications, annotations and whatever else we like), then that model can be transformed in to any number of views. In the case of converting XML into human-readable XHTML, there are many stable and mature technologies (XSLT etc). In the case of doing the same with PDF, the traditional controller is something that generates PDFs.

[paragraph break inserted]

The thing that’s been (somewhat) lacking so far is the two-way communication between view and model (via controller) that’s necessary to prevent the views from ossifying and becoming out of date (i.e. there’s no easy way to see that comments have been added to the HTML version of an article’s view if you happen to be reading the PDF version, so the view here can rapidly diverge from its underlying model).

[paragraph break inserted, link added]

Our Utopia software is an attempt to provide this two-way controller for PDFs. I believe that once you have this bidirectional relationship between view and model, then the actual detailed affordances of the individual views (i.e. what can a PDF do well / badly, what can HTML do well / badly) become less important. They are all merely means to channeling the content of an article to its destination (whether that’s human or machine).

The good thing about having this ‘model view controller’ take on the problem is that only the model needs to be pinned down completely …

Perhaps separating out our concerns in this way — that is, treating the PDF as one possible representation of an article — might help focus our criticisms of the current state of affairs? I fear at the moment we are conflating the issues to some degree.

– Steve Pettifer in a Beyond the PDF listserv post

I’m particularly interested in hearing if this perspective, using the MVC model, makes sense to others.

Tags: , , , , , , ,
Posted in books and reading, future of publishing, information ecosystem, library and information science, scholarly communication, social semantic web | Comments (9)

Enabling a Social Semantic Web for Argumentation (defining my Ph.D. research problem)

July 23rd, 2010

I’m working on online argumentation: Making it easier to have discussions, get to consensus, and understand disagreements across websites.

Here are the 3 key questions and the most closely related work that I’ve identified in the first 9 months of my Ph.D.

Read on, if you want to know more. Then let me know what you think! Suggestions will be especially helpful since I’m writing my first year Ph.D. report, which will set the direction for my second year at DERI.


Enabling a Social Semantic Web for Argumentation

Argumentative discussions occur informally throughout the Web, however there is currently no way of bringing together all of the discussions on a given topic along with an indication of who is agreeing and who is disagreeing. Thus substantial human analysis is required to integrate opinions and expertise to, for instance, determine the best policies and procedures to mitigate global warming, or the recommended treatment for a given disease. New techniques for gathering and organising the Social Web using ontologies such as FOAF and SIOC show promise for creating a Social Semantic Web for argumentation.

I am currently investigating three main research questions to establish the Social Semantic Web for argumentation:

  1. How can we best define argumentation for the Social Semantic Web, to isolate the essential problems? We wish to enable reasoning with inconsistent knowledge, to integrate disparate knowledge, and identify consensus and disputes.  Similar questions and techniques come up in related but distinct areas, such as sentiment analysis, dialogue mapping, dispute resolution, question-answering and e-government participation.
  2. What sort of modular framework for argumentation can support distributed, emergent argumentation — a World Wide Argumentation Web? Some Web 2.0 tools, such as Debatepedia, LivingVote, and Debategraph, provide integrated environments for explicit argumentation. But our goal is for individuals to be able to use their own preferred tools — in a social environment — while understanding what else is being discussed.
  3. How can we manage the tension between informality and ease of expression on the one hand and formal semantics and retrievability/reusability on the other hand? Minimal integration of informal arguments requires two pieces of information: a statement of the issue or proposition, and an indication of polarity (agreement or disagreement). How can we gather this information without adding cognitive overhead for users?

Related Work

Ennals et al. ask: ‘What is disputed on the Web? (Ennals 2010b). They use annotation and NLP techniques to develop a prototype system for highlighting disputed claims in Web documents (Ennals 2010a). Cabanac et al. find that two algorithms for identifying the level of controversy about an issue were up to 84% accurate (compared to human perception), on a corpus of 13 arguments. These are useful prototypes of what could be done; Ennals prototype is indeed a Web-scale system, but disputed claims are not arguments.

Rahwan et al. (2007) present a pilot Semantic Web-based system, ArgDF, in which users can create arguments, and query to find networks of arguments. ArgDF is backed with the AIF-RDF ontology, and uses Semantic Web standards.  Rahwan (2008) surveys current Web2.0 tools, pointing out that integration between these tools is lacking, and that only very shallow argument structures are supported; ArgDF and AIF-RDF are explained as an improvement. What is lacking is uptake in end-user orientated (e.g. Web 2.0) tools.

The Web2.0 aspect of the problem is explored in several papers, including Buckingham Shum (2008), which presents Cohere, a Web2.0-style argumentation system supporting existing (non-Semantic Web) argumentation standards, and Groza et al. (2009) which proposes a abstract framework for modeling argumentation. These are either minimally implemented frameworks or stand-alone systems which do not yet support the distributed, emergent argumentation envisioned, as further elucidated by Buckingham Shum (2010).

References with links to preprints

  1. S. Buckingham Shum, “Cohere: Towards Web 2.0 Argumentation,” Computational Models of Argument – Proceedings of COMMA 2008, IOS Press, 2008.
  2. S. Buckingham Shum, AIF Use Case: Iraq Debate, Glenshee, Scotland, UK: 2010. http://projects.kmi.open.ac.uk/hyperdiscourse/docs/AIF-UseCase-v2.pdf
  3. G. Cabanac, M. Chevalier, C. Chrisment, and C. Julien, “Social validation of collective annotations: Definition and experiment,” Journal of the American Society for Information Science and Technology, vol. 61, 2010, pp. 271-287.
  4. R. Ennals, B. Trushkowsky, and J.M. Agosta, “Highlighting Disputed Claims on the Web,” WICOW 2010, Raleigh, North Carolina: 2010.
  5. R. Ennals, D. Byler, J.M. Agosta, and Barboara Rosario, “What is Disputed on the Web?,” WWW 2010, Raleigh, North Carolina: 2010.
  6. T. Groza, S. Handschuh, J.G. Breslin, and S. Decker, “An Abstract Framework for Modeling Argumentation in Virtual Communities,” International Journal of Virtual Communities and Social Networking, vol. 1, Sep. 2009, pp. 35-47. 
  7. I. Rahwan, “Mass argumentation and the semantic web,” Web Semantics: Science, Services and Agents on the World Wide Web, vol. 6, Feb. 2008, pp. 29-37.
  8. I. Rahwan, F. Zablith, and C. Reed, “Laying the foundations for a World Wide Argument Web,” Artificial Intelligence, vol. 171, Jul. 2007, pp. 897-921.

Tags: ,
Posted in argumentative discussions, PhD diary, semantic web, social semantic web, social web | Comments (1)

Enhancing MediaWiki Talk pages with Semantics for Better Coordination – a proposal (SemWiki 2010 short paper)

May 31st, 2010

Today Alex is presenting our short paper (PDF) at SemWiki. We describe an extension to SIOC for MediaWiki Talk pages: the SIOC WikiTalk ontology, http://rdfs.org/sioc/wikitalk.

The general line of research is to provide tools for arguing, convincing others, and understanding the status of a debate, in social media, with Semantic Web technologies. This is work in progress and suggestions and other feedback are most welcome.

Besides the short paper (PDF), these slides are also downloadable (slides PDF).

Tags: , , ,
Posted in PhD diary, semantic web, social semantic web | Comments (0)

New Beginnings

October 22nd, 2009

This week I’m beginning my Ph.D. in Galway, Ireland at DERI.

Things move very quickly here. Unlike a U.S. Ph.D. student, I start with an supervisor (Alexandre Passant), an academic mentor (John Breslin), and a ‘professor in discipline’ (not quite sure yet what that entails) (Stefan Decker). Before arriving, I also put in a thesis proposal, which Alex drafted and I merely tweaked:

My thesis will investigate the use of Semantic Web technologies to represent argumentative discussions in online communities—how people have discussions on blogs, wikis, etc. and how they agree, disagree. etc.—and to make these discussions machine-readable and interoperable.

Tags: ,
Posted in argumentative discussions, PhD diary, social semantic web | Comments (4)