Evidence Informatics

January 20th, 2015

I sent off my revised abstract to ECA Lisbon 2015, the European Conference on Argumentation. Evidence informatics, in 75 words:

Reasoning and decision-making are common throughout human activity. Increasingly, human reasoning is mediated by information technology, either to support collective action at a distance, or to support individual decision-making and sense-making.

We will describe the nascent field of “evidence informatics”, which considers how to structure reasoning and evidence. Comparing and contrasting evidence support tools in different disciplines will help determine reusable underlying principles, shared between fields such as legal informatics, evidence-based policy, and cognitive ergonomics.

Genre defined, a quote from John Swales

October 21st, 2014

A genre comprises a class of communicative events, the members of which share some set of communicative purposes. These purposes are recognized by the expert members of the parent discourse community and thereby constitute the rationale for the genre. This rationale shapes the schematic structure of the discourse and influences and constrains choice of content and style. Communicative purpose is both a privileged criterion and one that operates to keep the scope of a genre as here conceived narrowly focused on comparable rhetorical action. In addition to purpose, exemplars of a genre exhibit various patterns of similarity in terms of structure, style, content and intended audience. If all high probability expectations are realized, the exemplar will be viewed as prototypical by the parent discourse community. The genre names inherited and produced by discourse communities and imported by others constitute valuable ethnographic communication, but typically need further validation.1

  1. Genre defined, from John M. Swales, page 58, Chapter 3 “The concept of genre” in Genre Analysis: English in academic and research settings. Cambridge University Press 1990. Reprinted with other selections in
    The Discourse Studies Reader: Main currents in theory and analysis (see pages 305-316). []

Rating the evidence, citation by citation?

September 4th, 2014

Publishers from HighWire Press are experimenting with a plugin called SocialCite. This is intended to rate the evidence, citation by citation. Like this:

SocialCite at PNAS, HighWire Press from

So far a few publishers (including PNAS) have implemented it as a pilot. Apparently the Journal of Bone and Joint Surgery is apparently leading this effort, I’d be really interested in speaking with them further:

Find out more about SocialCite from their website or the slidedeck from their debut at the HighwirePress meeting.

I’m *very* curious to hear what peopel think of this — it really surprised me.

Ph.D. viva – public talk

October 1st, 2013

Here are the slides from the public part of my Ph.D. viva (thesis defense), on “Enabling reuse of arguments and opinions in open collaboration systems”. There is also a downloadable PDF version of the slides. Or see the thesis/dissertation itself and its data (index page) (note added 2016-04-22).

Video to follow: thanks to Hugo Hromic for streaming & recording that!

Title: “Enabling reuse of arguments and opinions in open collaboration systems”

Abstract: The World Wide Web enables large-scale collaboration, even between groups of individuals previously unknown to one another. These collaborations produce tangible outputs, such as encyclopedias (Wikipedia), electronic books (Distributed Proofreaders), maps (OpenStreetMap) and open source software packages (Firefox). In such open collaboration systems, decisions are made through open online discussions in which anyone can participate, and those decisions are based on the written arguments and opinions that individuals contribute, sometimes in large volumes.

Sense-making and coordination is an important component of collaboration, but it is particularly challenging when individuals disagree. When large volumes of opinions and arguments are expressed, popular or emotive choices can be identified through coarse approaches such as sampling, sentiment, or voting. But these do not identify the reasons for disagreement, which may be needed in order to reach decisions. For example, about 500 discussions each week in Wikipedia concern whether a particular topic should be covered in the encyclopedia. Discussions may involve comments from 2-200 people, and some topics are contentious.

This thesis addresses the problem of analyzing, integrating, and reconciling arguments and opinions in goal-oriented online discussions. We emphasize the structure of arguments by providing a new, reconfigurable Web interface. Our interface improves the perceived usefulness, perceived ease of use, and information completeness, thus providing meaningful support for the discussion.

The thesis addresses the following three research questions:
– What are the opportunities and requirements for providing argumentation support?
– Which arguments are used in open collaboration systems?
– How can we structure and display opinions and arguments to support their use and reuse?

Ph.D. defense, Tuesday October 1st

September 30th, 2013


Discourse community

August 6th, 2013

“Discourse community” seems to be a good summary for a concept that I need. But I’m not happy with how to define it. One summary for my purposes might be: “a group with shared goals, a mechanism for communication, certain patterns of discussion, and enough members who have relevant expertise in the topic and how to argue about it”. That loses some of the richness but also manages the complexity of the full definition.

According to Swales,1
a discourse community:

  • has a broadly agreed set of common public goals.
  • has mechanisms of intercommunication among its members.
  • uses its participatory mechanisms primarily to provide information and feedback.
  • utilizes and hence possesses one or more genres in the communicative furtherance of its aims.
  • in addition to owning genres, it has acquired some specific lexis.
  • has a threshold level of members with a suitable degree of relevant content and discoursal expertise

For a more recent summary, read Borg2 (who however seems to advocate the term “Community of Practice”, which seems to me far less well-defined, perhaps since I’ve not gone looking for a definition).

Swales presents a really interesting case study of a discourse community in his book: a stamp collection society–and explains rhetorical (genre-fit) mistakes he made in his first forays into the community.

I’d expect this definition to be more famous than it is. For common use, it has some flaws: technical terms such as ‘genres’ and ‘lexis’ should be described, as should ‘discoursal expertise’ and perhaps ‘participatory mechanisms’.

  1. Swales, John. Genre analysis: English in academic and research settings. Cambridge University Press, 1990. []
  2. Borg, Erik. “Discourse community.” ELT journal 57.4 (2003): 398-400. []

Defining open collaboration systems

July 27th, 2013

“Open collaboration systems” has lately become part of the (working) title for my thesis. I had tried talking about “purposeful online conversations” when scoping my work. I had in mind online conversations where people argued in order to find common ground and take action. By contrast, I explained, while people argue in many online venues, characterizing those arguments is challenging: what shall we say (in general) about the arguments on Twitter, or in comments to blog posts? But the phrase “purposeful online conversations” seemed to mean little to anyone but me.

I latched onto a new definition to use in my thesis: “open collaboration systems”. In an open collaboration system, “people form ties with others and create things together”1,2

Forte and Lampe define a “prototypical open collaboration system” as

an online environment that

  1. supports the collective production of an artifact
  2. through a technologically mediated collaboration platform
  3. that presents a low barrier to entry and exit, and
  4. supports the emergence of persistent but malleable social structures.3

I’m chagrined to say that it hadn’t occurred to me to quote the definition and then slightly redefine it. That is, until today when I chanced upon Andrew West’s thesis, “Damage detection and mitigation in open collaboration applications”4, about his large body of work on vandalism in Wikipedia, and the robust tool for vandalism reversion that he developed, Stiki. Very interesting since, as the title suggests, he creates a variant definition, Open Collaboration Applications (OCAs), where he liberally applies the “low barrier to entry and exit” to exclude moderation (for instance Github, which requires “proactive moderation” from repository owners, is excluded in his definition). He also stresses collective production more than most. But most interestingly to me, West very explicitly excludes voting-oriented collaborative filtering, based on the independence of the action taken by each voter.5

  1. Forte, Andrea, and Cliff Lampe. “Defining, Understanding, and Supporting Open Collaboration: Lessons From the Literature.” American Behavioral Scientist 57.5 (2013): 535-547. doi:10.1177/0002764212469362 []
  2. The article, published this May, is their introduction to a special issue in American Behavioral Scientist. ABS 57(5), published May 2013. As a side note, the definition seems to have arisen out of need; I’m grateful. The original CFP for the issue explained what they were looking for more generally: “By open collaboration we mean the development of novel social structures supported by technologies including wikis and other content management systems that allow people to share and build content.” []
  3. Forte, Andrea, and Cliff Lampe. “Defining, Understanding, and Supporting Open Collaboration: Lessons From the Literature.” American Behavioral Scientist 57.5 (2013): 535-547. doi:10.1177/0002764212469362 []
  4. Andrew West “Damage detection and mitigation in open collaboration applications” Ph.D. thesis, University of Pennsylvania. May 2013. []
  5. To clarify his modified definition of Open Collaboration Applications (OCAs), West says (in part):

    We proceed by discussing familiar examples that are not OCAs. Append only and monotonically growing content/discussion repositories fail to qualify because they are not collectively produced at any granularity. This includes applications like YouTube, Flickr, forums, and blog/article comments regardless of the fact their content is user generated (these are aggregated independent artifacts). Collaborative filtering applications like Reddit, Digg, and Slashdot are also insufficient. Therein, community voting determines the acceptance and/or prominence of individual content items (“posts”) towards composing a public facing artifact. These fail in two dimensions: (1) Voting is an append only action, and (2) supposing participants could fully “edit” the ordering, this presentation is nonetheless a meta-artifact of independent posts – failing the atomicity constraint.


QOTD: Heinlein’s truth-telling language, Speedtalk

July 14th, 2013

Inventing languages is a past-time both of philosophers and science fiction storytellers. It spotlights the relationships between language and thought and language and culture.1

Yesterday I ran across Heinlein’s truth-telling language, Speedtalk. A few lines were really striking:
“In the syntax of Speedtalk the paradox of the Spanish Barber could not even be expressed, save as a self-evident error.”
“The advantage for achieving truth, or something more nearly like truth, was similar to the advantage of keeping account books in Arabic numerals rather than Roman.”

Here’s a longer quote:

But Speedtalk was not “shorthand” Basic English. “Normal” languages, having their roots in days of superstition and ignorance, have in them inherently and unescapably wrong structures of mistaken ideas about the universe. One can think logically in English only by extreme effort, so bad it is as a mental tool. For example, the verb “to be” in English has twenty-one distinct meanings, every single one of which is false-to-fact.

A symbolic structure, invented instead of accepted without question, can be made similar in structure to the real-world to which it refers. The structure of Speedtalk did not contain the hidden errors of English; it was structured as much like the real world as the New Men could make it. For example, it did not contain the unreal distinction between nouns and verbs found in most other languages. The world—the continuum known to science and including all human activity—does not contain “noun things” and “verb things”; it contains space-time events and relationships between them. The advantage for achieving truth, or something more nearly like truth, was similar to the advantage of keeping account books in Arabic numerals rather than Roman.
All other languages made scientific, multi-valued logic almost impossible to achieve; in Speedtalk it was as difficult not to be logical. Compare the pellucid Boolean logic with the obscurities of the Aristotelean logic it supplanted.

Paradoxes are verbal, do not exist in the real world—and Speedtalk did not have such built into it. Who shaves the Spanish Barber? Answer: follow him around and see. In the syntax of Speedtalk the paradox of the Spanish Barber could not even be expressed, save as a self-evident error.

Gulf, as printed in Assignment in Eternity – Robert A. Heinlein – Baen edition

This seemed to me to echo Leibniz’ symbolic language, in the “truthtelling” aspects — perhaps since I wrote a few months ago about Leibniz!2
Leibniz was perhaps the first philosopher to write about a special language for expressing truth or making arguments evident.34

  1. Apparently Wikipedia keeps a list of constructed languages and has nearby discussion on the purpose of some of these. []
  2. For thesis Chapter 1, forthcoming; thanks to some comments from Adam Wyner. Ironically, my BA thesis was on Leibniz monads, but if I’d ever read the “Let us calculate” lines, I certainly didn’t have them in mind when thinking of argumentation! []
  3. For more, see Roger Bishop Jones on Leibniz and the Automation of Reason. []
  4. For references to the original, trace a discussion on the listserv historia-matematica, started by Robert Tragesser 1999-05-23, [HM] Leibniz’s “let us calculate”?, with responses over several months. Michael Detlefsen gives references to several of Leibniz’s writings, and a followup question about which quote is most widely known (1999-07-17, started by “L. M. Picard” with the subject [HM] Leibniz’s “let us calculate”) yields a very useful response from Siegmund Probst, quoting several variants with detailed references. []

QOTD: Hybrid forums

May 26th, 2013

Interesting term I came across today: hybrid forum, via a tweet by Fabien Gandon.

“Hybrid forums”, according to Michel Callon and colleagues are:

forums because they are open spaces where groups can come together to discuss technical options involving the collective, hybrid because the groups involved and the spokespersons claiming to represent them are heterogeneous, including experts, politicians, technicians, and laypersons who consider themselves involved. They are also hybrid because the questions and problems taken up are addressed at different levels in a variety of domains, from ethics to economic and including physiology, nuclear physics, and electromagnetism.

– Michel Callon, Pierre Lascoumes, and Yannick Barthe, from a chapter called “Hybrid Forums”, Chapter 1 in Acting in an Uncertain World: An Essay on Technical Democracy by Michel Callon, Pierre Lascoumes, and Yannick Barthe. translated by Graham Burchell, MIT Press 2009, First published by Editions du Seuil in France as Agir dans un monde incertain: Essai sur la democratie technique.

In their heterogeneity, there is a relation to the “wicked problem”1- where “Stakeholders have radically different world views and different frames for understanding the problem.”2.

In their openness and heterogeneity, there is also a relation to the open (peer) production community (around which I am currently framing my dissertation work).

  1. a starting motivation for much work in human argumentation []
  2. Wikipedia, Wicked Problem, Background and context section []

Four types of evidence

May 17th, 2013

A great image “Four types of evidence” appears in a recent paper on probabalistic argumentation schemes1. The delineation of 4 types of evidence2 serves the larger goal of the paper — which is to describe how to combine evidence of different types.

Four Types of Evidence, from Tang et al. ArgMAS2013
Four Types of Evidence, from Tang et al. ArgMAS2013

The four types of evidence depicted are:

  1. Consonant Evidence – each set is wholly contained in another (all sets can be arranged in a nested series of subsets)
  2. Consistent Evidence – have a common element (nonempty intersection of all sets)
  3. Disjoint Evidence – in which there is no overlap (pairwise disjoint intersection of sets)
  4. Arbitrary Evidence – where none of the three preceding situations holds (i.e. there is no consensus but some agreement)

Evidence classification could possibly be thought of in conjunction with argument classification; for the latter, see my earlier musings Towards a Catalog of Argumentation Patterns.

  1. Dempster-Shafer Argument Schemes‘ by Yuqing TangNir OrenSimon Parsons, and Katia Sycara (2013) in Proceedings of ArgMAS 2013. []
  2. These, the authors mention, were drawn from an earlier technical report: K. Stentz and S. Ferson. Combination of evidence in Dempster-Shafer theory. Technical Report SAND 2002-0835, Sandia National Laboratories, 2002. See especially pages 10-13. The context in that technical report, is sensor fusion using Dempster-Shafer Theory, which as I have since learned, is a common approach to combination of evidence. []

