Wednesday was full of deep and useful talks, but I have to start at the beginning, so I had to wait for a copy of Stefan Decker’s slides.
Hidden in the orientation to DERI, there are a few slides (12-19) which will be new to DERIans. They’re based on an argument Stefan made to the database community recently: any data format enabling the data Web is “more or less” isomorphic to RDF.
The argument goes:
The three enablers for the (document) Web were:
- no censorship
- a positive feedback loop (exploiting Metcalf’s Law)1.
Take these as requirements for the data Web. Enabling Metcalf’s Law, according to Stefan, requires:
- Global Object Identity.
- Composability: The value of data can be increased if it can be combined with other data.
The bulk of his argument focuses on this composability feature. What sort of data format allows composability?
- Have no schema.
- Be self-describing.
- Be “object centric”. In order to integrate information about different entities data must be related to these entities.
- Be graph-based, because object-centric data sources, when composed, results in a graph, in the general case.
Stefan’s claim is that any data format that fulfills the requirements is “more or less” isomorphic to RDF.
Several parts of this argument confuse me. First, it’s not clear to me that a positive feedback loop is the same as exploiting Metcalf’s Law. Second, can’t information can be composed even when it is not object-centric? (Is it obvious that entities are required, in general?) Third, I vaguely understand that composing object-centric data sources results in a (possibly disjoint) graph: but are graphs the only/best way to think about this? Further, how can I convince myself about this (presumably obvious) fact about data integration.
- The value of a communication network is proportional to the number of connections between nodes, or n^2 for n nodes [↩]