September 15, 2003

List Algebra

Via Ted Leung I came across Sébastien Paquet's posting about a feed algebra.

I think the key idea here is defining some basic operations -- set arithmetic in this case -- on lists of things. And those lists of things don't necessarily have to be RSS feeds. Choose other data sources and you can still get some interesting results. If you take a blog roll for instance, you could find "blogs that Person X is reading that I'm not". Take a FOAF document and you find "people that Person Y 'knows' that I don't". Or things in common, etc. What's nice is that the operation is generic, but the meaning derives from it's context (FOAF, blogroll, RSS, etc).

As Danny points out in a comment RDF provides a very good foundation upon which to build such a facility. It has notions of lists (sequences, bags, etc) built-in -- although I gather these are somewhat out of favour -- but more importantly there are standardised vocabularies for the properties that you want to key the set operations off, e.g. dc:author, dc:created, dc:subject, etc.

I thought about building a similar application a while ago -- for a while I went through a phase of looking at other peoples blogrolls and clicking links I'd never heard of -- and decided that there needed to be two other primitives. These aren't strictly derived from set theory but I think they're necessary to encompass as wide a dataset as possible:

  • Transform -- take a data source and convert it into a canonical format for further manipulation, e.g. converting between RSS formats
  • Annotate -- annotate a Set with additional information, e.g. ratings. (OK, maybe this is just a union. It is in RDF graph terms anyway, assuming the annotations relate to the same resource URIs)

In my head I envisage this as a pipeline arranged in a "wheel" formation: each data source goes through multiple transform/annotate steps (the spokes), converging at the center (the hub) where the actual set arithmetic is carried out.

Posted by ldodds at September 15, 2003 12:26 PM | Feedback? | | TrackBack
-->