July 28, 2003

Geo Services, Dashboard and Gazetteers

In Goodness gracious, great circles, Edd writes up the results of more cool Dashboard related hacking. So now you have another good reason for adding GeoURL tags to your 'blog. Edd is also looking for more geographic related services and data sets.

I came across one a while back whilst doing some related research (enriching search results by identifying place-names and adding links to more information).

This service is offered by the Alexandria Digital Library Project, and is known as the ADL Gazetteer Protocol:

...a lightweight, stateless, XML- and HTTP-based protocol for accessing gazetteers: dictionaries of geographic placenames. The protocol supports querying gazetteers (by placename, by geographic area, by feature class, and by inter-placename relationship); retrieving standard placename reports; downloading gazetteers in bulk; and managing gazetteers (by adding, removing, and relating placenames).

The hands-on introduction and test forms show the range of queries that can be made, including the use of latitude and longitude (e.g. find places within this area).

These seem to fit quite closely to the kind of use cases Edd is interested in. However I'm not sure of the licensing for this data and some initial experiments have shown it to be quite US centric; I could find Chippenham, but not Bristol for example. However the ability to add gazetteer entries seems a promising way to deal with that. As Edd notes, with open source "...it's easy to get input from lots of people with local knowledge of different parts of the world."

The web client to the Map and Imaging lab also looks promising as there numerous search options and auto-generated maps that highlights a particular point. All of this appears to be URL addressable. For example, this is Chippenham.

Posted by ldodds at 11:35 AM | Feedback? | TrackBack

July 25, 2003

One Week to Go

Whilst winding down this sunny Friday evening (the red wine is a-breathin') I've made the mistake of visiting the Big Chill website to catch up with the latest news on the festival and line-up before next week.

Big mistake as I'm now all excited at the prospect of seeing a whole slew of cool bands.

The site is now sporting a list of the full line-up and detailed notes on each band. One of the things I love most about this festival is the wide range of music on offer.

And you can't ask for better surroundings either, they seem to have a knack of choosing excellent venues. This will be my third Big Chill: the first was at Lulworth castle -- the castle was the centre piece of the entire festival and was lit-up, to great effect, in the evening. The second was the "Enchanted Garden" at Larmer Tree Gardens. We spent most of our time at the Sanctuary stage which was an amazing setting, as you can see. What other festival has peacocks roaming wild through the crowds?

This years festival -- which I believe will be slightly larger than previous ones -- is at Eastnor Castle. A mate of mine who went there last year reckons its a cracking site. Can't wait.

Who am I looking forward to seeing? The Cinematic Orchestra, Quantic, Mr Scruff, Howie B, Lol Hammond, Another Fine Day and Bent (who sound like they're arranging something quite bizarre) immediately jump out of the line-up. I was pleased to see that John Peel is going to be there as well; another treat to look forward to. I'm also keen to see both Nitin Sawhney and Talvin Singh.

This years festival is taking place right after a major project deadline. Things are cooling off a bit in work now, and I'm looking forward to kicking back and getting properly chilled out next week. I think I deserve it :)

Posted by ldodds at 08:16 PM | Feedback? | TrackBack

XML-DEV and eclectic

For a long time now eclectic has been lying fallow. I've just been able to keep up to date with XML-DEV on a daily basis and keep it maintained the way I used to. A combination of factors including being much busier at work (now managing a team) and at home (now a father, and soon to be so again).

However I think the main reason is that the conversations seem to be endlessly spiralling around several recurring themes ("permathreads"). This makes for very tedious reading as the trenches rarely shift very far in either direction. This has greatly reduced my tolerance for keeping up to date with the list. In the past I've tried to remain as impartial as possible, but once you've blogged about a topic for the nth time it starts to get tedious fast.

So I'm declaring eclectic to be dead. I'm not longer going to be a daily reader of XML-DEV but will stop by the archives occasionally and report on any interesting topics that I see.

Sadly however, a brief look into this months archives sees threads on XSLT vs CSS, Namespaces, and even one about Permathreads. So slim pickings for the moment.

It's a shame really as I've learnt a great deal from monitoring XML-DEV over the last few years. I think the community has matured to the point where the interesting stuff is now happening on the "fringes" -- separate dedicated mailing lists or other shared spaces. And thats where I intend to keep lurking for the moment.

Many thanks to Userland for providing a free hosting service for so long.

If anyone is interested in taking the content from eclectic, you can download eclectic.root (a Frontier database export of the site)

Posted by ldodds at 03:22 PM | Feedback? | TrackBack

FOAF and Privacy

Shelley Powers has been writing about FOAF this week, privacy issues in particular. Ben Hammersley has been having similar thoughts. This is all good. As I understand it one of the aims of FOAF and related vocabularies (FOAF has no 'built-in' notion of trust) is to help explore these kind of issues.

Control over what other people and applications know about you is an important issue, and is one of the reasons why I want to become King of Data Province. I may not be able to do much about data that is already out in the wild -- innumerable web sites and archives have information about me already. And there isn't a great deal I could do about it, well, unless we can get an enforceable Data Protection Act for the entire web (which I doubt). But I can at least be the authoritative source of information. Assuming the infrastructure supports it of course. Edd Dumbill has already shown how to sign and encrypt FOAF files using PGP.

Posted by ldodds at 02:45 PM | Feedback? | TrackBack

July 18, 2003

Busy, Busy

Apologies to anyone who has mailed me over the last week or so and is still waiting for a response. We've just gone through an application release here at work and so things have been extremely busy. Not had time to think about anything other than bug fixing. In case you're interested we've just moved our production server over to JBoss.

I've got lots of things to write-up, and will be catching up with this and email over the next few days. So if you're still waiting on a response, please bear with me!

Posted by ldodds at 09:39 AM | Feedback? | TrackBack

July 04, 2003

Wiki Refactoring

I took some time last night to start refactoring the FOAF Wiki. This was something we needed to do, but comments on rdfweb-dev from Marc Canter, suggesting that folk are struggling without decent FOAF documentation, prompted me to devote a bit more time to it this week.

I'm pretty happy with the result as I've ended up with main entry points (FAQ, Tools, Developers, and DataSources) under which everything else can probably be arranged. Barring a Syntax page which ought to tie together the various vocab and extension discussions.

The FAQ should hopefully answer the majority of queries that Marc raised, including oft overlooked details such as "how do I link to my FOAF?, what icon(s) can I use?", etc.

This all got me to thinking about Wiki Refactoring in general. Is Wiki Refactoring really Refactoring?
Martin Fowler recently aired a pet peeve about the abuse of the term: people talk about "refactoring" when they really mean "rewriting". I'm pretty sure I've done this on more than one occasion.

Re-organising a Wiki -- breaking down pages into inter-linked related pages; combining related information -- seems like refactoring on the surface. But is it? The re-organisation seem to add (implicit) information, if only in the links between topics.

Not surprisingly there are pages about Wiki Refactoring on the C2 Wiki, which I'll be exploring when I get a few free minutes.

Bill de hÓra's comments about the Echo Wiki pretty much echo (ha ha) my own thoughts when I skimmed through it recently. It didn't have the same "feel" as other Wiki sites I've used. So, while it's a currently high-profile example of Wiki use I'm not sure it's necessarily a good exemplar of the Wiki concept in general. And not that it needs to be: its a tool to get a certain job done, and if it works great.

Posted by ldodds at 02:18 PM | Feedback? | TrackBack

MP3 GPS

Here's something I came across whilst looking for a Java API into WinAmp. Why was I looking for a such a beast? Well mainly because I'm a Java weenie and can't be bothered to dredge up the bits of C/C++ I know to write WinAmp plugins properly.

The plugin I'm thinking about would be used to extract some RDF data from WinAmp -- the playlist, what I'm listening to, etc.

Anyway, this lead me to Mp3 GPS, a WinAmp plugin written in Java that communicates to a GPS device attached to the serial port of your computer. (See the installation page for a link to the Java API; it doesn't seem to have an official separate home page)

The MP3 GPS plugin can be configured to select a playlist based on information such as the system time, the current speed, direction, latitude, longitude and altitude. Which is a cool idea.

Apart from the use cases the author mentions, there are some other interesting possibilities. Such as being able to have a playlist for a particular scenic location; something that might complement the mood perhaps. Currently at high altitude and moving very fast? No problem, WinAmp will start playing some soothing tunes to make your flight more enjoyable.

There are some interesting art works that could be constructed with something like this. It basically allows a musician or DJ to take the listeners environment into account when mixing a tune. They just need to provide the rules for selecting the right playlist.

This is also related to Linked which I saw chumped earlier this week.

It would be useful to extract the rules into something more generic, similarly the playlists. That way MP3 and GPS enabled phones could make use of the data. There are already some geo vocabularies for RDF, so there's probably work to build on already.

It's just a shame I don't have any GPS peripherals!

Posted by ldodds at 11:26 AM | Feedback? | TrackBack

July 03, 2003

Ping FOAFnaut

Here's a simple bookmarklet that uses FOAF Autodiscovery to find a FOAF document linked to from the page you're currently viewing and then submits it to FOAFnaut.

If FOAFnaut already knows this URL then the database will be cleared and the data reloaded, otherwise it will just load the new data.

Just drag the link below to your browser toolbar.

Ping FOAFnaut

Posted by ldodds at 06:26 PM | Feedback? | TrackBack

Thoughts on Markup Tools

Great post from Dorothea on requirements for good markup tools.

I agree with everything she says, and follow a few of the habits she lists myself: I tend to write an article and then markup it up, unless its only a very very short piece; and always fill in my URLs after the fact. I also hate tree based views, I only find them even vaguely useful when I'm writing XSLT.

That said I do personally find Wiki markup pretty unencumbering so do tend to markup as I write when editing a Wiki. But thats mainly because Wiki markup generally involves the same "gesture" (to borrow Dorotheas term) at the beginning and end.


New Tools and Old Habits

Posted by ldodds at 03:12 PM | Feedback? | TrackBack

My Life in RDF

I have this project idea at the back of my mind which I mentally refer to as "My Life in RDF".

Rather than being an exciting diary of the trials and tribulations of a lone hacker, wrestling with semantic web technologies for fun and profit, you should read that title a bit more literally: the details of my life -- the raw details of what I do -- available as RDF.

At this point I should probably try to dispel any whiff of conceit here. I don't necessarily think that the world (or the Semantic Web) is waiting with baited breath to know exactly what page of "Impossibility" I'm on at the moment [1] or what I'm listening to as I write this [2] or what I thought of the last film I saw [3]. Far from it! However the success of the Semantic Web is obviously predicated (pun intended :) on the availability of as much machine-processable metadata as possible. And as potential citizens of the Semantic Web who wish to gain the most from that environment, this means providing information about ourselves.

Obviously there are security and trust issues to be considered and resolved here. For the moment though I'm setting those aside. Not because I think they're secondary issues, but because I think that the immediate hurdle of extracting and publishing this data from the decidely un-semantic application environment in which we currently exist is a much bigger one. I also believe that some of the security, privacy and social issues won't be visible until we're suddenly awash in a sea of triples and new capabilities begin to emerge; these problems are never where you expect them to be.

The kind of data that would be exported by "My Life in RDF" would be as comprehensive as possible. I'm interested in finding out in what ways that a stream of data about what I'm doing, thinking, watching, reading, listening to etc can be knitted together to build new kinds of applications. People-centric applications. Many people have already realised that the while blogging phenomenon has only begun to scratch at the surface of the potential here.

"People who watched About Schmidt last night are typically listening to these artists..."

"There are actually 5 Semantic Web developers in the same area as you at the moment, do you want to co-ordinate a face-to-fact meeting? There's wireless access in the coffee shop 200 yards down the road"

You know the kind of thing. Those aren't even great examples.

Another important aspect of "My Life in RDF" would be to fill in the details of your existing environment. Suddenly enabling RDF data streams about yourself and your current activities would be akin to a Semantic Web "Year Zero". To pick obvious examples what about the music or books that are already on your shelf, the novels you've already read? I (we) need to be able to import other data.

I haven't gotten very far with imagining what kind of technology framework "My Life in RDF" would actually involve. It may just be a disparate set of data feeds produced by myself, automatically by my applications, or by other organisations (if sanctioned). The central aspect would be that I would be at the center of the web controlling all aspects of it: it's my data after all.

Rather romantically I imagine myself as the King of Data Province, a Scribe ready by my shoulder to record my latest thoughts, quips and travels. In concentric circles, radiating outward from my throne situated at the pinnacle of my Castle, move my trusted friends and colleagues who have access to the Keep and the Library. Beyond the castle walls, but within the moat, is a wider community with whom I interact; anonymously sometimes, hiding my robes under a dark cloak. Still further beyond the moat lie the Bad-Lands: an unmapped territory, its people an unknown quantity, their unwanted attentions kept at bay by the moat and walls of my demesne.

With all this in mind I read with interest Edd Dumbill's note about the Semantic Web dashboard being developed by Nat Friedman. This looks like a great step in the right direction: "My Life in RDF" on the desktop, something that could be of immediate benefit to me. Inverting this, so that the data is available to a trusted group of others, is not a great step from there. Tim O'Reilly's "All Software Should Be Network Aware" article covers a similar theme. I like the idea of a set of application guidelines.

Much of the code hacking I'm working on at the moment especially my FOAF tools, has all of this as its context. I want to help boot-strap a Semantic Web community of people [4]. Other things I have in mind are tools for cataloguing book, music and photo collections. Tools that are designed with the non-technical end user in mind.

Footnotes:

1. Somewhere in chapter 1. It's not as readable as The Meme Machine, but nevertheless extremely interesting

2. A series of tracks including An Fhomhair (Orbital, US Remix CD of The Altogether); Serpents (Nitin Sawhney); 3AM Eternal/Blue Danube (KLF/Strauss; a bizarre but cool remix by The Orb Remix Project); Chemical Beats (The Chemical Brothers); Words (The Doves); and more.

3. About Schmidt. An excellent film, but one that left me feeling quite depressed. I found the emotional "happy" ending to actually further underscore the tragedy of Schimdt's life: that his efforts to build a life that mattered, by following all the usual paths, were ultimately fruitless yet an action taken on a moments guilty feeling had profound repercussions. But then I tend to get emotional over any fatherhood theme ever since I became one. And the suggestion that one could end up becoming a bad, or at least (emotionally) distant, parent without realising its happening is profoundly scary.

4. And it's a shame that "community" is becoming such a tired word these days. Maybe it's a parenting thing, but I find it something that I'm craving more and more.

Posted by ldodds at 02:07 PM | Feedback? | TrackBack

July 02, 2003

Got Advogato? Get FOAF!

Continuing a week spent tinkering with FOAF utilities, I've just posted a new one that will automatically generate a basic FOAF file for you if you have an existing Advogato profile.

Advogato to FOAF

To use the utility all you need to supply is your nick name and email address. The latter is used as a unique identifier in the FOAF universe, so including that in the FOAF document makes it easier for FOAF applications to combine different peoples statements about you. There's an option (the default actually) to encrypt this as a SHA1 sum so it won't be revealed to spammers.

The utility uses Javascript to validate the form and assemble a URL chain that consists of using the W3C HTML Tidy service to turn your Advogato profile page into XHTML, and then the W3C XSLT service to process the results with this stylesheet which scrapes the relevant data from the page.

Here's some sample output from my profile.

There are two obvious extensions to this. The first is to provide an option for automatically generating foaf:knows statements for all the people you've certified (this assumes that FOAF apps are able to smush based on foaf:homepage alone. The second is to include your (and your friends) Advogato certification level as an additional property. Bill Kearney has sketched out a schema for this already.

Thanks to Morten Frederiksen for helping to make sure that the RDF output validates.

Between this and the FOAF-a-Matic you've no longer got an excuse for not having a FOAF description of yourself. Every blog should have one!

Posted by ldodds at 10:12 PM | Feedback? | TrackBack

July 01, 2003

More, More, MORE

I've been meaning to have a play with the POI API for some time now. So, when a colleague mentioned how easy it is to work with, I decided it was high time I had a look. Whilst thinking of a suitable utility it occured to me that Office documents have metadata stored in them (see the File -> Properties dialog), and so I wondered whether it would be able to extract this data as RDF.

The result is MORE (Microsoft Office RDF Extractor).

The tool is a simple command-line utility that generates an RDF document from one or more Office documents. Access to the embedded properties is made possible by the POI HPSF API, while the RDF manipulations are performed by Jena. So you'll need these classes in your CLASSPATH before running the application.

Download MORE 0.1

The command-line is simple:

java com.ldodds.more.MORE -help

...will get you a usage message describing the available properties. To summarise, it's possible to extract RDF from several documents in one go, add RDF statements to an existing RDF document, and dump the results to a file rather than the console which is the default.

The key part of MORE is the "mapping schema". This is a concept that I've borrowed (read: "stolen") from Norman Walsh's rdfjpeg utility, which I've also been tinkering with lately. A mapping schema is basically just an RDF Schema that contains a number of rdf:Property elements. Each of these properties are annotated by a more:pidString property as follows:

<rdf:Property rdf:about="http://purl.org/dc/elements/1.1/title">
<rdfs:label>Title</rdfs:label>
<more:pidString>PID_TITLE</more:pidString>
</rdf:Property>

Here's a complete example schema.

Office documents store their metadata as name-value pairs. These property names are either "built-in", these all start with the prefix "PID_", or are defined by the user in the Custom tab of the File -> Properties dialog in the application (actually I'm glossing over a lot of details here, see the HPSF internals document for the ugly truth; HPSF makes things easy to handle). The pidString properties in the mapping schema are therefore just the names of metadata elements stored in a Word, Excel or Powerpoint document.

Upon encountering an item of metadata, MORE examines its mapping schema to determine which RDF properties it should add to the resulting RDF. The example mapping schema in the download shows how to create both Dublin Core and custom RDF properties. If an item of metadata doesn't have an entry in the mapping schema then its just discarded, making it very easy to customise the tool to produce the output you desire. Also, if a property value starts with "http" or "mailto" then an rdf:resource element is generated rather than a literal.

Feedback is very welcome, particularly if it doesn't work for you or there are bugs! (One thing I'm not sure about is how best to assign a URI to each document resource. I've defaulted to just using the file name, because that's what jpegrdf does, and if its good enough for Norm...)

While I've no firm plans to extend this tool further -- for me it's just another step down the road in learning various RDF tools and technologies -- I may add sensible new features if suggested. However I consider the code to be Public Domain (it's pretty trivial after all) so feel free to do with it what you will.

Posted by ldodds at 09:45 PM | Feedback? | TrackBack