Projects


22
Jan 11

Custom Lego Sets

For a couple of years now I’ve tried to do something a little different for Christmas presents for the kids. I’m not particularly good with my hands but I’ve always wanted to be able to make them things: something that will hopefully mean a little more than the average gift.

For example one year I made them a level in LittleBigPlanet called Sackboy Saves Christmas. (Aside for data geeks: each LittleBigPlanet level now has its own unique URI). The level isn’t great, but I had fun making it, and they’ve enjoyed playing it. A little later I also made them some real pods for their sackboys.

This year I decided to do something with Lego.

Lego Digital Designer is a simple and free CAD package for building and designing lego sets. Once you’ve designed something you can get it priced and ultimately have it turned into a real set.

I’ve tried this package a few times but found that the brick set is a little limited and the price racks up quickly. I’m also not the world’s greatest designer so my creations weren’t great. So I decided to take a slightly different tack.

Lego Community Sites

There are a lot of great lego community sites. One of these is Peeron which is a lego inventory website that provides access to a database of lego parts, set inventories, instruction scans and photos. The whole thing is crowd-sourced so you can submit new inventories or scans.

One particularly nice feature is that you can build a personal inventory of lego sets and parts. You can browse sets, ticking off those that you own, and the site builds a database of the various parts that make up the sets. When you’re browsing a set you don’t have you can also click “try to build” and the service will run the set inventory through the list of parts you own, and let you know whether you have all of the required parts, if you have any of the right part but in the wrong colour, or which parts you’re missing.

The core of the family lego collection is the remnants of my childhood collection of Classic Lego Space sets. There were lots of parts missing, but we’ve been able to use Peeron to resurrect some of the sets with substitute parts.

While you can get new bricks, baseplates and minifigs from the lego shop, if you want to track down hard to find or discontinued pieces then there’s one place to go: Bricklink.

For the uninitiated, Bricklink is essentially an Ebay for Lego. It’s a marketplace where anyone can go to buy and sell lego bricks, sets, and instructions. Not only is it a fantastic resource for tracking down hard to find pieces, but I’ve found that even new bricks are much cheaper than buying them direct from lego.

There’s a search engine on the site for tracking down what you need. Lego part numbers are standardised so it’s easy to find what you want if you’re buying missing pieces for a set you want to build from Peeron. It’s also a great place to go to if you’re piecing together a custom set from scratch. You can maintain a wanted list and get alerts as pieces become available. And if you do buy from the marketplace, you can download the part list for your order for importing back into Peeron, to keep you part list up to date.

The Bricklink community is also very friendly and efficient. I’ve found that orders tend to be processed really quickly and come well packaged. I’ve taken care to rate sellers and comment on every order as that kind of quality interaction is something to encourage.

So if you’re thinking about building custom lego sets, Bricklink is definitely the place to start.

Finding and Creating Custom Lego Set Designs

Having ruled out trying to create something completely unique I settled on creating sets from other people’s creations. A bit of a cop out I suppose, but the end result would still be something different to what’s in the the Lego catalogue.

I’ve mentioned Lego Digital Designer already. There’s also a more “professional” Lego CAD package called LDraw which is essentially a suite of open source tools for creating and manipulating Lego model designs. As well as the core CAD package itself there are also tools to support creating rendered images from designs, and even to create complete instructions that are very close to those produced by Lego themselves. The tools are a bit fiddly to work with though and surprisingly I found it hard to track down many designs that people had actually shared.

Another resource is the MOCPages community. MOC stands for “My Own Creation”. It’s essentially a community site where people can upload photo sets for models they’ve created. There are some really great (and big!) Lego models on that site! Little in the way of instructions or parts lists though, so there’s an element of reverse engineering involved.

There’s also a community of people using Flickr to share their creations. I’ve been following Peter Reid for a while as he creates the most fantastic selection of Lego space and robot models. Again, you need to be prepared to reverse engineer, but this isn’t too hard for the smaller models at least.

A final source for some small simple models is the Brick Issue. This is the magazine of the Brickish Association and has a regular feature “5 Minute Model” feature that provides instructions for some simple models.

What I Made

Issue 7 of the Brick Issue, for example, has a 5 minute model of a Turtle droid by Peter Reid (photo). This was perfect for my purposes as my son and I had been admiring the Turtle Factory at the Great Western Lego Show (watch the video!). So this formed the basis for the first set I put together for my son.

I found the second set I decided to package via the Neo Classic Space blog. This is a Lego fan blog focused specifically on people updating the old Lego Classic Space theme to use modern parts as well as covering some fantastic new models made to follow the theme. There are some excellent micro-scale models featured on there, including this one of an X-wing. This was pretty easy to reverse engineer so I put together a second set that consisted of three X-Wings; one with some slight tweaks to make it the “squad leader”.

Lego has a pretty hit and miss affair when it comes to creating sets for girls. My daughter loves Lego too, but primarily for playing with the minifigs and the towns and buildings. So for her I assembled a collection of female minifigs to add to her existing small collection.

Once I’d ordered all of the parts — which involved probably ten or more individual orders across a number of Bricklink sellers — the remaining work was to order some boxes from the The Bag And Box Man. I created some custom labels, following the Lego box art style, which is pretty easy to reproduce. The availability of some great flickr photos of the models meant that I had plenty of existing resources to draw on.

I was pretty pleased with the end result and so were the kids! It was definitely a fun project over the pre-Christmas run-up and a welcome distraction from a very busy work schedule. If you’re interested in trying this out yourself, hopefully there are some useful pointers in this post.


6
Apr 10

Linked Data Patterns: a free book for practitioners

A few months ago Ian Davis and I were chatting about some new approaches to helping practitioners climb the learning curve around Linked Data, RDF and related technologies. We were both keen to help communicate the value of Linked Data, share knowledge amongst practitioners, and to encourage the community to converge on best practices. We kicked around a number of different ideas in this vein.

For example, Ian was keen to provide guidance as to how to mix and match different vocabularies to achieve a particular goal, like describing a person or a book. Having a ready reference containing recipes for these common tasks would address a number of goals. He’s ended up exploring that idea further in the recently released Schemapedia. If you’ve not seen it yet, then you should take a look. It provides a really nice way to navigate through RDF vocabularies and explore their intersections.

The other thing that we discussed was Design Patterns. I’ve been a Design Pattern nut for some time now. Discovering them was something of a right of passage for me during my Master’s dissertation. I’d spent weeks revising and honing a design for the distributed system I was building, only to discover that what I’d produced was already documented as a design pattern in an obscure corner of the research literature. While I’d clearly reinvented the wheel, the discovery not only provided external validation for what I’d produced, but also neatly illustrated the benefit of using design patterns to share knowledge and experience within a community. Knowing when to apply particular patterns is a key skill for any developer, and the terms are a part of the design vocabulary we all share.

I suggested to Ian that we explore writing some patterns for Linked Data. Patterns for assigning identifiers, modelling data, as well as application development. We experimented with this for a while but ended up parking the discussion for a few months whilst other priorities intervened.

I recently revived the project. It’s pretty clear to me that there’s still a big skills gap between experienced practitioners and those seeking to apply the technology. I think the current situation is reminiscent of the move of OO programming from the research lab out into the developer community; design patterns played a key role there too.

Ian and I have decided to share this with the community as an on-line book, a pattern catalogue that covers a range of different use cases. We started out with about half a dozen patterns, but over the last few weeks I’ve expanded that figure to thirty. I’ve still got a number on my short-list (more than a dozen, I think) but it’s time to start sharing this with the community. The work won’t ever be complete as the space is still unfolding, it will just get refined over time.

You can read the book online at http://patterns.dataincubator.org.

The work is licensed under a Creative Commons Attribution license so you’re free to use it as you see fit, but please attribute the source. If you want to download it, then there’s a PDF, and an EPUB too. We’re using DocBook for the text so there will be a number of different access options.

I’ll stress that this is a very early draft, so be gentle. But we’d love to hear your comments.


20
Oct 09

Surveying and Classifying SPARQL Extensions

I realised recently that, while a lot of work has been done on creating and exploring interesting extensions to the SPARQL query language, there has yet to be a systematic survey of the range of different extensions that are currently implemented in various RDF triplestores. Or if there has been a survey, then I’ve clearly missed it.

In order to get a better idea of what kinds of extensions are available I’ve set myself the task of surveying those currently implemented. I intend to write-up and share the results of that work through this blog.

Rationale

I think that pulling together a list of extensions is a useful activity which should:

  • Help researchers and implementors to have a clearer view of existing work, thereby encouraging further experimentation
  • Promote convergence on a core set of useful extensions that could be implemented across a number of triplestores.
  • Help users to have a clearer understanding of what SPARQL extensions are currently supported in particular triplestores, letting them make informed decisions about which extensions to use when writing and sharing queries

It looks like the SPARQL Working Group may well be adding a standard library of extension functions into the next revision of the query language so the timing of this work should help contribute to that effort. However I’m looking beyond their immediate goals and hope to encourage the implementor community to explore models simple to the EXSLT effort which has been successful in creating a set of community-designed extensions for XSLT transformations. I see no reason why the same process can’t be applied to SPARQL extensions.

Clarity of which extensions are portable across triplestores is important to allow users to experiment with various triplestore implementations and services. If data is going to be truly portable, then this will be an important consideration.

With that in mind I’ve begun digging into the available documentation for a number of different triplestores. I’ve decided to organize my work by surveying each of the three different types of SPARQL extension.

Types of SPARQL Extension Function

Its possible to extend the SPARQL query language in any of the following three ways:

  • Extension Functions
  • Property Functions (aka “Magic Predicates”)
  • Language Extensions

Lets look at each of these in turn.

Extension Functions

Extension Functions are explicitly described by the current SPARQL specification under the banner of “extensible value testing“. The standard library of extensions that may be added to SPARQL 1.1 will fall into this category. Extension Functions are simple function calls that can be used within a FILTER in a SPARQL query to carry out some specific extra logic that cannot be handled by matching triple patterns. Examples of extension functions include substring testing, string concatenation, date tests, etc.

The specification indicates that these extension functions should have a unique URI, allowing them to be globally identified. Few engines are publishing useful information at these URIs, but this seems like it would be a useful thing to do. These URIs should be grounded in the web too.

Property Functions

Property Functions (aka “Magic Predicates”, or “Magic Properties”) are extensions to the triple matching process that is carried out when a SPARQL query is executed. This means that property functions don’t appear in a FILTER expression like an extension function. They instead appear within the graph pattern of the query. Unlike extension functions which have a syntax like a conventional functional call, property functions use turtle syntax and appear, to the untrained eye, as standard triple patterns.

For example, as property function that could split a resource URI into a namespace and a localname might look like this in a SPARQL query:


?uri a rdfs:Class.
?uri ex:splitURI (?namespace ?localname).

In that example the the property function ex:splitURI has as its input each of the URIs that are bound to the ?uri variable, and as its output binds the namespace URI and localname of those URIs to two new variables.

There are other ways to structure the inputs and outputs of a property function, depending on its purpose, but the important things to recognise are that:

  • the property function is written as a conventional triple pattern
  • parameters can be passed from either the subject or object portions of the triple (or potentially both)
  • similarly, output can be bound to variables that appear in either the subject or object portions of the triple
  • one technique for passing multiple parameters or generating multiple output values is to allow specification of an RDF list in the object portion of the triple

Property functions are very powerful as they can allow arbitrary complex logic to be used to extend the triple matching process. One common use is to extend the matching process by calling out to specialised indices or logic, e.g. for full-text indexing or geospatial functions and reasoning.

It is worth noting that Property Functions are not explicitly licensed by the current SPARQL specification. The specification does not describe them at all: they are simply allowed by the fact that they conform to the overall SPARQL grammar.

Testing whether a query uses Property Functions would therefore require a validator (such as the one that Dan Brickley describes here) to either have explicit knowledge of the function, e.g. based on its URI, or for implementors to publish some useful information at those locations so that a validator might determine whether a specific predicate is actually a “real” predicate or an extension through dereferencing the URI. I’m not aware of any implementation that currently does this.

Language Extensions

The final category of SPARQL extensions are extensions to the language itself. This type of extension involves amending the grammar of the language to include new operators, keywords, and types of expression. Examples of this type of extensions include sub-queries and aggregates (e.g. min and max). The forthcoming SPARQL 1.1 specification will standardise these and a few other language extensions that have been commonly implemented.

Arguably, if one changes the grammar of a language then you’re creating a new language: “SPARQL plus some extensions”. So some care needs to be taken with respect to this type of extension if one wants queries to be portable.

In my view while there is plenty of scope for the community to collaborate and converge on common extension of all of the types I’ve described here, the best place for language extensions to be formally ratified and agreed on is through the SPARQL Working Group. I personally don’t expect the Working Group to have to, or want to sign-off on every extension function or property function, but interoperability is ultimately best served by co-ordinating language extensions through the Working Group. Naturally this should happen after the implementor community have had a period of experimentation and research. This is obviously the process that has happened to date, and hopefully this will continue as the language continues to evolve. A bit of collective action ought to help ensure interoperability in other areas.

A Survey

For my survey of SPARQL extensions I’ve decided to tackle things in the order in which I have presented them here: I will first look at Extension Functions, then Property Functions, and then Language Extensions. For the rationale and reasons I’ve already outlined, I think the community is best served by organizing itself around standardising two of those types of extensions. And Extension Functions seem like the lowest hanging fruit.

I’m intending to do the survey in as open a way as possible, and want to ensure that I include as many different implementations as possible. Having said that initially I’m going to impose some editorial control simply to ensure consistency and quality. Implementors feel free to drop me a line providing me with information on your extensions or preferably pointers to the relevant documentation. I’ll also stress that while this survey has obvious relevance for my day job, that this is a personal project so things will progress as quickly as I’m able to find some time to push things forward.

I’m going to send regular status updates to the public-sparql-dev mailing list as that is the correct place for further discussion. I’ll also summarize my findings in further blog posts here. I’ve already begun the process of cataloguing Extension Functions as you can see by my recent email to the mailing list. I still have to include some additional information helpfully provided by OpenLink and to also update the entries for Mulgara to list its support for some of the EXSLT functions.

One other task I have on my list is to help provide some guidance on how implementors should publish information about their SPARQL extensions. It would be useful to have some descriptive metadata for these available from the relevant URIs. I’m intending to spend some time at Vocamp DC pulling together a vocabulary for that purpose. Let me know if you’re attending and want to collaborate.


30
Sep 03

FOAF-a-Matic Mark 2 beta-2

It’s now been 10 months since I released the first beta of the FOAF-a-Matic Mark 2. An embarrassingly long time indeed, so I thought it was high time that I produced a second beta for you all to play with.
I’ve not been working on this solidly for 10 months and when you take a look you’ll see that there’s not a huge amount of extra functionality, but there’s a few fun bits in there which I’d like some feedback on, so thought I’d go ahead and push the out the door anyway.
I’m calling this beta-2 but there’s every likelihood that there are more bugs in this than the original so be careful and back-up your FOAF file before setting my tool on it. Also be aware that because it still doesn’t support all of FOAF (e.g. foaf:Group, foaf:Project, foaf:nearestAirport, etc) it won’t faithfully round-trip files.

Continue reading →


22
Sep 03

MusicBrainz Java API beta-1

I’ve just uploaded the beta-1 of my shiny new Java API onto the MusicBrainz RDF web service.
If you’re not familiar with MusicBrainz, it’s similar to CDDB: it stores lists of artists, albums and tracks that can be used to add metadata to your music collection. Aaron Swartz wrote a nice article on it a while ago: “MusicBrainz: A Semantic Web Service” (warning PDF).
There’s been a C/C++ API for some time now with bindings for other languages, but no Java API. And as I want to hook some Java code up to the server I went ahead and wrote one.
It’s not complete yet. It’s read-only at the moment so doesn’t support the query methods used to authenticate and submit data to the service. However this is enough for me at present and I thought I’d release it in case anyone else finds it useful.
The API is built on the spangly new Jena 2 API, and provides “raw” access to the RDF responses from the server or a simple bean interface for those of you not interested in the RDF.
You can download the API and read the package documentation online. The latter contains a few code fragments and enough information to get you started. The unit tests are pretty comprehensive too, so look there for additional examples.
This API is released under the Creative Commons Attribution-ShareAlike License


22
Aug 03

FOAF-a-Matic Translations: More Coming, and How It Works

I’ve just added a comment to Danny’sTranslator exchange for open projects” LazyWeb entry, asking for additional translations of the FOAF-a-Matic. Having a simple bulletin board like this is a great idea. Here’s hoping that it gets some more linkage and pulls in more project requests and translators offering their services.
I’ve already followed-up Diane Panek’s offering of Filipino/Tagalog translation. And, if you’re a dufus like me, then you’ll be interested to know that Tagalog is “…the second most commonly-spoken Asian language (after Chinese) in the United States, and the sixth non-English language spoken in America“. More information available here.
I’ve also had a nice email from Minsu Jang who has offered me a Korean version of the FOAF-a-Matic. I’ve mailed Minsu with instructions, which basically boil down to translating this file.
For the techies out there, here’s some details about how I generate a new version of the application…

Continue reading →


25
Jul 03

XML-DEV and eclectic

For a long time now eclectic has been lying fallow. I’ve just been able to keep up to date with XML-DEV on a daily basis and keep it maintained the way I used to. A combination of factors including being much busier at work (now managing a team) and at home (now a father, and soon to be so again).
However I think the main reason is that the conversations seem to be endlessly spiralling around several recurring themes (”permathreads”). This makes for very tedious reading as the trenches rarely shift very far in either direction. This has greatly reduced my tolerance for keeping up to date with the list. In the past I’ve tried to remain as impartial as possible, but once you’ve blogged about a topic for the nth time it starts to get tedious fast.
So I’m declaring eclectic to be dead. I’m not longer going to be a daily reader of XML-DEV but will stop by the archives occasionally and report on any interesting topics that I see.
Sadly however, a brief look into this months archives sees threads on XSLT vs CSS, Namespaces, and even one about Permathreads. So slim pickings for the moment.
It’s a shame really as I’ve learnt a great deal from monitoring XML-DEV over the last few years. I think the community has matured to the point where the interesting stuff is now happening on the “fringes” — separate dedicated mailing lists or other shared spaces. And thats where I intend to keep lurking for the moment.
Many thanks to Userland for providing a free hosting service for so long.
If anyone is interested in taking the content from eclectic, you can download eclectic.root (a Frontier database export of the site)


2
Jul 03

Got Advogato? Get FOAF!

Continuing a week spent tinkering with FOAF utilities, I’ve just posted a new one that will automatically generate a basic FOAF file for you if you have an existing Advogato profile.

Continue reading →


1
Jul 03

More, More, MORE

I’ve been meaning to have a play with the POI API for some time now. So, when a colleague mentioned how easy it is to work with, I decided it was high time I had a look. Whilst thinking of a suitable utility it occured to me that Office documents have metadata stored in them (see the File -> Properties dialog), and so I wondered whether it would be able to extract this data as RDF.
The result is MORE (Microsoft Office RDF Extractor).

Continue reading →


11
Jun 03

FOAF-a-Matic In Japanese

The title says it all. The Japanese version of the FOAF-a-Matic was made possible by the kind efforts of Masahide Kanzaki. Big thanks to Masahide for his swift turn-around of the translation.
Masahide has also created a Japanese introduction to FOAF.
Danny has links to other recent FOAF activities