<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Lost Boy &#187; Science and Technology</title>
	<atom:link href="http://www.ldodds.com/blog/category/science-and-technology/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ldodds.com/blog</link>
	<description>A journal of no fixed aims or direction, by Leigh Dodds</description>
	<lastBuildDate>Sat, 22 Jan 2011 20:23:23 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Ants, Overlays and Open Data</title>
		<link>http://www.ldodds.com/blog/2008/08/ants-overlays-and-open-data/</link>
		<comments>http://www.ldodds.com/blog/2008/08/ants-overlays-and-open-data/#comments</comments>
		<pubDate>Mon, 11 Aug 2008 18:09:21 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=313</guid>
		<description><![CDATA[Whilst standing behind the yellow line on the platform this morning, waiting for a train to Oxford, I noticed an ant on the floor wending its way along the tarmac, within the bounds of the thick yellow paint. The little black speck stood out quite sharply against the bright yellow.
Obviously the ant wasn&#8217;t following the [...]]]></description>
			<content:encoded><![CDATA[<p>Whilst standing behind the yellow line on the platform this morning, waiting for a train to Oxford, I noticed an ant on the floor wending its way along the tarmac, within the bounds of the thick yellow paint. The little black speck stood out quite sharply against the bright yellow.<br />
Obviously the ant wasn&#8217;t following the line, but neither was it moving randomly. It was clearly following its own little invisible marker, an ant scent trail, that just happened to co-incide with the platform markings.<br />
Last night BBC 1 showed <a href="http://www.bbc.co.uk/iplayer/episode/b00d24qq/">Britain from Above</a> an ariel view of Britain during a 24 hour period. The show had some great information visualisations of including traffic patterns for taxis, garbage collection, commuters, shipping, aircraft, as well as more static landmarks such as railway lines, electricity cables, water courses and telephone and network cabling. If you didn&#8217;t catch it the programme is definitely worth a watch.<br />
It was this birds eye view of the world that lead me to reflect on that ant and it&#8217;s invisible trail. I wonder how many other layers of information could have been<br />
added to the human-centric views shown in the programme? Animal migratory paths are an obvious one. Paths of dispersal, ranges and colonization are some others. It doesn&#8217;t take long to come up with many, many more.<br />
The combinations of different paths and layers are also interesting to explore.  Are many of these chance overlaps, like the ant on the paint or are there dependencies or inter-relations? For example how are migratory routes affected by no-fly zones or shipping lanes? Do migratory pathways begin to align with man-made features like roads and railways? And where have features like fish ladders and toad tunnels been introduced to avoid clashes between competing uses for the same space?<br />
It&#8217;s doubtful that these kinds of questions will be answered in the rest of the series. Judging by the trailer for next week&#8217;s episode there seems to be a more of a &#8220;Pop geography&#8221; focus. (I&#8217;ll be tuning in regardless)<br />
The truly exciting thing is that we can do this kind of exploration of layered information sources through map based visualizations ourselves using a huge, and growing, range of commodity tools and data sets.<br />
Whilst watching the programme, what intrigued me more than the admittedly beautiful, animations were questions such as: how did they approach the<br />
information holders in order to get permission to use it? What steps were made towards privacy and anonymity? For the BBC it&#8217;s going to be very easy to get access to all kinds of data. Not least because they have resources to spend, but also because their reputation proceeds them and the result of the sharing of data is immediate: &#8220;don&#8217;t you want to be on the telly&#8221;?<br />
Open data advocates may do well to band together to form an organization that can become the focal point for activism and importantly <i>trust</i>. Such an organization could recommend best practices, including auditing of data for privacy results. It could also put together a showcase of the end results: creative visualizations of published data. It may be easier to approach data owners as a member or representative of such an collective, open, distributed, collegial organization than as an independent interested hacker.<br />
But creating a compelling presentation is about more than just having the right technology and data. A good visualization tells a story. It&#8217;s through stories that data, really comes alive. The open data movement needs the involvement of strongly creative people as much as (and perhaps more than) technology people.<br />
You need do be able to do more than animate a little black speck against a yellow band: where was that little ant going?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2008/08/ants-overlays-and-open-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Modern Palimpsest</title>
		<link>http://www.ldodds.com/blog/2005/12/the-modern-palimpsest/</link>
		<comments>http://www.ldodds.com/blog/2005/12/the-modern-palimpsest/#comments</comments>
		<pubDate>Fri, 16 Dec 2005 18:11:08 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=258</guid>
		<description><![CDATA[The following is a brief summary of a talk I gave recently at the Ingenta Publisher Forum on the 28th November. The slides are available as a Powerpoint presentation.
In the presentation I tried to highlight some of the possibilities that could become available if academic publishers begin to share more metadata about the content they [...]]]></description>
			<content:encoded><![CDATA[<p>The following is a brief summary of a talk I gave recently at the Ingenta Publisher Forum on the 28th November. The slides are available as <a href="http://www.ingenta.com/corporate/downloads/ingenta/06_dodds.pps">a Powerpoint presentation</a>.<br />
In the presentation I tried to highlight some of the possibilities that could become available if academic publishers begin to share more metadata about the content they publish, ideally by engaging with the scientific community to expose &#8220;raw&#8221; data and results.</p>
<p><span id="more-258"></span><br />
The conceit around which I hung the presentation was the suggestion that the scientific paper is the modern equivalent of a <a href="http://en.wikipedia.org/wiki/Palimpsest">palimpsest</a>.<br />
A palimpsest, <a href="http://en.wikipedia.org/wiki/Palimpsest">as Wikipedia will tell you</a> is a scroll or manuscript that has been written on, had its text scraped off, and then reused. The practice was common in Medieval times when the costs of publishing were very high and it was cheaper to destroy a copy of another work than manufacture new parchment.<br />
A great deal of success has been made in extracting the original texts from these works. Probably the most famous example is the <a href="http://en.wikipedia.org/wiki/Archimedes_Palimpsest">Archimedes Palimpsest</a> (some <a href="http://www.thewalters.org/archimedes/frame.html">nice photos here</a>).<br />
The underlying text is known as the <i>scriptio inferior</i>, and may actually be more valuable than the more visible content.<br />
I likened the process of authoring a scientific paper to that of the creation of a palimpsest. Starting from original research results and working through the synthesis of a cogent explanation of the results or discovery, at each step the content becomes more abstracted from the original results, the previous work being &#8220;lost&#8221; to the reader.<br />
Data is presented in pre-analysed forms and is not amenable to reuse. Like the palimpsest the raw data has not really been lost, its just not (easily) accessible to the reader.<br />
If the <i>scriptio inferior</i>, the underlying data, were made available to the reader, then there a lot of interesting possibilities arise.<br />
This idea isn&#8217;t new of course. Many scientists have been pushing for this for many years. However with the general trends towards open data sharing and &#8220;Web 2.0&#8243;, the time is perhaps ripe for the web development and scientific communities to engage with one another to try and make more of these ideas a reality.<br />
In my presentation I tried to stick to a pragmatic and practical line and demonstrate the possibilities by referring to actual examples. I ended up pointing to three:<br />
Firstly I demonstrated how I could re-plot the some results of a geological study using <a href="http://maps.google.co.uk">Google Maps</a>. This served to highlight the interactivity in modern web applications, and illustrate how more compelling and dynamic interfaces can be made from existing data sets. The <a href="http://www.ldodds.com/tmp/map.html">very trivial demo</a> is available online. It wouldn&#8217;t be hard to envisage an application that amassed data from a range of related but independent studies in order to provide an alternative way of navigating through the corpus of documents.<br />
<a href="http://iSpecies.org">iSpecies</a> is a nice example of a science &#8220;mashup&#8221; that illustrates an alternative search interface for finding related content. I used the false results that can appear when performing simple keyword searches to reinforce the need for standard identifiers. (The need for a common, scoped identifier for authors, is a particular hobby horse of mine)<br />
I also showed the excellent <a href="http://www.hubmed.org/">HubMed</a> as an example of how both an alternative user interface can be better than the original, and also how content can be &#8220;enriched&#8221; by mixing in other sources. <a href="http://hublog.hubmed.org/archives/001236.html">The &#8220;terms&#8221; feature</a> which dynamically links keywords in an abstract through to a number of data sources, demonstrates this very well. I used the fact that material can be sourced from user contributed sources such as Wikipedia, to promote the idea that content needn&#8217;t be fixed at the point of publication but can be annotated after the fact.<br />
The general theme of the forum was &#8220;reaching new markets&#8221; and I closed the presentation by suggesting that making more data open and the content more engaging might help promote the role of the amateur in science.<br />
There seems to be a lot of current interest in exploring possibilities available in &#8220;eScience&#8221;, <a href="http://del.icio.us/tag/webbyscience">webbyscience</a> and  &#8220;<a href="http://openwetware.org/wiki/Science_2.0">Science 2.0</a>&#8220;; although I dislike that last term! The <a href="http://www.nature.com/nature/journal/v438/n7068/full/438531a.html">recent Nature articles</a> on these topics are also definitely worth reading.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2005/12/the-modern-palimpsest/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nature Quote</title>
		<link>http://www.ldodds.com/blog/2005/11/nature-quote/</link>
		<comments>http://www.ldodds.com/blog/2005/11/nature-quote/#comments</comments>
		<pubDate>Fri, 25 Nov 2005 15:40:39 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=255</guid>
		<description><![CDATA[There&#8217;s a short article in Nature (subscribers only I&#8217;m afraid) this week about Google Base and its potential impacts on the science community. In particular whether it might galvanise greater data sharing between scientists.
I&#8217;ve been corresponding with Declan Butler, the author of the piece, on this and some related topics recently, and he ended up [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s a short <a href="http://www.nature.com/nature/journal/v438/n7067/full/438400a.html">article in Nature</a> (subscribers only I&#8217;m afraid) this week about Google Base and its potential impacts on the science community. In particular whether it might galvanise greater data sharing between scientists.<br />
I&#8217;ve been corresponding with Declan Butler, the author of the piece, on this and some related topics recently, and he ended up quoting me:<br />
<cite></p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2005/11/nature-quote/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WebCite</title>
		<link>http://www.ldodds.com/blog/2005/11/webcite/</link>
		<comments>http://www.ldodds.com/blog/2005/11/webcite/#comments</comments>
		<pubDate>Wed, 16 Nov 2005 23:56:19 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=253</guid>
		<description><![CDATA[Alf Eaton posts today to point to the new WebCite service. This is going to be very useful. Don&#8217;t think so? Well there&#8217;s plenty of research to show that link atrophy is a big problem in scientific literature:
Persistence of Web References in Scientific Research
See also: A study of missing Web-cites in scholarly articles: towards an [...]]]></description>
			<content:encoded><![CDATA[<p>Alf Eaton <a href="http://hublog.hubmed.org/archives/001243.html">posts</a> today to point to the new <a href="http://www.webcitation.org/index">WebCite</a> service. This is going to be very useful. Don&#8217;t think so? Well there&#8217;s plenty of research to show that link atrophy is a big problem in scientific literature:<br />
<a href="http://doi.ieeecomputersociety.org/10.1109/2.901164">Persistence of Web References in Scientific Research</a><br />
See also: <a href="http://jis.sagepub.com/cgi/content/abstract/30/6/484?etoc">A study of missing Web-cites in scholarly articles: towards an evaluation framework</a> which reports that &#8220;<cite>[a]fter evaluating 2162 bibliographic references it was found that 48.1% (1041) of all citations used in the papers referred to a Web-located resource. A significant number of references to URLs were found to be missing (45.8%)&#8230;</cite>&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2005/11/webcite/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>iSpecies and taxonomy (no, not that kind)</title>
		<link>http://www.ldodds.com/blog/2005/11/ispecies-and-taxonomy-no-not-that-kind/</link>
		<comments>http://www.ldodds.com/blog/2005/11/ispecies-and-taxonomy-no-not-that-kind/#comments</comments>
		<pubDate>Thu, 03 Nov 2005 19:48:05 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>
		<category><![CDATA[Semantic Web]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=245</guid>
		<description><![CDATA[For the last few years I&#8217;ve been lurking on a mailing list run by the Taxonomic Databases Working Group. It&#8217;s a low volume list used by scientists interested in capturing and marking up taxonomies. That&#8217;s taxonomy in the Linnaean sense not the semantic web sense. I&#8217;ve been lurking there since I wrote this paper a [...]]]></description>
			<content:encoded><![CDATA[<p>For the last few years I&#8217;ve been lurking on <a href="http://listserv.nhm.ku.edu/archives/tdwg-sdd.html">a mailing list</a> run by the <a href="http://www.tdwg.org/">Taxonomic Databases Working Group</a>. It&#8217;s a low volume list used by scientists interested in capturing and marking up taxonomies. That&#8217;s <a href="http://en.wikipedia.org/wiki/Taxonomy">taxonomy</a> in the Linnaean sense not the semantic web sense. I&#8217;ve been lurking there since I wrote <a href="http://www.ldodds.com/delta/">this paper</a> a while back proposing an XML format to replace a text based format that had been popular.<br />
Yesterday on the list this interesting little mash-up was announced: <a href="http://ispecies.org">ispecies.org</a>. It <a href="http://darwin.zoology.gla.ac.uk/~rpage/ispecies/how.html">works</a> by searching NCBI, Yahoo images and Google Scholar to attempt to find relevant information on biological specis. <a href="http://darwin.zoology.gla.ac.uk/~rpage/ispecies/?q=Panthera+Leo">Lions for example</a>.<br />
I found it interesting mainly because it is was one of the first mashups I&#8217;ve seen that aren&#8217;t combinations of the same old APIs (maps, music, bookmarks) but also because its clearly focused at a particular scientific community.<br />
The author, Rod Page (apparently a big RDF fan) built this as an off-shoot of a wider project thats storing phylogenetic data as RDF. His site also has a <a href="http://darwin.zoology.gla.ac.uk/~rpage/portal/">Taxonomic Search Engine</a> which federates a number of taxonomic name databases. Perform a search it links you to metadata about the organism. There&#8217;s a paper on the application on <a href="http://www.biomedcentral.com/1471-2105/6/48">BioMedCentral</a>.<br />
Given an LSID (Life Sciences Identifier) it turns out you can get RDF metadata about the organism. <a href="http://sp2000.org.lsid.zoology.gla.ac.uk/authority/metadata/?lsid=urn:lsid:sp2000.org.lsid.zoology.gla.ac.uk:record_id:575689">Lions for example</a>.<br />
There&#8217;s a lot of interesting mash-up potential in this data, as well as that available from <a href="http://del.icio.us/ldodds/classification">a few other projects</a> in this area.<br />
I&#8217;ve been keeping half an eye on this space recently, after reading <a href="http://www.nature.com/nbt/journal/v23/n9/abs/nbt1139.html">this paper</a> on how bioinformatic researchers are bumping into limits of XML and looking at RDF instead: &#8220;<cite>&#8230;the syntactic and document-centric XML cannot achieve the level of interoperability required by the highly dynamic and integrated bioinformatics applications</cite>&#8220;.<br />
These guys have a <i>lot</i> of data that needs integrating and merging. Modern classification is about much more than the old Linnaean system. It has to be able to merge together data sources ranging from molecular biology through to field observations, and depending on what sources you draw on, and from what level, the tree of life can be draw quite differently.<br />
The early web has pioneered in part by the needs of scientists exchanging research papers. It strikes me that &#8220;eScience&#8221; and bioinformatics may very well become the driving forces behind a more semantic web.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2005/11/ispecies-and-taxonomy-no-not-that-kind/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Information Aesthetics</title>
		<link>http://www.ldodds.com/blog/2005/09/information-aesthetics/</link>
		<comments>http://www.ldodds.com/blog/2005/09/information-aesthetics/#comments</comments>
		<pubDate>Fri, 16 Sep 2005 13:35:02 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=236</guid>
		<description><![CDATA[I don&#8217;t normally do link blogging, but the information aesthetics blog is too cool not to share, where else can you read about an augmented reality kitchen, the gori node garden, or street clocks?
No attribution as I can&#8217;t remember where I discovered it. Quite possibly via oishii! which is often a source of my random [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t normally do link blogging, but the <a href="http://infosthetics.com/">information aesthetics</a> blog is too cool not to share, where else can you read about an <a href="http://infosthetics.com/archives/2005/09/augmented_reality_kitchen.html">augmented reality kitchen</a>, the <a href="http://infosthetics.com/archives/2005/09/gori_node_garden.html">gori node garden</a>, or <a href="http://infosthetics.com/archives/2005/07/streetclock.html">street clocks</a>?<br />
No attribution as I can&#8217;t remember where I discovered it. Quite possibly via <a href="http://opencontent.org/oishii/">oishii!</a> which is often a source of my random browsing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2005/09/information-aesthetics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Konfabulator</title>
		<link>http://www.ldodds.com/blog/2005/01/konfabulator/</link>
		<comments>http://www.ldodds.com/blog/2005/01/konfabulator/#comments</comments>
		<pubDate>Fri, 07 Jan 2005 02:19:39 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=171</guid>
		<description><![CDATA[Via Catalogablog I&#8217;ve just learnt that Konfabulator is available for windows. Looked interesting, so I installed it.
I&#8217;m in love.
Looking forward to seeing this del.icio.us based widget.
]]></description>
			<content:encoded><![CDATA[<p>Via <a href="http://catalogablog.blogspot.com/2005/01/widgets.html">Catalogablog</a> I&#8217;ve just learnt that <a href="http://www.konfabulator.com/">Konfabulator</a> is available for windows. Looked interesting, so I installed it.<br />
I&#8217;m in love.<br />
Looking forward to seeing <a href="http://www.familytimes.com/node/view/560">this del.icio.us based widget</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2005/01/konfabulator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Working In A Small World</title>
		<link>http://www.ldodds.com/blog/2004/09/working-in-a-small-world/</link>
		<comments>http://www.ldodds.com/blog/2004/09/working-in-a-small-world/#comments</comments>
		<pubDate>Wed, 01 Sep 2004 17:08:20 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=157</guid>
		<description><![CDATA[Stumbled over these musings on how small world theory applies to company organization. They&#8217;ve been languishing in my personal wiki for many months, thought I might as well post them as is.
Whilst reading the first few chapters of &#8220;Small World&#8221; by Mark Buchanan, I was fascinated by the work of Granovetter (see &#8220;The Strength of [...]]]></description>
			<content:encoded><![CDATA[<p><i>Stumbled over these musings on how small world theory applies to company organization. They&#8217;ve been languishing in my personal wiki for many months, thought I might as well post them as is.</i><br />
Whilst reading the first few chapters of &#8220;<a href="http://www.amazon.co.uk/exec/obidos/ASIN/075381689X">Small World</a>&#8221; by Mark Buchanan, I was fascinated by the work of Granovetter (see &#8220;The Strength of Weak Ties&#8221;). This basically highlights the fact that it is weak ties between individuals that are the important ones in a social network; not strong ties as one would expect. People with strong ties in common often have strong ties between them also, hence these links are less important than weak ties (acquaintances) as their removal has little effect on the structure of the graph (as measured in number of degrees between points). Previously descriptions I&#8217;ve read about small world phenomena have focussed on hubs/authorities which is a much less human-centric metaphor; quite rightly perhaps as &#8220;small worldism&#8221; isn&#8217;t tied to any particular type of graph, but it&#8217;s not very evocative.<br />
This lead me to thinking about relationships within companies. Exploiting social networks to find work, etc seems well explored, indeed it&#8217;s behind the current drive for many of the social networking sites and applications that are springing up at the moment. Work relationships seems like a different framework within which to explore the small world phenomena. Or at least it&#8217;s the one that occured to me whilst washing up after dinner.</p>
<p><span id="more-157"></span><br />
So some thoughts on this:</p>
<ul>
<li>encouraging small world social graphs in a company is beneficial to the flow of information. This is &#8220;small world for spreading memes&#8221; in the microcosm. However it can also be detrimental as this similar social structure encourages spreading of other kinds of information: gossip and rumours. We might take from this that even if morale is low in a company, at least the lines of communication are still open
</li>
<li>that networking is important is not really news to anyone, but small world studies prove it&#8217;s effective, and support all those fluffy corporate events.
</li>
<li>that the optimum corporate structure isn&#8217;t hierarchical, neither is it completely decentralised, its somewhere in between. Networks that lie in the middle of these two should gain the most benefits (stability, but still good sharing of knowledge)
</li>
<li>that inter-team communication is as important as bonding in a team, and that the manager alone shouldn&#8217;t be the gateway between the team and the rest of the company. If the manage leaves, or is on leave, then you&#8217;ve lost the all important weak links and while the team may still stay cohesive they can become isolated.</li>
<li>that as an individual, your role in a company can be secured by networking with others. However the detrimental side of this is that as you quickly become a &#8220;hub&#8221; (people come to you for information because you can find it quicker, or know who to refer them to) the many communication channels can distract you from your key role, and also lift you away from the work that you&#8217;re interested in. A company would do wise to acknowledge it&#8217;s hubs, but immediately route around them so that they have backups and that those weaks links don&#8217;t cripple the company during leave/job changes.</li>
<li>it&#8217;s the small odd little tasks that people do, those that have them interact with a slightly wider social circle, that can be the most important: they build weak links between teams. It&#8217;s too easy to re-organize and draw lines, saying &#8220;this isn&#8217;t something that Team X should do&#8221;, but then you&#8217;re further isolating Team X.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2004/09/working-in-a-small-world/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Champernowne&#8217;s Constant</title>
		<link>http://www.ldodds.com/blog/2004/08/champernownes-constant/</link>
		<comments>http://www.ldodds.com/blog/2004/08/champernownes-constant/#comments</comments>
		<pubDate>Wed, 01 Sep 2004 00:42:18 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=155</guid>
		<description><![CDATA[Whilst reading von Baeyers &#8216;Information&#8217; recently, I came across the following fun mathematical tidbit which I thought was worth sharing. Mainly because I couldn&#8217;t find many references to it elsewhere on the &#8216;net.
In the chapter on &#8220;Randomness&#8221;, von Baeyer introduces several definitions of the term &#8220;random&#8221;, iteratively showing how each is slightly flawed. Considering a [...]]]></description>
			<content:encoded><![CDATA[<p>Whilst reading <a href="http://www.amazon.co.uk/exec/obidos/tg/detail/-/0674013875/">von Baeyers &#8216;Information&#8217;</a> recently, I came across the following fun mathematical tidbit which I thought was worth sharing. Mainly because I couldn&#8217;t find many references to it elsewhere on the &#8216;net.<br />
In the chapter on &#8220;Randomness&#8221;, von Baeyer introduces several definitions of the term &#8220;random&#8221;, iteratively showing how each is slightly flawed. Considering a binary sequence of digits, the first definition describes a random number as one in which there is no pattern to the series of 1&#8217;s and 0&#8217;s. However a sequence such as 000110000100 is not random as it has an unequal proportion of the binary digits. A slightly improved definition is one which states that the numbers of each digit are approximately equal. But not only that: there combinations of the two digits (00, 01, 10, 11) must also occur in roughly equal proportions. And so on for combinations of three, four, five digits. Sequences that meet this restriction are apparently known as &#8220;normal numbers&#8221;.<br />
The first explicit (rather than theoretical) example of a normal number is <a href="http://mathworld.wolfram.com/ChampernowneConstant.html">Champernowne&#8217;s Constant</a> which was produced (discovered?) in 1933. David Champernowne pointed out that if one starts with zero, then one then string together all possible pairings, then all eight triples, an so on you end up with a number which must, by construction, contain all possible patterns, and is therefore &#8220;normal&#8221;.<br />
Von Baeyer then points out that this number in its binary form is &#8220;<cite>a fabulous object. Using Morse code, or some other translation of zeroes and ones into typographical symbols, it can be transformed into a string of letters, spaces and punctuation marks. Since every conceivable finite sequence of words is buried somewhere in the string&#8217;s tedious gobbledygook, every poem, every traffic ticket, every love letter and every novel ever written, or ever to be composed in the future is there in that string&#8230;You may have to travel out along the string for billions of light years before you find them, but they are all in there somewhere&#8230;.</cite>&#8221; (pp101-102).<br />
So who needs a million chimpanzees with typewriters? Distributed computing project anyone? <grin/></p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2004/08/champernownes-constant/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Searching Small Worlds</title>
		<link>http://www.ldodds.com/blog/2004/01/searching-small-worlds/</link>
		<comments>http://www.ldodds.com/blog/2004/01/searching-small-worlds/#comments</comments>
		<pubDate>Fri, 16 Jan 2004 18:36:15 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Science and Technology]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://www.ldodds.com/lostboy/?p=114</guid>
		<description><![CDATA[Interesting &#8220;small world&#8221; article in New Scientist this week (&#8221;Know Thy Neighbour&#8221;, January 17 2004, Mark Buchanan), this time discussing how people and information can be located within a small world network.
The essay discusses Milgram&#8217;s famous experiment in which he asked people to attempt to route a letter, via their contacts, to a given person. [...]]]></description>
			<content:encoded><![CDATA[<p>Interesting &#8220;small world&#8221; article in New Scientist this week (&#8221;Know Thy Neighbour&#8221;, January 17 2004, Mark Buchanan), this time discussing how people and information can be located within a small world network.<br />
The essay discusses Milgram&#8217;s famous experiment in which he asked people to attempt to route a letter, via their contacts, to a given person. Most of the letters got their within a small number of hops and apparently the strategy that most people, quite naturally, adopted was along the lines of &#8220;Mr X (the end-point) works in the financial sector, who else do I know that works in that sector&#8230;&#8221;. In essence people were comparing their contacts with what they know about the end point, categorising them into groups.<br />
Groups are therefore an important feature of small world networks that are &#8220;searchable&#8221;. Classifying nodes in this way allows your local knowledge of the network (your contacts) to help manipulate it. In the case of the Milgram experiment, that manipulation was to use people to route letters, however the New Scientist article suggests that the similar techniques could be used to benefit internet search engines.</p>
<p><span id="more-114"></span><br />
<a href="http://informatics.indiana.edu/fil/default.asp">Filippo Menczer</a> at the University of Indiana is carrying out research in this area. His list of <a href="http://informatics.indiana.edu/fil/papers.asp">papers</a> is online, and a quick surf through them makes interesting reading.<br />
For example in <a href="http://informatics.indiana.edu/fil/Papers/TOIT.pdf">Topical Web Crawlers: Evaluating Adaptive Algorithms</a> (PDF) Menczer <i>et al</i> describe &#8220;topical crawlers&#8221; (emphasis mine):<br />
<cite><br />
Topical crawlers (also known as focused crawlers) respond to the particular information<br />
needs expressed by topical queries or interest profiles. These could be the<br />
needs of an individual user (query time or online crawlers) or those of a community<br />
with shared interests (topical or vertical search engines and portals). Topical<br />
crawlers support decentralizing the crawling process, which is a more scalable approach&#8230;<b>An additional benefit is that such crawlers can be driven by a rich context (topics, queries, user profiles) within which to interpret pages and select the links to be visited</b>.<br />
</cite><br />
In other words, the crawler can get away with indexing less pages as it&#8217;s guided to the most relevant material by other cues. The paper <a href="http://informatics.indiana.edu/fil/Papers/se-crawler.pdf">Search Engine-Crawler Symbiosis: Adapting to Community Interests</a> describes how a community search engine can improve web crawler performance and vice versa through learning the communities interests.<br />
I seems to me that FOAF could play a role here: rather than solely rely on machine learning to discover information about a document and community interests, one could explicitly gather than data from the aggregated FOAF descriptions of that community.<br />
E.g. Using a FOAF description one can not only determine the <a href="http://xmlns.com/foaf/0.1/#term_interest">interests</a> of a <a href="http://xmlns.com/foaf/0.1/#term_Person">person</a> linking to a given <a href="http://xmlns.com/foaf/0.1/#term_Document">document</a>, but one can also determine the interests of the <a href="http://xmlns.com/foaf/0.1/#term_maker">author</a> of that document, assuming there are appropriate links from the HTML to the FOAF description (cf: <a href="http://rdfweb.org/topic/Autodiscovery">autodiscovery</a>, and <a href="http://usefulinc.com/foaf/iMadeThis">I Made This</a>). There&#8217;s even a <a href="http://xmlns.com/foaf/0.1/#term_Group">grouping</a> mechanism that can further help search agents to adapt their paths, using the same technique as Milgram&#8217;s test subjects. Again the algorithm seems natural: if you want to learn more about a particular topic or event, you&#8217;d start looking at the websites, documents and blogs of people you know are interested in that area.<br />
Cool stuff. Just wish I had the maths to understand it all properly!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ldodds.com/blog/2004/01/searching-small-worlds/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

