Patterns of Intermediation

Maintained by Leigh Dodds. See acknowledgements for a full list of contributors

Version 0.2, 4th April 2005

Introduction

This is a working document whose aim is to define and describe a number of design patterns suitable for use when writing web intermediaries: bookmarklets, Greasemonkey scripts, or dynamic web interfaces, e.g. using XMLHttpRequest.

The initial impetus for documenting these patterns is described in this posting to my blog: Patterns of Intermediation.

To provide feedback on this document please email: leigh@ldodds.com. To publically comment on this document, e.g. in a blog posting, please use the tag.

The ultimate goal is to move this content into an open public Wiki so that it can be collaboratively extended. I thought it useful to have some editorial input in its early stages, and I don't yet have time to set up a dedicated Wiki.

Cataloguing Existing Practices

In this blog posting I suggested 3 axes to consider when classifying existing practices and designs. Using a classification scheme will help further the discovery and discussion of individual patterns.

The axes are as follows:

Each of these axes are described further below.

Data Source

Where does the intermediary find the data that it acts upon?

Action

What does the intermediary do with the data?

Trigger

How does the intermediary get invoked?

Data Patterns

URL Parser

Parse the current url, using regexps, to extract identifiers or key fields

Screen Scraper

Run regexps or perform DOM manipulations to extract text from the document body.

Link Scanner

Find and extract links that meet specific criteria from the page. E.g. find all Amazon links.

Distinct from Screen Scraper because that pattern refers specifically to searching the text, i.e. un-marked up content in the page for particular data patterns. Related to Semantic Markup Reader in that markup cues, e.g. "this is an anchor" are used to extract the data, the difference is that

Semantic Markup Reader

Use the browser DOM to find markup elements present in the document which contains data of interest. The page author introduces semantic markup into the page explicitly to make it easier to consistently style and/or find certain page elements. See XFN, and Semantic XHTML for example.

Service Invoker

Invoke a service to fetch data. May be done both synchronously or asynchronously. It's likely that one of the other data patterns will be used to provide data to the service.

The service may use other integration patterns (need a new category here) to send its response: e.g. XML, or the Serialized Object Model pattern (c.f: JSON).

Temporary Form

Invite structured user input by displaying a new dynamically generated document, whose form submit triggers further processing by the intermediary. Could be used to generate entirely self-contained intermediaries, or prompt for additional data that may then be used by a Service Invoker

Page Context Reader

by Mark Pilgrim

Editorial Note: The following specific data sources were suggested and documented by Mark Pilgrim. I've initially gathered these together into a general class of data source pattern. They can be expanded on later as the respective techniques become better understood -- LRD

This is a general class of pattern that extracts data from the current page context. Parsing the current URL is covered in URL Parser, and the page text and/or markup in Screen Scraper and Semantic Markup Reader. But there are other sources of data available from the current request context:

HTTP headers: Imagine a GM script that looks at the HTTP_REFERER, and if the user came to the current page from a search engine, highlight the search terms automatically. (Individual sites do this, but it makes much more sense on the client side.)

Cookies: A local script can access the cookies of the current page, since it operates in the page's context. All sorts of mischief possible here (not the least of which is stealing the cookies and posting them to a remote server).

Page styles. Such as a script that changes Arial to Helvetica. See Restyler for the action of changing a page's style.

External Context Reader

by Mark Pilgrim

Editorial Note: The following specific data sources were suggested and documented by Mark Pilgrim. I've initially gathered these together into a general class of data source pattern. They can be expanded on later as the respective techniques become better understood -- LRD

This is a general class of pattern that extracts data from outside of the current request context. Retrieving data from an external service is covered in Service Invoker. Retrieving data from the user is covered in Temporary Form. But there are other sources of data available from the current request context:

Local Environment, such as window size, operating system, etc. e.g. a GM script that resizes page fonts based on current window size.

Current Date and Time. Imagine a GM script that made text brighter at night, or one that sanitized dirty words -- but only between 6 AM and 6 PM.

Action Patterns

Redirector

Send the user to an alternate location, e.g. library OPAC not Amazon

Annotator

General pattern describing any additions to the current page, e.g. markup or text

Link Rewriter

A specific Annotator pattern: find and rewrite links on the current page, targetting them at an alternate service or location. E.g. Barnes and Noble not Amazon; Library OpenURL resolver, not a specific A&I database, etc.

Link Dropper

Add additional links to the page. E.g. finding company or product names and making them hyperlinks; discovering ISBN or ISSN numbers and linking those; dropping in a list of RSS feeds or related pages.

Interface Generator

Pop-up a form for the user to interact with. E.g. del.icio.us posting tools. Used to create quick posting tools for sites, enhance posting tools for existing sites, etc.

Another example may be to visualise incoming links to this page. Or view related pages based on tag metadata.

This is distinct from Temporary Form in that here the generation of the interface is the ultimate goal of the intermediary, whereas in that pattern the generated form is used to provide input to the intermediary.

Restyler

by Mark Pilgrim

Change the current page style. Such as a script that changes Arial to Helvetica. Could be used to improve readability, accessibility, etc.

Blocker

by Mark Pilgrim

The reverse of Annotator. Remove content from the page

"Kill the ads" is the #1 site-specific request on http://dunck.us/collab/GreaseMonkeyUserScriptRequest. Possibly better done with other technologies, but it's an action that's arguably distinct from "annotate".

Rearranger

by Mark Pilgrim

Rearrange the content/structure of the page. There is a GM script that takes the main content on CNN stories and moves it to the top of the page.

Trigger Patterns

Button Press

Standard javascript bookmarklet. User-initiated action

Event Driven

Trigger processing based on some event, e.g. loading of a page from a particular URL. There may be specific patterns to elucidate here: e.g. loading of a page, receipt of an event from an system such as mod_pubsub

Other Areas To Research

Patterns of data integration, e.g. XML, or the Serialized Object Model pattern (c.f: JSON).

The Bookmarklet Bootloader pattern for distributing bookmarklets.

Relationships to Other Work

Note how issues such as web service design (e.g. is it RESTful, whats in the URL?) and web site design (e.g. semantic markup cues) can help foster particular types of intermediary pattern. Point out that good service/site design can encourage easier (and stable) intermediation.

Change History, Acknowledgements

Acknowledgements

The following is a list of contributors to this document.

Changes