Catalan has to be the official language of XML -- there's so many X's. It's the only language that can support all the project.

Different vision for doc. types:

  • well-formed is too much
  • validity is too little.

Uses of XML

  • interchange
  • marking up text
  • existing data import
  • closed loop editing
  • value-adding
  • verifying
  • etc, etc.

Can't say that industry sectors are clumped. Some shared perspectives.

Trade-offs

  • atomic vs. mixed content
  • trees or linked structures
  • does data existence independently of markup (e.g. book)
  • symbolic vs. data proc.
  • natural output format vs. none
  • active vs. static

Publishing

  • Mixed content
  • Linked structures
  • Independent Existence
  • Symbolic processing (not numeric; no need for types)
  • Natural output/delivery format
  • Static

Use these features to select a schema language.

Visual vs. Automatic Verification

Validation = auto
- difficult to create/maintain results
- can't detect pigeonhole errors

Visual verification, e.g. used in authoring stylesheets
- verifying the abstraction, so can miss errors
- only catch visible errors

Status Quo

Axis of complexity/power.

WF to the left, Valid to the right

+ NS, +XLink, etc moves towards the right, and possibly XML 2.0.

SGML '86

Original form of SGML had more to the left AND the right.

Lefthand:

  • tag omissions
  • short tags
  • short references
  • delimiter remapping -- rare
  • data tags -- rare

More and more implied markup as more to the left.

Righthand:

  • DTD (stronger)
  • Lexical types
  • Architectural Forms

Lack of use of AFs may be due to complexity, or just not useful for publishing.

Don't believe there's a universal schema language. Marketing phenomena.

Look at UML: 9 diagrams to model constraints.

SGML '98

Broke need for DTD?

'Amply Tagged' -- lesser than WF Valid is still stronger

W3C Future PSVI

Schema valid further to the right. Industry tradeoffs will determine how useful that is.

Progression of Logical Phases

Feasibly Tagged
Inferably
Impliably
Amply Tagged

WF

Feasibly Valid
...
...
Minimally Valid

Valid

ISO MDTS

MDTS

Modular Document Type Specification. 2 years, fully baked. Some parts out soon.

publishing requirements, but a lot of things are publishing. Targetting high end publishing initially, but might be applicable.

  • Validation Candidate Selection -- e.g. namespace association
  • RELAX NG module
  • Schematron module -- first draft in few weeks. May borrow some ideas from XCSL, better in some areas
  • Integrity Module -- still room for that. Schematron not really suited for that. Needs to be declarative, which Schematron isn't.
  • Character Repertoire Module -- request from euro publisher. Testing character content.
  • ...Others + Extensions -- biggest extension will be allowing XSD

Still very early. No sense in 'opposing' XSD. Not all modules may get in.

With modules, and schema translation, the emphasis on particular languages is lessened.

Modular schema framework might be able to help with 'cut-and-paste' issues.

Whats Wrong with XSD?

Wanted to avoid competitive feeling. "Richness is the appropriate thing"

No publishing support.

  • mixed content
  • character repetoire
  • lexical typing is under-utilized
  • intrusion of DBMS ideas: nil, suffixation limits extensibility

Bad architecture

  • no subsetting = monolithic = difficult to implement
  • pretension of universality
  • PSVI -- no obj. providing it's not called XML. It's about how to enrich data inside a process.

Design problems (dates, integrity, complexity)

"BUT good for a lucrative and large nice. No need to need to knee-jerk (either way!)"

Gets lost in spec!! "shows that it's at least not memorable"

Supporting Amply Tagged

Editors Concrete Syntax (ECS)

  • SGML concrete syntax
  • basically XML, but without end tags, quoting, case differents. WF without the "terseness is minimal importance"

Made a syntax that uses this, available next week.

Already widespread: HTML. Still no DTD.

  • Good for colouring editors

Supporting Inferencing

Named Information Items (NII) -- may go into MDTS

Simple format for sets of declarations.

May do it through augmenting schemas, e.g. appinfo.

When Validity Not Enough

Adam Smith, often document isn't marked up linearly. Each editor has a limited number of tags. Therefore not a linear process.

Incremental validity, islands of validity.

Weakly Valid

E.g. where level is missing (e.g. body). Patent exists (based on tree considerations). SGML has tag-omissions (prior art?)

Minimally Valid

All required elements

Or the document has all the required elements until a choice is forced

Impliably

Document is missing parts, added from schemas

Inferably Valid

Heuristics to ignore errors, or generate placeholders.

Feasibly Valid

subsequence valid -- conforms to content model up to a point. ordered elements ordered in a way that there could be elements introduced that correct it.

Implementation Options

  • strength reducing the schemas
  • Schematron phasing
  • logic systems/language. develop rule based for schema. Because it is logic, can be queried and prompted for suggestions (or limit options in editor)

When WF is Too-Much

Constraining and unpleasant for use with editors.

  • Wiki at low end
  • ECS
  • SGML minimization

Operators may not think in terms of trees.

"Will be very suprised if XHTML has any success. The XML rules are too strict..."

Tag grammar: transform document into a tag-document

<a_start><x/></a_end/>

Can then validate with other tools.

Add new attachment

In order to upload a new attachment to this page, please use the following box to find the file, then click on “Upload”.
« This page (revision-1) was last changed on 21-Aug-2002 18:23 by unknown [RSS]
G’day (anonymous guest) My Prefs


Referenced by
XMLEurope2002

JSPWiki v2.6.0 [RSS]