[Schematron] Help sought: implementation of Character Repertoire in XSLT2 for embedding in schematron

Rick Jelliffe rjelliffe at allette.com.au
Mon Sep 22 05:07:36 EDT 2008

Dave Pawson wrote:
> So when I'm talking about a 'charset' or (my own term) I need to
> remember to use 'property'
> Don't let it become too all encompassing Rick? Shoehorning lots of bits into
> GP terms reaches a point of confusion?
Good point.  The <properties> idea certainly needs feedback and comment, 
nothing is baked.

Rather than using foreign attributes, a mechanism like this allows 
single definition-multiple use. This is
very useful where the same properties need to be applied to different uses.

 <schema ...><title>Stubbed schema to show embedded CRepDL script</title>
   <rule test="x">
    <assert test="true()" 
	properties="iso8859-15-text http-iri current-section current-text" 
     >The <name/> 
element should be a http IRI that only contain ISO 8859-15 text</assert>
    <property id="iso8859-15-text ">

   <property id="http-iri" >
		<xsd:pattern value="http:.*"/>

   <property id="current-section">
	<sch:value-of select="ancestor::section[1]/title" />

   <property id="current-text">
	<sch:value-of select="." />


In this example, we have properties for
    1) the character repertoire
    2) the XSD datatype
    3) dynamic text for some other information

The need for 3 has repeated come up: I very often see people putting 
home-made microformats into the
assertions or diagnostics, in order to get more marked up information 
into the output. This pollutes
rather pollutes the assertions. So SVRL would be extended so that the 
output would also contain
properties, nicely marked up.

The previous system I had prototyped used different attributes for each:
   <assert test="true()"  ext:cdrl-type="ISO-8859-15"  
ext:xsd-type="http-iri" >
but I didn't see that this kind of specificity bought anything other 
than pain.

Another design I tried was foreign elements
   <rule context="x">
       <xsd:simpleType ref="http-iri" />
       <cdrl:ref href="#ISO-8859-15 />
where you just use the appropriate linking element. This is neat, 
doesn't change Schematron, and so on.

But it has, to my thinking, a showstopper: there is no assertion text. I 
think the central distinguishing feature
of Schematron is not the use of XPaths, or even the use of patterns 
rather than types, it is the primacy of
the natural language assertion.  In fact, this is the reason why I asked 
the XSD WG not to use the Schematron
namespace when XSD 1.1 was looking at adopting assertions: XSD does not 
have natural language assertion text
as a first class object. 

Schematron schemas are outside-in: the assertion text says what should 
be, and the test implements this as best
as possible. XSD (and RELAX NG) schemas are inside-out: the content 
models say what should be and then
you may have some comments to say constraints that cannot be expressed. 

So the use of text="true()" and @properties may indeed look strange and 
over generic ("more meta than thou"
is the witticism I heard at W3C about this kind of design issue), but I 
think it
satisfies a stronger design issue: the primacy of the assertion text.  
It also provides, to me, a clearer
conceptual classification of this extra information, and in particular 
allows the properties to have a name (id)
independent of their location.

What the properties give is, in effect, a post-schema validation 
infoset, because they link extra information to
nodes. The big differences between these and the XSD PSVI are 1) there 
is an XML syntax from the word go,
namely the properties as the come out in the SVRL document, and 2) these 
properties are extensible, while
the outcomes of XSD validation are fixed.  (Now, actually, it might be 
argued that these properties do nothing
more than XSD's annotation/* elements do, however, because the 
properties can have dynamic values, I disagree.)


More information about the Schematron mailing list