[Schematron] Help sought: implementation of Character Repertoire in XSLT2 for embedding in schematron
Rick Jelliffe
rjelliffe at allette.com.au
Mon Sep 22 05:07:36 EDT 2008
Dave Pawson wrote:
> So when I'm talking about a 'charset' or (my own term) I need to
> remember to use 'property'
> Don't let it become too all encompassing Rick? Shoehorning lots of bits into
> GP terms reaches a point of confusion?
>
Good point. The <properties> idea certainly needs feedback and comment,
nothing is baked.
Rather than using foreign attributes, a mechanism like this allows
single definition-multiple use. This is
very useful where the same properties need to be applied to different uses.
<schema ...><title>Stubbed schema to show embedded CRepDL script</title>
<pattern>
<rule test="x">
<assert test="true()"
properties="iso8859-15-text http-iri current-section current-text"
>The <name/>
element should be a http IRI that only contain ISO 8859-15 text</assert>
</rule>
</pattern>
<properties>
<property id="iso8859-15-text ">
<cdrl:union>
<cdrl:char>
....
</cdrl:union>
</property>
<property id="http-iri" >
<xsd:anyUri>
<xsd:pattern value="http:.*"/>
</xsd:anyUri>
<property>
<property id="current-section">
<sch:value-of select="ancestor::section[1]/title" />
</property>
<property id="current-text">
<sch:value-of select="." />
</property>
</properties>
</schema>
In this example, we have properties for
1) the character repertoire
2) the XSD datatype
3) dynamic text for some other information
The need for 3 has repeated come up: I very often see people putting
home-made microformats into the
assertions or diagnostics, in order to get more marked up information
into the output. This pollutes
rather pollutes the assertions. So SVRL would be extended so that the
output would also contain
properties, nicely marked up.
The previous system I had prototyped used different attributes for each:
<assert test="true()" ext:cdrl-type="ISO-8859-15"
ext:xsd-type="http-iri" >
but I didn't see that this kind of specificity bought anything other
than pain.
Another design I tried was foreign elements
<rule context="x">
<xsd:simpleType ref="http-iri" />
<cdrl:ref href="#ISO-8859-15 />
where you just use the appropriate linking element. This is neat,
doesn't change Schematron, and so on.
But it has, to my thinking, a showstopper: there is no assertion text. I
think the central distinguishing feature
of Schematron is not the use of XPaths, or even the use of patterns
rather than types, it is the primacy of
the natural language assertion. In fact, this is the reason why I asked
the XSD WG not to use the Schematron
namespace when XSD 1.1 was looking at adopting assertions: XSD does not
have natural language assertion text
as a first class object.
Schematron schemas are outside-in: the assertion text says what should
be, and the test implements this as best
as possible. XSD (and RELAX NG) schemas are inside-out: the content
models say what should be and then
you may have some comments to say constraints that cannot be expressed.
So the use of text="true()" and @properties may indeed look strange and
over generic ("more meta than thou"
is the witticism I heard at W3C about this kind of design issue), but I
think it
satisfies a stronger design issue: the primacy of the assertion text.
It also provides, to me, a clearer
conceptual classification of this extra information, and in particular
allows the properties to have a name (id)
independent of their location.
What the properties give is, in effect, a post-schema validation
infoset, because they link extra information to
nodes. The big differences between these and the XSD PSVI are 1) there
is an XML syntax from the word go,
namely the properties as the come out in the SVRL document, and 2) these
properties are extensible, while
the outcomes of XSD validation are fixed. (Now, actually, it might be
argued that these properties do nothing
more than XSD's annotation/* elements do, however, because the
properties can have dynamic values, I disagree.)
Cheers
Rick
More information about the Schematron
mailing list