[Schematron] Is schematron missing a top level

rjelliffe at allette.com.au rjelliffe at allette.com.au
Sun Aug 23 16:04:46 EDT 2009


This is a little more theoretical/conceptual than many posting on this
list, but I am interested in any thoughts, or at least preparing the
ground for them. So

I have been wondering recently whether schematron is missing a top level,
or at least if there is a certain class of assertions that are important
enough (for reporting considerations) that they should have some special
syntax or names.

The issue can be summarized as this: are 'patterns' really first class
objects in Schematron if we cannot even make the assertion that a pattern
needs to be found in a document.

Now, we can certainly make a Schematron schema that looks through an SVRL
output from some validation with another schema, and demand that certain
patterns are present with no failed assertions. But we generality of
design is a burden on users frequently: just because we can do X in a
pipeline it doesn't mean that X is best modelled with a pipeline.

Now at the moment, the way to require that a pattern exists is to have a
rule that matches /  or *, since every XML document must have these. And
certainly we could trace through assertions under these for required
elements or attributes, and then see if other patterns have matching rules
and therefore are necessary.

For example, would it be useful to have the following?

<sch:phase name="simple-html">
   <sch:active pattern="head-body-structure"
        expect="always" />
   <sch:active pattern="paragraphs"
        expect="sometimes" />
</sch:phase>

where the semantics of  expect="always" might be something as weak as,
say, "at least one rule must fire".

By "useful" I mean for being able to report to users better, and to
structure tests, and to simplify schemas slightly. I have been thinking
about a transformation of SVRL to make a simple table summary which
collates all the individual assertion results, like the JUnit red/green
test results, such as:

Pattern "simple-html" (required)   valid 1/1
  Rule  /                          valid 1/1
    Assert head                    valid 1/1
    Assert body                    valid 1/1

Pattern "paragraphs"               valid 85/86
  Rule   p                         valid 75/76
    Assert not( marquee )          valid 75/76
  Rule   blockquote                valid 10/10
    Assert not( marquee )          valid 10/10

and I remember the requests that have been made where people want to
confirm that a rule or pattern or assertion was in fact run, as kind of
logging and audit necessity. Hence this post.

Another approach would be

<sch:phase name="simple-html">
   <sch:active pattern="head-body-structure"
        require-rule-fire="head-rule" />
   <sch:active pattern="paragraphs" />
</sch:phase>

In this approach, we nominate which rule in some pattern must fire, by
refernce to an ID. So that if the rule is

<rule context="html"  id="head-rule">
    <assert test="head">in order to allow metadata,
         a document should have a head</assert>
</rule>

then there is an implicit declaration

<rule context="/">
   <assert tes="head">The pattern "head-body-structure" must be
present.</assert>
</rule>

Anyway, this is less than half-baked of course.

Cheers
Rick Jelliffe



More information about the Schematron mailing list