ph-schematron is a Java library that validates XML documents via ISO Schematron. It is licensed under Apache 2.0 license.
Schematron is now also on GitHub: https://github.com/Schematron It offers several different possibilities to perform this task where each solution offers its own advantages and disadvantages that are outlined below in more detail. ph-schematron only supports ISO Schematron and no other Schematron version. The most common way is to convert the source Schematron file to an XSLT script and apply this XSLT on the XML document to be validated. Alternatively ph-schematron offers a native implementation for the Schematron XPath binding which offers superior performance over the XSLT approach but has some other minor limitations.
Continue reading the full documentation at http://phax.github.io/ph-schematron/.
- v5.0.1 - 2018-02-01
- Moved
getBeautifiedLocationto classSVRLHelperand made it public - Updated to Saxon-HE 9.8.0-7
- Requires ph-commons 9.0.1
- Moved
- v5.0.0 - 2018-01-02
- Updated to ph-commons 9.0.0
- Added new ANT task for preprocessing Schematron files only
- Improved support for
base-uri()XPath function when using the pure implementation (#47) - Fixed issue with
roleattribute in SVRL when using pure implementation (#54) - Updated to Saxon-HE 9.8.0-6 - therefore no XLST v1 scripts can be used anymore - this only works up to 9.7.x!
- Added ANT task property
failOnError(#57)
- v4.3.4 - 2017-07-27
- Added new class
SchematronDebugthat centrally manages the debug flags for logging etc.
- Added new class
- v4.3.3 - 2017-07-27
- Reverted to Saxon-HE 9.7.0_18 because of incompatibilities in production
- v4.3.2 - 2017-07-25
- Updated to Saxon-HE 9.8.0-3
- Changed all XSLT scripts to use and create only XSLT 2.0 (because Saxon 9.8.x does not support XSLT 1.0 anymore)
- Updated to ph-commons 8.6.6
- v4.3.1 - 2017-05-29
- Updated to ph-commons 8.6.5
- Fixed too verbose logging of created XSLT
- Removed some old deprecated methods
- v4.3.0 - 2017-05-15
- Updated to Saxon-HE 9.7.0-18
- Fixed an error with nested SVRL directories in Maven plugin (#37)
- Added possibility to use "negative" tests in Maven plugin (#38)
- Added ANT plugin to validate Schematron resources (#39, #40)
- Using the EntityResolver also for the XML files to be validated (not just the Schematron)
- Added a default
EntityResolverand a defaultURIResolverthat tries to resolve includes relative to the base Schematron.
- v4.2.2 - 2017-02-22
- Updated to Saxon-HE 9.7.0-15
- Fixed usage of
<let>in<extend>-based rules for the pure implementation (#36)
- v4.2.1 - 2017-01-20
- Added WrappedCollectingPSErrorHandler
- v4.2.0 - 2017-01-09
- Binds to ph-commons 8.6.0
- Updated to Saxon-HE 9.7.0-14
- Added a new Schematron validation Maven plugin
- v4.1.1 - 2016-11-03
- Added possibility to use XML EntityResolver (#30)
- Updated to Saxon-HE 9.7.0-10
- v4.1.0 - 2016-09-09
- Binding to ph-commons 8.5.x
- v4.0.2 - 2016-07-22
- v4.0.1 - 2016-07-05
- better integration of sch2xslt Maven plugin into m2e - thanks to @baerrach
- v4.0.0 - 2016-06-15
- updated to JDK8
- updated to Saxon-HE 9.7
- v3.0.1 - 2015-10-14
- keep diagnostics in Pure version; resource resolving emits to error handler
- v3.0.0 - 2015-07-29
- because of update to ph-commons 6.0.0; extended XSLT based API
- v2.9.2 - 2015-03-12
- because of update to ph-commons 5.6.0
- v2.9.1 - 2015-02-03
- fixes a classloader issue added in 2.9.0
- v2.9.0 - 2015-01-30
- introduced new APIs in several places
- updated to Saxon-HE 9.6
- v2.8.4 - 2014-10-30
- v2.8.3 - 2014-09-16
- An easy way to use XQuery functions (like funcx library) as custom XPath functions was added
- v2.8.2 - 2014-09-02
- v2.8.1 - 2014-08-29
- v2.8.0 - 2014-08-28
The dependency for ph-schematron looks like this:
<dependency>
<groupId>com.helger</groupId>
<artifactId>ph-schematron</artifactId>
<version>5.0.1</version>
</dependency>It transitively contains ph-commons, SLF4J and Saxon HE.
Maven plugin to convert Schematron (SCH) to XSLT at compile time using ph-schematron as the converter.
The conversion of Schematron to XSLT is quite costly. That’s why this Maven plugin that does the conversion at build time.
By default the plugin is run in the Maven lifecycle phase generate-resources. The basic configuration of the plugin in the pom.xml looks like this (inside the <build>/<plugins> element):
<plugin>
<groupId>com.helger.maven</groupId>
<artifactId>ph-sch2xslt-maven-plugin</artifactId>
<version>5.0.1</version>
<executions>
<execution>
<goals>
<goal>convert</goal>
</goals>
</execution>
</executions>
</plugin>The possible configuration parameters are:
schematronDirectory- The directory where the Schematron files reside. Defaults to${basedir}/src/main/schematron.schematronPattern- A pattern for the Schematron files. Can contain Ant-style wildcards and double wildcards. All files that match the pattern will be converted. Files in theschematronDirectoryand its subdirectories will be considered. Default is**/*.sch.xsltDirectory- The directory where the XSLT files will be saved. Default is${basedir}/src/main/xslt.xsltExtension- The file extension of the created XSLT files. Default is.xslt.overwriteWithoutQuestion- Overwrite existing Schematron files without notice? If this is set tofalsethan existing XSLT files are not overwritten. Default istrue.phaseName- Define the phase to be used for XSLT creation. By default thedefaultPhaseattribute of the Schematron file is used.languageCode- Define the language code for the XSLT creation. Default isnullwhich means English. Supported language codes are: cs, de, en, fr, nl.
Maven plugin to validate XML files against convert Schematron (SCH) at compile time using ph-schematron as the validator.
This plugin was introduced in version 4.2.0.
By default the plugin is run in the Maven lifecycle phase process-resources. The basic configuration of the plugin in the pom.xml looks like this (inside the <build>/<plugins> element):
<plugin>
<groupId>com.helger.maven</groupId>
<artifactId>ph-schematron-maven-plugin</artifactId>
<version>5.0.1</version>
<executions>
<execution>
<goals>
<goal>validate</goal>
</goals>
</execution>
</executions>
</plugin>The possible configuration parameters are:
schematronFile- The Schematron file to be applied. This parameter is mandatory.schematronProcessingEngine- The Schematron processing engine to be used. Can be one ofpure(pure implementation),schematron(Schematron to XSLT engine) andxslt(pre-compiled XSLT files available). Default ispure.xmlDirectory- The directory where the XML files to be validated are contained. It is expected that the XML files in this directory match the Schematron rules. Either this parameter orxmlErrorDirectoryparameter are mandatory.xmlIncludes- A pattern for the XML files to be included. Can contain Ant-style wildcards and double wildcards. All files that match the pattern will be validated. Files in thexmlDirectoryand its subdirectories will be considered. Default is**/*.xml.xmlExcludes- A pattern for the XML files to be excluded. Can contain Ant-style wildcards and double wildcards. All files that match the pattern will NOT be validated. Files in thexmlDirectoryand its subdirectories will be considered.svrlDirectory- The directory where the SVRL (Schematron Validation Report Language) files will be saved. If this property is not specified, no SVRL files will be written. By default the name of the SVRL file corresponds to the XML file that is validated (defined by the parametersxmlDirectory,xmlIncludesandxmlExcludes) with the suffix.svrl.xmlErrorDirectory(since v4.3.0) - The directory where the XML files to be validated are contained. It is expected that the XML files in this directory do not match the Schematron rules. Either this parameter orxmlDirectoryparameter are mandatory.xmlErrorIncludes(since v4.3.0) - A pattern for the erroneous XML files to be included. Can contain Ant-style wildcards and double wildcards. All files that match the pattern will be validated. Files in thexmlDirectoryand its subdirectories will be considered. Default is**/*.xml.xmlErrorExcludes(since v4.3.0) - A pattern for the erroneous XML files to be excluded. Can contain Ant-style wildcards and double wildcards. All files that match the pattern will NOT be validated. Files in thexmlDirectoryand its subdirectories will be considered.svrlErrorDirectory(since v4.3.0) - The directory where the erroneous SVRL files will be saved. If this property is not specified, no SVRL files will be written. By default the name of the SVRL file corresponds to the XML file that is validated (defined by the parametersxmlErrorDirectory,xmlErrorIncludesandxmlErrorExcludes) with the suffix.svrl.phaseName- Define the phase to be used for XSLT creation. By default thedefaultPhaseattribute of the Schematron file is used.languageCode- Define the language code for the XSLT creation. Default isnullwhich means English. Supported language codes are: cs, de, en, fr, nl.
A validator for Schematron definitions based on RelaxNG definition.
Add the following to your pom.xml to use this artifact:
<dependency>
<groupId>com.helger</groupId>
<artifactId>ph-schematron-validator</artifactId>
<version>5.0.1</version>
</dependency>
Since ph-schematron 4.3.0 there is an Apache Ant task that enables you to validate XML files against Schematron rules. As I'm not an Ant expert please forgive me if some of the explanations are not 100% accurate. ph-schematron 5.0.0 adds a new task for preprocessing Schematron files.
There is currently only one task:
<taskdef name="schematron" classname="com.helger.schematron.ant.Schematron" />For this Ant Task to be available you need to include the ph-schematron-ant-task "JAR with dependencies" in your classpath.
Alternatively you can use the classpath attribute to reference a classpath that is defined internally in the build script.
A compiled version of the "JAR with dependencies" is available at the Maven Central Repository.
The validation itself looks like this:
<target name="validate">
...
<schematron schematronFile="sample_schematron.sch" expectSuccess="true">
<fileset dir="xml">
<include name="*.xml" />
<exclude name="err*.xml" />
</fileset>
</schematron>
...
</target>Basically you declare the Schematron file (relative to the project's base directory), define whether you expect a successful validation or failures, and finally you name the XML files to be validated (as resource collections - e.g. Filesets).
The schematron element allows for the following attributes:
FileschematronFile - The Schematron file to be used for validationStringschematronProcessingEngine - The internal engine to be used. Must be one ofpure,schematronorxslt. The default isschematron. In v5.0.0schwas added as an alias forschematron.FilesvrlDirectory - An optional directory where the SVRL files should be written to.StringphaseName - The optional Schematron phase to be used. Note: this is only available when using the processing enginepureorschematron. For enginexsltthis is not available because this was defined when the XSLT was created.StringlanguageCode - The optional language code to be used. Note: this is only available when using the processing engineschematron. For enginexsltthis is not available because this was defined when the XSLT was created. Default is English (en). Supported language codes are: cs, de, en, fr, nl.booleanexpectSuccess -trueto expect successful validation,falseto expect validation errors. If the expectation is incorrect, the build will fail.booleanfailOnError (since v5.0.0) -trueto break the build if an error occurred,falseto continue with the following tasks on error.
Additionally you can use an XMLCatalog that acts as an Entity and URI resolver both for the Schematron and the XML files to be validated! See https://ant.apache.org/manual/Types/xmlcatalog.html for details on the XML catalog. Here is an example that shows how to use an inline XML catalog:
<target name="validate">
<schematron schematronFile="../sch/test.sch"
expectSuccess="true"
schematronProcessingEngine="pure">
<fileset dir=".">
<include name="test.xml" />
</fileset>
<xmlcatalog>
<dtd publicId="-//bla//DTD XML test//EN" location="../dtd/test.dtd"/>
</xmlcatalog>
</schematron>
</target>There is currently only one task:
<taskdef name="schematron-preprocess" classname="com.helger.schematron.ant.SchematronPreprocess" />For this Ant Task to be available you need to include the ph-schematron-ant-task "JAR with dependencies" in your classpath.
Alternatively you can use the classpath attribute to reference a classpath that is defined internally in the build script.
The validation itself looks like this:
<target name="validate">
...
<schematron-preprocess srcFile="src.sch" dstFile="dst.sch" />
...
</target>Basically you define source and destination Schematron files and that's it. Additionally you can define a few settings controlling the output.
The schematron element allows for the following attributes:
FilesrcFile - The source Schematron file to be preprocessed. This parameter is required.FiledstFile - The destination file in which the preprocessed content should be written. This parameter is required.booleankeepTitles -trueto keep<title>-elements,falseto delete them. Default isfalse.booleankeepDiagnostics -trueto keep<diagnostic>-elements,falseto delete them. Default isfalse.booleankeepReports -trueto keep<report>-elements,falseto change them to<assert>-elements. Default isfalse.booleankeepEmptyPatterns -trueto keep<pattern>-elements without rules,falseto delete them. Default istrue.booleanfailOnError -trueto break the build if an error occurred,falseto continue with the following tasks on error.
My personal Coding Styleguide | On Twitter: @philiphelger