Chapter 2. XSL processors

Table of Contents

XSLT processors
XSL-FO processors
Portability considerations

An XSL processor is the software that transforms an XML file into formatted output. There is a growing list of XSL processors to choose from. Each tool implements parts or all of the XSL standard, which actually has several components:

The XSL Standards

Extensible Stylesheet Language (XSL)

A language for expressing stylesheets written in XML. It includes the XSL formatting objects (XSL-FO) language, but refers to separate documents for the transformation language and the path language.

XSL Transformation (XSLT)

The part of XSL for transforming XML documents into other XML documents, HTML, or text. It can be used to rearrange the content and generate new content.

XML Path Language (XPath)

A language for addressing parts of an XML document. It is used to find the parts of your document to apply different styles to. All XSL processors use this component.

To publish HTML from your XML documents, you just need an XSLT processor. It will include the XPath language since that is used extensively in XSLT. To get to print, you need an XSLT processor to produce an intermediate formatting objects (FO) file, and then you need an XSL-FO processor to produce PostScript or PDF output from the FO file. A diagram of the DocBook Publishing Model is available if you want to see how all the components flow together.

XSLT processors

Currently there are three processors that are widely used for XSLT processing because they most closely conform to the XSLT specification:

Saxon

Saxon (http://saxon.sourceforge.net/) was written by Michael Kay, the author of XSLT Reference, one of the best books on XSLT. Saxon is a free processor written in Java, so it can be run on any operating system with a modern Java interpreter. Saxon now comes in two flavors: Saxon 6 which handles the XSLT 1.0 standard, and Saxon 8 which handles the newly emerging XSLT 2.0 and other new XML standards.

Xalan

Xalan (http://xml.apache.org/xalan-j/index.html) is part of the Apache XML Project. It has versions written in both Java and C++, both of them free. The Java version is described in this book because it is highly portable and easier to set up. Generally Xalan is used with the Xerces XML parser, also available from the Apache XML Project.

xsltproc

The xsltproc (http://xmlsoft.org/XSLT/) processor is written in C by Daniel Veillard. It is free, as part of the open source libxml2 library from the Gnome development project. It is considered the fastest of the processors, and is highly conformant to the specification. It is much faster than either of the Java processors. It also processes XIncludes.

There are a few other XSLT processors that should also be mentioned:

XT

James Clark's XT (http://www.blnz.com/xt/index.html) was the first useful XSLT engine, and it is still in use. It is written in Java, so it runs on many platforms, and it is free. XT comes with James Clark's nonvalidating parser XP, but you can substitute a different Java parser.

MSXML

Microsoft's MSXML (http://msdn.microsoft.com/xml/) engine includes an XSLT processor. It is reported to be fast, but only runs on Windows.

Sablotron

Sablotron (http://www.gingerall.com/charlie/ga/xml/p_sab.xml), written in C++, from Ginger Alliance.

4XSLT

4XSLT (http://sourceforge.net/projects/foursuite/), written in Python, now an open project on SourceForge.