Details to watch out for

Olinks provide the tremendous power of cross referencing between documents, but they have a price. Olinks introduce dependencies between documents that are not an issue with standalone documents. The documents in a collection must "play together", and so they must follow a few rules.

Target database location

The location of the olink targets database is specified by the stylesheet parameter target.database.document. Note these features:

  • The target.database.document parameter has no default value, so you must always set this parameter if you are using olinks.

  • The parameter value can be a full path, but then it should be expressed using URI syntax since that is what the XSLT document() function expects. For example:

    <xsl:param name="target.database.document">file:///c:/xml/tools/olinkdb.xml</xsl:Param>
  • The parameter value can be a relative path, in which case it is taken as relative to the directory containing the document being processed. Since the parameter takes URI syntax, you can use forward slashes even on Windows. You can include ../ in the path to access directory levels above the current document.

  • You cannot use a relative path if you are using profile-docbook.xsl or profile-chunk.xsl. Those stylesheets create an in-memory copy of your document that does not have a base directory. Use a full path, or use two-pass profiling (see the section “Two-pass processing”).

  • An XML catalog entry can be used to map the parameter value to a specific location in a filesystem. Note that when using a Java processor, the parameter value must be a full path expression in order for the catalog entry to work, because Java will replace any relative path with an absolute path before the catalog resolver sees it.

If you are sharing a target database among several documents, as is common with olinking, you should put it in a path that is accessible to all documents in the collection. If the relative path from all your documents to the database is the same, then you can just put that path in the parameter. If the relative paths to the shared database are not the same, you have some choices:

  • Set the target.database.document parameter in the build script for each document directory, using an appropriate relative path for documents in that directory.

  • Set the parameter to a fixed full path to the database file.

  • Use a phony full path that is mapped to the actual location using an XML catalog file. See the section “Relative SYSTEM identifiers may not work” for more information on this trick.

Using a sitemap

One of the most powerful features of the olink system is the sitemap in the target database document. The sitemap is an XML structure that parallels the directory structure of your HTML or PDF output tree. By recording the output locations for all the documents in your olink database, the stylesheet can compute relative links between any two documents. The stylesheets compute the correct number of ../ steps to move up, and the right sequence of directory names to move down to locate a file. Relative links make your HTML highly portable, as long as you keep the same directory structure when you move the files.

If you put all your output in one directory, then you do not need to use a sitemap. You can omit the sitemap and dir elements, and just create a flat list of document elements as children of the targetset element in the database file. For PDF output or non-chunked HTML output, the baseuri attribute of each document element must still contain the filename of its PDF or HTML output file, because that name is not available to the stylesheet.

Keep in mind that the sitemap records the HTML or PDF output hierarchy, not the XML source hierarchy. The location of your XML documents does not matter. Creating an output sitemap requires advanced planning for your document collection. You need to decide the name and location of each directory containing output. If you change where you put your HTML or PDF files, be sure to update your sitemap as well.

For the sitemap to work, you have to set the current.docid parameter for each document you process. You set the parameter value to the targetdoc identifier for the current document. That informs the stylesheet of the starting point for computing relative references, since that information is not recorded in the document itself.

Here are some guidelines for understanding the sitemap feature. See Example 24.1, “Target database document” for examples.

  • The sitemap element itself must contain just a single top-level dir element that serves as a container for the other dir elements. The top-level name attribute is irrelevant, since it is never used in hrefs (it is always represented by ../).

  • The output directory hierarchy is represented by nested dir elements under the top level dir in the sitemap. Each dir element's name attribute must match the name of its output directory. Thus a sequence of dir descendants can represent part of a pathname.

  • A dir element can also contain one or more document elements. A typical setup will have terminal dirs containing a single document element, especially if that document is chunked. But a dir element can contain a document element and other dir elements, if that is your directory structure.

  • Each document element's targetdoc attribute value is the same document identifier used for olinking to that document. This identifier keys the stylesheet to the current document's location in the sitemap so it can compute relative paths from there to other documents.

  • The content of each document element is the set of target data collected for that document. This is usually inserted as a system entity reference, although XInclude can be used as well (see the section “Using XInclude in the database document”).

  • Non-chunked documents may need a baseuri attribute on their document element to indicate the HTML filename. This is necessary if the olink.base.uri parameter was not used to write the same filename into each href in the target data.when collecting the document's target data. do not use both the parameter and the attribute, or both will appear in the generated hrefs.

  • A directory can contain the output for more than one document. Expressed in the sitemap, this means a dir element can contain more than one document element. This feature is most useful for putting together several non-chunked documents. Chunked documents run the risk of duplicate filenames that would overwrite each other.