Shared text entities

Text entities permit you define text strings such as product names or version numbers as variables that can be globally changed by changing the entity declaration. However, using entities with modular files can be a maintenance nightmare if you do not do it properly.

If each module is to be a valid document, it's DOCTYPE declaration must include declarations for any entities that are referenced in the module. You could put the entity declarations that a given module needs at the top of the file in the internal subset of the DTD (included as part of the DOCTYPE declaration). By doing so, however, you proliferate declarations with a given entity name, which can lead to inconsistencies and surprises. For example, the xsltproc processor will flag as an error any entities that are declared with the same name but different content in the including and included documents. Also, putting entity declarations in each document file makes them hard to change globally.

Shared text entities are best declared in a separate file that is referenced by each module and master document that assembles modules. An entities file gives you a consistent set of entity declarations across your modules, and lets you change a definition in one place and have it apply to all of your files. You declare and reference the file containing the entity declarations as a parameter entity in the DOCTYPE declaration of all of your modules and document files. The following example shows how it's done.

Example 23.3. Shared text entities

Entities file named myproject.ent:
<?xml version="1.0" encoding="iso-8859-1" ?>
<!ENTITY productname "VisionFinder">
<!ENTITY version "3.0.1">
<!ENTITY userguide "Using &productname;">
...

Content module  using the entities file:
<?xml version="1.0"?>
<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
                "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % myents SYSTEM "myproject.ent" >
%myents;
]>
<section>
<title>Configuring &productname;</title>
...

If all of your modules and document files have the same entity declaration and reference, then they will all share the same set of entities. You can use an XML catalog to map the filename to a specific pathname on a system so it can be in a central location. By using a different catalog at runtime, you can map the same filename to a different pathname which might contain different versions of the entities.

Note

System entities files are assumed to be encoded in UTF-8 character encoding. That's why the example includes an XML declaration (as specified by the XML standard) that indicates that the character encoding for that file should instead be iso-8859-1.

Putting customized entities in the DTD

The above method of sharing entities works fine, but if you have a large number of modular files, then inserting the parameter entity declaration and reference in the DOCTYPE of every file is a tedious process. Fortunately, if you are using a DocBook DTD, you can customize a certain DTD module to contain your entity declarations, and then they are globally available to all file modules that use the DTD.

The DocBook DTD prior to version 5 includes an empty placeholder module named dbgenent.mod that is intended for general entities declared by the user. Here is how it is referenced in the docbookx.dtd file:

<!ENTITY % dbgenent PUBLIC
"-//OASIS//ENTITIES DocBook Additional General Entities V4.5//EN"
"dbgenent.mod">
%dbgenent;

The DTD declares a parameter entity using both PUBLIC and SYSTEM identifiers, and then it references the entity. In the stock DTD, this effectively does nothing because the dbgenent.mod contains nothing but comments. However, if you populate that file in the DTD directory with your own entity declarations, then they will automatically be incorporated into the DTD and be available for reference in documents.

What if you cannot edit the dbgenent.mod file? That is the case if it is on a shared system or in a read-only directory. One option is to copy the entire DTD to a new location that is writable, edit dbgenent.mod there, and have all your documents reference the DTD in the new location by using an XML catalog.

Another option is to leave the DTD in its original place and use an XML catalog to redirect the lookup of just the dbgenent.mod file. See Chapter 5, XML catalogs for more information on catalogs. This method works if you are using the Java catalog resolver used in Saxon or Xalan, but not the catalog resolver in xsltproc. Here is how it works:

Using an XML catalog to relocate dbgenent.mod

  1. Create your version of dbgenent.mod containing your entity declarations. You can use any filename and directory location.

  2. In your XML catalog file, add an entry that matches on the PUBLIC identifier used for dbgenent.mod, but set the uri attribute to locate your new file. Include the prefer="public" on the group wrapper to ensure the public identifier is used before the original system identifier.

    <group prefer="public">
      <public
         publicId="-//OASIS//ENTITIES DocBook Additional General Entities V4.5//EN"
         uri="mygenent.mod"/>
    </group>
    <nextCatalog catalog="path/to/docbook/catalog.xml"/>
    

    The public identifier in the catalog entry must exactly match the one used in your version of the DocBook DTD (this example is for version 4.5). The entry must also appear before the nextCatalog reference to the DocBook catalog file, which would resolve that public identifier to the original empty file.

Then when you process your documents with Saxon or Xalan configured to use your XML catalog, the parser will locate your new entities file and not use the empty version supplied with the DTD.

Note

Unfortunately, this process does not work with xsltproc, which will skip looking up a file in a catalog if the file's system identifier works. In the case of the DocBook DTD, the original system id does work because the empty file is present, so no catalog lookup takes place. You can force it to use the catalog only by deleting or renaming the empty dbgenent.mod file in the DTD directory.

If you are using DocBook version 5, there is no provision in the DTD or RelaxNG schema to support entity declarations in the schema. You have to put entity declarations, including any named character entities like &trade;, into an entities file and reference that file in the DOCTYPE declaration, as described in the section “Shared text entities”.