Kick-starting a DocBook Project

When I started writing XMPP: The Definitive Guide, I switched from LaTeX to DocBook as my writing tool, mainly because DocBook was O’Reilly’s suggested format. After a few months of writing with DocBook, I started getting quite attached to the format: not only does it force you to separate presentation from content, the strict XML format allows you to easily write tools to transform and validate your document. For example, for the XMPP book, we had several short Python scripts that checked whether the stanzas used in the book were well-formed, whether all web URLs were valid, … Today, I use DocBook for practically all of my documents. Because getting a DocBook environment up requires putting together quite a few pieces from different places, I created a “DocBook kit” to be able to start writing a new DocBook project without much hassle.

Starting a DocBook project requires a few elementary tools to be installed:

  • The DocBook XML schemas. These are used to validate whether your document is a legal DocBook document.
  • The DocBook XSL stylesheets. These are used to transform your DocBook input file into other formats, such as HTML or XSL-FO (a format which can be converted to PDF)
  • XML and XSL processing tools. These tools take the schemas and stylesheets mentioned above, and apply them to your document to do the actual checking and transformation. I use xmllint and xsltproc (available out of the box on many platforms and distributions), but other tools can be used as well (e.g. Saxon or Xalan)
  • An XSL-FO processor, to transform the intermediate XSL-FO format generated by the stylesheets to PDF. I use the Free Apache FOP, but commercial tools such as RenderX XEP and AntennaHouse are very popular for this too.

Once you have all these tools, you need to tie them all together to be able to get from your DocBook file to your desired output format, such as PDF or HTML. Because this involves quite some boilerplate scripting, I created a DocBook kit to make this as light as possible. The DocBook kit comes with a Makefile, which you can use to do all the work for you. The kit also automatically downloads the DocBook XML schemas and XSL stylesheets, making it possible to start working on a DocBook project on a machine with only some basic tools (xmllint and xsltproc) installed.

To use the kit, simply drop it in the directory of your project, and create a minimal Makefile such as the following one:

# The toplevel DocBook file of our project
DOCUMENT = MyDocument.xml
# Include the DocBook Kit's makefile rules
include docbook-kit/tools/Makefile.inc

This makes several make commands available, such as:

  • make, make pdf, make html, make txt: Creates an HTML, PDF, and/or TXT file of your document.
  • make check: Validates your document syntactically.
  • make check-spelling: Runs a spell checker on your document.
  • make package: Creates a flat DocBook file (i.e. one with all the parts included using XIncludes expanded), normalizes all figure names, and packages the result up into a tarball.

The makefile also tracks the dependencies of the document, making sure that your document is rebuilt whenever one of its dependencies (e.g. images, document parts included using XInclude) changes.

Besides tools, the kit also provides a customization layer around the standard XSL stylesheets. These customizations change fonts, spacings, and other presentation parameters for the output document. You can use these as an example on how to make your DocBook document look the way you want. Detailed information on using and customizing the DocBook XSL stylesheets can be found in Bob Stayton’s DocBook XSL: The Complete Guide.

The DocBook kit is available from my Git repository (or on GitHub, or as a ZIP file), and comes with an example of a simple project using the kit. More tools and features will be added in the future.

Published by

Remko Tronçon

Software Engineer · Hobby musician · BookWidgets