Kick-starting a DocBook Project

When I started writing XMPP: The Definitive Guide, I switched from LaTeX to DocBook as my writing tool, mainly because DocBook was O’Reilly’s suggested format. After a few months of writing with DocBook, I started getting quite attached to the format: not only does it force you to separate presentation from content, the strict XML format allows you to easily write tools to transform and validate your document. For example, for the XMPP book, we had several short Python scripts that checked whether the stanzas used in the book were well-formed, whether all web URLs were valid, … Today, I use DocBook for practically all of my documents. Because getting a DocBook environment up requires putting together quite a few pieces from different places, I created a “DocBook kit” to be able to start writing a new DocBook project without much hassle.

Starting a DocBook project requires a few elementary tools to be installed:

  • The DocBook XML schemas. These are used to validate whether your document is a legal DocBook document.
  • The DocBook XSL stylesheets. These are used to transform your DocBook input file into other formats, such as HTML or XSL-FO (a format which can be converted to PDF)
  • XML and XSL processing tools. These tools take the schemas and stylesheets mentioned above, and apply them to your document to do the actual checking and transformation. I use xmllint and xsltproc (available out of the box on many platforms and distributions), but other tools can be used as well (e.g. Saxon or Xalan)
  • An XSL-FO processor, to transform the intermediate XSL-FO format generated by the stylesheets to PDF. I use the Free Apache FOP, but commercial tools such as RenderX XEP and AntennaHouse are very popular for this too.

Once you have all these tools, you need to tie them all together to be able to get from your DocBook file to your desired output format, such as PDF or HTML. Because this involves quite some boilerplate scripting, I created a DocBook kit to make this as light as possible. The DocBook kit comes with a Makefile, which you can use to do all the work for you. The kit also automatically downloads the DocBook XML schemas and XSL stylesheets, making it possible to start working on a DocBook project on a machine with only some basic tools (xmllint and xsltproc) installed.

To use the kit, simply drop it in the directory of your project, and create a minimal Makefile such as the following one:

# The toplevel DocBook file of our project
DOCUMENT = MyDocument.xml
# Include the DocBook Kit's makefile rules
include docbook-kit/tools/Makefile.inc

This makes several make commands available, such as:

  • make, make pdf, make html, make txt: Creates an HTML, PDF, and/or TXT file of your document.
  • make check: Validates your document syntactically.
  • make check-spelling: Runs a spell checker on your document.
  • make package: Creates a flat DocBook file (i.e. one with all the parts included using XIncludes expanded), normalizes all figure names, and packages the result up into a tarball.

The makefile also tracks the dependencies of the document, making sure that your document is rebuilt whenever one of its dependencies (e.g. images, document parts included using XInclude) changes.

Besides tools, the kit also provides a customization layer around the standard XSL stylesheets. These customizations change fonts, spacings, and other presentation parameters for the output document. You can use these as an example on how to make your DocBook document look the way you want. Detailed information on using and customizing the DocBook XSL stylesheets can be found in Bob Stayton’s DocBook XSL: The Complete Guide.

The DocBook kit is available from my Git repository (or on GitHub, or as a ZIP file), and comes with an example of a simple project using the kit. More tools and features will be added in the future.

Tags: , , ,

9 Responses to “Kick-starting a DocBook Project”

  1. Matěj Cepl says:

    Well, the obvious question: what did you use for actually writing the text? If it is emacs w/nxml-mode (I don’t there is much more else available in open source world), did you use Emacs before, or did you install it just because of the book?

  2. @Matěj A good question. I used good ol’ Vim (without the XML plugin I found on the net, it annoyed me too much). Writing XML in Vim sounds painful, but it wasn’t, really (and I think my co-authors on the XMPP book, who also used Vim, agree).

    For a non-commercial article, I did a test-drive of XMLMind, which is nice because it gives you WYSIWYG, and even expands XIncludes inline, making it very easy to review and correct the document. It’s also relatively easy to add new entries to biographies etc. However, I quickly went back to Vim, because any type of restructuring is very hard in a restrictive editor like XMLMind.

    Personally, I would still recommend using your own favorite editor, be it Vim, Emacs, Textmate, or something else. And if you have good plugins for XML that don’t stand in your way, all the better!

  3. stpeter says:

    You rock. This looks like a great “starter kit” for working with DocBook.

  4. Jonas says:

    Great!
    Conglomerate is a free visual/source XML editor under GNU/Linux which have Docbook templates (Article, Book,…). I recommend it.

  5. JRDegan says:

    Thanks for that, in particular the sample customization layer. I have been bending my brain trying to get my head around that for a while and longing for a straight-forward example to push me along. This helps enormously.
    For the record, I am using an editor from a company in Romania and I have found it to be excellent. Not open source, but the company is paying, so what the hell?
    All the best.

  6. [...] added a DocBook XSL customization layer to my DocBook Kit that outputs an HTML/PHP version of the document that automatically integrates with a WordPress [...]

  7. [...] for a similar project, and, surprisingly, didn’t find too much in this space. I did find one project which has precisely the same goals, but it relies on make and other command-line tools typically [...]

  8. Ron Smith says:

    What version of Docbook does the kit support? 4.X, 5?

  9. @Ron 4.X

    The tool support (xmllint, xsltproc, docbook-xsl) for DocBook 5 isn’t perfect yet last time i checked, which is why i’m waiting to use DocBook 5. However, I think you can use the kit with 5 as is, but you’ll get a bunch of warnings from the stylesheets and the processor.

Leave a Reply