Appendix D. Getting started with SGML/DocBook

Table of Contents
D.1. What is SGML/DocBook
D.2. Jade
D.3. DocBook
D.4. The DSSSL stylesheets
D.5. Using the tools
D.6. An alternative approach to catalog files
D.7. Producing PostScript output
D.8. Links

This appendix describes the installation of the tools needed to produce a formatted version of the NetBSD guide. SGML/DocBook and the DSSSL are not described here, but at the end of this appendix there is a section containing links to useful documents which can get you started.

The SGML/DocBook environment can be installed using the netbsd-docs meta-package; this is the easiest way and you are encouraged to use it. This appendix describes the installation of the components one by one and can be used for a more fine grained installation or as a reference for troubleshooting your installation.

Note: It is possible that the netbsd-docs meta-package installs some useful packages which are not described in this document because they are not strictly needed for the NetBSD guide.

This document describes the installation of the SGML tools using precompiled packages. For details on packages see Chapter 8.

Note: the version numbers of the tools that we are going to install can change, as new versions are added to the package system.

D.1. What is SGML/DocBook

SGML (Standard Generalized Markup Language) is a language which is used to define other languages based on markups, i.e. with SGML you can define the grammar (i.e. the valid constructs) of markup languages. HTML, for example, can be defined using SGML. If you are a programmer, think of SGML like the BNF (Backus-Naur Form): a tool used to define grammars.

DocBook is a markup template defined using SGML; DocBook lists the valid tags that can be used in a DocBook document and how they can be combined together. If you are a programmer, think of DocBook as the grammar of a language specified with the BNF. For example, it says that the tags

<para> ... </para>    

define a paragraph, and that a <para> can be inside a <sect1> but that a <sect1> cannot be inside a <para>.

Therefore, when you write a document, you write a document in DocBook and not in SGML: in this respect DocBook is the counterpart of HTML (although the markup is richer and the concepts are different.)

The DocBook specification (i.e. the list of tags and rules) is called a DTD (Document Type Definition.)

In short, a DTD defines how your source documents look like but it gives no indication about the format of your final (compiled) documents. A further step is required: the DocBook sources must be converted to some other representation like, for example, HTML or PDF. This step is performed by a tool like Jade, which applies the DSSSL transforms to the source document. DSSSL (Document Style Semantics and Specification Language) is a format used to define the stylesheets necessary to perform the conversion from DocBook to other formats.

The life of a DocBook document is thus the following:

Therefore what you need to start working is

D.2. Jade

Jade is an SGML/XML parser which implements the DSSSL engine. The Jade package includes the validating parser, nsgmls.

Install Jade using a precompiled package:

# pkg_add jade-1.2.1.tgz    

You will find some documentation in /usr/pkg/share/doc/jade/index.htm, but the most important directory installed is /usr/pkg/share/sgml/jade/: this is where you can find Jade's catalog file.

D.3. DocBook

The next thing that you need to install is the DocBook DTD (i.e. the template used to write DocBook documents.)

This package requires the package with the character entity sets from ISO 8879:1986. Therefore let's add the entities:

# pkg_add iso8879-1986.tgz    

The entities are installed in the directory /usr/pkg/share/sgml/iso8879/ and the catalog file is /usr/pkg/share/sgml/iso8879/catalog.

Now we can install the DocBook DTD.

# pkg_add docbook-4.1.tgz    

Despite its name this package installs several versions of the DocBook DTD (i.e. 2.4.1, 3.0, 3.1, 4.0, 4.1). This lets you process documents which use different versions of the DTD.

The root of the installation is /usr/pkg/share/sgml/docbook/4.1/. Each version of the DTD has a separate directory and each has its catalog file, eg. /usr/pkg/share/sgml/docbook/4.1/catalog.

D.4. The DSSSL stylesheets

Now it's time to install the DSSSL stylesheets:

# pkg_add dsssl-docbook-modular-1.57.tgz    

The stylesheets install their catalog too, in /usr/pkg/share/sgml/docbook/dsssl/modular/catalog. You will find the documentation of the Modular DocBook stylesheets in /usr/pkg/share/sgml/docbook/dsssl/modular/doc/index.html.

D.5. Using the tools

Let's try to use the tools that we have installed and produce an HTML version of the english guide.

Cd to the base directory of the guide and then:

$ cd en
$ make netbsd.html
    

You will get a long list of errors, because the SGML parser, nsgmls, can't find the catalog files. Therefore, type the following commands (and add them to your ~/.profile):

SGML_ROOT=/usr/pkg/share/sgml
SGML_CATALOG_FILES=${SGML_ROOT}/jade/catalog
SGML_CATALOG_FILES=${SGML_ROOT}/iso8879/catalog:$SGML_CATALOG_FILES
SGML_CATALOG_FILES=${SGML_ROOT}/docbook/3.0/catalog:$SGML_CATALOG_FILES
SGML_CATALOG_FILES=${SGML_ROOT}/docbook/3.1/catalog:$SGML_CATALOG_FILES
SGML_CATALOG_FILES=${SGML_ROOT}/docbook/4.0/catalog:$SGML_CATALOG_FILES
SGML_CATALOG_FILES=${SGML_ROOT}/docbook/4.1/catalog:$SGML_CATALOG_FILES
SGML_CATALOG_FILES=${SGML_ROOT}/docbook/dsssl/modular/catalog:$SGML_CATALOG_FILES
export SGML_CATALOG_FILES    

When the SGML_CATALOG_FILES environment variable is active, do another

 $ make netbsd.html
nsgmls -sv netbsd.sgml
nsgmls:I: SP version "1.3.3"
jade -d ../dsl/myhtml.dsl -t sgml -o netbsd.html netbsd.sgml    

This time everything goes well and the HTML version of the guide is generated. The RTF version is created in the same way:

 $ make netbsd.rtf
nsgmls -sv netbsd.sgml
nsgmls:I: SP version "1.3.3"
jade -d ../dsl/myrtf.dsl -t rtf -o netbsd.rtf netbsd.sgml    

With this setup you can create only the HTML and RTF versions; the generation of PS and PDF requires the installation and configuration of TeX and Jadetex.

D.6. An alternative approach to catalog files

In my installations I usually create a master catalog file which references all the other catalog files. If you like this approach, create the /usr/pkg/share/sgml/catalog file containing the following lines:

CATALOG "/usr/pkg/share/sgml/docbook/3.0/catalog"
CATALOG "/usr/pkg/share/sgml/docbook/3.1/catalog"
CATALOG "/usr/pkg/share/sgml/docbook/4.0/catalog"
CATALOG "/usr/pkg/share/sgml/docbook/4.1/catalog"
CATALOG "/usr/pkg/share/sgml/docbook/dsssl/modular/catalog"
CATALOG "/usr/pkg/share/sgml/iso8879/catalog"
CATALOG "/usr/pkg/share/sgml/jade/catalog"    

When you have created this file you can simplify your ~/.profile like this:

SGML_CATALOG_FILES=/usr/pkg/share/sgml/catalog
export SGML_CATALOG_FILES    

D.7. Producing PostScript output

To create a printable version of the guide the following steps are needed:

The following sections describe each of the steps in detail.

D.7.3. Creating the hugelatex format

Jadetex requires the hugelatex format, which is not included in the default installation of teTeX. Make a backup copy of /usr/pkg/share/texmf/web2c/texmf.cnf and add the following lines at the end the file (we will need the jadetex and pdfjadetex settings when we install Jadetex later):

% hugelatex settings
main_memory.hugelatex = 1100000
param_size.hugelatex = 1500
stack_size.hugelatex = 1500
hash_extra.hugelatex = 15000
string_vacancies.hugelatex = 45000
pool_free.hugelatex = 47500
nest_size.hugelatex = 500
save_size.hugelatex 5000
pool_size.hugelatex = 500000
max_strings.hugelatex 55000
font_mem_size.hugelatex = 400000

% jadetex & pdfjadetex
main_memory.jadetex = 1500000
param_size.jadetex = 1500
stack_size.jadetex = 1500
hash_extra.jadetex = 15000
string_vacancies.jadetex = 45000
pool_free.jadetex = 47500
nest_size.jadetex = 500
save_size.jadetex 5000
pool_size.jadetex = 500000
max_strings.jadetex 55000

main_memory.pdfjadetex = 2500000
param_size.pdfjadetex = 1500
stack_size.pdfjadetex = 1500
hash_extra.pdfjadetex = 50000
string_vacancies.pdfjadetex = 45000
pool_free.pdfjadetex = 47500
nest_size.pdfjadetex = 500
save_size.pdfjadetex 5000
pool_size.pdfjadetex = 500000
max_strings.pdfjadetex 55000      

This is how the hugelatex format can be created according to the Jadetex installation guide:

# cp -R /usr/pkg/share/texmf/tex/latex/config /tmp
# cd /tmp/config
# tex -ini -progname=hugelatex latex.ini
# mv latex.fmt hugelatex.fmt
# mv hugelatex.fmt /usr/pkg/share/texmf/web2c
# ln -s /usr/pkg/bin/tex /usr/pkg/bin/hugelatex
      

D.7.4. Installing Jadetex

Fetch the most recent distribution of Jadetex (currently jadetex-3.12.zip), unzip it, then:

# cd jadetex
# make install
# mktexlsr
# cd /usr/pkg/bin
# ln -s tex jadetex
      

When you install the jadetex and pdfjadetex format files are copied to the tex tree along with other utility files.

The jadetex distribution contains two manual pages that are not installed automatically. You can just copy them manually; for example:

# cp jadetex.1 pdfjadetex.1 /usr/local/man/man1      

Now you are ready to create the Postscript version of the NetBSD guide (and of any document you like, of course.)

D.8. Links

You can find a simple and well written introduction to SGML/DocBook and a description of the tools in SGML comme format de fichier universel.

The official DocBook home page is where you can find the definitive documentation on DocBook. You can also read online or download a copy of the book DocBook: The Definitive Guide by Norman Walsh and Leonard Muellner.

For DSSSL start looking at nwalsh.com.

Jade/OpenJade sources and info can be found on the OpenJade Home Page.

If you want to produce Postscript and PDF documents from your DocBook source, look at the home page of JadeTex.

The home page of Markus Hoenicka explains everything you need to know if you want to work with SGML/DocBook on the Windows NT platform.