| KatSpace Home Page
| xhtmlpp home page
| << | < | > | >>
xhtmlpp xhtmlpp
Version 1.0d

Xhtmlpp Reference

Description

Xhtmlpp is a preprocessor for XHTML files, and is intended to simplify the task of maintaining large sets of XHTML documents. You provide xhtmlpp with a document that is a mix of XHTML-tagged text and xhtmlpp commands. Xhtmlpp generates a set of XHTML files from that document.

Command-line Syntax

To run xhtmlpp, use the following syntax:

xhtmlpp [-option...] filename ...

Where filename is assumed to have an extension '.xhp' if necessary. You can use these command-line options:

All command line options can be shortned to their significant letters, e.g. '-d' is the same as '-debug', '-nof' is the same as '-nofunc'.

Inserting Symbols

Xhtmlpp replaces symbols in command lines and XHTML text. You can specify a symbol in various ways:

$(name)
Inserts the symbol name. If it is not defined and the multilingual symbols option is turned on (see $(USE_LANG) variable below), xhtmlpp will also search for the symbol name.xx where xx is the current language code (see $(LANG) variable below). If none of the symbols is defined (see .define command below) you get an error message.
$(name?default)
Inserts the symbol name. If it is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. If none of the symbols is defined, inserts the supplied default value.
$(*name)
Inserts a link for the symbol name. This is shorthand for:
<a href="$(name)">name</a>.
If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) which is different from the current language and the multilingual symbols option is turned on, the XHTML "hreflang=xx" attribute is automatically added in the <a...> tag.
If both symbols name and name.$(LANG) have an empty value, the <a...> and </a> tags are left-out - i.e. the link is not active.
$(*name*attributes*)
Inserts a link for the symbol name including attributes within the <a...> tag. It is useful is you are using attributes in your XHTML. This is shorthand for:
<a href="$(name)" attributes>name</a>.
If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the XHTML "hreflang=xx" attribute is automatically added in the <a...> tag.
If both symbols name and name.$(LANG) have an empty value, the <a...> and </a> tags are left-out.
$(*name=label)
$(*name="label")
Inserts a link for the symbol name, with label as specified. This is shorthand for:
<a href="$(name)">label</a>.
If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the XHTML "hreflang=xx" attribute is automatically added in the <a...> tag.
If both symbols name and name.$(LANG) have an empty value, the <a...> and </a> tags are left-out. You can use double quotes if the label itself contains ')' or its first character is '*'.
$(*name*attributes*=label)
$(*name*attributes*="label")
Inserts a link for the symbol name, with label as specified, and including attributes within the <a...> tag. This is shorthand for:
<a href="$(name)" attributes>label</a>.
If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the XHTML "hreflang=xx" attribute is automatically added in the <a...> tag.
If both symbols name and name.$(LANG) have an empty value, the <a...> and </a> tags are left-out. You can use double quotes if the label itself contains ')'.
$(*name=)
Inserts a link for the symbol name, with the full reference as label. This is shorthand for:
<a href="$(name)">$(name)</a>.
If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the XHTML "hreflang=xx" attribute is automatically added in the <a...> tag.
$(*name=*attributes*)
Inserts a link for the symbol name, with the full reference as label, and including attributes within the <a...> tag. This is shorthand for:
<a href="$(name)" attributes>$(name)</a>.
If the symbol is not defined and the multilingual symbols option is turned on, it will also search for the symbol name.xx where xx is the current language code. Also, if name already includes a language code (e.g. $(*home.es)) that is different from the current language and the multilingual symbols option is turned on, the XHTML "hreflang=xx" attribute is automatically added in the <a...> tag.
&(Perl program fragment)
Replaces the symbol by the output of the specified Perl code. The Perl code is executed using the eval command -- see your Perl documentation if you want to use this feature. This is how to replace the symbol by the output of a Perl program: &(`perl program`). You must enclose the program fragment in double quotes if it contains ( or ), or else escape these characters using '\'.
&name(arguments)
Replaces the symbol by the result of an $(*intrinsic=intrinsic function). These are predefined functions that xhtmlpp provides for various purposes.
%(variable)
Replaces the symbol by the value of an environment variable. If the variable does not exist, inserts an empty value. For portability, always define environment variables in uppercase.
%(variable?default)
Replaces the symbol by the value of an environment variable. If the variable does not exist, inserts the specified default value. For portability, always define environment variables in uppercase.
\(
Replaces this by "(". This is to 'escape' symbol definitions so that they are not translated.
\.
Replaces this by ".". This is to 'escape' dots so that they are not interpreted as commands, at the start of a line.

You can define symbols in terms of symbols: $($(name)) is quite okay, if you know what you are doing. Xhtmlpp inserts symbols from right to left in the line.

Types of Symbol

Symbols are of various types

Standard Symbols

Xhtmlpp provides these standard symbols for use at any point in the document:

$(DATE)
The date that xhtmlpp started, formatted as an 8-character string: YY/MM/DD.
$(TIME)
The time that xhtmlpp started, formatted as an 8-character string: HH:MM:SS.
$(DOCBASE)
The main document filename, without extension.
$(INC)
A counter, which starts at zero and is bumped-up each time you refer to it. This is used to number filenames, in the .page command. The first time you use $(INC), it is empty - i.e. "". The second time it is "1", then "2", "3", "many", "manymany", and "manymanymany" (joke, sorry).
$(PAGE)
After a .page command, this holds the page filename, exactly as specified in the .page command.
$(TITLE)
After a .page command, this holds the page title. It is nice to use this in the header block.
$(PIPE_TITLE)
After a .pipe command, this holds the pipe title. It is nice to use this in the pipe_header block.
$(PASS)
Contains either 0 or 1, depending on whether xhtmlpp is scanning for titles (0) or building the output files (1).
You may want to (re)define some of these symbols:
$(BASE)
Defined as "doc". This is used in .page commands for automatic filename generation.
$(EXT)
Defined as "html", and commonly-used hot on the heels of a $(BASE).
$(DIR)
Defined as "." and used to prefix the filename for the generated HTML pages.
$(SILENT)
Defined as 0. If you .define this as 1, xhtmlpp will try to be a bit quieter. When you are generating *lots* of pages, it is easy to lose real warnings and errors amidst the information messages.
$(LINEMAX)
Defined as 79. Xhtmlpp warns if it finds longer lines. If you don't want to see these warnings, set it to 0 (for example, if you have Javascript scripts in your pages).
$(DEBUG_MODE)
Has the value 1 when xhtmlpp is operating in debug mode, and 0 otherwise. To use debug mode, use the -debug command-line option.
$(LANG)
A two-character code for the language that xhtmlpp will use in the formatting dates subroutines. It will also be used to search for symbols of type name.$(LANG) when multilingual variable search is activated. Its default value is "en" for English. Other supported languages are "es" for Spanish, "dk" for Danish, and "fr" for French.
$(USE_LANG)
Defined as 0. If you .define this as 1, xhtmlpp will search multilingual symbols of type name.$(LANG), and will add the hreflang attribute in <a..> tags when $(*name) already ends up with a .xx language code.
$(USE_RELPATH)
Defined as 0. If you .define this as 1, all $(*name) links whose URL does not start with "http://", "ftp:", "mailto:", "./", or "../" will be considered within-site absolute links and will be made relative.
$(Hn)
Where 'n' is 1 to 9, defines a header level number. You would use this to generate headers like this:
1
1.1
1.2
1.2.1
1.2.2 ... etc.
Xhtmlpp automatically manages the numbering of header levels. You are, however, limited to the 'dotted number' syntax.
Unless you use .ignore pages, these symbols are available in header and footer blocks (you can use them elsewhere, but you'll get warnings):
$(FIRST_PAGE)
The filename for the first page of the document.
$(LAST_PAGE)
The filename for the last page of the document.
$(NEXT_PAGE)
The filename for the next page of the document.
$(PREV_PAGE)
The filename for the previous first page of the document.
$(FIRST_TITLE)
The title for the first page of the document.
$(LAST_TITLE)
The title for the last page of the document.
$(NEXT_TITLE)
The title for the next page of the document.
$(PREV_TITLE)
The title for the previous first page of the document.

In addition, xhtmlpp will include the current environment symbols if you run it with the -env option. You can use this to redefine any of the standard symbols such as $(EXT). Remember that you can also access any of the environment symbols using the %(...) syntax; e.g. %(PATH).

Xhtmlpp Commands

An xhtmlpp command starts with a dot, in column 1, followed by a keyword. You can put spaces between the dot and the keyword. To continue the command line over the next line, end the line with a hyphen (though you need to at least put the dot and the keyword on the same line. Commands can be in upper- or lower-case: .endblock and .EndBlock are equivalent.

These are the commands that xhtmlpp understands:

.define symbol [value]
Define a symbol with the specified value. The symbol name can consist of letters, digits, -, ., and _. The value is everything else up to the end of the line. If you omit the value, the variable is un-defined. You can redefine a variable as often as you like simply by repeating the .define command. Use lowercase for your own symbols. Predefined xhtmlpp symbols are uppercase. Case is significant. You can assign values to the built-in xhtmlpp variables like INC if you want to. In some cases this is even useful. If you append .xx to the symbol name, where xx is a two-characters language code, you can afterwards use the variable as $(symbol) without writting the language code, provided $(USE_LANG)=1 and $(LANG)=xx. You should use standardized ISO-639 language codes.
.define symbol = expression
Evaluates the expression and stores the result in symbol. Note that you must use '=' to evaluate an expression. Otherwise the expression is considered as a string and stored as-is in the symbol. Xhtmlpp passes the expression to Perl for evaluation, so you can use any valid Perl syntax. If you want your xhtmlpp files to be portable to (future) non-Perl implementations of xhtmlpp, restrict the expressions to simple arithmetic (+, -, *, /, and parentheses). This is an example:
.define count = 1
.echo $(count)
.define count = $(count) + 1
.echo $(count)
Of course it helps to know that xhtmlpp will evaluate all variables before passing the expression to Perl to work out. So, the second .define is evaluated as '1 + 1'. If you decide to rely on Perl (a good bet for now), you can use the .define = command to execute shell commands, e.g.:
.if $(PASS)
.  define junk = system "rm *.htm";
.endif
If you append .xx to the symbol name, where xx is a two-characters language code, you can afterwards use the variable as $(symbol) without writting the language code, provided $(USE_LANG)=1 and $(LANG)=xx. You should use standardized ISO-639 $(*langcodes=language codes).
.define symbol++ initial_value
Creates or re-initialises a counter with the initial value. Each time you use the counter symbol, it is incremented. The $(INC) symbol is actually defined internally like this:
.define INC++ ""
Note that the empty string is treated as zero; the next time the symbol will be '1'. You can also use '--' after the symbol name to subtract one from its value each time it is used. You can stick the '++' or '--' before the symbol name: then the symbol is incremented or decremented before its value is taken.
The .define statement is resolved as late as possible: if the statement refers to other variables, these are inserted when the .define'd variable is used, rather than when it is defined. For instance if you refer to a .define'd variable in the page header, it will be re-evaulated each time the page header is output.
.macro [-nosplit|-noquote] name macro-body
Defines or redefines a macro. The macro body can go over several lines; end each continuation line with '-'. When xhtmlpp outputs a macro, it replaces arguments in the body with values you supply at the time. The arguments are $1, $2, and upwards and $*, which work as follows: $1 to $n are the first to n'th argument you supply; $* is the whole string of arguments. If the macro body is "", the macro is given an empty body. Otherwise, any quotes you use will be included in the macro body as-is. If you use the -nosplit option before the macro name, the macro will only ever have one argument, $1. Without this option, macro arguments are split on whitespace, with quotes and apostrophes being used to group arguments. See the section on macros for details. The -noquote option automatically escapes quotes in the macro arguments.
.include filename
Start reading from the specified file. You can nest .include files as much as you like. Xhtmlpp checks for circular references. If the same file was already included earlier, xhtmlpp ignores the command, like the Perl 'require' operator. Xhtmlpp searches along the XHTMLPP_PATH environment variable for the file. If you specify a filename with a full path, xhtmlpp won't search the XHTMLPP_PATH. If xhtmlpp can't find the document using XHTMLPP_PATH, it'll search LIBPATH (which was used by htmlpp) too. If xhtmlpp can't find the document using LIBPATH, it'll search PATH too.
.include filename!
Include the file in any case, like a C #include directive.
.include `command`
Execute 'command' and include the output of the command in the generated XHTML text. The command can be any program with arguments; it should respect any operating system conventions or limitations. The output text can contain xhtmlpp symbols in the normal manner. It cannot contain xhtmlpp commands.
.page filename = ["]title["]
Start writing a new XHTML file. The title is required. At any point after the .page, you can refer to $(PAGE) and $(TITLE) for the current file name and title. For instance, you'll often see this:
<h1>$(TITLE)</h1>
.page ["]title["]
Equivalent to .page $(BASE)$(INC).$(EXT) = "title". Just easier.
.pipe filename=title
Tells xhtmlpp to create a secondary file, as specified, and to send output there as well as to the primary file specified by the last .page command. The .pipe command is useful when you want to send part of a HTML page into another file, for instance to generate a small readme.htm file for an installation. The piped file is prefixed by the PIPE_HEADER block and ended with the PIPE_FOOTER block, if these are defined.
.ignore header
Ignore the next header line as far as the table of contents is concerned. This is good for headers like <h2>Table of Contents</h2>.
.ignore header level
Ignores all headers with level greater or equal to level. This is useful if a section has a lot of H3 and H4's that you don't want in the table of contents. Use .ignore header 99 to re include all further headers.
.ignore pages
Ignore all .page commands except to pick-up the page titles. Use this when you want to create a super-document. When you use .ignore pages, xhtmlpp also ignores the .build toc and .build index commands. So, if you want a table of contents, do the .build toc before you say .ignore pages. You can also use .if commands to skip blocks of text under certain conditions.
.ignore page
Ignore next .page command for any future .build index command. This is the right way of keeping the index page itself out of the index. Note that the index page does take part in the general page-to-page linking scheme provided by $(PREV_PAGE) and such.
.if expression
[.else]
.endif
If the expression returns a false value, xhtmlpp skips until the .else or .endif line. You can nest .if blocks. An .else is always part of the closest preceding .if. Xhtmlpp passes the expression to Perl for evaluation, so you can use any valid Perl syntax. This is quite okay:
.if -f myfile.htm
An .if block must be entirely in one line.
These are some examples of .if expressions:
.if $(number) == 0
.if $(number) != 1
.if $(number) > 2
.if $(string) eq "value"
.if $(string) ne "value"
.block blockname
Define a piece of XHTML text to be output as part of a .build command. You can end the .block with an .endblock or another .block. Xhtmlpp knows about these block names:
header
Output at the start of each new XHTML page; i.e. whenever you use a .page command.
footer
Output at the end of each XHTML page.
toc_open
Output at the start of a .build toc block (see below), and whenever xhtmlpp decides to indent a new level.
toc_open_outer
Additional output at the start of a .build toc block (see below).
toc_open_inner
Additional output whenever xhtmlpp decides to indent a new level. (This is useful for XHTML which wants nested lists to be enclosed in their own list item)
toc_entry
Output for each entry in the table of contents. Use these symbols: $(TOC_HREF) - the local URL for the file and section; $(TOC_TITLE) - the title for the section, taken from the header line; $(TOC_LEVEL) - the table-of-contents level, 1 and higher.
toc_close
Output whenever xhtmlpp decides to outdent a level, and at the end of the table of contents.
toc_close_inner
Additional output whenever xhtmlpp decides to outdent a level.
toc_close_outer
Additional output at the end of the table of contents.
dir_open
Output at the start of a .build dir block (see below).
dir_entry
Output for each entry in a .build dir block. Use these symbols: $(DIR_HREF) - URL for the file; $(DIR_NAME) - the filename, left-justified; $(DIR_EXT) - the file extension, always put into lowercase; $(DIR_SIZE) - the file size, right-justified; $(DIR_DATE) - the file date; $(DIR_TIME) - the file time. You can also use $(DIR_SIZEK) and $(DIR_SIZEM) to get the file size in Kbytes and Mbytes, and $(DIR_HREFL) to get the URL for the file in lowercase.
dir_close
Output at the end of a .build dir block.
index_open
Output at the start of a .build index block (see below).
index_entry
Output for each entry in a .build index block. Use these symbols: $(INDEX_PAGE) - the filename; $(INDEX_TITLE) - the file title. For compatability with earlier versions, xhtmlpp also accepts the name 'index'.
index_close
Output at the end of a .build index block.
anchor
Output whenever you use a .build anchor. Use this symbol: $(ANCHOR) - name of anchor.
Any other block is treated as a user-defined block and can be output at any point using a matching .build command.
.endblock
End the previous .block. You can end a .block with an .endblock or a further .block command. Any other command within a .block is interpreted when the block has been generated.
.block blockname local
You can follow the .block command by the keyword local - this defines a block that will be used one time only. The local keyword applies to header, footer, and anchor blocks. Local blocks are used to change the way a single page looks, without disturbing the headers and footers of the whole document. Typically, you would define a general document header at the start of the document, then a local header and footer for a specific .page. Note that you should define the local page footer after the .page command. If you define a local footer block before the first page, xhtmlpp handles this correctly.
.build toc
Build table of contents for document. Xhtmlpp scans the document and all include files once to collect titles (<hn>...</hn>) and once to create the XHTML pages. Titles (<hn>...</hn>) must be entirely on a single line, or xhtmlpp will not find them. You can manage the contents of the table of contents through the .ignore header command. You will normally use a .build toc at the start of a document.
.build dir directory [filespec...]
Build directory listing as specified. The .build dir command only works if you mirror the server directory on some local disk that xhtmlpp can access. This is a Good Idea in any case. Before you can use .build dir you must define LOCAL and SERVER. For example:
.define LOCAL   i:/site:
.define SERVER  http://www.imatix.com
The directory must be relative to either of these two. It should start with '/' but not end with '/'. You can specify zero or more filenames or wildcards (xhtmlpp accepts * and ?, according to UNIX rules). If you specify no filespecs, xhtmlpp assumes you mean '*'. The filespecs can include PERL regular expressions: place the filespec between double quotes, e.g. to match all files with 'doc' or 'txt' somewhere in the name: .build dir /pub "doc|txt". An example might help:
.define .txt   Text file
.define .htm   HTML document
.define .zip   ZIP archive
.block dir_open
<pre>
.block dir_entry
$(*DIR_HREF="$(DIR_NAME)") $(DIR_SIZE)  $($(DIR_EXT))
.block dir_close
</pre>
.endblock
Note the sneaky double-derefencing of $(DIR_EXT) which translates the file extension into a comment like 'Text file'. In this distribution there is a separate .include file, filetype.def, which contains such .defines (to which you can add what you want)
.build index
Build file index for document. This is basically a list of all pages in the document with their titles. If you use this, you may want to put an .ignore page before the .page that starts the index page. It may be useful to do a .build index inside the footer of a page -- this is quite okay.
.build anchor anchor-name[=title]
Build an anchor definition. This is useful. Basically you do a '.build anchor somename' in a document, then do a $(*somename) or $(*somename="label") anywhere in any other document. Xhtmlpp saves anchor symbols in the file anchor.def; otherwise anchor symbols are treated much like normal .define'd symbols. One difference: anchor symbols and normal symbols do not share the same namespace; if you .define a symbol with the same name as the anchor symbol, the .define'd symbol takes precedence. If you undefine the symbol, the anchor symbol reappears by magic. This may or may not be useful, but it is the way it works. If you change the file structure of your document, run everything through xhtmlpp *TWICE*, so that all anchor references can get really solidly updated. You can delete the anchor.def file at any time; it is just kept to save some context between runs. When you use the form $(*somename) to refer to an anchor, xhtmlpp will insert the anchor title as the label.
.build user_block_name
Output the user-defined block specified. This is any amount of text that you do not want to specifically put into a separate file for use with the .include command. You define the block using the .block command.
.echo [-] text
Echoes the text to the console. Strips-off any leading and trailing spaces, but you can enclose the text in single or double quotes if you want leading/trailing spaces. Unless you place a hyphen before the text, xhtmlpp adds a newline.
.for name in item...
.endfor
Repeats the text between .for and .endfor, where $(name) has the value of each item in the list. The item list is separated by spaces.
.for name in `command`
Repeats the text between .for and .endfor, where $(name) has the value of each line in the output generated by the command. The special variables $(1), $(2), and so on will hold each word in the line.
.for name in @filename
Repeats the text between .for and .endfor, where $(name) has the value of each line in the specified file. The special variables $(1), $(2), and so on will hold each word in the line.
.for name in %filename separator comment_flag [orderby[>]] [exact_match] [case_sensitive] [criterium ...]
Repeats the text between .for and .endfor, where $(name) has the value of each line which satisfies the given query in the specified file, which must be a flat-text database whose fields are separated by separator. Use Perl conventions for the separator character, and escape backslashes with a backslash (e.g. if the separator character is '|', you should use '\|' in Perl, and '\\|' as separator). The special variables $(1), $(2), and so on will hold each field in the line.
When reading the database, all lines that start with comment_flag will be skipped. The rows resulting from the query will be sorted in ascending order according to field number orderby, unless you append a > character to the field number, in which case they will be sorted in descending order.

Each query criterium is defined by four pipe-delimited fields. You can use as many criteria as you want. The fields are the

  1. value to be matched
  2. index of field into the database that this criteria applies to (starting at 1)
  3. operator for comparison
    Possible values: >,<,>=,<=,=,!= (not equal)
    The operator is compared the following way:

    value OPERATOR database_field_value

    That is, (1) above is the left hand side of the operator and (2) above is the right hand side of the operator.
  4. data type of the field (this determines how the operator in (3) gets applied to the data)
    The data type can be: date, number, or string
    If the data type is a date, then the operator for comparison is done after the value to be matched. Note that dates must be in MM/DD/YY format.
    If the data type is a number, then the operator for comparison is done based off of numerical if operators (>,<,==, etc.)
    If the data type is a string, then the operator for comparison is done based off of string if operators (gt, lt, eq, ne, etc.) with ONE EXCEPTION.
    If the datatype is a STRING *AND* the operator is =, then the search that is done becomes a more flexible search:
    • All the words in the form variable are split apart and searched as separate keywords in the text of the fields.
    • By default, the search on string = string is a pattern match search and is not case sensitive.
    • If you want this special string,= combination searching to be case sensitive and to match on whole words only, you MUST set case_sensitive and exact_match to on
      If exact_match is on then the combination of string,= in the query criteria array will match on WHOLE WORDS only.
      If case_sensitive is on then the combination of string,= in the query criteria array must have matching case values (upper/lower).
Examples:
  • "20|9|<=|number" "40|9|>=|number": return all rows in the database whose field number 9 value is between 20 and 40.
  • "hello world|1,2,3,4,5|=|string": return all rows in the database that contain the words "hello" and "world" (in that order if exact_match is set to on, otherwise in any order) in any of the fields from 1 to 5.
Suggestions:
  • You can use some of the free tools in Internet to generate flat-text databases with and XHTML interface, so that even other people can fill the data and your pages will be generated using those data. See for example Selena Sol's Database Manager.
  • By combining the -set command-line option with this .for loop you can automate mass-production of similar pages from a single xhtmlpp document.
    Imagine you want to make a page for each of the employees in your company. The employees themselves can fill their data through the web with some of the tools mentioned above. You make a xhtmlpp document with something like:
    .for row in %database "\\|" "COMMENT:" "on" "on" "$(person)|1|=|string"
    .page "$(5)" = "$(2) $(3)'s Homepage"
    <ul>
        <li>E-mail: <a href="mailto:$(4)">$(4)</a></li>
        <li>Office: $(5)</li>
    [...]
    .endfor
    
    Then calling 'xhtmlpp -set person=personcode template_file' for each person will produce his personal page.
.for name from start to end
Repeats the text between .for and .endfor, where $(name) has a numeric value from start to end inclusive. Xhtmlpp will count up or down as necessary.
 

Xhtmlpp Macro Processing

Macros are a shorthand way to produce XHTML tags and other constructs. For example, the standard macro 'H2':

.macro H2 <h2>$*</h2>

This uses all uppercase names for macros, but this is just a convention, since the case is not important. We can use a macro like H3 in three ways:

.H2 some text

or

<!--.H2 some text-->

or

<.H2 some text>

The first form is good for titles and other constructs that come naturally on a line by themselves. Since it uses a syntax similar to xhtmlpp commands, there is a certain danger that a macro will conflict with some future command. This is just too bad; the alternative of inventing yet another syntax for macros was a worse choice. In any case, xhtmlpp will warn you if you try to define a macro that already exists as a command. The second form is compatible with HTML editors and some other HTML preprocessors, but is frankly a pain to type. The third form is good for mark-up tags. The second and third forms suffer from one problem: the whole thing has to come on a single line.

When you use a macro like this: <.H2 some text> you are supplying arguments. Here we supply two, 'some' and 'text'. You can refer to these as $1 and $2 inside the macro definition, or together as $*. Xhtmlpp can handle quotes correctly, so <.H2 "some text"> only supplies one argument, $1.

The $+ symbol expands to anything left over after $1, $2, etc. For instance, if you refer to $1 and $3 in the macro body, $+ refers to $4 and any remaining arguments.

The $# symbol expands to the number of macro arguments.

You can define a macro with a section that repeats for each argument. This is useful if you don't know in advance how many arguments you are going to have. For instance, the standard .THEAD macro generates a table heading for one, two, three, or more columns. You specify the repeating section as {...$n...}. The text between '{' and '}' is repeated for each argument; "$n" (dollar sign, small 'n') is replaced by the argument value.

To use multi-word arguments, enclose them in quotes.

When a macro refers to a variable using $(xxx), this will be expanded as soon as the macro is expanded. Usually this is what you expect, but sometimes you need the variable to be expanded in the next pass, for instance if you generate the .define in the same pass. In this case, escape the variable: $\(xxx).

The file macro.def that comes with xhtmlpp defines a set of standard macros. You can define multiline macros that include other commands, like .if and .include.

Multilingual support

Support For Accented Characters

You can type accented characters directly, and xhtmlpp will do its best to convert these into XHTML metacharacters. For instance, if your document contains an e-circumflex, xhtmlpp will replace it by the metacharacter &ecirc;.

Supported character sets are ISO-8859-1 and MS-DOS (codepage 850). In general you can use ISO-8859-1 both for Unix Latin-1 and Windows 1250. You can define the character set through the -charset command-line option or let xhtmlpp do a little testing of the wind to figure-out if it's running under a Unix or a DOS system (Windows testing not supported). If you use xhtmlpp on a Mac, or on documents encoded using another character set it won't work. Basically xhtmlpp handles MS-DOS accents if there is an environment variable 'COMPSPEC' defined, and ISO-8859-1 accents if there is a file called "/etc/passwd" on the system.

If you use any character which is not on ISO-8859-1 or MS-DOS CP850 you will find that it comes-out as '?' (not found). If you have the HTML metachar for the character (which must be a Unicode numeric reference rather than an entity reference if it is outside ISO-8859-1) and the octal ASCII code for the character set you are using, please send it to me.

Dates Formatting

Days of the week and month names in date formatting functions can be written in several languages. Current supported languages are "es" for Spanish, "dk" for Danish, and "fr" for French. If no language is specified in the formatting function the value of standard symbol $(LANG) is used, which is "en" by default if it has not been redefined.

Multilingual Variables

Xhtmlpp allows you to add two-character "extensions" to symbols to denote the language for the symbol value. This way you don't have to remember different symbol names for each alternative language you offer in your site. When you use the symbol you don't have to specify the language if it coincides with the current value of $(LANG). This feature only works when you activate the multiligual variable search option by defining $(USE_LANG) to 1.

Example:
.define USE_LANG 1
.define home.en  "http://www.myserver.com/english/index.html"
.define home.es  "http://www.myserver.com/spanish/index.html"
.define home.fr  "http://www.myserver.com/french/index.html"
Now if you use the symbol $(home), xhtmlpp will
  1. Look for variable $(home).
  2. Because $(home) does not exist, look for $(home.$(LANG)).

If you want to make a link to a variable which is in a different language than the current one in $(LANG) you can use the full symbol name and Xhtmlpp will add for you the 'hreflang=xx' attribute:

.define $(LANG)  es
$(*home=Home) -->
     <a href="http://www.myserver.com/spanish/index.html">Home</a>
$(*home.es=Home) -->
     <a href="http://www.myserver.com/spanish/index.html">Home</a>
$(*home.en=Home) -->
     <a href="http://www.myserver.com/english/index.html" hreflang=en>Home</a>

Generating Several Language Versions from a Single Source File

You can use the following trick. Write your source file as:

.if ("$(LANG)" eq "en")

  [... version in english ...]

.elsif ("$(LANG)" eq "es")

  [... version in spanish ...]

.elsif ("$(LANG)" eq "fr")

  [... version in french ...]

.endif

Then to process each language invoke xhtmlpp as 'xhtmlpp -set LANG=xx filename'. You can also write a simple shell script to process the three languages at once by calling three times to Xhtmlpp.

 

Xhtmlpp Guru Mode

Recognising that a True Guru does not have time to painfully mark-up large XHTML documents, xhtmlpp includes a basic text-to-XHTML converter. You can invoke this as a preprocessing phase to the normal xhtmlpp process. This is an either-or choice; you either use xhtmlpp commands in a XHTML document, or a text document and guru mode, but not a mixture of the two modes.

You can, usefully, use xhtmlpp's guru mode to mark-up a document, then fine-tune it by hand. Or you can eschew the fine-tuning and use it to create a web-version of a document which has plain text source (for example, an FAQ document which is regularly posted to a newsgroup or mailing list) See http://www.katspace.com//works/formatfaq.html for an example.

To use guru mode, run xhtmlpp with the '-guru' option:

xhtmlpp -guru filename

Guru mode works by recognising layout, and converting this to XHTML. Basically, guru mode relies on layout rules that also help to make the text readable in any case. For example, blank lines and indentation are significant in most places. One consequence of this is that the plain text file is very readable even before it is XHTML'd (assuming you do your bit to help things.)

In guru mode, xhtmlpp reads an input text file (with any name and extension except '.xhp') and creates an output file with the same name and the extension '.xhp'. It then processes this file as it would any normal input file. The '.xhp' file remains afterwards, so you can use it as the basis for further refinement if wanted. (You should call it something else, to avoid embarrasing mistakes.)

The other argument in guru mode is the '-notable' argument, which disables table creation (see below).

Standard Guru Mode Definitions

The file 'guru.def' is always inserted at the start of the newly-created file. You can modify this file as wanted, to tune the results of guru mode. You cannot choose another name for this file other than by changing xhtmlpp's source code, which I don't recommend (unless you read the Camel book).

Xhtmlpp looks for a file called 'guru.fmt' which may exist and which may redefine the various XHTML tags it uses. A file 'guru_opt.fmt' is supplied in the xhtmlpp distribution; rename or copy this to 'guru.fmt' and change any values you want to (I'd suggest you remove anything that does not change, just to make things clear). I've made it work in this way so that if you reinstall xhtmlpp, you don't loose your work.

Chapter and Section Headers

Xhtmlpp handles four levels of headers, H1, H2, H3, and H4. In the text these look like this:

Chapter Header
**************

Section Header
==============

Subsection Header
-----------------

Subsubsection Header
^^^^^^^^^^^^^^^^^^^^

The line following the header text must start with 3 or more asterisks, equals, hyphens or carets. I recommend that you start the document with a chapter header.

You can also request a horizontal rule (<hr />) by putting four or more dots on a line by themselves:

....

or four or more dashes on a line by themselves:

----

or four or more underscores on a line by themselves:

____

(What can I say? You're spoiled for choice! The thing is, that people often use dashes for lines in plain text documents, so I added it.)

The header text line must come after a blank line, or at the start of the document.

Table of Contents

If your document contains at least two chapters, xhtmlpp will insert a table of contents before the second chapter header. This works best if the first chapter is empty or contains a brief text to introduce the document. Xhtmlpp inserts the table of contents by adding a section header called 'Table of Contents', and then a line '.include contents.def', in the normal manner. You should not call the first chapter 'Table of Contents'.

Pagination

Xhtmlpp inserts a '.page' command before each chapter header. Therefore, use chapter headers wisely to break the document into usable pages.

Page Headers and Footers

The guru.def file normally includes 'prelude.def', which defines page headers and footers for the document. You will normally tune these for any project -- the supplied files contain references to things that may not be appropriate for your work. I like to use the same headers and footers (the same prelude.def) for all the files in a project, including those that use guru mode.

You can see the dramatic difference a prelude.def file can make when you compare the version of this documentation on the KatSpace website with the version which is included in the dowload archive xhtmlpp.tgz. Both use exactly the same source file, xhtmlpp.xhp but have differing prelude.def files -- the one on the KatSpace site uses the prelude.def for that site, while the version in the archive uses the prelude.def included in the archive.

Paragraphs

A paragraph is anything following a blank line that does not look like something else. Basically, any plain text following a blank line is given a <p> tag. Note however the exceptions that follow...

If a line is shorter than a given length, then it will be treated as a short line (such as might be used with poetry) and a <br /> tag is put after it, to preserve the layout without having to make it preformatted text (see below). The default length of a short line is 40 characters. It can be changed with the '-l' option to htmlpp.

For example, to make the length of a short line be 20 characters, call xhtmlpp like so:

	xhtmlpp -guru -l 20 somefile.txt

Text Style

There is a convention on mailing lists and newsgroups that emphasized text is surrounded by *asterisks*, and underlined text by _underscores_. Guru mode follows these conventions as best it can, and adds another one, carets for ^bold^ text.

Text surrounded by asterisks is rendered as <em> tags (which is usually rendered as italics). Text surrounded by carets is rendered as <strong> tags (which is usually rendered as bold). These don't have to be all on the same line, they can be separated by more than one line, so long as they're in the same paragraph. (If they aren't in the same paragraph, you will get illegal XHTML.)

Text surrounded by underscores, on the same line, that isn't part of a URL, is given <tt><b> tags, since the <u> tag is depreciated in XHTML.

Because there can be use of underscores which don't mean underlines (such as with variable names in a discussion of program code), I have added the '-nounderlines' option to guru mode, which will disable underline interpretation.

Preformatted Text

If a line is indented by 4 or more spaces, or a tab, xhtmlpp treats the line as 'preformatted' text and inserts a <pre> tag. You can mix blank lines with preformatted text.

Bulleted and Numbered Lists

A paragraph starting with a hyphen and a space is considered to be a bulleted list item. A paragraph starting with a digit and a dot and optionally a space is considered to be a numbered list item. You can put blank lines between list items, but it's not necessary. Cosmetically, when list items are short, blank lines are disturbing. But when list items are several lines, blank lines make the text more readable. Either way, xhtmlpp is happy.

You can also nest bulleted lists by starting the nested list item with a space(s), then a hyphen, then a space. Only one level of nesting is allowed, however.

Definition Lists

A definition list is a line ending in ':' followed by some lines indented by one or more spaces. For example:

Definition:
 Explanation of definition.

You can put blank lines between definition items, but again, it's a matter of cosmetics. There should be a blank line before the first definition item, however.

Tables

Tables are one of the real pains of XHTML markup, in my opinion. Here xhtmlpp tries to solve the most common case; a two-column table consisting of a term or value in one column, and an explanation in the second column.

A table can start with a header, which is a line like this:

Some column:  Followed by some explanation:

Here, the colons (':') are important. Xhtmlpp also wants a captial letter at the start of both phrases, and a space after the first colon. The table header is optional; you can start immediately with table items. Either way, xhtmlpp needs a blank line before the table. A table item looks like this:

Some_word:    Followed by some explanation
               which can come on several lines.

The first column must be a single word - if you want several words, use underlines. Xhtmlpp replaces these by spaces. The explanation can come on several lines, which must be indented by one or more spaces.

Disabling Tables

Unfortunately, if one is processing text which just happens to contain colons (after all, they are a normal punctuation character) which aren't meant to be interpreted as tables, it can be even more of a pain. Reformatting the text or removing the colons sometimes just isn't the answer.

So, there is the '-notable' argument to the rescue! If you call xhtmlpp with '-guru -notable' then none of the table processing above will be done, and a colon will just be a colon.

Figures and Images

To insert a figure, use one of these conventions:

[Figure filename: caption]
[Figure "filename": caption]

Xhtmlpp inserts a figure caption, numbering the figures in a document from 1 upwards. The caption is followed by an <img> tag to display the file. You can use a URI (a path) as the filename, or an URL (with a host name specifier); you must put an URL in quotes. My preference is to put image files locally with the XHTML files, and use a simple filename without a path. This is just easier to manage and lets you put the XHTML files plus images in any directory. If xhtmlpp can find the image you specify, and it's a .GIF or .JPG file, it will insert the width= and height= tags automatically.

To insert a plain image, omit the 'Figure' keyword. For example, these are all examples of valid images:

[Figure somefile.gif: caption]
[somefile.gif: caption]
[Figure somefile.gif]
[somefile.gif]

Hyperlinks

If you use <name@address>, this is converted into a mailto: URL hyperlink. If you use <http://address/document> -- or any other URL -- this is converted into a hyperlink as well. You can follow the URL by ':description' if you like, e.g. <http://www.katspace.com:Kat's Site>. You can also refer to local files using the syntax </localfile[:description]>. And if the filename doesn't start with '/', you can tell xhtmlpp that it's a url by going <URL:localfile>.

Xhtmlpp does not presently allow links within the document or to other documents.

Special Characters

Since you're not typing XHTML, xhtmlpp replaces <, > and & by XHTML metacharacters. < and > are used to indicate hyperlinks.

 

Xhtmlpp Intrinsic Functions

Xhtmlpp provides a number of intrinsic functions that you can use in your text. The syntax for using an intrinsic function is:

&function-name(arguments)
This function: Does this:
&date("picture", date) Format specified date using picture
&date("picture", date, lc) Format specified date using picture and language code.
&date("picture") Format current date using picture
&date() Return current date value
&time() Format current time as hh:mm:ss
&week day([date]) Get day of week, 0=Sunday, 6=Saturda
&year week([date]) Get week of year, 1 is first full we
&julian date([date]) Get Julian date for date
&lillian date([date]) Get Lillian date for date
&date to days(date) Convert yyyymmdd to Lillian date
&days to date(days) Convert Lillian date to yyyymmdd
&future date(days,[date]) Calculate future date
&past date(days,[date]) Calculate past date
&date diff(date1,date2) Calculate differences between dates
&image height("image.ext") Get image height (GIF, JPEG, PNG etc)
&image width("image.ext") Get image width (GIF, JPEG, PNG etc)
&image size("image.ext") Get image size (width & height as string) (GIF, JPEG, PNG etc)
&file size("filename",arg) Get size of file: optional arg K or
&file date("filename") Get date of file
&file time("filename") Get time of file as hh:mm:ss
&normalise("filename") Normalise filename to UNIX format
&system("command") Get result of some system utility
&upper("string") Convert string to uppercase text
&lower("string") Convert string to lowercase text
&pageref("page","title") Build link for page index
&relpath("to") Get relative path from current document->to
&relpath(["from"],"to") Get relative path from->to
 

The &date Function

Syntax:

&date(picture, value)
&date(picture, value, language)
&date(picture)
&date()

Without a picture, returns the current date. With a picture, formats the current date according to a picture that you specify. You can optionally supply a date value in the standard 8-digit format; YYYYMMDD (as returned by &date()), or use 0 to indicate today's date. You can optionally follow the picture and value by a language code; the values currently accepted are "es" for Spanish, "fr" for French, and "dk" for Danish. Anything else is taken to mean English. If no language is specified, $(LANG) is used by default. The picture can consist of any mixture of these elements:

cc century 2 digits, 01-99
y day of year, 1-366
yy year 2 digits, 00-99
yyyy year 4 digits, 100-9999
m month, 1-12
mm month, 01-12
mmm month, 3 letters
mmmm month, full name
MMM month, 3 letters, ucase
MMMM month, full name, ucase
d day, 1-31
dd day, 01-31
ddd day of week, Sun-Sat
dddd day of week, Sunday-Saturday
DDD day of week, SUN-SAT
DDDD day of week, SUNDAY-SATURDAY
w day of week, 1-7 (1=Sunday)
ww week of year, 1-53
q year quarter, 1-4
\x literal character x
other literal character

Examples:

.echo &date()             --> Nov 13, 99
.echo &date('mm d, yy')   --> Dec 2, 98
.echo &date('d mmm, yy')  --> 2 Dec, 98
.echo &date("yymd")       --> 9812 2
.echo &date("yyyymmdd")   --> 19981202
.echo &date("d \de mmmm \de yyyy", 0, "es")  --> today's date in Spanish
 

The &time Function

Syntax:

&time()

Formats the current time in the same way as the $(TIME) symbol. The difference is that $(TIME) is set when xhtmlpp starts working; &time() reflects the current time.

 

The &week_day Function

Syntax:

&week_day()
&week_day(date)

Returns the day of the week for the specified date, or for the current date if no argument is given. Day 0 is Sunday; day 6 is Saturday.

 

The &year_week Function

Syntax:

&year_week()
&year_week(date)

Returns the week of the year for the specified date, or for the current date if no argument is given. Week 1 is the first full week, starting with a Sunday.

 

The &julian_date Function

Syntax:

&julian_date()
&julian_date(date)

Returns the Julian date for the specified date, or for the current date if no argument is given. Day 1 is January 1.

 

The &lillian_date Function

Syntax:

&lillian_date()
&lillian_date(date)

Returns the Lillian date for the specified date, or for the current date if no argument is given. This is the number of days since a starting (but unspecified) epoch (which in fact is around 1582).

 

The &date_to_days Function

Syntax:

&date_to_days(date)

Returns the Lillian date for the specified date. This function is really a the same as &lillian_date() except that you must supply a date argument. It's provided for orthogonality with &days_to_date().

 

The &days_to_date Function

Syntax:

&days_to_date(days)

Converts a Lillian date back into a normal date in the form yyyymmdd. You can use this function (in combination with the reverse function, &date_to_days()) to calculate past and future dates.

 

The &future_date Function

Syntax:

&future_date(days)
&future_date(days,date)

Calculates a date at some point in the future. For instance, &future_date(1) will produce tomorrow's date. If the date argument is not provided, calculates from today.

 

The &past_date Function

Syntax:

&past_date(days)
&past_date(days,date)

Calculates a date at some point in the past. For instance, &past_date(1) will produce yesterday's date. If the date argument is not provided, calculates from today.

 

The &date_diff Function

Syntax:

&date_diff(date)
&date_diff(date1,date2)

Calculates the difference between two dates, in days. The calculation is date1 - date2. If date2 is not supplied, calculates using today, and will therefore return a positive value if date is in the future, and a negative value if date is in the past.

 

The &image_width Function

Syntax:

&image_width(filename)

Returns the width of the specified image. The width is returned in pixels. The image_ intrinsics use the Image::Size perl module, and hence will work for any image fileype that it supports. This includes GIF, JPG and PNG.

 

The &image_height Function

Syntax:

&image_height(filename)

Returns the height of the specified image, in pixels. The file can be any type of image file which is supported by the Image::Size perl module (GIF, JPG, PNG etc)

 

The &image_size Function

Syntax:

&image_size(filename)

Returns the width and height of the specified image, in the form

width="width" height="height"
The file can be any type of image file which is supported by the Image::Size perl module (GIF, JPG, PNG etc)

 

The &file_size Function

Syntax:

&file_size(filename)
&file_size(filename, K)
&file_size(filename, M)

Returns the size of the specified file. If the second argument is K or M, calculates the size in Kb or Mb as appropriate. Always returns an integer value.

 

The &file_date Function

Syntax:

&file_date(filename)

Returns the date of the specified file, as an 8-digit value, YYYYMMDD.

 

The &file_time Function

Syntax:

&file_time(filename)

Returns the time of the specified file, as a string, HH:MM:SS.

 

The &normalise Function

Syntax:

&normalise(filepath)

Returns the filepath in a UNIX-style format. You can use this, for instance, under MS-DOS, when filenames taken from (e.g.) the environment contain back slashes which can cause problems. Replaces by / and spaces by underlines.

 

The &system Function

Syntax:

&system(string)

Returns the result of some system utility. For instance:

.define SERVER  http://&system("hostname")
 

The &upper Function

Syntax:

&upper(string)

Returns the string in uppercase letters.

 

The &lower Function

Syntax:

&lower(string)

Returns the string in lowercase letters.

 

The &pageref Function

Syntax:

&pageref("page","title")

Does the same as this xhtmlpp code:

.if "name" eq "refer2.html"
title
.else <a href="page">title</a>
.endif

This function strips-off any XHTML tags that you put around the title text when it uses it to build a link. So, you can do this kind of thing, which can, for example, be used to build an index in the page footer:

.block index_entry
&pageref("$(INDEX_PAGE)","<em>$(INDEX_TITLE)</em>")
.endblock

Beware of this potential problem, however: if you use this definition of index_entry, you can't use page-titles which include parentheses, because xhtmlpp will interpret the closing parenthesis in your title as the closing parenthesis of the pageref intrinsic.

 

The &relpath Function

Syntax:

&relpath(["from"], "to")

Returns the relative path from 'from' to 'to'. If only one argument is specified, the current XHTML page is used as 'from'. For example:

&relpath("john/peter/david/me.html","john/henry/you.html")
would return "../../henry/you.html". You can use this to create a web site with pure relative references to other pages. Remember that you don't need to use this function with $(*name) links if $(USE_RELPATH) is set to 1. Note that if you use relative references you can test and use the XHTML pages on a local hard disk as well as on the server without changes.

Multipass Processing

Xhtmlpp uses a multipass technique to allow embedded blocks. For example, you can place .include actions in the header or footer blocks, or define your own blocks that have .define, .page, and other actions.

Xhtmlpp handles this using the following rules:

  1. .include actions are executed immediately.
  2. .block actions are executed immediately.
  3. .build actions are executed as soon as possible after the first pass. This allows xhtmlpp time to collect the document titles, which it needs to build the table of contents.
  4. .if ... .else ... .endif actions are handled at once.
  5. .page commands are handled in two stages; headers and footers are built in the second pass, and individual page files are built during the last pass.
  6. Xhtmlpp will process the document (actually a temporary copy) as many times as necessary, until all actions have been processed.

One consequence of this is that xhtmlpp needs a minimum of 3 passes to fully process a document, one to collect all the titles; one to insert page headers and footers, and a last one to break the text into individual pages. If any genius can reduce this to two (or one!) pass, go ahead.

The upside is that you can do really funky stuff in headers and footers: for instance, the xhtmlpp pages build a document index in the footer, switching hyperlinks on and off to indicate the current page in the index.

Multipass Debugging

To see what xhtmlpp is doing with its passes, use the -debug option, like this:

xhtmlpp -debug filename
This leaves a number of .wrk files lying around; these contain the result of each pass.

Other Things to Know


| << | < | > | >>
| xhtmlpp - The XHTML Preprocessor | Installing Xhtmlpp | Getting Started | Xhtmlpp Reference | Frequently Asked Questions | Other Information
KatSpace
Validate Me!