The Metamark Markup Languages and Translators

The Metamark system is a flexible way to permit one source document to represent many versions. A book source can be translated to Latex chapters or interlinked HTML sections. A C program source can be translated to versions that compile with various C compilers as well as an annotated HTML version that spans multiple web pages.

History

When I wrote "Tcl/Tk for Programmers", I wanted a markup language that could efficiently produce interlinking HTML sections as well as indexed Latex chapters. I also wanted my markups to have syntax error checking similar to what Knuth built into TeX.

Since I figured there would be rather long stretches of text without markup commands, my language had only four symbols that caused the translator to hiccup and take closer notice: "+", "-", "{", and "}". I choose the syntax to be less "in my face" than SGML based syntaxes are. For example, +/l{hi} for <i>hi</i>.

I implemented that markup language in Perl. "Tcl/Tk for Programmers" was written in it. It produced a Latex version that was edited by a Latex expert to produce the publisher's page masters. It also produced an HTML version, 25% of which can be found at the book's home page ( "Tcl/Tk for Programmers"). I called this markup language the "HL" markup language.

After that book was finished, I began to think about adaptations to the HL markup language that would have made life easier for the layout editor as well as for future projects requiring different markup languages. These thoughts lead to my Metamark system which I implemented in Tcl/Tk and describe here. The Metamark system separated markup language definition from the languager parser and from any language translator. In short it enabled the creation of different markup languages with the same syntax and with multiple targets.

Then, I used Metamark to create a Tcl-extension markup language, the purpose of which was to speed up the creation of extensions to Tcl. The products of the Tcl-extension markup language are of two kinds: C source code and HTML documentation. The C source code can be produced in different versions for use with C/C++ translators on different platforms. The Tcl-extension markup language put me into a position where I could quickly create Tcl extensions that compile in different environments and are self-documenting.

The Metamark System

The Metamark system retains the use of four special symbols from its HL predecessor: "+", "-", "{", and "}". Commands appear in one of the following two forms:
+/command name

-/command name

Command names consist of alphameric characters and are not expected to be long. Optionally, commands may be followed by arguments as in
+/command name{argument}

-/command name{argument}

Seven Metamark commands are built in. Particular markup languages in the Metamark system do not differ in overall syntax rules or these seven built-in commands. They get their distinct features from other markup commands and from the translations that are assigned to these other commands. Command names, types, and number of parameters are described in a syntax file and command translations are described in a translator file. Both the syntax file and the translator file must be written in Tcl.

Each document is written for a particular markup language. That markup language is identified by putting

+/lang{name of syntax file}
at the beginning of the document. The Metamark system will run with a syntax file and no translator file. In that mode, it merely checks the document for syntax. When starting new projects I have begun writing documents before I have completed a translator file.

The kind of translation you want for a document can be ensured with a translator file. For a given Metamark language there may be more than one of these. Passing a translator file name along with a document name to the Metamark system will determine the kind of translation that is performed. For example, a markup language for C programs can have translator files that produce platform specific code.

Lets consider such a markup language, the one I created for making Tcl extensions. That language has a command

+/proc
for marking the names of those C functions that will be executed from Tcl. The command might be used this way
+/proc{advanceChar}{}
code for the advanceChar function goes here
+/end
There is no prototype or other form of declaration needed for advanceChar. The effect of what you see above is to generate boilerplate code (in different places of the target document) which fills out the necessary C declarations and also invokes the functions from Tcl's API for linking the advanceChar function to Tcl's run-time system.

Moreover, the boilerplate generated by this system is context dependent. It will be done one way if +/proc appears between +/class and -/class and object commands and another way if +/proc appears between +/object and -/object. (Here "class" and "object" have meanings very similar to those you are used to with object oriented languages. To understand the exact meanings in the context of Tcl, think of the difference in Tk between a button command and a .button1 configure command.)

I wrote translator files for three C compilers. This is much cleaner than the usual way of generating platform dependent code with macros because the source code is not burdened with the various alternatives. Duplicate code witin these three translator backends is avoided because the Metamark system permits one backend to inherit from another.

A final feature of Metamark is that it enables you to define a filter function. A filter is applied to all parts of a document that are not part of a markup command. Of course, you do not need to use this feature; a default filter function leaves all document text unfiltered. For the Tcl-extension example, the translators that produce C code use the default filter but the translators which produce HTML, make alterations so text characters that might otherwise begin HTML commands are not perceived by the browser to do so.

An Example of the Metamark System

To show how a Metamark language appears to the user, here's the original marked up C code which was the original document for a Tcl extension. The Metamark system produced this annotated HTML listing as well as separate unannotated versions that compile with gcc, Borland, and Microsoft C/C++ compilers.

Advantages of the Metamark System

  1. When I use the same markup commands from project to project, I can concentrate more on my writing and less on the markups.
  2. When moving onto a new system (e.g. a new target for printed text or a new target compiler), I don't have to learn that system well enough to think in it. It is only necessary to learn it well enough to write the back-end for a Metamark translator. The difference in time it takes to reach a useful level of skill can be quite large.
  3. Any tricky logic I want in controlling the new system can be done in Tcl rather than in a system that is new to me. Tcl is familiar and powerful. It is not a bad choice for defining Metamark syntax and writing Metamark backends. Even so, if I ever rewrite Metamark I'll probably use Python. You, of course, could use any suitable language that you are comfortable with. If you do port or redesign Metamark, I would like to hear about it.processing systems. The time I've invested in this system is not lost
  4. Using Metamark, my approach to avoiding duplicated effort remains the same when I work in different programming language and text as I move on.
  5. The same source document can produce a single document translation or a multiple document translation.
  6. Things that ought to be automatically generated can be automatically generated. When I am working in a new system, this is often not done simply because I am not yet efficient enough in writing for that system to do the things necessary to avoid duplicating my efforts. With Metamark I am using an old and familiar system to avoid duplicated effort.
  7. Often I don't have to know exactly what I'm doing when I start a project. I feel better about this when I can easily diddle my markup language and its translators as I understand my needs better.
Related to Tcl/Tk for Programmers
June 8, 2000