next up previous contents
Next: Feedback Up: Workflow of a Conversion Previous: Specifying the Goal

Intercepting the Converter

Doing the conversion is no problem if the conversion tools do what they should. But this is rarely the case. Often, the tools produce incorrect output, that can be fixed in one of two ways:

Early interception in the conversion process means to recognize any errors produced by a converter, and to fix the converter to no longer produce this error. This is usually an iterating process, running the converter, fixing, running it again, ...until all bugs have been identified and fixed. There's still a number of fixed to do by hand afterwards, which are either not possible to fix in the converter or source at all, or which would cost too much time and knowledge to fix, both of which may not be available. When fixing the converter, this requires having the source code, which is usually no problem when using freely available products. When using commercial converters, this is a problem, though. Bug reports can be sent to the vendor, but usually the time until a fix comes out is way to long to wait for it. Nevertheless fixing the converter means one only has to do a change in one place, and not many places all over the document (possibly leaving something out by mistake). Another advantage of fixing the converter is that the made fix can be contributed to everyone having the same problem, resulting in the enhancement and support of freely available software.

In case the source for a converter is not available or one doesn't have the knowledge to modify it, the resulting output of the conversion step has to be post processed, either by hand or by a (set of) scripts which fix the mistakes the converter introduced. Often, these fixes can be applied by using a (multi-file) search-and-replace-operation, but sometimes the fix is not so easy. Some scripting language for processing the HTML file is needed then, usually perl, as it's available on all major conversion platforms. With such a language, it's e.g. possible to find out the first heading (number and/or name) and to write it into the title of a HTML document, to find out which page or chapter number a given symbolic reference in a LATEX of FrameMaker document corresponds to, etc. See [6] and [5] for some examples.

If it's not possible to employ such a language to fix a problem in all files, the output has to be fixed by hand. This is no problem if a mistake only occurs in a few places, but if a mistake occurs all over the document, fixing by hand is an endless waste of time that should be avoided if possible.


next up previous contents
Next: Feedback Up: Workflow of a Conversion Previous: Specifying the Goal
Hubert Feyrer
1998-03-18