AscToHTM Documentation for the AscToHTM conversion utility
This documentation can be downloaded as part of the documentation set in .zip format (200k).

Prev | Next | Contents


1 Introduction

AscToHTM is an ASCII to HTML conversion tool. It has, of course, been used to generate the HTML version of this document from the text file a2hdoco.txt (see 6.1 for more details). The HTML version of this document is presented "as is". That is, no post-production of the HTML has occurred. This should give you a flavour of what AscToHTM is capable of.

Any RTF version of this document will have been made by AscToRTF, the sister product that shares the same text analysis engine.

AscToHTM is made available for download via the Internet from the download page.


1.1 AscToHTM's design objectives

1.1.1 Intelligent analysis.

AscToHTM is designed to analyse a document to determine its structure and layout. This analysis allows AscToHTM to decide how best to mark up the HTML so as to accurately represent the author's original meaning.

It also helps AscToHTM to reduce errors by allowing it to spot anomalies in the document source. This is important in minimising the amount of any post-production work required.


1.1.2 Human-readable HTML

AscToHTM tries to create HTML that can be easily read and modified in an editor. This is useful if corrections are necessary, or further development is required.

For example AscToHTM

  1. produces short (usually <80 character) output lines

  2. attempts to indent the HTML to match the output indentation.

  3. adds comments to the HTML to indicate include files etc.

  4. uses <BLOCKQUOTE> tags for indentation, rather than placing the whole file in <TABLE>...</TABLE> tags.

Note, later moves to make more standards-compliant and browser-compatible HTML code tend to work against making user-readable code. For example most browsers have rendering problems when newline characters are placed in certain key locations, whereas adding newline characters can make the HTML easier to read.


1.1.3 Simple user input

Inevitably users have supply additional information to tell AscToHTM where its analysis has gone wrong and to add additional information such as a document title etc. AscToHTM offers a large number of options (also known as "policies") that the user can modify.

Broadly speaking, these policies fall into two camps

AscToHTM can save your policies to a file, so that next time you run it you can load this information back from the "policy" file.

Policies are described fully in the Policy manual. Previously they were described in section 6 of this document.

You can further refine the conversion by placing special lines and tags into your source file. These are known as pre-processor commands (see Using the preprocessor) and in-line tags (see Using in-line tags).

To help users formulate and modify their document's policy, AscToHTM can be made to create an output policy file (see 4.2.2.8). Users can then simply edit this file and feed it back into the conversion process.

A summary of the recognised policy lines is given in the Policy manual.


1.1.4 Standards compliance.

New in version 3.2

Prior to version 3.2 AscToHTM made no real attempt to be standards compliance. It didn't produce bad HTML, it simply didn't produce strictly correct HTML either.

From 3.2 onwards, standards compliance is a stated goal. We can't guarantee standards compliance because the HTML generation is so complex that errors can and do occur, but it is a goal.

Compliance has proved to be vital to get cross-browser compatability, and to stand a chance of successfully applying CSS to created pages.

I wouldn't claim to be a born-again standards bearer, but I certainly "get the point" now, and hang my head in shame for my past sins.

Original versions of AscToHTM were (loosely) targeted at producing HTML 3.2 code.

Version 3.2 is targeted at "HTML 4.0 Transitional", which allows CSS, but also permits <FONT> tags (although these are deprecated). This is a compromise standard that is best placed to be well viewed by V3 and V4 browsers.

Future versions of the program will attempt to generate stricter HTML 4.0 code, while still offering production of the earlier HTML standards.


1.2 Expected uses of AscToHTM

Large amounts of unconverted text exist. As people plan to put this information on the Web, conversion to HTML will become necessary.

This can be a tedious and time-consuming task. AscToHTM will do much of the work for you.

AscToHTM is priced to be worth an hour of two of your time. Check the registration page for details.

This means that the "pay back" time is negligible (we only mention this in case you have bean-counters to convince :). If you don't think AscToHTM will save you hours, then by all means don't buy it.

The HTML created by AscToHTM may not be as pretty or as clever as that generated by a full blown HTML editor (read as "bloated").

But...

It'll be easier to write, edit and spell-check, and it may have a hyperlinked contents list generated.

AscToHTM can be used to automatically convert text documents that you receive. For this we usually suggest you run in command line mode.


1.3 Other uses of AscToHTM

Please note, AscToHTM DOES NOT convert Word's .doc or .rtf file formats.

AscToHTM was never intended to handle Word documents. We fully expect HTML export and import filters to appear (they have in Word '97), and we would advise anyone whose master document is in Word to search out these filters and give them a try.

That said... a lot of people seem unhappy with what's already available, and AscToHTM does a reasonable job if you save the file as text with line breaks, though obviously tables and figures will get lost (in the case of tables, because Word throws them away).

The main problem is that Word produces lousy looking text. This is one area where AscToHTM does a little better than "garbage in, garbage out"

(This is a bit cheeky, but does actually work.).

Use AscToHTM to convert text to HTML, then import this into your word processing package. Since the text analysis engine in AscToHTM out-performs that in Word in many respects (URL, table and heading detection to name but three), you can often get better results than importing from text direct..

That's because AscToHTM's analysis engine is smarter.

NOTE:
In the near future the author plans to use the analysis engine to generate a text-to-RTF convertor more suited to this purpose. This uses the same analysis, but slightly different formatting. See the AscToRTF home page for details.

Use AscToHTM to convert text to HTML, then print the file from within Netscape or whatever. The result is a much nicer looking document with fonts'n'stuff.

AscToHTM has a "link dictionary" feature that can be used to add hyperlinks to any word or phrase (see the Policy manual).

This can greatly enhance an otherwise dull set of text pages.



Prev | Next | Contents


Valid HTML 4.0! Converted from a single text file by AscToHTM
© 1997-99 John A. Fotheringham
Converted by AscToHTM