>20 6 : Using Document Policy files  A 8
AscToHTM 

Documentation for the OAscToHTM conversion utility



9

This documentation can be downloaded as part of the  documentation set in .zip format (370k)




V Previous page $@ Back to Contents List#R Next page 


 1

6 Using Document Policy files

x

This chapter has been largely superceded by the Policy manual

K

Document policy files are ordinary text files that list the "policies"D that AscToHTM should implement when converting your document. TheJ file can have added comment lines (starting with a "!" or "#" character) and headings for clarity.

q

A summary of the recognised policy lines is given in the Policy manual.

K

In most cases recognised policy lines are identical to those listed inp the generated policy file (see 4.1). This is usually a good place to( start when making your own policy.

F

Only those lines that are recognised policies are acted upon.

G

To use a policy file, simply list it on the command line after theb name of the file being converted (see 4.2.2.3).

/

Document policies have two main uses :


    
  1. F To correct any failure of analysis that AscToHTM makes. HopefullyL this won't be needed too much as the core analysis engine improves.
  2. 

E

Examples include page width, whether or not underlined section headings are expected etc.



    
  1. F To tell the program how to produce better HTML end product in waysC that couldn't possibly be inferred from the original text.
  2. 

E

Examples include adding colour and titles to the page, as wellC as requesting a large document is split into several pages, and a contents list created.


P

The document sections in this chapter that described the policies in detailg have been moved to a standalone document called the "Policy manual".H That document describes the scope, effect, location and default values1 for all policies recognised by the program.


 $

6.1 An example conversion

F

This documentation has itself been converted using AscToHTM. The files used were

 
_

This policy file "includes" the link dictionary a2hlinks.dat.


 C

These files are included in the distribution kit as an example set of documentation.

H

You can, of course, use AscToHTM to convert this doco into whatever' format, colour etc that you wish.




6.2 Analysis policies

I

These policies are used to control and correct the analysis of files@ during conversion. Full descriptions of these policies can beB found in the Policy manual.

 -

6.2.1 Overview ("look for") policies

J

The following analysis policies help give you an overview of what theM program is looking for, and to enable/disable what is being looked for.

Y

"Look for indentation"
d "Look for hanging paragraphs"
V "Look for white space"
V "Look for short lines"
b "Look for horizontal rulers"
V "Minimum ruler length"
N "Look for bullets"
Z "Search for definitions"
V "Look for quoted text"
n "Look for MAIL and USENET headers"
b "Look for preformatted text"
^ "Attempt TABLE generation"
O "Look for diagrams"


 &

6.2.2 General Layout policies

Q

The following analysis policies help control general layout parameters:-

E

"Page width"
> "TAB size"
P "Short line length"
N "Min chapter size"

q

"Expect blank lines between paras"
h "Hanging paragraph position(s)"

]

"Search for Definitions"
V "New Paragraph Offset"
L "Definition Char"

U

"Indent position(s)"


 

6.2.3 Bullet policies

K

AscToHTM has the following bullet point policies that will normally be2 correctly calculated on the analysis pass :-

P

"Look for bullets"

c

"Expect alphabetic bullets"
\ "Expect numbered bullets"
e "Expect roman numeral bullets"

.
g

"Recognise '-' as a bullet"

b"Recognise 'o' as a bullet"

A"Bullet char"

1
I

AscToHTM tries hard not to get confused by the "1", "a" and "I" thatD happen to end up at the start of lines by random. These could get! mistaken for bullet points.


 )

6.2.4 Contents analysis policies

5

There is only one analysis contents policy:-

X

"Expect contents list"

J

This is described together with all the output contents list policiesB in Contents generation policies

n

For more information on content list generation see 5.6.2.


 &

6.2.5 File Structure policies

M

AscToHTM has the following file structure policies that will normally be need to be set manually :-

L

"Keep it simple"

W

"Expect code samples"
r "Input file contains DOS characters"
p "Input file contains MIME encoding"
h "Input file contains PCL codes"
| "Input file contains Japanese characters"
b "Input file has change bars"
d "Input file has page markers"
d "Page marker size (in lines)"

U

"Text Justification"
c "Input file is double spaced"


 

6.2.6 Heading policies

N

AscToHTM has the following section heading policies that will normally be2 correctly calculated on the analysis pass :-

a

"Expect Numbered Headings"
b "Expect Underlined Headings"
d "Expect Capitalised Headings"
^ "Expect Embedded Headings"
S "Heading key phrases"

r

"Check indentation for consistency"

g

"Expect Second Word Headings"
V "First Section Number"
n "Smallest possible section number"
l "Largest possible section number"




i "Preserve underlining of headings"



G Section headers are far and away the most complex things the analysisC pass has to detect, and the most likely area for errors to occur.



E AscToHTM will also document to a policy file the headings it finds.= This is still to be finalised, but currently has the format

*
#      We have 4 recognised headings,          Heading level 0 = "" N at indent 0.          Heading level 1 = "" N.N at indent 0-          Contents level 0 = "" N at indent 0/          Contents level 1 = "" N.N at indent 2
1

K

AscToHTM will read in such lines from a policy text file, but does notB yet fully supported editing these via the Windows interface.

K

The syntax is explained below, but this will probably change in futureF releases. You can edit these lines in your policy file, and through$ the policy options in Windows.

6

The lines are currently structured as follows


'8 , #    4    4    9    7   "    T9  f
Line component
Value
xxxx

Either "Heading" or "Contents" according
8 to the part of the policy being described
Level n

Level number, starting at 0 for chapters
* 1 for level 1 headings etc.
"Some_word"


Any text that may be expected to occur before
: the heading number. E.g. "Chapter" or "Section"
0 or "[". The case is unimportant.
N.Nx


The style of the heading number. This will
0 ultimately (in later versions) be read
5 as a series of number/separator pairs.





The proposed format is
 "N" = number
4 "i" / "I" = lower/upper case roman numeral
9 with an 'x' at the end signalling that trailing
8 letters may be expected (e.g. 5.6a, 5.6b)
at indent n


The indentation that this heading is expected
H8 at. This is important in helping to eliminate
false candidates.
.
L
T


; n*

6.2.7 Pre-formatted text policies

N

AscToHTM has the following section heading policies that will normally be2 correctly calculated on the analysis pass :-

n

"Minimum automatic <PRE> size"


 &

6.2.8 Table analysis policies

*

New in version 4

F

AscToHTM uses the following policies to control the detection and analysis of tables :-

}`

"Attempt TABLE generation"

\

"Table extending factor"

Y

"Expect sparse tables"
Nt "Ignore table header during analysis"
X "Column merging factor"
k "Minimum TABLE column separation"

Y

"Default TABLE layout"
u "Tables could be blank line separated"

T
T o

6.3 Output policies

>

These policies are used to output and generation of files@ during conversion. Full descriptions of these policies can beB found in the Policy manual.

H t"

6.3.1 Added HTML policies

M

AscToHTM has the following HTML policies that will only ever take effectt* if supplied in a user policy file :-

e

"Use first heading as title"
"\ "Use first line as title"
I "Document title"

NY

"Document description"
DP "Document keywords"
N "Background Image"

Q

"HTML header file"
UN "HTML footer file"
M "HTML Script file"

e

"Omit <HEAD> and <BODY> from output"
P "Document Base URL"
\ "Comment generation code"
S "HTML fragments file"

iM

These "polices" allow you to start "adding value" to the HTML generated..H That is, they allow to specify things that cannot be inferred from the original text.

G

You can also add HTML to your files by using the HTML preprocessor.D command (see 7.1.1)


 E3

6.3.2 Cascading Style sheet policies (CSS)

*

New in version 4

D

AscToHTM has the following HTML policies that influence the use% of CSS in the HTML generated :-

X

"Document Style Sheet"

0

Not visible in the user interface is :-

g

"Create embedded style sheet"

a
= o+

6.3.3 Contents generation policies

J

AscToHTM has the following HTML policies that influence the detection) and generation of contents lists :-

uY

"Expect Contents List"

pS

"Add contents list"
to "Maximum level to show in contents"

l

"Use any existing contents list"

o

"Generate external contents file"
External contents list filename"

s[

"Hyperlinks on numbers"

.T

See also the discussion in 5.6.2


p e'

6.3.4 Document Colour policies

*

New in version 4

F

AscToHTM has a large number of HTML policies that can control theD colouring of the files. These policies are spread across a number of areas of functionality.



General

id

"Suppress all colour markup"

U

"Active Link Colour"
fP "Background Colour"
D "Text Colour"
X "Unvisited Link Colour"
S "Visited Link Colour"



Frames

m

"Header frame background colour"
v^ "Header frame text colour"
n "Contents frame background colour"
b "Contents frame text colour"
j "Footer frame background colour"
] "Footer frame text colour"

l

Tables

Q

"Colour data rows"
d "Default TABLE border colour"
V "Default TABLE colour"
h "Default TABLE even row colour"
e "Default TABLE odd row colour"

l
a u&

6.3.5 Directory Page policies

F

AscToHTM has the following policies that can be used to influenceC whether or not AscToHTM will attempt to generate a Directory page=F for the files being converted. This is really only appropriate whenb converting more that one file at once (see 4.3.3)

C

The Directory Page will consist of entries for each file beingrD converted (in order of conversion), and can have hyperlinks to theI files, and to recognised headings in the files. This makes it suitableEC for use as a master index to a set of files converted in a singlea directory.

M

"Make Directory"
f "Indent headings in Directory"
h "Show file titles in Directory"
Q "Directory filename"

lO

"Directory title"
lX "Directory description"
R "Directory keywords"
k "Directory return hyperlink text"

p[

"Directory header file"
sX "Directory footer file"
W "Directory script file"

#
o R'

6.3.6 File generation policies

s
M

AscToHTM has the following HTML policies that affect the file generationH process :-

O

"Input directory"
tN "Output directory"
R "Use .HTM extension"
W "Output file extension"

<}

"Preserve file structure using <PRE>"
h\ "Preserve line structure"
i "Treat each line as a paragraph"

ee

"Generate diagnostics files"
_R "Output policy file"
Z "Output policy filename"

S

"DOS filename root"
"P "Use DOS filenames"

G

"Split level"
=R "Min HTML File size"
R "Add navigation bar"
\ "Minimise HTML file size"

a

"Break up long HTML lines"

iN

These policies specify how your document is divided into one or more HTMLE files, and how those files are to be named and linked together withA hyperlinks.

5
S U

6.3.7 Font policies

G

AscToHTM supports the implementation of fonts via either Cascading>5 style sheets (CSS) or via the <FONT> tag.

"

Related policies are :-

e

"Use CSS to implement fonts"
uE "Default font"

h
 u

6.3.8 Frames policies

*

New in version 4

D

From version 4 onwards AscToHTM will support the output of HTMLC as a set of HTML FRAMES. A large number of policies support thisa process.



General

J`

"Place document in frames"

R

"Output frame name"
M "Add Frame border"

#m

"Open frame links in new window"
ua "New frame link window name"

sU

"Add NOFRAMES links"
6O "NOFRAMES link URL"

h2

Header and Footer frame policies

o

"Use main header in header frame"
aQ "Header Frame depth"

to

"Use main footer in footer frame"
aQ "Footer Frame depth"

e

Contents frame

m

"Add contents frame if possible"
PV "Contents Frame width"
q "Number of levels in contents frame"

i

Main Frame

G

"Split level"
Min HTML File size"
[ "First frame page number"

r

Frame colours

Bm

"Header frame background colour"
e^ "Header frame text colour"
n "Contents frame background colour"
b "Contents frame text colour"
j "Footer frame background colour"
] "Footer frame text colour"


 t!

6.3.9 Hyperlink policies

l
I

AscToHTM has the following hyperlink policies set as defaults :-

OS

"Create hyperlinks"
eT "Create mailto links"
r "Allow email beginning with numbers"
] "Check domain name syntax"

sW

"Create gopher links"
eN "Create FTP links"
g "Only allow explicit FTP links"

S

"Create NEWS links"
nX "Only use known groups"
] "Recognised USENET groups"

l

"Add <BR> to lines with URLs"

V

"Cross-refs at level"

o

"Open link in new browser window"
T[ "new browser window name"

hv

Hyperlinks can also be added by using a link dictionary (see 4.3.2.28 and 4.4.2).


 (

6.3.10 Link Dictionary policies



r8 Link definitions appear in a policy file as follows :-

t*
         [Link Dictionary]         ----------------- i        Link definition       :  "a2hdoco.txt" = "Source text" + "/~jaf/A2HDOCOi
1

J

That is, the text to be matched, the text to be used in its placed asJ the highlighted text, and the URL this link is to point to (in this case a relative URL).

See the discussions in 4.3.2.2 and 4.4.2.


i %

6.3.11 Preprocessor policies

F

AscToHTM has the following policies that can be used to influencex the preprocessor (see Using the preprocessor), and hence the HTML output :-

Q

"Use Preprocessor"
lc "Include document section(s)"

mh

"Allow definitions inside PRE"


o %

6.3.12 HTML styling policies

t
G

AscToHTM has the following "styling" that can be used to influenceh the HTML output :-

a

"Allow automatic centring"
nf "Automatic centring tolerance"
d "Ignore multiple blank lines"

c

"Highlight definition text"
_r "Use <DL> markup for defn. paras"

g

"Largest allowed <Hn> tag"
lf "Smallest allowed <Hn> tag"
L "Headings colour"
m "Preserve underlining of headings"

V

"Search for emphasis"

t

"Use <EM> and <STRONG> markup"

j

"Preserve New Paragraph Offset"

8

Also, not available in the user interface is :-

t

"First line indentation (in blocks)"

> )

6.3.13 Table Generation policies

F

AscToHTM has the following policies that can be used to influenceB whether or not AscToHTM will attempt to detect and generate HTML9 tables, and the attributes of any tables generated.

rI

Tables may be tailored individually by adding pre-processor commandsoP to your source text (see 7.1.4)

a

"Attempt TABLE generation"

He

"Default TABLE cell spacing"
_b "Default TABLE cell padding"
` "Default TABLE border size"
S "Default TABLE width"

lY

"Default TABLE colour"
.d "Default TABLE border colour"

Q

"Colour data rows"
Ph "Default TABLE even row colour"
e "Default TABLE odd row colour"

v_

"Default TABLE alignment"
ee "Default TABLE cell alignment"

oj

"Convert TABLE X-refs to links"

L

The following policies can only be changed through policy file, but areG probably best not used in favour of the their equivalent preprocessor< tags.

Z

"Default TABLE caption"

c

"Default TABLE header rows"
_ "Default TABLE header cols"

sr

"Column boundaries have zero width"

p

"Use <CODE>..</CODE> markup"


" &

6.3.14 Miscellaneous policies

F

AscToHTM supports the following policies which currently can only* be added by editing the .policy file



Contents List
uo "Add mail headers to contents list"



CSS
hc "Create embedded style sheet"

o!

File generation
r^ "Break up long HTML lines"
d "HTML version to be targeted"
j "Lines to ignore at end of file"
m "Lines to ignore at start of file"

U

Fonts
B] "Suppress all font markup"

P

Headings
c "Expect Second Word Headings"
RU "First Section Number"
ry "Number of words to include in filename"

!

HTML Generation
tc "HTML version to be targeted"

r

Style
oq "First line indentation (in blocks)"

x

Tables
W "Default TABLE caption"

rc

"Default TABLE header rows"
o_ "Default TABLE header cols"

Br

"Column boundaries have zero width"

p

"Use <CODE>..</CODE> markup"


3 r

6.4 Settings policies

*

New in version 4

K

These policies are used to control the behaviour of the program duringpD the conversion process. Most program setting are not available asE policies, but those that are are listed here. Full descriptions of.X these policies can be found in the Policy manual.


r n

6.4.1 Error reporting

H

The following policies can be used to tailor the number and type of+ messages displayed during conversion.

aZ

"Error reporting level"

]

"Suppress INFO messages",
nc "Suppress TAG ERROR messages"
eW "Suppress URL messages"
_ "Suppress WARNING messages"
Suppress program ERROR messages"

p
t r.

6.5 Saving and loading policy files

C

This section has been copied into the Policy manual section on"P placing policies in a file

= _

6.5.1 Overview

o
G

AscToHTM allows you to save policies to file so that you can laters? reload them. This allows you to easily define different ways6B of doing conversions, either for different types of files, or to( produce different types of output.

F

The policy files have a .pol extension by default, and are simpleB text files, with one policy on each line. You can, if you wish,B edit these policies in a text editor... this is sometimes easier8 that using all the dialogs in the Windows version.

H

When editing policies, it is important not to change the key phraseI (the bit before the ":" character), as this needs to be matched exactlyy by AscToHTM.

F

For best results, it is advisable to put in your policy file onlyD those policies you want to fix. This leaves AscToHTM to calculateH document-by-document policies that suit the files being converted.


O
Note:
Avoid using "full" policy file for your conversions. Sucha; files prevent the program from adjusting to each sourcee, file, often leading to unwanted results.

/ a8

6.5.2 Generating policy files for your document

J

The normal way to create a policy file is by setting options and themG saving them using the "save policy file" dialog. This will offer youlD the choice of creating a partial policy file or a full policy file^ (see 6.5.2.1 and 6.5.2.2).

v

Alternatively, you can set the "Output policy file" policyD which will generate a full policy file resulting from the analysis of the converted document.

I

Once a file is generated you can either edit them in a text editor -eI deleting policies that are of little interest to you, and editing thoseeG that are - or reload them into the program, change them and save theml again.


A >%

6.5.2.1 Partial policy files

H

Partial policy files are files which have values for some, not all, policies.

I

These are recommended, because the it leaves AscToHTM free to adjust"E all the other policies not set in the file, allowing it to adapt too2 the details of the document being concerned.

Y

For example, you should only set the indentation policy if you knowI what indents you are using, or if you want to override those calculatedAL

When you save a policy file from inside AscToHTM, a partial policy file will contain


 d
d
/ y"

6.5.2.2 Full policy files

L

A "full" policy file contains a value for almost every possible policy.C Such files are usually only useful for documentation and analysis"F reasons, and should almost never be expected to be reloaded as inputJ into a conversion, as this would totally fix the conversion details.


M e"

6.5.3 Naming policy files

o

Whenever the "Output policy file" policy is set thee4 generated "full" policy file is usually called




<filename>.pol

_
aM

where <filename> is the name of the file being created. When thisfA happens any existing file of that name will be overwritten.

<Y

For this reason we strongly advise you adopt a naming convention of  the form


;

in_<filename>.pol or i<filename>.pol

F
>J

or place your input policies in a different directory and ensure they are backed up.



i 2

u
iV Previous page $@ Back to Contents List#HR Next page 


Bs& 
Valid HTML 4.0!Converted from 6 a single text file by A AscToHTM
>J© 1997-2001 John A Fotheringham
O"