Document revision date: 30 March 2001
[Compaq] [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]
[OpenVMS documentation]

Guide to the DEC Text Processing Utility


Previous Contents Index

3.7.3.2 & (Pattern Linking Operator)

The link operator (&) is similar to the concatenation operator (+). Unlike the concatenation operator, the link operator does not necessarily cause an anchored search. If you define a pattern by specifying any pattern element, an ampersand (&), and a pattern or keyword variable, a search for each subpattern is not an anchored search.

If you link elements other than pattern variables, the search is an anchored search unless you specify otherwise. Strings, constants, and the results of built-in procedures are not pattern variables.

For example, suppose you defined two subpattern variables as follows:


p1 := "a" & ANY("012345678"); 
p2 := "c" & ARB (1); 

You then define the following pattern variable:


pat_var := p1 & p2 

Given this sequence of definitions, a search for pat_var succeeds if DECTPU encounters the following string:


a5xcd 

Because two pattern variables are linked, DECTPU searches first for the text that matches p1, then unanchors the search, and then searches for the text that matches p2.

To specify an anchored search when the right-hand subpattern is a pattern or keyword variable, use a plus sign (+). You must use a plus sign (+) to anchor the search if the right-hand subpattern is a keyword variable. If the right-hand subpattern is a pattern variable, you can use the ANCHOR keyword as the first element of that subpattern to anchor the right-hand subpattern.

For example, suppose you defined the following patterns:


p1 := LINE_BEGIN + "a"; 
p2 := "b" + LINE_END; 

You anchor the search for p2 by using (+) as follows:


pat_var := p1 + p2; 

If you use an ampersand (&), you unanchor the search for p2.

You can also anchor the search for p2 by defining p2 as follows:


p2 := ANCHOR + "b" + LINE_END; 

3.7.3.3 | (Pattern Alternation Operator)

The alternation operator (|) tells SEARCH or SEARCH_QUIETLY to match a sequence of characters if those characters match either of the pattern elements separated by the alternation operator. The following pattern matches either the string abc or the string xes:


pat1 := "abc" | "xes"; 

If the text being searched contains text that matches both alternatives, SEARCH or SEARCH_QUIETLY matches the earliest occurring match. If two matches start at the same character, SEARCH or SEARCH_QUIETLY matches the left element. For example, suppose you had the search text abcd and the following pattern definitions:


pat1 := "abc" | "bcd"; 
pat2 := "bcd" | "abc"; 
pat3 := "bc"  | "bcd"; 
pat4 := "bcd" | "bc"; 

Given these definitions and search text, a search for the patterns pat1 and pat2 would return a range that contains the text abc. A search for the pattern pat3 would return a range that contains the text bc. Finally, a search for the pattern pat4 would return a range that contains the text bcd.

3.7.3.4 @ (Partial Pattern Assignment Operator)

The partial pattern assignment operator (@) tells SEARCH or SEARCH_QUIETLY to create a range that contains the text matching the pattern element to the left of the partial pattern assignment operator. When the search is completed, the variable to the right of the partial pattern assignment operator references the created range. If SEARCH or SEARCH_QUIETLY is given the search text abcdefg and the following pattern, it returns a range that contains the text abcdefg:


pat1 := "abc" + (arb(2) @ var1) + remain; 

SEARCH or SEARCH_QUIETLY also assigns to var1 a range that contains the text de.

If you assign to a variable a partial pattern that matches a position, rather than a character, the partial pattern variable is a range that contains the character or line-end at the point in the file where the partial pattern was matched. For example, in any of the following patterns that contain partial pattern assignments, the variable partial_pattern_variable contains the character or line-end at the point in the file where the partial pattern was matched:

If you use one of the preceding patterns when the cursor is free (that is, in an area that does not contain text, such as the area after the end of a line), the variable partial_pattern_variable contains the line-end or character nearest to the cursor.

SEARCH or SEARCH_QUIETLY does partial pattern assignment only if the complete pattern matches. If the complete pattern matches, it makes assignments only to those variables paired with pattern elements that are used in the complete match. If a partial pattern assignment variable appears more than once in a pattern in places where it is legal for a partial pattern assignment to occur, the last occurrence in the pattern determines what range SEARCH assigns to the variable. For example, with the search text abcdefg and the following pattern, SEARCH or SEARCH_QUIETLY returns a range that contains the text abcde and assigns a range that contains the text d to the variable var1:


pat1 := "a" + ("b" @ var1) + "c" + ("d" @ var1) 
        + ("e" | ("x" @ var1)); 

3.7.3.5 Relational Operators

You can use the two relational operators, equal (=) and not equal (<>), to compare patterns. Two patterns are equal if they are the same pattern, as pat1 and pat2 are in the following example:


pat1 := notany("abc", 2) + span("123"); 
pat2 := pat1; 

Two patterns are also equal if they have the same internal representation. Patterns have the same internal representation only if they are built in exactly the same way. The order of the characters in the arguments to ANY, NOTANY, SCAN, SCANL, SPAN, and SPANL does not matter when you are comparing patterns returned by any of these built-ins. Other than this, almost any difference in the building of two patterns makes those patterns unequal. For example, suppose you defined the variable this_pat as follows:


this_pat := ANY ("abc"); 

Given this definition, the following patterns match the same text but are not equal:


pat1 := LINE_BEGIN + ANY ("abc"); 
pat2 := LINE_BEGIN + this_pat; 

3.7.4 Compiling and Executing Patterns

When you execute a DECTPU statement that contains a pattern expression, DECTPU builds an internal representation of the pattern. DECTPU uses the current contents of any buffers or ranges used as arguments to pattern built-ins in the pattern expression to build the internal representation. Later changes to those buffers and ranges do not affect the internal representation for the pattern. DECTPU also uses the current values of any variables used in the pattern expression. Later changes to these variables do not affect the internal representation of the pattern. For example, suppose you wrote the following code fragment:


p1  := "abc"; 
p2  := "123"; 
pat := p1 & p2; 
p1  := "xyz"; 
SEARCH (pat, FORWARD); 

Given this code fragment, the search matches the string "abc123" because the variable pat is evaluated as it is built from p1 and p2 during the assignment statement.

3.7.5 Searching for a Pattern

The SEARCH and SEARCH_QUIETLY built-ins use the following algorithm to find a match for a pattern:

  1. Put the internal marker that marks the search position at the starting position for the search. The starting position is determined as follows:
  2. Check whether the pattern matches text, starting at the current search position and extending toward the end of the searched buffer or range. If a range is being searched, the matched text cannot extend beyond the end of that range. If the pattern matches, return a range that contains the matching text and stop searching.
  3. If the previous step fails, move the search position one character forward or backward, depending upon the direction of the search. If this is impossible because the search position is at the end or beginning of the searched buffer or range, stop searching. If this step succeeds, repeat the previous step.

Note

This algorithm changes if you specify a reverse search for a pattern starting with SCAN, SPAN, SCANL, or SPANL. For more information, see the descriptions of these built-in procedures in the DEC Text Processing Utility Reference Manual.

3.7.6 Anchoring a Pattern

Anchoring a pattern forces SEARCH or SEARCH_QUIETLY to match the anchored part of the pattern to text starting at the current search position. If the anchored part of a pattern fails to match that text, SEARCH or SEARCH_QUIETLY stops searching.

Usually, all pattern elements other than the first pattern element of a pattern are anchored. This means that a pattern can match text starting at any point in the searched text but that once it starts matching, each pattern element must match the text immediately following the text that matched the previous pattern element.

To direct DECTPU to stop searching if the characters starting at the editing point do not match the pattern, use the ANCHOR keyword as the first pattern element. For example, the following pattern matches only if the string abc occurs at the editing point:


pat1 := ANCHOR + "abc"; 

There are two ways to unanchor pattern elements in the midst of a pattern. The easiest is to concatenate or link the UNANCHOR keyword before the pattern element you want to unanchor. The following pattern unanchors the pattern element xyz:


pat1 := "abc" + UNANCHOR + "xyz"; 

This means that the pattern pat1 matches any text beginning with the characters abc and ending with the characters xyz. It does not matter what or how many characters or line breaks appear between the two sets of characters. Since SEARCH or SEARCH_QUIETLY matches the first xyz it finds, the text between the two sets of characters by definition does not contain the string xyz.

The second way to unanchor a pattern element is to use the special properties of the link operator (&). While the concatenation operator always anchors the right pattern element to the left, the link operator does so only if the right pattern element is not a pattern variable. If the link operator's right pattern element is a pattern variable, the link operator unanchors that pattern element. The pattern pat2 defined by the following assignments matches any sequence of text that begins with the letter a and ends with a digit.


pat1 := ANY ("0123456789"); 
pat2 := "a" & pat1; 

Any amount of text can occur between the a and the digit. Pat2 matches the same text as the following pattern:


pat3 := "a" + UNANCHOR + ANY( "0123456789" ); 

The link operator unanchors a pattern variable regardless of what the left pattern element is. In particular, the following two patterns match the same text:


pat2 := "a" & pat1; 
pat3 := "a" & ANCHOR & pat1; 

If you are using pattern variables to form patterns and you wish those variables to be anchored, you have two choices: you can use the concatenation operator, or you can use the ANCHOR keyword as the first element of any pattern the pattern variables reference.

3.8 Process Data Type

The CREATE_PROCESS built-in procedure returns a value of the process data type.

A DECTPU process runs as a subprocess.

DECTPU processes have the same restrictions that OpenVMS subprocesses have. Following are some of the restrictions:

The following example assigns a value of the process data type to the variable x:


x := CREATE_PROCESS (main_buffer, "MAIL"); 

The first parameter specifies that the output from the process is to be stored in MAIN_BUFFER. The string "MAIL" is the first command sent to the subprocess.

To pass commands to a subprocess, use the SEND built-in procedure, as follows:


SEND ("MAIL", x); 

To pass the READ command to the Mail utility, enter the following DECTPU statement:


SEND ("READ", x); 

The output from the READ command is stored in the buffer associated with the process x. If the buffer associated with a process is deleted, the process is deleted as well.

3.9 Program Data Type

A program is the compiled form of a sequence of DECTPU procedures and executable statements. The COMPILE and LOOKUP_KEY built-in procedures can optionally return a value of the program data type as a result. The following example assigns a value of the program data type to the variable x:


x := COMPILE (main_buffer); 

MAIN_BUFFER must contain only DECTPU declarations, executable statements, and comments. All declarations must come before any executable statements that are not included in the declarations. The declarations and statements are compiled and the resulting program is stored in the variable x.

3.10 Range Data Type

A range contains all the text between (and including) two markers. You can form a range with the CREATE_RANGE built-in procedure. A range is associated with characters within a buffer. If the characters within a range move, the range moves with them. If characters are added or deleted between two markers that delimit a range, the size of the range changes. If all the characters in a range are deleted, the range moves to the nearest character.

DECTPU does not support ranges of zero length unless the range begins and ends at the end of a buffer. All other ranges contain at least one character (which could be a space character) or a line-end (if the range is created at the end of a line).

If you create a range by specifying a free marker as a parameter to the CREATE_RANGE built-in, DECTPU creates a new marker and binds the marker to the text nearest to the free marker position. DECTPU uses the new bound marker as the range delimiter. This operation does not cause insertion of padding spaces.

Deleting the markers used to create a range does not affect the range.

To convert the contents of a range to a string, use either the STR or the SUBSTR built-in procedure.

To remove a range, use the DELETE built-in procedure with the range as a parameter. For example, the following statement deletes the range range1:


DELETE (range1); 

You can also delete a range by removing all variable references to the range. To do this, set all variables referring to the range to some other value, such as 0. For example, the following statement sets the variable range1 to 0:


range1 := 0; 

Deleting a range does not remove the characters of the range from the buffer; it merely removes the range data structure. To remove the characters of a range, use the ERASE built-in procedure with the range as a parameter. For example, ERASE (my_range) removes all the characters in my_range, but it does not remove the range structure. Using the statement DELETE (range_variable) removes the range data structure, but does not affect the characters in the range.

The following built-in procedures, as well as the partial pattern assignment operator, all return values of the range data type:

The following example assigns a value of the range data type to the variable x:


x := CREATE_RANGE (mark1, mark2, UNDERLINE); 

You can specify the video attribute with which DECTPU should display a range. The possible attributes are BLINK, BOLD, REVERSE, and UNDERLINE. The UNDERLINE keyword in the preceding example specifies that the characters in the range will be underlined when they appear on the screen. You cannot give more than one video attribute to a range. However, to apply multiple video attributes to a given set of characters, you can define more than one range that contains those characters and give one video attribute to each range.

3.11 String Data Type

DECTPU uses the string data type to represent character data. A value of the string data type can contain any of the elements of a character set. You can select one of the following character sets to use with your string data:

DECTPU uses the string data type to represent character data. A value of the string data type can contain any of the elements of the character sets mentioned previously. To specify a string constant, enclose the value in quotation marks. In DECTPU, you can use either the quotation mark (") or the apostrophe (') as the delimiter for a string. The following statements assign a value of the string data type to the variable x:


x := 'abcd'; 
x := "abcd"; 

To specify the quote character itself within a string, type the character twice if you are using the same quote character as the delimiter for the string. The following statements show how to quote an apostrophe and a quotation mark, respectively:


x := '''';       ! The value assigned to x is '. 
x := """";       ! The value assigned to x is ". 

If you use the alternate quote character as the delimiter for the string within which you want to specify a quote character, you do not have to type the character twice. The following statements show how to quote an apostrophe and a quotation mark, respectively, when you use the alternate quote character to delimit the string:


x := "'";        ! The value assigned to x is '. 
x := '"';        ! The value assigned to x is ". 

A null string is a string of length zero. You can assign a null string to the variable x in the following way:


x := ''; 

To create a string from the contents of a range, use the STR or the SUBSTR built-in procedure. To create a string from the contents of a buffer, use the STR built-in.

The maximum length for a string is 65,535 characters. A restriction of the DECTPU compiler is that a string constant (an open quotation mark, some characters, and a close quotation mark) must have both its opening and closing quotation marks on the same line. While a string can be up to 65,535 characters long, a line in a DECTPU buffer can only be 32767 characters long. If you try to create a line that is longer than 32767 characters, DECTPU truncates the inserted text to the amount that fills the line to 32767 characters.

Many DECTPU built-in procedures return a value of the string data type. The ASCII built-in procedure, for example, returns a string for the ordinal value that you use as a parameter. The following statement returns the string "K" in the variable my_char:


my_char := ASCII (75); 

To replicate a string, specify the string to be reproduced, then the multiplication operator (*), and then the number of times you want the string to be replicated. For example, the following DECTPU statement inserts 10 underscores into the current buffer at the editing point:


COPY_TEXT ("_" * 10) 

The string to be replicated must be on the left-hand side of the operator. For example, the following DECTPU statement produces an error:


COPY_TEXT (10 * "_") 

To reduce a string, specify the string to be modified, then the subtraction operator (--), and then the substring to be removed. Table 3-2 shows the effects of two string-reduction operations.

Table 3-2 Effects of Two String-Reduction Operations
DECTPU Statement Result
COPY_TEXT ("FILENAME.MEM" --"FILE") Inserts the string "NAME.MEM" into the current buffer at the editing point.
COPY_TEXT ("woolly" --"wool") Inserts the string "ly" into the current buffer at the editing point.


Previous Next Contents Index

  [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]  
  privacy and legal statement  
6018PRO_005.HTML