VMS GREP User's Manual by Eric Andresen DESCRIPTION: The GREP command enables you to search through files for a string that matches one or more patterns. GREP displays all of the lines that match the pattern(s). FORMAT: $ GREP file-spec[,...] pattern-string Command Qualifiers Defaults /[NO]EXACT /NOEXACT /[NO]HEADING /NOHEADING /[NO]LOG /NOLOG /[NO]NOT /NONOT /[NO]NUMBERS /NONUMBERS /[NO]OUTPUT /OUTPUT PARAMETERS: file-spec[,...] Specifies the names of one or more files to be searched. There must be at least one file name and multiple file names are to be separated by commas. Wildcard characters are allowed in the file-spec. pattern-string Specifies a pattern string to search for in the specified files. If the string contains any special characters or lowercase letters then enclose the string in quotes. The /EXACT qualifier causes GREP to differentiate between uppercase and lowercase letters. See the section PATTERNS for the syntax of the pattern-string. PATTERNS: Regular expressions are composed of the following:1 ^ matches the beginning of a line $ matches the end of a line . matches any character \ followed by a single character matches that character. The following are special exceptions: \b backspace \n linefeed \r carriage return \s space \t tab \\ backslash Note: The metacharacters ()+? are not supported as in the standard UNIX GREP command. A single character not otherwise endowed with special meaning matches that character. A string enclosed in brackets [] specifies a character class. Any single character in the string will be matched. For example, [abc] will match an a, b or c. Ranges of ASCII character codes may be abbreviated, as in [a-z0-9]. If the first symbol following the [ is a ^ then a negative character class is specified. In this case, the string matches all characters except those enclosed in the brackets (that is, [^a-z] matches everything except lowercase letters). Note that a negative character class must match something, even though that something cannot be any of the characters listed. For example, ^$ is not the same as ^[^z]$. The first example matches an empty line (beginning of line followed by end of line); the second example matches a beginning of line followed by any character except a z followed by end of line. In the second example a character must be present on the line, but that character cannot be a z. Note that the characters *.^$ are not special characters when inside a character class. A regular expression followed by a * matches zero or more matches of the regular expression. Two regular expressions concatenated match a match of the first, followed by a match of the second. Two regular expressions separated by a | match either a match for the first or a match for the second. The order of precedence is [] then * then concatenation then |. QUALIFIERS: /[NO]EXACT Controls whether the GREP command matches the search string exactly, or treats uppercase and lowercase letters as equivalents. The default is /NOEXACT, causing GREP to ignore case differences. /[NO]HEADING Controls whether file names are printed in the output. With the default heading format, file names are printed only when more than one file is specified or when wild card characters are used. The separator, a line of 30 asterisks, is displayed between groups of lines that belong to different files. /[NO]LOG Controls whether the GREP command produces a line containing the file name and the number of records and matches for each file searched. The log information is output to the current SYS$OUTPUT device. /[NO]NOT Causes lines that do not match the specified pattern(s) to be printed rather than those that do match. The statistics in the /LOG output will still correspond to correct matches even if /NOT is used. /[NO]NUMBERS Controls whether the source line number is displayed at the left margin of each line. By default, line numbers are not displayed. /[NO]OUTPUT[=file-spec] Controls whether the results of the search are output to a specified file. The output will be sent to the current default output device (SYS$OUTPUT) if you omit the /OUTPUT qualifier or omit the file specification. The /NOOUTPUT qualifier could be used to turn off output if only the /LOG information is desired. EXAMPLES: The command line: $ GREP/NUMBERS FILE.C,FILE2.C "^[a-z][a-z]*[\s\t]*.*([^;]*)[^;]*$" creates a cross-reference of the C program FILE1.C and FILE2.C. GREP's output will show all function declarations in the specified files, preceded by the appropriate line number. Because of the special characters in the pattern it is necessary to enclose the string in double quotes. The regular expression is interpreted as follows: beginning of line (^), followed by one or more occurrences of any character in the range a-z ([a-z][a-z]*), followed by either a space or a tab repeated zero or more times ([\s\t]*), followed by any character repeated zero or more times (.*), followed by an open parenthesis ((), followed by any character except a semicolon repeated zero or more times ([^;]*), followed by a close parenthesis ()), followed by any character except a semicolon repeated zero or more times ([;]*), followed by end of line ($). Here are some other examples of patterns: a.d matches any word containing an a, followed by any character, followed by a d. This would match the substrings "and" in and and "ard" in aardvark. ^a.d matches the same strings but only if they occur at the beginning of the line. No characters, including spaces and tabs, are allowed in front of the a. a.d$ matches the same strings if they occur at the end of the line. No characters, including spaces and tabs, can follow the d. ^$ matches a beginning of line, followed by an end of line; this means it matches all lines containing no characters or nothing but a newline character. an*d matches any word containing a, followed by an n repeated zero or more times, followed by a d. This expression will match add as well as and. .* matches any character repeated any number of times. This expression always succeeds. aa* matches one (rather than zero) or more occurrences of the letter a. [abc] defines a character class. A character class matches any one of the characters surrounded by the square brackets. This character class matches an a, b, or c in the corresponding position on the line. who[ms]e matches who, followed immediately by either an m or an s, followed by an e. That is, the words whomever and whose will be matched, but the word whole will not. who[a-z_]e matches who, followed by any character between a and z or _, followed by an e. else|if matches either the string else or the string if. If a line contains either of these strings the line will be printed. ^[^\s\t] The first ^ means beginning of line. The second one begins a negative character class. This matches any character in the first column of a record that is not a space or a tab. ----------------------- [1] The sections PATTERNS and EXAMPLES are taken directly from Table 14-1 of Dr. Dobb's Toolbook of C, A Brady Book, Prentice Hall (1986)