Berkeley Digital Library SunSITE

SWISH-E

Stemming option

The Concept

The ability to successfully search for both the singular and plural form of a word is very useful. A "word stemmer" is an algorithm that takes a word and removes common suffixes, such as plurals and "ing" endings, as well as some other english language structures such as "tional" (as in "emotional", where the root is "emotion").

Usage

Implementation

The WAIS toolkit included an implementation of the Porter stemming algorithm as documented in:

Porter, M.F., "An Algorithm For Suffix Stripping," Program 14 (3), July 1980, pp. 130-137.

the "stemmer.c" file was taken out of WAIS and compiled it into SWISH-E.

Table of Contents

Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000 Hewlett-Packard Company
Originally by Kevin Hughes, kev@kevcom.com, March 11, 1994.
SWISH-E is distributed with no warranty under the terms of the GNU Public License,
Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
Public questions may be posted to the SWISH-E Discussion.
Document maintained at http://sunsite.berkeley.edu/SWISH-E/Manual/UsageOverview.html by the SunSITE Manager.
Last update December 16, 1998. SunSITE Manager: manager@sunsite.berkeley.edu