Stemming

Stemming

Stemming in Legal Information Retrieval

The following is a basic concept of Stemming in relation to information retrieval. In addition to this, Stemming may be applied to legal texts, including case law, legislation and scholarly works. Stemming refers to procedures for automatically removing certain common suffixes, or word endings, (and sometimes prefixes, like re or re- as in re-indexing) in order to increase the frequency count for important words, and also in order to find word occurrences when the word form in the text does not match the word form in the search statement. There are often sets of related words that are derived from a common root and appear in a variety of forms, depending on particular functions in a sentence or variations in meaning. Thus we have index, indexes, indexer, indexing, indexable. We also have variants, such as indices as another form for the word indexes. See also this Dictionary for discussion of stemming algorithms.


Posted

in

,

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *