Thursday, November 11, 2004

about advas.sf.net

about advas.sf.net: "dvas is a python module which provides algorithms for advanced search. These methods are mainly used in information retrieval and linguistics.

This package contains:

* statistical algorithms
o term frequency (tf)
o term frequency with stop list
o inverse document frequency (idf)
o retrieval status value (rsv)
o language detection
by keywords
o k-nearest neighbour algorithm (kNN)

* linguistic algorithms
o stemming algorithms
table lookup stemmer
n-gram stemmer
successor variety stemmer using peak-and-plateau method
o synonym detection with the use of the OpenThesaurus (plain text version)

* sound-like methods
o soundex
o metaphone
o NYSIIS algorithm

* ranking methods
o a simple descriptor-based ranking algorithm

* text search algorithms
o Knuth-Morris-Pratt

The documentation describes each function with its purpose, parameters and the result (return value). A number of examples are included which display the use of each function or algorithm "