Thursday, July 29, 2004

Untitled Document

Untitled Document: "Ransacker is a scriptable, incrementally-double-indexed search engine written in python.
It's scriptable in that you can index any text with any key. This makes it easy to index content ('pages') stored in databases, file systems, the web, etc.
It can index incrementally. This means you can add content or update the entry for a particular page without touching the rest of the index.
It's double-indexed, meaning that not only does every word have a list of pages, every page has a list of words. This is used for the incremental indexer, but also allows you to determine which pages have the most in common. This will allow ransacker to produce 'what's related' pages.
Currently, ransacker ranks pages by number of times keywords appear. It does not yet support boolean queries, fuzzy matches, or other advanced searching features."