mspace.py

mspace.py is a module for the Python language which you can use to perform similarity searches in a metric space. Included are three metric tree implementations (Vantage-Point-Trees and two variants of Burkhard-Keller-Trees) and the Levenshtein distance.

Application Domains

Metric space indexes can be used for many tasks where you want to find objects somehow "similar", but not necessarily equal, to another object. This includes spellchecking and record de-duplication, but there are completely other problem domains like biology as well. Metric space indexes are especially nice for searches in highly dimensional spaces because dimensionality does not play a role as significant as in other solutions to the search problem.

Download & Installation

The most current development version of mspace.py is always available on Github. You can check it out from this URL:

https://github.com/solexx/mspace

If you prefer to use something with a version number attached to it instead you can grab one of these files:

The only thing you are probably interested in is the file mspace.py. Just copy it somewhere into your PYTHONPATH and you're done.

Usage

Check out the extensive module documentation for a complete description of the API, usage instructions and, if you need it, more or less formal definitions needed for the task.

Unfortunately mspace.py is still only compatible with Python2. I will progably migrate it to v3 at some point, but patches are always welcome.

License

mspace.py is licensed under the Gnu Public License v2. That means you can use and distribute it in any way you want, provided that you don't claim any copyright for it, distribute the source as well and only use it in other free software projects. If you need it under some other license, contact me.

Zurück nach oben
Letzte Änderung: Sun Jan 24 19:35:50 2021
.