gumby-320x247

Performance Shootout of Nearest Neighbours: Contestants

Radim Rehurek gensim, programming 12 Comments

Continuing the benchmark of libraries for nearest-neighbour similarity search, part 2. What is the best software out there for similarity search in high dimensional vector spaces? Document Similarity @ English Wikipedia I’m not very fond of benchmarks on artificial datasets, and similarity search in particular is sensitive to actual data densities and distance profiles. Using fake “random gaussian datasets” seemed …