Learned metric index - proposition of learned indexing for unstructured data

Investor logo

Warning

This publication doesn't include Faculty of Education. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

ANTOL Matej OĽHA Jaroslav SLANINÁKOVÁ Terézia DOHNAL Vlastislav

Year of publication 2021
Type Article in Periodical
Magazine / Source Information Systems
MU Faculty or unit

Faculty of Informatics

Citation
Web http://dx.doi.org/10.1016/j.is.2021.101774
Doi http://dx.doi.org/10.1016/j.is.2021.101774
Keywords Index structures;Learned index;Unstructured data;Content-based search;Metric space
Description The main paradigm of similarity searching in metric spaces has remained mostly unchanged for decades - data objects are organized into a hierarchical structure according to their mutual distances, using representative pivots to reduce the number of distance computations needed to efficiently search the data. We propose an alternative to this paradigm, using machine learning models to replace pivots, thus posing similarity search as a classification problem, which stands in for numerous expensive distance computations. Even a relatively naive implementation of this idea is more than competitive with state-of-the-art methods in terms of speed and recall, proving the concept as viable and showing great potential for its future development.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.