The search for nearest neighbors is an emerging and increasingly vital component in data analysis tasks, for example using vector embedding databases. Typically, the search is the bottleneck in terms of efficiency. Approximate nearest neighbor (ANN) search methods are often employed to speed up the application. However, different methods for ANN search come with different biases that can be positive or negative for the downstream application. In this project, the bias of different ANN methods and its impact on different applications will be studied.
Efficient Algorithms and Data Structures
The efficiency of algorithms and data structures is becoming increasingly important in the area of big data, where complicated analysis is performed on very large datasets. Often algorithm efficiency is the deciding factor in analysis quality (of even if it possible at all). Modelling modern computational infrastructure (such as complicated memory-hierarchies, GPUs and modern clientserver architectures), and development of algorithms and data structures for these models/devices, is also increasingly important.