EMA: Approximate Nearest Neighbor Search with General Attribute Filtering and Dynamic Updates
AIPR assessment
This is a hard, competitive, and fairly saturated problem, with many recent graph-filtered ANN systems optimizing the same latency-recall trade-offs. The strengths compound well: a simple structural idea, broad evaluation, and an open code release make the performance claims plausible and useful. The main weakness interaction is the lack of clear component-level ablation, which makes it harder to tell how much of the gain comes from marker encoding versus recovery versus pruning heuristics. Resu
Abstract
Filtering Approximate Nearest Neighbor (FANN) search is a critical and emerging task for strengthening the query capability of vector databases, supporting applications such as recommendation systems, retrieval-augmented generation (RAG), and agent memory. However, most existing methods are limited to range or label filtering, often incurring unacceptable index construction time and memory overhead. Predicate-agnostic approaches further struggle to handle a wide range of predicate selectivities effectively. In this paper, we propose EMA, a filtering ANN algorithm that supports multi-predicate queries over mixed numerical and categorical attributes, and efficient dynamic updates. EMA introduces Markers as compact summaries attached to graph edges, providing conservative predicate- and geometric-aware guidance with zero false negatives at the Marker level. During query processing, EMA performs Marker-augmented joint search with a bounded edge recovery mechanism, enabling efficient filtering while preserving graph navigability. Extensive experiments demonstrate that EMA achieves 1.68x--12.25x speedup over state-of-the-art general filtering ANN methods across diverse workloads.
Score Breakdown
More from this week
- Towards Interactive Video World Modeling: Frontiers, Challenges, Benchmarks, and Future Trends
- On Thin Perfect Matchings up to Polylogarithmic Factors
- ViBE: Co-Optimizing Workload Skew and Hardware Variability for MoE Serving
- LeAP: Learnable Adaptive Permutation for Feature Selection in Heterogeneous and Sparse Recommender Systems
- PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects