SCOPE: Cost-Efficient Model Selection for Compound AI Systems under Quality Constraints

AIPR assessment

This is a hard, competitive optimization problem, large combinatorial search over many LLM assignments under an explicit quality constraint, with expensive evaluations and practical cost pressure. The strengths reinforce each other well: the query-level search idea, the proof-based feasibility guarantee, the open code, and the strong benchmark gains all point in the same direction. The main weaknesses also interact: the algorithm depends on kernelized surrogate modeling, tuned bound parameters,

Abstract

A compound AI system consists of multiple LLM modules, together handling complex and multi-step tasks that exceed the capabilities of a single model. Existing systems often use a single expensive LLM across all modules to improve the result quality of the whole system. However, this configuration incurs prohibitive costs, particularly for data management and analytics tasks at scale, such as data manipulation. To this end, we formalize the problem of constrained LLM selection for compound AI systems, leveraging the diverse pricing and capabilities of different LLMs to achieve competitive quality at lower cost. Given a query dataset and a user-specified quality threshold, we aim to select an LLM for each module to minimize the system's average cost while ensuring that overall quality meets the required threshold. To solve this problem, we propose SCOPE, a cost-efficient optimization algorithm. Unlike existing approaches that rely on expensive dataset-level evaluations, SCOPE exploits per-query results to rapidly estimate the system's cost and quality, and constructs confidence bounds to guide the search for promising LLM combinations. Furthermore, SCOPE provides theoretical guarantees for meeting the quality threshold and achieving near-optimal average cost. We evaluate SCOPE against 7 baselines on three data processing tasks, demonstrating that it outperforms all baselines. Under the same search budget and quality constraint, it finds solutions with up to $20\times$ lower cost than the best competitor during the search and achieves up to $6\times$ lower final cost in the returned solution.

Score Breakdown

Holistic Impression
74
Novelty
73
Rigor
77
Applicability
72
Clarity
78
Citation
80
Confidence: 85%

More from this week

More in AI