Context Features Are Cheap: Rank-Aware Decomposition for Efficient Feature Interaction in Recommender Systems

AIPR assessment

This is a hard, competitive systems-and-ML problem, because industrial recommender serving has many years of optimization pressure and the gains must hold under real traffic. The strengths compound well: a simple algebraic idea, closed-form cost analysis, controlled ablations, and a real production deployment all point in the same direction. The weaknesses also interact: the most ambitious extensions, especially attention and deeper cross-network variants, are less fully demonstrated, and the la

Abstract

Modern industrial recommender systems use a deep ranking model to score N candidates against the same user and context features. Standard implementations broadcast context features early in the forward pass, redundantly computing context-only operations N times per request. We present a rank-aware decomposition applicable to the dominant interaction mechanisms in modern recommender architectures-Factorization Machine (FM) pairwise products, Deep Cross Network (DCNv2) cross layers, self-attention, and fully connected (FC) projection layers-built on a single algebraic principle: any linear or bilinear operation over a rank-partitioned input admits an exact block decomposition that moves context-only computation from once-per-candidate to once-per-request, identity-equivalent to the original model. Closed-form analysis and controlled ablation verify that savings scale quadratically with the number of context features. Applied to a production DLRM-style ranker without any architectural change, the decomposition increases per-pod throughput by 87.5% (a 47% reduction in peak pod count) at identical model predictions. The identity-equivalent decomposition applies only at the first layer of cross networks and self-attention, since each layer mixes ranks in its output. To extend savings across depth, we further introduce rDCN, an architectural variant of DCNv2 that maintains rank discipline across depth and matches DCNv2 accuracy within training noise at 67% fewer total FLOPs, and sketch an analogous architectural variant for self-attention.

Score Breakdown

Holistic Impression
77
Novelty
69
Rigor
75
Applicability
86
Clarity
78
Citation
83
Confidence: 85%

More from this week

More in AI