The Windows IOCTL Census: A Corpus-Scale, Multi-Architecture Database of the Driver Control-Code Surface
AIPR assessment
Problem difficulty: hard and competitive. Driver vulnerability discovery and IOCTL surface recovery have been optimized by multiple groups for years, and the paper is competing against both symbolic scanners and curated vulnerability corpora. Compounding strengths: the architecture-neutral recovery, the persistent relational store, the open data and code, and the cross-method agreement study reinforce each other and make the result more useful than any single component. Compounding weaknesses: r
Abstract
A Windows driver exposes its kernel through I/O control (IOCTL) codes, and a single unchecked length on the buffer behind one turns an unprivileged call into a kernel write. The research community has strong scanners for this surface and a curated list of known-bad drivers, but no map of the surface itself. We build that map. The Windows IOCTL Census is a queryable database of the control-code dispatch surface of 27,087 signed Windows drivers, recovered by one deterministic, architecture-neutral pass with no symbolic execution. Reading a lifted intermediate representation instead of running a symbolic engine lets it recover a dispatch surface for 80% of the corpus across x86 and x64, including the 32-bit half existing scanners abort on. On the 64-bit lane it adds handler reachability, taint, and the call graph. An LLM ranks the reachable handlers for triage. We release the census as a public dataset of tens of millions of rows: 27,087 binaries, 3.1M decoded control codes, 8.18M functions, and 15.95M call edges.
Score Breakdown
More from this week
- RealDocBench: A Benchmark for Field-Level QA and Layout Understanding on Real-World Regulated Documents
- GitInject: Real-World Prompt Injection Attacks in AI-Powered CI/CD Pipelines
- vla.cpp: A Unified Inference Runtime for Vision-Language-Action Models
- ScaleDisturb: Exploiting Temporal Asymmetry to Amplify Read Disturbance in Modern DRAM Chips
- The CIFAR Synthetic Evidence Corpus for Detecting AI-Generated Evidence