OpenAI Codex has gone through two distinct eras: (1) the original 2021 “Codex” code-generation models evaluated on benchmarks like HumanEval, and (2) the newer Codex agent (research preview launched in 2025) plus the Codex macOS app and GPT-5.3-Codex family released in February 2026.
OpenAI Codex statistics
This article compiles the most-cited Codex numbers—benchmark scores, usage/adoption signals, and performance/latency metrics—into one place.
HumanEval pass@1 (Codex 12B): 28.8% (single sample) in the original Codex evaluation.
HumanEval pass@100 (Codex 12B): ~70.2% when sampling 100 solutions per problem (repeated sampling boosts success rates).
HumanEval pass@1 (Codex 300M): 13.2% in the same evaluation framework.
Codex API deprecation: OpenAI deprecated legacy Codex models (e.g., code-davinci-* and code-cushman-*) with shutdown dates in March 2023.
Codex usage (recent): OpenAI stated overall Codex usage doubled since the launch of GPT-5.2-Codex (mid-December), and that more than one million developers used Codex in the past month (as of the Codex app announcement).
Codex app adoption: More than 1 million people downloaded the Codex app in its first week (per public statements).
GPT-5.3-Codex speed: OpenAI says GPT-5.3-Codex is “25% faster” (for Codex users) than prior serving, alongside other improvements.
Codex-Spark throughput: OpenAI says Codex-Spark can deliver >1000 tokens/second and launches with a 128k context window (text-only).
HumanEval is a code-generation benchmark where models produce a function from a docstring and must pass unit tests. A commonly cited metric is pass@1 (the probability the first generated sample passes). The original Codex paper also highlighted how repeated sampling (e.g., 100 tries) can dramatically improve success.
Bar chart: HumanEval pass@1 comparison (original Codex paper)
Chance that at least one of 100 samples passes tests.
Codex as an agent (2025–2026): usage and scale signals
In 2025, OpenAI reintroduced Codex as a software engineering agent that can work in sandboxes, propose changes, and run tasks in parallel. In February 2026, OpenAI released a dedicated macOS Codex app designed for multi-agent workflows.
Recent usage/adoption stats (as publicly stated)
Metric
Figure
Context
Developers using Codex (past month)
> 1,000,000
Reported in the Codex app announcement as a recent usage milestone.
Overall Codex usage change since GPT-5.2-Codex launch (mid-December)
2× (doubled)
OpenAI framing of growth after GPT-5.2-Codex rollout.
Codex app downloads (first week)
> 1,000,000
Publicly stated first-week download milestone.
Weekly Codex user growth (one reported week)
60%+
Publicly stated week-over-week growth figure.
Bar chart: Token scale in a Codex “long-running” build example
In one showcase, OpenAI reported Codex built a racing game by working independently using more than 7 million tokens. The same section displayed smaller token totals (e.g., 800k and 60k) as comparison points for shorter runs/iterations.
Label
Bar
Value
Long run (reported)
7,000,000
Mid run (shown)
800,000
Short run (shown)
60,000
Max = 7,000,000. Widths: Long run 100.00%, Mid run 11.43%, Short run 0.86%.
GPT-5.3-Codex and Codex-Spark: speed, throughput, and latency stats
In February 2026, OpenAI introduced GPT-5.3-Codex, positioning it as the most capable agentic coding model in the Codex lineup at that time and stating that it runs 25% faster for Codex users. Soon after, OpenAI released GPT-5.3-Codex-Spark as an ultra-fast, real-time coding variant.
Notable published performance/latency figures
Metric
Figure
Notes
GPT-5.3-Codex speed uplift (stated)
25% faster
OpenAI statement about faster interactions/results for Codex users.
Codex-Spark throughput (stated)
> 1000 tokens/s
Positioned as “near-instant” for interactive coding workflows.
Codex-Spark context window (launch)
128k
Text-only at launch (research preview).
Pipeline overhead reduction (roundtrip)
80%
Reported reduction per client/server roundtrip in the Codex-Spark post.
Pipeline overhead reduction (per-token)
30%
Reported reduction in per-token overhead.
Time-to-first-token improvement
50%
Reported reduction in time-to-first-token.
Bar chart: Reported latency/overhead improvements (Codex-Spark)
Label
Bar
Value
Roundtrip overhead reduction
80%
Time-to-first-token reduction
50%
Per-token overhead reduction
30%
Max = 80%. Widths: Roundtrip 100.00%, Time-to-first-token 62.50%, Per-token 37.50%.
Codex API lifecycle: legacy model deprecations
If you’re researching “Codex” specifically as an OpenAI API model family (e.g., code-davinci-002), the key statistical milestone is the retirement timeline: OpenAI listed multiple Codex completions models as deprecated with shutdown dates in March 2023 and recommended newer general models as replacements.
Commonly referenced Codex retirement point
Item
What happened
Date
Codex models on OpenAI API
Deprecated and scheduled for shutdown (e.g., code-davinci-* and code-cushman-*).
March 2023 (shutdown dates listed as March 23, 2023 for several models)
Codex in the broader developer-tool ecosystem
Codex also became widely known through its role in early AI coding assistants. For example, GitHub said Copilot could suggest a sizable share of newly written code in certain languages during its early rollout period (a frequently quoted figure is around 30% in some languages).
Sources
OpenAI (2026-02-02): Introducing the Codex app — https://openai.com/index/introducing-the-codex-app/