OpenAI has announced the release of the GPT-5.3-Codex-Spark model, marking a turning point in the adoption of alternative hardware for AI inference. This model, designed for code development, is OpenAI's first to run on chips manufactured by Cerebras, rather than traditional Nvidia GPUs.

Performance and Accessibility

Codex-Spark offers a processing speed of over 1,000 tokens per second, an increase of approximately 15 times compared to the previous model. For comparison, Claude Opus 4.6 in fast mode reaches about 2.5 times its standard speed of 68.2 tokens per second. The model is currently available as a research preview for ChatGPT Pro subscribers ($200/month) through the Codex app, command-line interface, and VS Code extension. OpenAI is gradually opening API access to select design partners.

Technical Details

The model supports a context window of 128,000 tokens and, at launch, handles text only. Sachin Katti, head of compute at OpenAI, emphasized the importance of the engineering collaboration with Cerebras and the excitement about adding fast inference capabilities to the platform.

Implications

OpenAI's choice to use Cerebras hardware highlights a growing diversification in the AI hardware landscape. This move could lead to increased competition and new opportunities for AI inference solutions optimized for specific workloads.