It’s the chip they’re apparently running the model on.
> Codex-Spark runs on Cerebras’ Wafer Scale Engine 3 (opens in a new window)—a purpose-built AI accelerator for high-speed inference giving Codex a latency-first serving tier. We partnered with Cerebras to add this low-latency path to the same production serving stack as the rest of our fleet, so it works seamlessly across Codex and sets us up to support future models.
That's what it's running on. It's optimized for very high throughput using Cerebras' hardware which is uniquely capable of running LLMs at very, very high speeds.
It’s the chip they’re apparently running the model on.
> Codex-Spark runs on Cerebras’ Wafer Scale Engine 3 (opens in a new window)—a purpose-built AI accelerator for high-speed inference giving Codex a latency-first serving tier. We partnered with Cerebras to add this low-latency path to the same production serving stack as the rest of our fleet, so it works seamlessly across Codex and sets us up to support future models.
https://www.cerebras.ai/chip
That's what it's running on. It's optimized for very high throughput using Cerebras' hardware which is uniquely capable of running LLMs at very, very high speeds.