Very important note:

>Note that GPT‑4.1 will only be available via the API. In ChatGPT, many of the improvements in instruction following, coding, and intelligence have been gradually incorporated into the latest version

If anyone here doesn't know, OpenAI does offer the ChatGPT model version in the API as chatgpt-4o-latest, but it's bad because they continuously update it so businesses can't reliably rely on it being stable, that's why OpenAI made GPT 4.1.

> chatgpt-4o-latest, but it's bad because they continuously update it

Version explicitly marked as "latest" being continuously updated it? Crazy.

No one's arguing that it's improperly labelled, but if you're going to use it via API, you might want consistency over bleeding edge.

Lots of the other models are checkpoint releases, and latest is a pointer to the latest checkpoint. Something being continuously updated is quite different and worth knowing about.

It can be both properly communicated and still bad for API use cases.

OpenAI (and most LLM providers) allow model version pinning for exactly this reason, e.g. in the case of GPT-4o you can specify gpt-4o-2024-05-13, gpt-4o-2024-08-06, or gpt-4o-2024-11-20.

https://platform.openai.com/docs/models/gpt-4o

Yes, and they don't make snapshots for chatgpt-4o-latest, but they made them for GPT 4.1, that's why 4.1 is only useful for API, since their ChatGPT product already has the better model.

Okay so is GPT 4.1 literally just the current chatpt-4o-latest or not?

I feel like it is. But that's just the vibe.

It isn't.

Yeah, in the last week, I had seen a strong benchmark for chatgpt-4o-latest and tried it for a client's use case. I ended up wasting like 4 days, because after my initial strong test results, in the following days, it gave results that were inconsistent and poor, and sometimes just outputting spaces.

So you're saying that "ChatGPT-4o-latest (2025-03-26)" in LMarena is 4.1?

No, that is saying that some of the improvements that went into 4.1 have also gone into ChatGPT, including chatgpt-4o-latest (2025-03-26).

yeah I was surprised in they benchmarks during livestream they didn't compare to ChatGPT-4o (2025-03-26) but only older one.