It all depends on the context window size. A small context size with fast performance won't be very useful today, as most workloads (like requests behind codex) usually have very long context.