It all depends on the context window size. A small context size with fast performance won't be very useful today, as most workloads (like requests behind codex) usually have very long context.
It all depends on the context window size. A small context size with fast performance won't be very useful today, as most workloads (like requests behind codex) usually have very long context.