Maybe because you aren’t doing batching? It sounds like you’re assuming that would benefit prefill more than decode, but I believe it’s the other way around.
Maybe because you aren’t doing batching? It sounds like you’re assuming that would benefit prefill more than decode, but I believe it’s the other way around.