What latency? How much is it compared to LLM inference speed?

See the Redpanda comment/link here.