Not all applications are chatbots. Many potential uses for LLMs/VLAMs are latency constrained.