model size directly affects latency