Can you ELI5 why this is so slow for local inference but so fast for using hosted models?