Hacker News

Might be the optimal approach for running a slow inference model locally, and if we treat LLMs like compilers this makes sense. Overnight compilation for complex codebases is still the normal thing to do, but what if LLM code generation (about the one task it seems really good at) was run overnight the same way? That is, your workflow would be to look at what the LLM generated the previous night, make a bunch of annotations and suggestions and then at the end of the day submit everything you did to the LLM for an 'overnight generation' task?