How do we make sure the LLM generated code works? We'll have LLM generated tests! Wait a minute...