i tried to create a makefile driven workflow based on this idea and ended up with https://github.com/khimaros/enc -- it suffers from the issues you raised

i'm hoping that it becomes more useful as models improve and become more reliable at producing working code (though determinism would be great for improving prompts).