As ever, I think the answer to "how do we sandbox arbitrary code while still letting it do useful things?", whether human-written or machine-written, is with object capabilities. Run the generated code in a sandbox, but pass in capabilities to useful resources, whether that be remote servers, local directories, or whatever else. Then you know the bounds of what trouble it can get up to from the start.