> That's when A.I. starts advancing itself and needs humans in the loop no more.
You got to put the environment back in the loop though, it needs a source of discovery and validity feedback for ideas. For math and code is easy, for self driving cars doable but not easy, for business ideas - how would we test them without wasting money? It varies field by field, some allow automated testing, others are slow, expensive and rate limited to test.
Simulation is the answer. You just need a model that's decent at economics to independently judge the outcome, unless the model itself is smart enough. Then it becomes a self-reinforcing training environment.
Now, depending on how good your simulation is, it may or may not be useful, but still, that's how you do it. Something like https://en.wikipedia.org/wiki/MuZero
Electric dreams. Simulation of what?
Think scams and pure resource extraction. They won't consider many impacts outside of bottom line.
Simulated environment suggests the possibility of alignment during training but real time, real world, data streams are better.
But the larger point stands: you don't need an environment to explore the abstraction landscape prescribed by systems thinking. You only need the environment at the human interface.