> You could in principle create a simulation with the same mathematical properties as the physical world but no one has ever done that. I'm not sure if we even know how.

What do you mean by that? Simulating physics is a rich field, which incidentally was one of the main drivers of parallel/super computing before AI came along.

The mapping of the physical world onto a computer representation introduces idiosyncratic measurement issues for every data point. The idiosyncratic bias, errors, and non-repeatability changes dynamically at every point in space and time, so it can be modeled neither globally nor statically. Some idiosyncratic bias exhibits coupling across space and time.

Reconstructing ground truth from these measurements, which is what you really want to train on, is a difficult open inference problem. The idiosyncratic effects induce large changes in the relationships learnable from the data model. Many measurements map to things that aren't real. How badly that non-reality can break your inference is context dependent. Because the samples are sparse and irregular, you have to constantly model the noise floor to make sure there is actually some signal in the synthesized "ground truth".

In simulated physics, there are no idiosyncratic measurement issues. Every data point is deterministic, repeatable, and well-behaved. There is also much less algorithmic information, so learning is simpler. It is a trivial problem by comparison. Using simulations to train physical world models is skipping over all the hard parts.

I've worked in HPC, including physics models. Taking a standard physics simulation and introducing representative idiosyncratic measurement seems difficult. I don't think we've ever built a physics simulation with remotely the quantity and complexity of fine structure this would require.

I'm probably missing most of your point, but wouldn't the fact that we have inverse problems being applied in real-world situations somewhat contradict your qualms? In those cases too, we have to deal with noisy real-world information.

I'll admit I'm not very familiar with that type of work - I'm in the forward solve business - but if assumptions are made on the sensor noise distribution, couldn't those be inferred by more generic models? I realize I'm talking about adding a loop on top of an inverse problem loop, which is two steps away (just stuffing a forward solve in a loop is already not very common due to cost and engineering difficulty).

Or better yet, one could probably "primal-adjoint" this and just solve at once for physical parameters and noise model, too. They're but two differentiable things in the way of a loss function.

Is this like some scale-independent version of Heisenberg's Uncertainty Principle?