There's a related but distinct problem downstream: once the agent is running in production, verification debt shifts from code to execution. Internal logs of what the agent called and what it received are mutable — if a provider disputes delivery or compliance requires an audit trail, "we have logs" is a weak defense. The deterministic verification (tests, linters, CI) handles the code side. The execution side is a different problem: you need immutable witnesses at call time, before the agent proceeds, not post-hoc reconstructions.