Record things with 2 cameras.

Today's Sora can produce something that resembles reality from a distance, but if you look closely, especially if there's another perspective or the scene is atypical, the flaws are obvious.

Perhaps tomorrow's Sora will overcome the the "final 10%" and maintain undetectable consistency of objects in 2 perspectives. But that would require a spatial awareness and consistency that models still have a lot of trouble with.