But the point is that you'd be extracting the nonce from someone else's existing video of the same event.

If a celebrity says something and person A films a true video, and person B films a video and then manipulates it, you'd be able to see that B's light code is different. But if B simply takes A's lighting data and applies it to their own video, now you can't tell which is real.

I am not defending the proposed method, but your criticism is not why:

Lets assume the pixels have an 8-bit luminance depth, and lets say the 7 most significant bits are kept, and the signature is coded in the last bit of the pixels in a frame. A hash of the full 7-bit image frame could be cryptographically signed, while you could copy the 8-th bit plane to a fake video, the same signature will not check out according to a verifying media player, since the fake video's leading 7-bit planes won't hash to the same hash that has been signed.

What does this change compared to status quo? nothing: you can already hash and sign a full 8-bit video, and Serious-Oath that it depicts Real imagery. Your signature would also not be transplantable to someone elses video, so others can't put fake video in your mouth.

The only difference: if the signature is generated by the image sensor, and end-users are unable to extract the private key, then it decreases the number of people / entities able to credibly fake a video, but provides great power to the manufacturers to sign fake videos while the masses are unable to (unless they play a fake video on a high quality screen being imaged by a manufacturer-privatekey-containing-image-sensor.