Systems like this need to report confidence in their assertions.

e.g. Not "this student has a gun" but "this model says the student has a gun with a probability of 60%".

If an AI can't quantify it's degree of confidence, it shouldn't be used for this sort of thing.

Even better, share the frame(s) that the guess was drawn from with a human for verification before triggering ANYTHING. How much trouble could that possibly be? How many "guns" is this thing detecting in a day across all sites? I doubt more than a couple or we'd have heard about tons of incidents, false positives or not.

I wanna see the frames too.

I don't find that especially good as a sole remedy, because lots of people are stupid. If they see a green outline box overlaid on a video image with the label 'gun', many many people will just respond to the label instead of looking at the underlying image and trying to make a decision. Probability and validation history need to be built into the product so that there are audit logs that can be pored over and challenged. Bad human decision-making, which is rampant, is always smoothed over with justifications like 'I was concerned for everyone's safety', and usually treated in isolation rather than assessed longitudinally.

Doesn’t work. Competition will make them report higher accuracy to make the product look better.