My buddy has some vision impairments, and I remember training a much older of YOLO's models to detect objects/enemies in Terraria for him. It worked very well.

I then tried trained it on a lot of sample images from a 3D point & shoot game, and was quite disappointed in how it performed.

Has anyone else experimented with it recently? How does this suit as a base-model for training custom classifiers? And with hardware growth in the last ~5 years, is it suitable to run in parallel with games which are graphically intensive?