The Leap Motion Controller came out in 2014 already (11 years ago, wow!) and isn't very expensive. The SDK was lacking in the beginning if I recall correctly, but a webcam seems to be inferior. Technology isn't the limiting factor for a quite some time now. I'm sure many projects existed to translate gestures to MIDI, some less polished, some more polished[0][1].
Reminds me... I even used two PlayStation Eyes (EUR 5 each) with OpenCV and the EVM algorithm[2] on a ThinkPad X230 for a dance performance piece back in 2015. Movements rather than gestures and OSC instead of MIDI, but it worked great!
[0]: https://midipaw.com/