This is very impressive! I researched fault-tolerant octorotor control using RL in grad school for a NASA project. Perhaps this may be helpful[1, see section 8.3]! The field is moving fast, so there may be better or more suitable approaches out there now.
For folks who are interested in UAV physics, I wrote up an explainer[2].
[1]: https://drive.google.com/file/d/1RTEVRd0XCWLuDXY2nkbmYuOaa5x...