That reminds me of one of the easiest big wins I've had in my career. SystemD was causing issues, so I slapped in Gentoo with the real-time kernel patch. Peak latency (practically speaking, the only core metric we cared about -- some control loop doing a bunch of expensive math and interacting with real hardware) went down 5000x.

That specific advice isn't terribly transferable (you might choose to hack up SystemD or some other components instead, maybe even the problem definition itself), but the general idea of measuring and tuning the system running your code is solid.

What do you think is causing the issue? We are having the same kind of problem. Core isolation, no_hz, core pinning, but i am still getting interrupted by nmi interrupts