I have occasionally, just for fun, written benchmarks for some algorithm in C++ and an equivalent C# implementation, them tried to bring the managed performance in line with native using the methods you mention and others. I'm always surprised by how often I can match the performance of the unmanaged code (even when I'm trying to optimize my C++ to the limit) while still ending up with readable and maintainable C#.

JIT compilers can outperform statically compiled code by analysing at run time exactly what branches are taken and then optimising based on that.

Could you please share some benchmark code? It would be incredibly useful as a learning aid!