Optimising hardware to run existing software is how you sell your hardware.

The amount of performance you can extract from a modern CPU if you really start optimising cache access patterns is astounding

High performance networking is another area like this. High performance NICs still go to great lengths to provide a BSD socket experience to devs. You can still get 80-90% of the performance advantages of kernel bypass without abandoning that model.

> The amount of performance you can extract from a modern CPU if you really start optimising cache access patterns is astounding

I think this was one, and I want to emphasise this, of the main points behind Odin programming language.