I feel like I could learn a lot just studying this ; just a curiosity ; how do you know if stuff is within L1 cache or not ? Are there kernel fn for that ? or just trough benching ?
I feel like I could learn a lot just studying this ; just a curiosity ; how do you know if stuff is within L1 cache or not ? Are there kernel fn for that ? or just trough benching ?
From the working set size and knowledge of hardware cache behaviour. Whenever you access data from memory not already in-cache it's copied four times: to L3, L2, L1 and to CPU registers. As you access data, the hardware evicts old cache entries to make space for it.
If you loop through an array once, and then iterate through it again you can figure out where it will be cached based on the array size.
Does it fit 32K? Does it have some weird aliasing issue because you caused cache extinction with too many power of two sizes? And if you don't know the answer to these just check L1d hitrate with perf.