https://www.intel.com/content/www/us/en/docs/intrinsics-guid...
Intel's docs are unfortunately spartan, but the guarantees around program order is a hint that this is what it does.
https://www.intel.com/content/www/us/en/docs/intrinsics-guid...
Intel's docs are unfortunately spartan, but the guarantees around program order is a hint that this is what it does.
That doc is about visibility _outside the core_ (“globally visible”), so it's not what I'm looking for.
Similarly, if I look up MOVNTDQ in the Intel manuals (https://www.intel.com/content/dam/www/public/us/en/documents...), they say:
“Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with VMOVNTDQ instructions if multiple processors might use different memory types to read/write the destination memory locations”
Note _if multiple processors_.