> Anyway, my colleague found that there was one program that needed to allocate around 64KB of memory on the stack and initialize it. The standard way of doing this is to perform a stack probe to ensure that 64KB of memory is available, then subtracting 65536 from the stack pointer, and then initializing the memory in a small, tight loop.

Actually, the standard way of allocating 64 kB of memory on the stack is to just assume you can do it, subtract 64k from the stack pointer, and hope for the best.

Most stack allocations in the wild are not checked.

IIRC you have to probe every page of the stack on Windows. You cannot just subtract a value from ESP/RSP. If you don't probe every page in order, you get a page fault or some other exception (I don't remember which one).

The reason for this is to ensure stack overflows are detected. The OS places a guard page above the top of the stack, which will cause a segfault if accessed. That way stack overflows are guaranteed to crash rather than stomping on valid memory that belongs to something else. However, if a stack frame is larger than a page (say, because it includes a large buffer), then it is possible for the program to "jump over" the guard page and access memory beyond.

In order to protect against this, if a stack frame is larger than a page, then the compiler inserts some dummy reads or writes to ensure every page is touched in order from bottom to top. This ensures the guard page is hit before the application has a chance to write to memory beyond it.

Here's an example: https://godbolt.org/z/oTbzTczM6

How else would the OS know your read/write 16 pages away from the current stack pointer is in fact an attempt to increase the stack and not just really bad pointer arithmetic and a bug? How many pages should the runtime let you skip before its just a segfault?