Windows developer here. After reading this post, my gut instinct is that this is due to something called 'segment heap'.
A bit of backstory: there are two, totally independent implementations behind the Windows heap allocation APIs (i.e. the implementation code behind RtlHeapAlloc and RtlHeapFree, which are called by malloc/free). The older of the two, developed uring the Dave Cutler era, is known as the "NT heap". The newer implementation, developed in the 2010s, is known as "segment heap". This is all documented online if anyone wants to read more. When development on segment heap was completed, it was known to be superior to the NT heap in many ways. In particular, it was more efficient in terms of memory footprint, due to lower fragmentation-related waste. Segment heap was smarter about reusing small allocations slots that were recently free'd. But, as ever, Windows was very serious about legacy app compat. Joel Spolsky calls this the 'Raymond Chen camp'. So, they didn't want to turn segment heap on universally. It was known that a small portion of legacy software would misbehave and do things like, rely on doing a bit of use-after-free as a treat. Or worse, it took dependencies on casting addresses to internal NT heap data structures. So, the decision at the time was to make segment heap the default for packaged executables. At that time, Windows Phone still existed, and Microsoft was pushing super hard on the Universal platform being the new, recommended way to make apps on Windows. So they thought we'd see a gradual transition from unpackaged executables to packaged, and thus, a gradual transition from NT heap to segment heap. The dream of UWP died, and the Windows framework landscape is more fragmented than ever. Most important software on Windows is still unpackaged, and most of it runs on x64.
Why does this matter? Because segment heap is also enabled by default on arm. Same logic as the packaged vs unpackaged decision. Arm64 binaries on Windows are guaranteed not to be ancient, unmaintained legacy code. Arm64 windows devices have been a big success, and users widely report that they feel more responsive than x64 devices.
A not insignificant part of why Windows feels better on arm is because segment heap is enabled by default on arm.
I'd be interested to see how this test turns out if you force segment heap on x64. You can do it on a per-executable basis via creating a DWORD value named FrontEndHeapDebugOptions under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\<myExeName>.exe, and giving it a value of 8.
You can turn it on globally for all processes by creating a DWORD value named "Enabled" under HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Segment Heap, and giving it a value of 3. I do this on my dev machine and have encountered zero problems. The memory footprint savings are pretty crazy. About 15% in my testing.
I've measured NT Heap vs. Segment Heap for my RAM and CPU intensive workloads and got a steady 7% overall performance improvement. The combined workload finishes 7% quicker with the Segment Heap.
P.S. In Windows 95 - Windows Vista era, there was a good tradition of "Compatible with Windows XXX" certifications for apps. If MS did something like that for Windows 10/11 and included the segment heap tick mark into it, a considerably larger amount of apps and its users would benefit from increased performance. Think better energy consumption and eco-friendliness as additional bonuses.
P.S. 2: The problem with UWP was not the technology itself, it was the stubbornness to have it packaged and tied to The Store, all of which contradicts the very existence of Windows as an OS.
UWP is not strictly tied to the Windows store (you can install UWP applications packaged in the right format(s) from the command line, for business deployments for instance), but it might as well be when it comes to consumers.
I can't really complain, though. If UWP would've broken through, the Steam Deck would've probably been a much more massive undertaking to get working right.
As long as developers can opt into the new system (which they can with the manifest approach), I don't think it matters whether you're doing UWP or traditional Windows applications.
Microsoft has added a mishmash of flags in the app manifest and transparently supports manifest-less applications, so developers don't have a need to ever bother including a manifest either.
It'd annoy a lot of people, but if Windows would show a "this app has been written for an older version of Windows and may be slower than modern applications" warning for old .exes (or maybe one of those popups they now like about which apps are slower than they could be), developers would have an incentive to add a manifest to their applications and Microsoft could enable a lot more of these optimisations for a lot more applications.
> As long as developers can opt into the new system (which they can with the manifest approach) [...] Microsoft has added a mishmash of flags in the app manifest
Could you please tell me, where are all these manifest flags documented? I asked about it a decade and a half ago at stackoverflow (https://stackoverflow.com/questions/5733085/application-mani...), and the only answer was "there isn't".
https://learn.microsoft.com/en-us/windows/win32/sbscs/applic... has the majority here.
I don't see why you'd need a separate flag for memory management, Windows version, printer driver isolation, awareness of long paths, and all of that jazz.
Still, https://learn.microsoft.com/en-us/windows/win32/sbscs/applic... has a setting to enable modern memory management.
For those interested, you can opt-in to this behavior via the application manifest for your own executables: set heapType to SegmentHeap https://learn.microsoft.com/en-us/windows/win32/sbscs/applic...
Two issues.
First, regarding application compatibility: the heap was already changed once prior to the segment heap. The Low Fragmentation Heap (LFH) was added in XP and made default in Vista, with applications no longer having to opt into it:
https://learn.microsoft.com/en-us/windows/win32/memory/low-f...
Second, the segment heap has different tradeoffs that make it not a guaranteed win to swap in, it trades off performance for working set:
https://issues.chromium.org/issues/40138716
This feels like it deserves to live somewhere on a blog, not as a comment on some forum. This is really interesting thanks for sharing.
Agreed! Looks like it can be enabled with a manifest at build time too: https://learn.microsoft.com/en-us/windows/win32/sbscs/applic...
This is the sort of extremely valuable hint that makes HN worthwhile.
Does that global registry key require a reboot, or does it just take effect on executable launch?
> You can turn it on globally for all processes by creating a DWORD value named "Enabled" under HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Segment Heap, and giving it a value of 3
I had previously seen this described as 0 vs non-zero. Since you have some inside experience :), anything special about 3 instead? What about 2? How would I find these value meanings out on my own (if that's even possible)?
Thanks!
It's a combinination of bit flags. The lowest bit controls whether segment heap is on or off. The 2nd lowest bit bit controls some additional optimizations that go along with it, something about multithreading. A value of 3 (both flags set) gives you identical behavior to what specifying <heapType>SegmentHeap</heapType> in your application manifest does.
Using the application manifest approach is the right way to ship software that opts into segment heap. The registry thing is just a convenience for local testing.
How often does software actually ship with the opt in for segment heap turned on though ?
Anyway to globally turn it on when a blacklist or denylist or whatever in case something individual acts up ?
Would the (not Framework) .NET apps I work on benefit from this?
Any app using memory allocation functions would benefit from a newer heap implementation independently of a technology it's created with, unless it's actively constrained by compatibility burdens. In case of .NET, the memory layout compatibility is not something you usually care about unless the app loads old 3rd party .DLLs through P/Invoke. So for 99.9% of .NET (not Framework) apps, the segment heap should work just fine.
Ahh, a Windows problem surfaces - so does a registry hack allegedly fixing the problem. That's basically the (sad) story of Windows today.
I don't see how this is a 'problem', but rather a tunable. And any decent OS has tunables. Would you rather not have this option?
You could say that Windows is giving you options while defaulting to backwards compatibility.
One could also call it tuning.
Wonderful breakdown. I love reading this kind of thing. thank you
Seems like a no brainer on virtualised environments for Windows servers ?
Also assuming that most Microsoft first party applications in Windows server (DNS, etc etc) would all be optimised for segment heap ?