For anyone not familiar with the meaning of '2' in this context:
The Linux kernel supports the following overcommit handling modes
0 - Heuristic overcommit handling. Obvious overcommits of address space are refused. Used for a typical system. It ensures a seriously wild allocation fails while allowing overcommit to reduce swap usage. root is allowed to allocate slightly more memory in this mode. This is the default.
1 - Always overcommit. Appropriate for some scientific applications. Classic example is code using sparse arrays and just relying on the virtual memory consisting almost entirely of zero pages.
2 - Don't overcommit. The total address space commit for the system is not permitted to exceed swap + a configurable amount (default is 50%) of physical RAM. Depending on the amount you use, in most situations this means a process will not be killed while accessing pages but will receive errors on memory allocation as appropriate. Useful for applications that want to guarantee their memory allocations will be available in the future without having to initialize every page.
> exceed swap + a configurable amount (default is 50%) of physical RAM
Naive question: why is this default 50%, and more generally why is this not the entire RAM, what happens to the rest?
There's a lot of options. If you want to go down the rabbithole try typing `sysctl -a | grep -E "^vm"` and that'll give you a lot of things to google ;)
it's a (then-)safe default from the age when having 1GB of RAM and 2GB of swap was the norm: https://linux-kernel.vger.kernel.narkive.com/U64kKQbW/should...
Probably a safe default as there's extra memory of kernel structures, file buffering, SSH sessions to allow logins to debug why your server suddenly has high load and high IOwait (swapping).
If you know a a system is going to run (e.g.) a Postgres database, then tweaking the vm.* sysctl values is part of the tuning process.
Not sure if I understand your question but nothing "happens to the rest", overcommitting just means processes can allocate memory in excess of RAM + swap. The percentage is arbitrary, could be 50%, 100% or 1000%. Allocating additional memory is not a problem per se, it only becomes a problem when you try to actually write (and subsequently read) more than you have.
They’re talking about the never-overcommit setting.
Just a guess, but I reckon it doesn't account for things like kernel memory usage, such as caches and buffers. Assigning 100% of physical RAM to applications is probably going to have a Really Bad Outcome.
But the memory being used by the kernel has already been allocated by the kernel. So obviously that RAM isn't available.
I can understand leaving some amount free in case the kernel needs to allocate additional memory in the future, but anything near half seems like a lot!
> For anyone not familiar with the meaning of '2' in this context:
Source:
* https://www.kernel.org/doc/Documentation/vm/overcommit-accou...
Do any of the settings actually result in "malloc" or a similar function returning NULL?
malloc() and friends may always return NULL. From the man page:
If successful, calloc(), malloc(), realloc(), reallocf(), valloc(), and aligned_alloc() functions return a pointer to allocated memory. If there is an error, they return a NULL pointer and set errno to ENOMEM.
In practice, I find a lot of code that does not check for NULL, which is rather distressing.
No non-embedded libc will actually return NULL. Very, very little practical C code actually relies only on specified behavior of the spec and will work with literally any compliant C compiler on any architecture, so I don’t find this particularly concerning.
Usefully handling allocation errors is very hard to do well, since it infects literally every error handling path in your codebase. Any error handling that calls a function that might return an indirect allocation error needs to not allocate itself. Even if you have a codepath that speculatively allocates and can fallback, the process is likely so close to ruin that some other function that allocates will fail soon.
It’s almost universally more effective (not to mention easier) to keep track of your large/variable allocations proactively, and then maintain a buffer for little “normal” allocations that should have an approximate constant bound.
> No non-embedded libc will actually return NULL
This is just a Linux ecosystem thing. Other full size operating systems do memory accounting differently, and are able to correctly communicate when more memory is not available.
There are functions on many C allocators that are explicitly for non-trivial allocation scenarios, but what major operating system malloc implementation returns NULL? MSVC’s docs reserve the right to return NULL, but the actual code is not capable of doing so (because it would be a security nightmare).
I hack on various C projects on a linux/musl box, and I'm pretty sure I've seen musl's malloc() return 0, although possibly the only cases where I've triggered that fall into the 'unreasonably huge' category, where a typo made my enormous request fail some sanity check before even trying to allocate.
> There are functions on many C allocators that are explicitly for non-trivial allocation scenarios, but what major operating system malloc implementation returns NULL?
Solaris (and FreeBSD?) have overcommitting disabled by default.
Solaris, AIX, *BSD and others do not offer overcommit, which is a Linux construct, and they all require enough swap space to be available. Installation manuals provide explicit guidelines on the swap partition sizing, with the rule of thumb being «at least double the RAM size», but almost always more in practice.
That is the conservative design used by several traditional UNIX systems for anonymous memory and MAP_PRIVATE mappings: the kernel accounts for, and may reserve, enough swap to back the potential private pages up front. Tools and docs in the Solaris and BSD family talk explicitly in those terms. An easy way to test it out in a BSD would be disabling the swap partition and trying to launch a large process – it will get killed at startup, and it is not possible to modify this behaviour.
Linux’s default policy is the opposite end of that spectrum: optimistic memory allocation, where allocations and private mappings can succeed without guaranteeing backing store (i.e. swap), with failure deferred to fault time and handled by the OOM killer – that is what Linux calls overcommit.
> No non-embedded libc will actually return NULL.
malloc(-1) should always return NULL. Malloc returns NULL if the virtual address space for a given process is exhausted.
It will not return NULL when the system is out of memory (depending on the overcommit settings)
It's been a while but while I agree the man page says that, my limited understanding was the typical libc on linux won't really return NULL under any sane scenario. Even when the memory can't be backed
I think you're right, but "typical" is the key word. Embedded systems, systems where overcommit is disabled, bumping into low ulimit -v settings, etc can all trigger an immediate failure with malloc(). Those are edge cases, to be sure, but some of them could be applied to a typical Linux system and me, as a coder, won't be aware of it.
As an aside: To me, checking malloc() for NULL is easier than checking a pointer returned by malloc on first use. That's what you're supposed to do in the presence of overcommit.
Even with overcommit enabled, malloc may fail if there is no contiguous address space available. Not a problem in 64 bits but may occasionally happen in 32 bits
But why would you want to violate the docs on something as fundamental as malloc? Why risk relying on implementation specific quirks in the first place?
Because it's orders of magnitudes easier not to handle it. It's really as simple as that.
malloc() is an interface. There are many implementations.
Yes.