glibc malloc still doesn't work well for multi-threaded apps. It is prone to memory fragmentation which causes excessive memory usage. One can reduce number of arenas using MALLOC_ARENA_MAX environment variable and in many cases it's a good idea but it could increase lock contention.

If you care about efficiency of a multi-threaded app you should use jemalloc (sadly no longer maintained but still works well), mi-malloc or tcmalloc.

Glibc malloc also has a fun bug where it doesn't return memory to the OS to make it look better on benchmarks.