Hacker News

Not amazing. In certain workloads I ran, once the db reached several hundred gb, writes would hang for longer and longer periods of time, eventually hours, while the db grew drastically in the background. https://news.ycombinator.com/item?id=30023623 seems to be the same issue, and it was serious enough that Shopify decided not to use lmdb.

And yes, I ensured there were no outstanding long lived readers, verified with mdb_stat -r. My workload used one transaction per read/write anyway (never needed larger atomicity). Once the db got into the bad state, running my program on it would almost immediately run into the issue again, so I really think the db is in a bad state such that most writes would cause it to hang, not related to how I do transactions. This workload would pretty consistently hit the issue once the db got to several hundred gb.

Issue #10236 on the OpenLDAP bug tracker might be the root cause, who knows. It's been marked CONFIRMED for years without a fix, while other similar issues are created.

This is extremely annoying. It seems workload dependent (other workloads I've run create absolutely massive lmdb dbs without this issue) and once it happens your only recourse is to make a new db and copy the contents over (thankfully reads still work fine on these borked dbs).

Other than that, though, it's great. Never in any case had actual data corruption, and reads and writes are extremely fast (until this issue happens)

Edit: fun fact, since shopify may have created Bolt in response to this bug, and then Bolt was the root cause of the 73-hour Roblox downtime in 2021, this bug may indirectly have caused one of the worst outages ever!

That it keeps an infinite cache of malloc page allocations is annoying (the issue you referenced). I just removed that (after complaining on the mailing list about it). The performance advantage is probably negligible in many cases (since malloc implementations often already cache), while causing confusing memory usage behavior.

Idk, if it was your issue, but for long running write transactions it doesn't spill to disk. So you have all the changes being written to disk at the end of the transaction. One would think enabling write mapping fixes this, but it needs to mark all the pages as clean before commit, so same effect there. I fixed this for 0.9 here https://github.com/uroni/hs5/tree/main/external/lmdb . Will have to investigate if it is improved with 1.0, or if I need to redo the changes.

Edit: Just noticed that the issue is about free list in the file. Never had a problem with that, but I also had to replace that MIDL structure with something more scalable for the spilling.

jnwatson 21 hours ago [ - ]

I've used LMDB in production for multi-terabyte databases, and we encountered the long-write time but found a solution.

The important idea is that LMDB offloads cache management almost completely to the OS. You have to become intimately familiar with the way that the page cache works and how to configure it.

uroni 21 hours ago [ - ]

hyc_symas 6 hours ago [ - ]

LMDB 1.0 no longer uses a P_DIRTY flag, it no longer has to explicitly mark pages as clean.

markasoftware 20 hours ago [ - ]

FWIW I had this issue even with the MDB_NOSYNC flag so it shouldnt be force flushing to disk unless I'm out of ram or whatever