Hacker News

Yes, for Redpanda. There's a blog about that:

"The use of fsync is essential for ensuring data consistency and durability in a replicated system. The post highlights the common misconception that replication alone can eliminate the need for fsync and demonstrates that the loss of unsynchronized data on a single node still can cause global data loss in a replicated non-Byzantine system."

However, for all that said, Redpanda is still blazingly fast.

https://www.redpanda.com/blog/why-fsync-is-needed-for-data-s...

uberduper 4 days ago [ - ]

I'm highly skeptical of the method employed to simulate unsync'd writes in that example. Using a non-clustered zookeeper and then just shutting it down, breaking the kafka controller and preventing any kafka cluster state management (not just preventing partition leader election) while manually corrupting the log file. Oof. Is it really _that_ hard to lose ack'd data from a kafka cluster that you had to go to such contrived and dubious lengths?

mxey 4 days ago [ - ]

> while manually corrupting the log file

To be fair, since without fsync you don't have any ordering guarantees for your writes, a crash has a good chance of corrupting your data, not just losing recent writes.

That's why in PostgreSQL it's feasible to disable https://www.postgresql.org/docs/18/runtime-config-wal.html#G... but not to disable https://www.postgresql.org/docs/18/runtime-config-wal.html#G....

mxey 4 days ago [ - ]

I just read the post and didn’t find it contrived at all. The point is to simulate a) network isolation and b) loss of recent writes.

kasey_junk 4 days ago [ - ]

Kafka no longer has Zookeeper dependency and RedPanda never did (this is just an aside for those reading along, not a rebuttal).

jackvanlightly 4 days ago [ - ]

We fixed that particular issue: https://jack-vanlightly.com/blog/2023/8/17/kafka-kip-966-fix...