Hacker News

this uses `zfs send @snapshot` which is block-level, not file-level.

Oh! So the issue with large postgres backups is the number of files?

No. Postgres data files are 1Gb each. When you change just one byte in a table, a whole 1Gb file gets updated (just 1 byte change, effectively). Your file-based backup tool now has to upload 1Gb of data to save 1 byte of actual changes.

mrflop 3 days ago [ - ]

That's a fair point, and it's a known challenge with file-based backups on systems like Postgres. That said, some backup systems implement chunk-level deduplication and content-addressable storage, which can significantly reduce the amount of data actually transferred, even when large files change slightly.

For example, tools like Plakar (contributor here) split data into smaller immutable chunks and only store the modified ones, avoiding full re-uploads of 1GB files when only a few bytes change.

Tostino 4 days ago [ - ]

They fixed that in pgbackrest a while ago: https://pgbackrest.org/user-guide.html#backup/block

It was a major pain point for my backups for years.

levkk 4 days ago [ - ]

Does that work with S3, etc.? I don't remember them allowing partial file uploads.

Tostino 4 days ago [ - ]

I believe so, because it is done in conjunction with their file bundling feature and doesn't rely on support from the blob storage backend.

They create a new file with the diffs of a bundle of Postgres files, and upload that to blob storage.