Hacker News

I've been using ZFS for quite a while, and I had a realization some time ago that for a lot of data, I could tolerate a few hours worth of loss.

So instead of a mirror, I've set up two separate one-disk pools, with automatic snapshots of the primary pool every few hours, which are then zfs send/recv to the other pool.

This gives me a lot more flexibility in terms of the disks involved, one could be SSD other spinning rust for example, at the cost of some read speed and potential uptime.

Depending on your needs, you could even have the other disk external, and only connect it every few days.

I also have another mirrored RAID pool for more precious data. However almost all articles on ZFS focus on the RAID aspect, while few talk about the less hardware demanding setup described above.

sandreas 2 days ago [ - ]

Interesting idea... thanks for sharing.

I have two setups.

1.) A mirror with an attached Tasmota Power Plug that I can turn on and off via curl to spin up an USB-Backup-HD:

  curl "$TASMOTA_HOST/cm?cmnd=POWER+ON"
  # preparation and pool imports
  # ...
  # clone the active pool onto usb pool
  zfs send --raw -RI "$BACKUP_FROM_SNAPSHOT" "$BACKUP_UNTIL_SNAPSHOT" | pv | zfs recv -Fdu "$DST_POOL"

2.) A backup server that pulls backup to ensure ransomware has no chance via zsync (https://gitlab.bashclub.org/bashclub/zsync/)

To prevent partial data loss I use zfs-auto-snapshot, zrepl or sanoid, which I configure to snapshot every 15 minutes and keep daily, weekly, montly and yearly snapshots as long as possible.

To clean up my space when having too many snapshots, I wrote my own zfs-tool (https://github.com/sandreas/zfs-tool), where you can do something like this:

  zfs-tool list-snapshots --contains='rpool/home@' --required-space=20G --keep-time="30d"

ddxv 2 days ago [ - ]

That's a really cool idea and matches my use case well. I just copy pasted it to another person in this thread who was asking about the ZFS setup.

Your use case perfectly matches mine in that I wouldn't mind much about a few hours of data loss.

I guess the one issue is that it would require more disks, which at the current prices is not cheap. I was suprised how expensive it was when I bought them 6 months ago and was even more suprised when I looked recently and the same drives are even more now.

oarsinsync 2 days ago [ - ]

I opted to use a two disk mirror, and offline the slow disk. Hourly cronjob to online the slow disk, wait, and then offline it again.

Gives me the benefit of automatic fixes in the event of bit rot in any blocks more then an hour old too.

j1elo 2 days ago [ - ]

That sounds cool; is it possible to just query the ZFS system to know when it has finished synchronizing the slow disk, before bringing it offline again? Do you think that stopping and spinning the disk again, 24 times a day, is not going to cause much wear to the motors?

oarsinsync 2 days ago [ - ]

  zfs wait <poolname>

magicalhippo 2 days ago [ - ]

That is another way, though annoying if you've set up automatic error reporting.

huflungdung 2 days ago [ - ]

[dead]