I currently have two computers, one that has a big zfs raidz pool that I currently back everything up to. Right now, on my local computer I use rsnapshot
to do snapshot backups via rsync to the remote zfs pool. I know I’m wasting a ton of space because I have snapshotting in the rsync backup, and then the zfs pool is snapshotted every day.
Does it make sense to just do a regular rsync
into a backup directory on the zfs pool and then just rely on the zfs pool snapshotting for snapshotting?
Maybe eventually I will put the local machine on zfs and then just send the local zfs snapshots over, but that will take some time. Thanks!
Using plain rsync sounds sane.
Sending local ZFS snapshots to the remote ZFS might be problematic. Consider accidentally deleting important data locally and nuking all of your local snapshots, then sending that to the remote ZFS. You lost all of your snapshots and there’s no way to recover the deleted data. Instead do what I do - keep the two ZFS systems separate and use a non-ZFS mechanism to transfer data - rsync, Syncthing, etc. That way even if you delete everything locally, nuke all local snapshots and send the deletions via rsync remotely, you could still recover your data by restoring the remote ZFS to a snapshot prior to the deletions. For reference I have two ZFS machines doing frequent snapshots and Syncthing replicating data between them on immediate basis.
!selfhosted, please do critique if you find some fundamental issues with this.
Zfs send / receive might be what you want
Wouldn’t send/receive also sync snapshots across ZFS instances?
Docs say this , so yeah. "send streams can either be “full”, containing all data in a given snapshot, or “incremental”, containing only the differences between two snapshots. ZFS receive reads these send streams and uses them to re-create identical snapshots on a receiving system. "
Hm, so send doesn’t “create the same state, bits and snapshots” on the other side. Instead it “adds net new snapshots” on the other side. 🤔
Perhaps I could use send instead of Syncthing after all. But then again I’m typically syncing net new data so the optimization would be minimal.
I believe there is a method to do a 1-1 build copy, but my expertise ends at this point
This is fantastically helpful, thank you. I will do this.
I don’t know why I thought sending zfs snapshots was the better option
Cause it makes sense at a glance and it’s efficient. Not for backup purposes though.
You don’t sync the deletion of snapshots, you use expiry on the remote
That is rsync.net’s entire business model.
I still rclone my Borg repos there instead of relying on snapshots though.
I also use rsync.net but as direct host for my borg repos, why rclone after?
It works the same either way. Borg does a lot of different backups on my home network. I also have more than just Borg backups that I want off-site, so an rclone of everything from that nas share once after everything else is done makes more sense than duplicating Borg everywhere. The rclone’d stuff can be used directly just like if it was put there by Borg itself.
It’d be worth checking out Borg as an alternative to rsync. Borg will handle snapshotting, and automatically de-dupe on a block-by-block basis.
I use it for all of my remote backups, and it provides a lot of quality of life stuff that rsync isn’t going to handle.
So for this, would i make another zfs pool on my remote backup server that is not snapshotted? Like, the problem i have is that i have snapshotting via rsync, but then the whole remote server zfs pool is further snapshotted so there’s a lot of redundancy.
Have you tried a restore? A non-differential smap snapshot should be fine, but differential snapshots would make a restore difficult to impossible.
A zfssend and zfsrestore with a differential snapshot would be more traditional. If one put mbuffer in the middle, it would even be fast.
Don’t use filesystem snapshots as backup. They’re a safety measure against accidental deletion or casual modifications but they’re not backups.
If you want backups then use a proper, dedicated solution like Borg Backup. It connects remotely, takes care of deduplication, compression, encryption etc. and you can fully verify the backups and manage them individually.
This is the right answer. A better backup strategy is an actual backup strategy. Snapshots, drive mirroring, rsync copies, etc aren’t really backups.