A replication task allows you to automate the copy of ZFS snapshots to another system over an encrypted connection. This allows you to create an off-site backup of a ZFS dataset or pool.
This section will refer to the system generating the ZFS snapshots as PUSH and the system to receive a copy of the ZFS snapshots as PULL.
Before you can configure a replication task, the following pre-requisites must be met:
- a ZFS volume must exist on both PUSH and PULL.
- a periodic snapshot task must be created on PUSH. You will not be able to create a replication task before the first snapshot exists.
- the SSH service must be enabled on PULL. The first time the service is enabled, it will generate the required SSH keys.
A replication task uses the following keys:
- /data/ssh/replication.pub: the RSA public key used for authenticating the PUSH replication user. This key needs to be copied to the replication user account on PULL.
- /etc/ssh/ssh_host_rsa_key.pub: the RSA host public key of PULL used to authenticate the receiving side in order to prevent a man-in-the-middle attack. This key needs to be copied to the replication task on PUSH.
This section will demonstrate how to configure a replication task between the following two FreeNAS® systems:
- 192.168.2.2 will be referred to as PUSH. This system has a periodic snapshot task for the ZFS dataset /mnt/local/data.
- 192.168.2.6 will be referred to as PULL. This system has an existing ZFS volume named /mnt/remote which will store the pushed snapshots.
A copy of the public key for the replication user on PUSH needs to be pasted to the public key of the replication user on the PULL system.
To obtain a copy of the replication key: on PUSH go to Storage → Replication Tasks → View Replication Tasks. Click the View Public Key button and copy its contents. An example is shown in Figure 6.2a.
Figure 6.2a: Paste the Replication Key
Go to PULL and click Account → Users → View Users. Click the Modify User button for the user account you will be using for replication (by default this is the root user). Paste the copied key into the "SSH Public Key" field and click OK. If a key already exists, append the new text after the existing key.
On PULL, ensure that the SSH service is enabled in Services → Control Services. Start it if it is not already running.
On PUSH, verify that a periodic snapshot task has been created and that at least one snapshot is listed in Storage → Periodic Snapshot Tasks → View Periodic Snapshot Tasks → ZFS Snapshots.
To create the replication task, click Storage → Replication Tasks → Add Replication Task. Figure 6.2b shows the required configuration for our example:
- the Volume/Dataset is local/data
- the Remote ZFS Volume/Dataset is remote
- the Remote hostname is 192.168.2.6
- the Begin and End times are at their default values, meaning that replication will occur whenever a snapshot is created
- once the Remote hostname is input, click the SSH Key Scan button; assuming the address is reachable and the SSH service is running on PULL, its key will automatically be populated to the Remote hostkey box
Table 6.2a summarizes the available options in the Add Replication Task screen:
Figure 6.2b: Adding a Replication Task
Table 6.2a: Adding a Replication Task
|Enabled||checkbox||uncheck to disable the scheduled replication task without deleting it|
|Volume/Dataset||drop-down menu||the ZFS volume or dataset on PUSH containing the snapshots to be replicated; the drop-down menu will be empty if a snapshot does not already exist|
|Remote ZFS Volume/Dataset||string||the ZFS volume on PULL that will store the snapshots; /mnt/ is assumed and should not be included in the path|
|Recursively replicate||checkbox||if checked will replicate child datasets and replace previous snapshot stored on PULL|
|Initialize remote side||checkbox||does a reset once operation which destroys the replication data on PULL before reverting to normal operation; use this option if replication gets stuck|
|Limit (kB/s)||integer||limits replication speed to specified value in kilobytes/second; default of 0 is unlimited|
|Begin||drop-down menu||the replication can not start before this time; the times selected in the Begin and End fields set the replication window for when replication can occur|
|End||drop-down menu||the replication must start by this time; once started, replication will occur until it is finished (see NOTE below)|
|Remote hostname||string||IP address or DNS name of PULL|
|Remote port||string||must match port being used by SSH service on PULL|
|Dedicated User Enabled||checkbox||allows a user account other than root to be used for replication|
|Dedicated User||drop-down menu||only available if Dedicated User Enabled is checked; select the user account to be used for replication|
|Enable High Speed Ciphers||checkbox||note that the cipher is quicker because it has a lower strength|
|Remote hostkey||string||use the SSH Key Scan button to retrieve the public key of PULL|
By default, replication occurs when snapshots occur. For example, if snapshots are scheduled for every 2 hours, replication occurs every 2 hours. The Begin and End times can be used to create a window of time where replication occurs. Change the default times (which allow replication to occur at any time of the day a snapshot occurs) if snapshot tasks are scheduled during office hours but the replication itself should occur after office hours. For the End time, consider how long replication will take so that it finishes before the next day's office hours begin.
Once the replication task is created, it will appear in the View Replication Tasks of PUSH.
PUSH will immediately attempt to replicate its latest snapshot to PULL. If the replication is successful, the snapshot will appear in the Storage → Periodic Snapshot Tasks → View Periodic Snapshot Tasks → ZFS Snapshots tab of PULL, as seen in Figure 6.2c.
Figure 6.2c: Verifying the Snapshot was Replicated
If the snapshot is not replicated, see the next section for troubleshooting tips.
If you have followed all of the steps above and have PUSH snapshots that are not replicating to PULL, check to see if SSH is working properly. On PUSH, open Shell and try to ssh into PULL. Replace hostname_or_ip with the value for PULL:
ssh -vv -i /data/ssh/replication hostname_or_ip
This command should not ask for a password. If it asks for a password, SSH authentication is not working. Go to Storage → Replication Tasks → View Replication Tasks and click the "View Public Key" button. Make sure that it matches one of the values in ~/.ssh/authorized_keys on PULL, where ~ represents the home directory of the replication user.
Also check /var/log/auth.log on PULL and /var/log/messages on PUSH to see if either log gives an indication of the error.
If the key is correct and replication is still not working, try deleting all snapshots on PULL except for the most recent one. In Storage → Periodic Snapshot Tasks → View Periodic Snapshot Tasks → ZFS Snapshots check the box next to every snapshot except for the last one (the one with 3 icons instead of 2), then click the global Destroy button at the bottom of the screen.
Once you have only one snapshot, open Shell on PUSH and use the zfs send command. To continue our example, the ZFS snapshot on the local/data dataset of PUSH is named auto-20110922.1753-2h, the IP address of PULL is 192.168.2.6, and the ZFS volume on PULL is remote. Note that the @ is used to separate the volume/dataset name from the snapshot name.
zfs send firstname.lastname@example.org | ssh -i /data/ssh/replication 192.168.2.6 zfs receive email@example.com
NOTE: if this command fails with the error "cannot receive new filesystem stream: destination has snapshots", check the box "initialize remote side for once" in the replication task and try again. If the zfs send command still fails, you will need to open Shell on PULL and use the zfs destroy -R volume_name@snapshot_name command to delete the stuck snapshot. You can then use the zfs list -t snapshot on PULL to confirm if the snapshot successfully replicated.
After successfully transmitting the snapshot, recheck again after the time period between snapshots lapses to see if the next snapshot successfully transmitted. If it is still not working, you can manually send an incremental backup of the last snapshot that is on both systems to the current one with this command (where 1853 is the latest snapshot on your system):
zfs send -i firstname.lastname@example.org email@example.com | ssh -i /data/ssh/replication 192.168.2.6 zfs receive firstname.lastname@example.org