Hyper-V Backup over SAN - DP 10.04
we have a Hyper-V Cluster with two hosts.
The backup job is defined to save all vms through the "cluster-hostname" (cluster ip) backup host.
The cluster ip can only be on one host at the same time and switches to the other node, if you restart the host where the cluster ip is currently deployed.
Since DP is only backing up vms directly over san where the cluster ip is located, all other vms which are located on the second host need to be sent through lan to the backup host, and so a bottleneck is created.
A few days ago i migrated all vms to the host where the cluster ip is deployed to see how much the time difference would be and the difference was huge. If all vms are on the cluster ip node, the backup job would only need around 8 hours, else if all vms are distributed on both nodes evenly it takes around 15 - 20 hours depending on where the vms are located.
So my question is, is there a better way to backup my vms and save time?
Actually this works slightly differently than you have described. In HyperV you have something called the co-ordinator role and this plays as important a part as the DP backup host.
The technologies that let CSV-enabled volumes operate still require one cluster node that's responsible for the coordination of file access. This cluster node is called the coordinator node, with each individual LUN having its own coordinator node.
So.... You have the HyperV node hosting the VM plus the HpyerV node acting as the co-ordinator for the CSV that the VM data is on (these are not always the same node!) and then you also have the DP Backup Host and the DP Media Agent.
When the backup starts the DP Backup Host (in your case whichever cluster node is hosting the cluster VIP) figures out which Cluster Nodes are hosting the VMs (uses VEPALIB_HYPERV) and then tells each of those nodes to quiesce the VM and to send the VM data to the DP Media Agent for backup (uses VSSBAR for that bit).
Just to further complidate things... Once the VM is quiesced the actual data transfer is done via the Cluster Node hosting the CSV co-ordinator role - so if its a different cluster node than the one hosting the VM then the data is send via that node to the backup target
The only way (IMO) to truly automate end-to-end FC backup is if you are using a client-side deduplication enabled FC backup device so if you don't have that then this is not gonna help you. If you do have that however here is what you need to do.
1. Identify the VMs that *really* need to run end-to-end FC to shorten the backup window. For example: any VM larger than 500GB. Its unrealistic to aim to move all VMs this is just not going to be practical. So some of the backups will run across ethernet.
2. Write a powershell script that runs every day for the VMs you identified above (before your backup window start time) that checks the Cluster Node hosting the VM against the Cluster Node acting as co-ordinator for the CSV containing the VM data and if they are not the same CHANGE it (e.g. if VM is hosted by ClusNode1 but the VM data is on a CSV is co-ordinated by ClusNode3 then you need to move the VM to ClusNode3 *or* you need to change the CSV co-ordinator role to ClusNode1)
3. Ensure you have configured a source-side gateway in the DP B2D configuration *and* install DP MA on all of your Cluster Nodes then ZONE them into the disk-based backup device (e.g. StoreOnce FC ports). Run a hardware scan on the nodes to be sure you have see the disk-based devices in device manager. Ensure that the source-side gateway is configured to write FC only or FC w/ IP Fallback
4. In the Backup spec use the Cluster Virtual Resource as the "Backup Host" and select the source-side (client-side) gateway as the destination/target
With the above provided you ensure that all of VMs are hosted by the same Cluster Node that is acting as co-ordinator rols for the CSV that the VM data is on you will then see that multiple Cluster Nodes start streaming the identified VMs data directly to disk from the CSV and it will all happen end-to-end FC.
The above requires a significant change of mindset when deploying CSVs and VMs. The customers that I've implemented this for up to now have had to make changes to their environments to optimise them for backup.. But if you can get the commitment from them to do the above this will vastly improve your backup window.
Until MS redesign how clustered disks are accessed this is ALWAYS going to be a real headache.
If my post was useful, please click on KUDOS!
Lanfree backups has been enhanced since Data Protector 8.11, could you please confirm your target devices have multipath and also are accesible to all the Hyper-V clients? If that's the case then you will have to enable the following global variable:
The variable will overwrite how BSM determines the multipath host, using a value of 1 will try to use always a local device.
If still doesn't work then try LANfree=2
AFAIK the data transfer will still run via ethernet unless the co-ordinator role and VM hosting Cluster Node is the same. This is because the backup "client" is always the Cluster Node hosting the VM (this is where vssbar.exe gets started) whereas the data transfer for VE HyperV will always be from the node acting as co-ordinator role for the CSV.
Variable LANFree is designed to help you avoid having to always configure the preferred path on physical tape drives within the backup spec.
I'd love to be wrong here tho.... but I'm pretty sure I'm not...
If my post was useful, please click on KUDOS!
thanks for your long and excat answer.
You really made me undestand better how the procedure works.
Anway i just tried the LANfree=1 option but without success (just as you mentioned).
There are only two server that have around 2 TB data to backup, all other vms are less then 200GB.
So i will definitly try what you wrote and see if this can lower the duration of my backups.
Thanks a lot,