From VMware to IBM Cloud VPC VSI, part 4: Backup and restore

See all blog posts in this series:

  1. From VMware to IBM Cloud VPC VSI, part 1: Introduction
  2. From VMware to IBM Cloud VPC VSI, part 2: VPC network design
  3. From VMware to IBM Cloud VPC VSI, part 3: Migrating virtual machines
  4. From VMware to IBM Cloud VPC VSI, part 4: Backup and restore
  5. From VMware to IBM Cloud VPC VSI, part 5: VPC object model
  6. From VMware to IBM Cloud VPC VSI, part 6: Disaster recovery

IBM Cloud provides two backup offerings that are relevant to the backup of your applications running on IBM Cloud VSI:

  1. For catastrophic VM failures, leverage IBM Cloud Backup for VPC to perform crash-consistent volume backup and recovery for entire virtual machines
  2. For recovery of specific files and folders, leverage IBM Cloud Backup and Recovery‘s agent-based “physical server” backup capabilities to backup part or all of your VSI filesystems, and restore selected files or folders either to the original location or a new location

Because of these offerings’ complementary focus on volume backup versus file backup, you will need to combine the two of them to cover all failure scenarios. Let’s consider their capabilities and limitations in turn.

Backup for VPC—volume snapshots

If you navigate the IBM Cloud “hamburger menu” to Infrastructure | Storage | Backup policies, you can create backup policies. These backup policies allow you to select one or more volumes (block or file) by volume tagging criteria, or to select the volumes for one or more VSIs by VSI tagging criteria. You can define up to four schedules for the snapshots, meaning that you could, for example, have a daily schedule with 7 days of retention plus a weekly schedule with 90 days of retention. IBM Cloud maintains the snapshots in a space-efficient chain. Unlike VMware snapshots, IBM Cloud’s block storage snapshots exist in a separate chain from the VSI volumes and the VSI boot image. IBM Cloud’s snapshots remain intact even if the VSI is deleted. There are some important things to be aware of:

  1. You can also schedule your own automation to create point-in-time snapshots of either individual volumes or consistent snapshots of sets of volumes for individual VSIs. You could use this approach, for example, if you want to quiesce a database or filesystem prior to taking the snapshot.
  2. Snapshots for VSI volumes (either standalone snapshots or snapshots created via backup policy) are write-order-consistent across all of the volumes for the VSI. However, it is not currently possible to ensure write-order consistency across multiple VSIs unless you employ your own out-of-band approach to quiesce database or filesystem writes during the snapshot.
  3. Consistency groups are not currently supported for the second-generation sdp volume profiles. I hope for this to change over time, but for now I recommend against using them.
  4. If you plan to leverage the Backup for VPC service to create backup policies rather than directly invoking snapshots yourself, you need to create a set of four service-to-service authorizations between it and several other resources in your VPC, so that it can perform the snapshot scheduling and execution. It is important to create the specific detailed authorizations to these four resource types; in my experience, creating a blanket authorization for Backup for VPC to manage all of VPC Infrastructure Services resources does not work.
  5. If you require a resilient backup outside of the region where your VSI is running, it is possible to copy individual snapshots (one by one, not as part of their consistency groups) to another region. You can automate this process to ensure that some number of your backups are available outside of your region.
  6. As the number of VSIs and volumes grows, multiplied by your backup policy, you will find that visualizing and managing your consistency groups and snapshots in the IBM Cloud UI becomes unwieldy. You will likely need to build your own automation for visualizing and managing these.
  7. The process described here only backs up your VSI volumes. Additional VSI configuration such as its name, instance profile, VNI and IP address, security group, floating IP, etc. is not backed up by this approach. In order to restore your data, you will need to create a new VSI from the volume(s) or from each consistency group, and you will need to reconstitute all of the additional configuration for the VSI. You will likely need to build your own approach to recording this data and automating the restoration of VSIs if you expect to need to restore VSIs at scale.
  8. There is a concept of fast restore where select snapshots are copied from the regional backup storage to zonal block storage for fast provisioning. However, fast restore is available only for snapshots of individual volumes, not for VSI consistency groups.
  9. Consult the IBM Cloud docs for additional limitations on backup policies.

Whole-volume and whole-VSI backup and restore is quite heavy-handed for some backup scenarios such as recovery of deleted files. For your convenience, you may wish to complement the Backup for VPC capabilities by also using Backup and Recovery.

Backup and Recovery of files and folders

The Backup and Recovery offering leverages agents running on your VSIs (although it calls them “physical servers”) to backup files, folders, and certain databases. IBM publishes a list of currently supported operating system and database versions.

The steps you will follow to leverage this service are:

  1. Open the IBM Cloud “hamburger menu” and navigate to Backup and Recovery | Backup service instances
  2. Create an instance in the region of your choice. Note that Backup and Recovery currently integrates with HPCS encryption keys, but support for Key Protect encryption keys is forthcoming. If you don’t configure your own key, your backup storage will be encrypted using an IBM-managed key.
  3. Click on the details of your newly created instance and then click on Launch dashboard to login to your instance’s dashboard. Although the dashboard leverages the same credentials you use for IBM Cloud, the dashboard does not have dynamic SSO integration which means that you will need to login a second time.
  4. Navigate to System | Data Source Connections and create a logical connection representing your VPC. If you have multiple VPCs you should create a connection for each one.
  5. Now you need to perform some steps within your IBM Cloud VPC:
    • Prepare your VPC by creating a virtual private endpoint (VPE) gateway for the cloud object storage (COS) service in the same region as your VPC. You may already have such a gateway created if you have provisioned an IBM Cloud kubernetes instance into your VPC.
    • Prepare your VPC by creating a VPE gateway for your new Backup and Recovery instance.
    • Follow the instructions to create one or more VSI connectors within your VPC that will serve as the data movers for the logical connection that you created in the previous step. IBM recommends that you create at least two VSI connectors for high availability, and recommends as a rule of thumb that you create one connector for every 10 VSIs that you will be backing up simultaneously.
    • It’s important to ensure that your security groups allow for these connector VSIs to communicate with the two VPEs and with your workload VSIs. By default this is normally the case; however, in my case I found that the COS VPE which had been previously created by my IBM Cloud kubernetes service did not share a security group with my connector VSIs. This prevented my backups from succeeding until I corrected the problem.
  6. Now you need to prepare the workload VSIs that you will be backing up. You manage this within the Data Protection | Sources view in your Backup and Restore dashboard. From this view, you will download the agents for your VSI, then register each VSI as a “physical server.” In the process of doing this, you will associate it with the logical connection that you created previously. This will trigger your connectors to connect to the agents on your VSI. If you are carefully managing the firewall on your VSI you will want to take note of the ports you need to open between the connectors and your workload VSIs.
  7. Now you can manage your backup schedule in the Data Protection | Policies view, and schedule backup jobs leveraging these policies by creating protection groups in the Data Protection | Protection view. By default the system will backup all files on your VSI, but you can also select lists of folders either to include or to exclude.
  8. Finally, you can recover files and folders using the Data Protection | Recoveries view. You have the option of downloading them through your browser, restoring them to the original system (either in the original location or in a temporary folder), or restoring them to an alternative system that is running the backup agent and is registered in Data Protection | Sources.

As always, be sure to thoroughly review the documentation to familiarize yourself with other considerations such as alerting.

Leave a comment