Complex (2)

I have generally found NoSQL to be a disaster. Like agile processes, it allows you to dispense with certain disciplines, but for use at scale and over time it requires you to engage in substitute disciplines. Too often these are not practiced. From a recent work chat with minor adaptation:

Data hygiene is crucial. I wouldn’t be opposed to broader NoSQL/JSON use if we used JSON schemas wherever appropriate, but at that point it is probably simpler to flatten the data into tables.

A good schema is a species of defensive code; e.g., you can have higher confidence that the value you are reaching for is actually there no matter how old the document.

See also: Complex

Connecting VMware Cloud Director with IBM Cloud VPC

IBM Cloud offers IBM–managed VMware Cloud Director through its VMware Solutions Shared offering. This offering is currently available in IBM Cloud’s Dallas and Frankfurt multi-zone regions, enabling you to deploy VMware virtual machines across three availability zones in those regions.

IBM Cloud also offers a virtual private cloud (VPC) for deployment of virtual machine and container workloads. Although VMware Cloud Director is operated in IBM Cloud’s “classic infrastructure,” it is still possible to interconnect your Cloud Director workload with your VPC workload using private network endpoints (PNEs) that are visible to your VPC.

In this article we’ll discuss how to implement this solution. This solution allows for bidirectional connectivity, but for illustrative purposes consider the use case of hosting an application in IBM Cloud VPC and a database in VMware Cloud Director:

Reviewing this topology from the top down:

  • Incoming traffic is handled by an IBM Cloud Load Balancer
  • The load balancer distributes connections to applications running on virtual server instances (VSIs) in our example, or optionally to kubernetes services. The application is deployed in two zones for high availability.
  • Each zone in the VPC has a router that will tunnel traffic to and from Cloud Director using BGP over IPsec. For the purposes of this exercise we used a RedHat Enterprise Linux 8 VSI, but you could deploy virtual gateway appliances from a vendor of your choice.
  • The VPC routers connect over the private IBM Cloud network through private network endpoints (PNEs) to edge appliances in Cloud Director.
  • The Cloud Director workload is distributed across three virtual datacenters (VDCs), one in each availability zone. Two edge services gateways (ESGs), one in each of two zones, serve as the ingress and egress points. These operate in active–standby state so that a stateful firewall can be used.
  • The database is deployed across three zones for high availability.

Caveats

The solution described here uses the IBM Cloud private network. This is a nice feature of the solution, but for reasons that may not be initially obvious, it is also required at the moment. If you wish to connect a single availability zone between VCD and VPC, you could do so using a public VPN connection between your VCD edge and the IBM Cloud VPC VPN gateway service. However, the VPC VPN service currently does not support BGP peering, so it is not possible to create a highly available connection that is able to failover to a different VCD edge endpoint.

Also, the solution outlined here deploys only a single router device in each VPC zone. For high availability, you likely want to deploy multiple virtual router appliances, and for routing purposes share a virtual IP address which you reserve in your VPC subnet. At this time, IBM Cloud VPC does not support multicast or protocols other than ICMP, TCP, and UDP. These limitations exclude protocols like HSRP and VRRP; you should ensure that your router’s approach to HA is able to operate using unicast ICMP, TCP, or UDP.

Deploy your VPC resources

Create a VPC in Dallas or Frankfurt. The VPC will automatically generate address prefixes and subnets for you; I recommend you de-select “Create a default prefix for each zone” so that you can choose your own later:

Next, navigate to your VPC and create address prefixes of your choice:

In order to create subnets, you must navigate away from the VPC to the subnet page. In our case, since we are hosting workloads in only two zones, we had a need only for two subnets:

Next, create four virtual server instances (VSIs), two in each zone. Within each zone, one VSI will serve as the application and the other will serve as a virtual router. For the purposes of this example we use RedHat Enterprise Linux 8.

You need to modify the router VSI network interfaces, either when you create it or afterwards, to enable IP spoofing. This will allow the routers to route traffic other than their own IP address:

Be sure to update the operating system packages and reboot each VSI.

Finally, create an IBM Cloud load balancer instance pointing to each of your application VSIs. Because this is a multi-zone load balancer you must use the DNS-based application load balancer:

Deploy your Cloud Director resources

Next create three VMware Solutions Shared virtual data centers (VDCs). Note that while VPC availability zones are named 1, 2, and 3, VDC availability zones are named according to the IBM Cloud classic infrastructure data center names. Thus, we will deploy to Dallas 10, 12, and 13, which correspond to the three VDC zones:

After creating your three virtual data centers, you need to view any one of these VDCs and reset the administrator password to gain access to the single Cloud Director organization for your account. Using this administrator account you can create additional users and optionally integrate with your own SSO provider:

Next, use these credentials to login to the Cloud Director console. We will create a Data Center Group and assign all three of our VDCs to it so that they have a shared stretch network and network egress. Navigate to Data Centers | Data Center Groups and create a new data center group. Ensure that you select the “Create Local Group” option; although the VDCs are actually in different availability zones, they are designated in the same fault domain from a Cloud Director perspective and we will use active-standby routing. There is only one network pool available for you to use:

After creating the data center group, create a stretched network that will be shared by all three VDCs:

Add your DAL10 edge as the active egress point, and your DAL12 edge as the passive egress point:

Next, navigate to each of your VDCs, view the stretched network, and create an IP pool for each VDC that is a subset of your stretched network:

Next, configure your DAL10 and DAL12 edges (see IBM Cloud docs for details) to allow and to SNAT egress traffic from your VPC to the IBM Cloud service network (e.g., for DNS and RedHat Satellite) and to the public network. If you wish to DNAT traffic from the public internet to reach your virtual machines, keep in mind that the DAL10 edge is the active edge and you should not use DAL12 for ingress except in case of DAL10 failure.

Minimally you want your workload to reach the IBM private service network which includes 52.117.132.0/24 and 161.26.0.0/16. Because we are using private network endpoints (PNEs) you also need to permit 166.9.0.0/15; this address range is also used by any other IBM Cloud services offering private endpoints. For this example I simply configured the edge firewalls to permit all outbound traffic to both private and public:

You must configure an SNAT rule for the private service network (note that this rule is created on the service interface):

and, if needed, an SNAT rule for the public network (note that this rule is created on the external interface):

Next, create the virtual machines that will serve as your database, one in each VDC. For the purposes of this example, we deployed RHEL 8 virtual machines from the provided templates and connected them to IBM Cloud’s Satellite server following the directions in the /etc/motd file. There are a few caveats to the deployment:

  • You should connect the virtual machine interfaces to the stretched network before starting them so that the network customization configures their IP address. Choose an IP address from the pool you created earlier.
  • At first power-on, you should “power on and force recustomization;” afterwards you can view the root password from the customization properties.
  • When using a stretched network, customization does not set the DNS settings for your virtual machines. For RHEL we entered the IBM Cloud DNS servers into /etc/sysconfig/network-scripts/ifcfg-ens192 as follows:
DNS1=161.26.0.10
DNS2=161.26.0.11

Configure BGP over IPsec connectivity between VCD and VPC

In order to expose your Cloud Director edges to your VPC using the IBM Cloud private network, you must create private network endpoints (PNEs) for your DAL10 and DAL12 VDCs. First, in the IBM Cloud console, view your VPC details. A panel on that page lists the “Cloud Service Endpoint service addresses” which are addresses not visible to your VPC but which are the addresses representing your VPC that you will need to permit to access your PNEs. Take note of these addresses:

Now, navigate to your DAL10 and DAL12 VDCs in the IBM Cloud console and click “Create a private network endpoint.” Select the device type of your choice and enter the IP addresses you noted above:

The PNE may take some time to create as it is an operator assisted activity. After it has been successfully created, you will need to create a second PNE in each of the two zones. The reason we need to create a second PNE is that the PNE hides the source IP address of incoming connections, so we cannot configure policies for two different IPsec tunnels using the same PNE. The IBM Cloud console does not allow you to create a second PNE automatically, so you must open a support ticket to the VMware Solutions team. Phrase your ticket as follows:

Hi, I have already created a PNE for my VCD edges edge-dal10-xxxxxxxx and edge-dal12-yyyyyyyy. Please create a second service IP for each of these edges with an additional PNE for each edge. Please use the same whitelist for the existing PNEs. Thank you!

Note that in our example we are connecting only Dallas 1 and Dallas 2 zones from our VPC to Cloud Director. If you wanted to connect Dallas 3 as well, you would need to request three rather than two PNEs for each of your DAL10 and DAL12 edges.

Now we need to configure each of our two NSX edges and our two VPC routers to have dual BGP over IPsec connections to their peers. You need to select which PNE will be used for each VPC router connection.

On the VCD side, the IPsec VPN site configuration for one of the VPC routers looks as follows. In this case, the 52.x address is the PNE’s “service network IP” and the 166.x address is the PNE’s “private network IP:”

And the corresponding BGP configuration is as follows:

Finally, you must be sure to permit the VCD and VPC interconnectivity in both edge firewalls:

For the purposes of this example we are using RHEL8 VSIs as simple routers on the VPC side. First of all, we need to modify /etc/sysctl.conf to allow IP forwarding:

net.ipv4.ip_forward = 1

And then turn it on dynamically:

[root@smoonen-router1 ~]# echo 1 >/proc/sys/net/ipv4/ip_forward
[root@smoonen-router1 ~]#

Next we installed the libreswan package for IKE/IPsec support, and the frr package for BGP support.

In order to use dynamic routing, the IPsec tunnel must be configured using a virtual tunnel interface (VTI). The IPsec configuration for our Dallas 1 router is as follows. The left and leftid values are the address and identity of the router appliance itself. The right value has been obscured; it reflects the address of the VCD edge as known to the router; this is the PNE’s “private network IP.” The rightid value has also been obscured; it reflects the identity of the VCD edge, which we have previously set to the PNE’s “service network IP:”

# Connection to ESG1
conn routed-vpn-esg1
    left=192.168.1.4
    leftid=192.168.1.4
    right=166.9.xx.xx
    rightid=52.117.xx.xx
    authby=secret
    leftsubnet=0.0.0.0/0
    rightsubnet=0.0.0.0/0
    leftvti=10.10.10.1/30
    auto=start
    ikev2=insist
    ike=aes128-sha256;modp2048
    mark=5/0xffffffff
    vti-interface=vti01
    vti-shared=no
    vti-routing=no

# Connection to ESG2
conn routed-vpn-esg2
    left=192.168.1.4
    leftid=192.168.1.4
    right=166.9.yy.yy
    rightid=52.117.yy.yy
    priority=2000
    authby=secret
    leftsubnet=0.0.0.0/0
    rightsubnet=0.0.0.0/0
    leftvti=10.10.10.5/30
    auto=start
    ikev2=insist
    ike=aes128-sha256;modp2048
    mark=6/0xffffffff
    vti-interface=vti02
    vti-shared=no
    vti-routing=no

Note that the tunnels use a different mark and VTI interface. Next, in /etc/frr/daemons, enable bgpd:

bgpd=yes

Then define your tunnel interfaces in /etc/frr/zebra.conf; these are the interfaces for our Dallas 1 router:

!
interface vti1
ip address 10.10.10.1/30
ipv6 nd suppress-ra
!
interface vti2
ip address 10.10.10.5/30
ipv6 nd suppress-ra

Finally, configure BGP in /etc/frr/bgpd.conf:

hostname smoonen-router1
router bgp 64555
 bgp router-id 10.10.10.1
  network 10.10.10.0/30
  network 10.10.10.4/30
  network 192.168.1.0/24
  neighbor 10.10.10.2 remote-as 65010
  neighbor 10.10.10.2 route-map RMAP-IN in
  neighbor 10.10.10.2 route-map RMAP-OUT out
  neighbor 10.10.10.2 soft-reconfiguration inbound
  neighbor 10.10.10.2 weight 2
  neighbor 10.10.10.6 remote-as 65010
  neighbor 10.10.10.6 route-map RMAP-IN in
  neighbor 10.10.10.6 route-map RMAP-OUT out
  neighbor 10.10.10.6 soft-reconfiguration inbound
  neighbor 10.10.10.6 weight 1

ip prefix-list PRFX-VCD seq 5 permit 172.16.0.0/12 le 32
ip prefix-list PRFX-VPC seq 5 permit 192.168.0.0/16 le 32

route-map RMAP-IN permit 10
 match ip address prefix-list PRFX-VCD
route-map RMAP-OUT permit 10
 match ip address prefix-list PRFX-VPC

log file /var/log/frr/bgpd.log debug

Taken together, we have configured:

  • Cloud Director to use DAL10 as active and DAL12 as standby
  • Cloud Director edges will advertise the entire stretch network (172.16.1.0/24) to the VPC routers
  • Each VPC router is configured to prefer the DAL10 edge
  • Each VPC router will advertise its own zone (192.168.1.0/24 or 192.168.2.0/24) to the Cloud Director edges

Now enable IPsec and FRR:

systemctl start ipsec
systemctl enable ipsec
ipsec auto --add routed-vpn-esg1
ipsec auto --add routed-vpn-esg2
ipsec auto --up routed-vpn-esg1
ipsec auto --up routed-vpn-esg2

chown frr:frr /etc/frr/bgpd.conf
chown frr:frr /etc/frr/staticd.conf
systemctl start frr
systemctl enable frr

Finally, you need to visit the IBM Cloud console and find the route table configuration for your VPC:

Modify the route table configuration to direct the VCD networks to your router VSI in each zone. Remember that for this example we are hosting applications only in two zones:

After the tunnel is up and the initial BGP exchange complete, you should have bidirectional connectivity between both environments. Here is a ping from one of our application VSIs:

[root@smoonen-application1 ~]# ping -c 3 -I 192.168.1.5 172.16.1.10
PING 172.16.1.10 (172.16.1.10) from 192.168.1.5 : 56(84) bytes of data.
64 bytes from 172.16.1.10: icmp_seq=1 ttl=61 time=3.21 ms
64 bytes from 172.16.1.10: icmp_seq=2 ttl=61 time=2.34 ms
64 bytes from 172.16.1.10: icmp_seq=3 ttl=61 time=2.87 ms

--- 172.16.1.10 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 2.344/2.809/3.210/0.356 ms
[root@smoonen-application1 ~]#

We have not tuned BGP, but in spite of this, if we disable BGP on the DAL10 edge (this effectively severs both its connection to the stretched network and its connection to VPC), we see that the connectivity from the VPC fails over to the DAL12 edge:

64 bytes from 172.16.1.10: icmp_seq=16 ttl=61 time=2.51 ms
64 bytes from 172.16.1.10: icmp_seq=17 ttl=61 time=16.9 ms
64 bytes from 172.16.1.10: icmp_seq=18 ttl=61 time=2.63 ms
64 bytes from 172.16.1.10: icmp_seq=137 ttl=61 time=8.52 ms
64 bytes from 172.16.1.10: icmp_seq=138 ttl=61 time=6.06 ms
64 bytes from 172.16.1.10: icmp_seq=139 ttl=61 time=5.07 ms

Conclusion

We have successfully established bidirectional connectivity over the IBM Cloud private network between VMware Cloud Director and IBM Cloud VPC using BGP over IPsec.

As described above, it is possible to extend this solution by deploying a router appliance in the third VPC availability zone, in which case you would need to deploy two more PNEs, one for each of your VCD edges. Also, you will need additional PNEs if you deploy more than one router appliance into each zone for HA. Thus, you could require up to twelve PNEs (two router appliances in each of three zones, each of which has a connection to two VCD edges).

Many thanks to Mike Wiles and Jim Robbins for their assistance in developing this solution.

How to bring your team back to the office (2)

I wrote previously on bringing your team back to the office.

Aaron Renn (you should follow him) makes an interesting point about remote work:

Of course this is only one consideration among several. For a long time I have been keen to return to the office; I miss the productive and rewarding personal interaction with my local teammates (although my team is spread around the world). And until 2020, for a long time my employer expected software teams to be generally present in the office, recognizing the real benefits of face to face collaboration.

However, I really appreciate the affordances of remote work, and I would be glad to have a hybrid approach. My company is still silent on how and when teams will return to the office.

I found this fascinating:

The linked article reveals quite a contradictory mess among employers. I think this proves Taleb’s point well. The whole system is a ball of anxiety in the sense that Edwin Friedman uses. Employees and especially employers are locked in an anxious state, and creating a corporate culture by soliciting employee engagement simply perpetuates this situation; anxiety is only amplified when it is solicited and coddled.

There are some truly impressive actors out there, but the crises of the past year have revealed the fear behind the mask. This kind of anxiety among leaders will metastasize if it is not done away with. Decisive leadership is willing to undergo a little short term pain for the sake of long term health.

deGoogling, part 1b

In my previous post I summarized my move from Gmail to Fastmail. After making this move, I noticed from time to time my iPhone battery drained quickly, with background activity by the Calendar application by far the biggest user of battery life.

The Fastmail team took a look at what was going on and indicated that, strangely, my macOS client was frequently changing the order of calendars. This explained why I saw the phone battery drain only from time to time: when my Mac was awake.

The macOS configuration doesn’t provide a lot of insight or control into this. I am no expert on iCalendar, but this doesn’t prevent me from fashioning two naïve theories about what was going on: first, possibly the Fastmail implementation doesn’t handle ordering of shared calendars well. This theory seems unlikely since I share a calendar with my wife and her phone was unaffected. Second, possibly macOS doesn’t handle calendars with duplicate names well (I had ended up with two calendars named Birthdays).

I condensed my list of calendars significantly and no longer see the issue.

deGoogling, part 1

I’m engaging in a slow effort to decouple from Google.

My first step was to decouple email, calendar, and contacts. For a long time I’ve been using Fastmail as a mail server for my non–Gmail domains, and then fetching this mail into Gmail. Before deciding to reverse this flow, I briefly considered ProtonMail and Mailbox.org. While I appreciate the data residency and security of ProtonMail, it seems that the added security creates some connectivity challenges. As for Mailbox.org, there were no compelling features that motivated me to switch from Fastmail.

Fastmail provides direct integration with Gmail to import mail, contacts, and calendars. Before doing this, I first disabled Gmail’s email import from my Fastmail account. When I first attempted to import my Gmail content into Fastmail, I was surprised by the storage estimate. Google reported that Gmail was using about 4GB of my Google storage, and this matched the size of email data when I downloaded all of my email from Google Takeout. However, Fastmail reported that there was an estimated 116GB of email that would be downloaded from Google using IMAP. This was quite disconcerting to me, partly because Google indicates that they may throttle IMAP downloads to 2.5GB per day, and partly because Fastmail charges based on my storage usage.

The Fastmail import from Gmail may create duplicate email messages for any emails that have multiple tags, so my first thought was to ensure that I had no emails with multiple tags. However, this did not affect Fastmail’s download estimate. My second thought was to sift through old archived emails that I no longer needed, and while this helped, it did not help significantly.

After working with Fastmail support, it turns out that the IMAP import size estimate is based on the entire storage usage that Google reports, which includes Google Drive: and I currently have over 100GB of data on my Google Drive (stay tuned). After crossing my fingers, I went ahead and imported my email. Google’s 4GB of email was fully imported in just under an hour, after which Fastmail reported that I had about 4GB of email.

When I imported my own email, for some reason the Gmail Sent and Spam folders appeared as custom folders within Fastmail, and I had to move the emails to Fastmail’s equivalent folders. However, when I did this for my wife’s email, these two folders transferred seamlessly.

I appreciate that the keyboard shortcuts in Fastmail’s web interface generally match those of Gmail. I also appreciate how easy Fastmail makes it to configure Apple devices. So far the experience has been relatively seamless for me, and I haven’t noticed any problems at all with my contacts or my calendar.

Managing Veeam backup encryption using IBM Cloud key management

Veeam Backup and Replication offers the ability to encrypt your backups using passwords, which function as a kind of envelope encryption key for the encryption keys protecting the actual data. Veeam works hard to protect these passwords from exposure, to the degree that Veeam support cannot recover your passwords. You can ensure the resiliency of these keys either with a password–encrypted backup of your Veeam configuration; or by using Veeam Backup Enterprise Manager, which can protect and recover these passwords using an asymmetric key pair managed by Enterprise Manager. However, neither of these offerings allows integration with an external key manager for key storage and lifecycle. As a result, you must implement automation if you want to achieve Veeam backup encryption without your administrators and operators having direct knowledge of your encryption passwords. Veeam provides a set of PowerShell encryption cmdlets for this purpose.

In this article, I will demonstrate how you can use IBM Cloud Key Protect or IBM Cloud Hyper Protect Crypto Services (HPCS) to create and manage your Veeam encryption passwords.

Authenticating with the IBM Cloud API

Our first step is to use an IBM Cloud service ID API key to authenticate with IBM Cloud IAM and obtain a limited–time token that we will provide as our authorization for Key Protect or HPCS APIs. For this purpose we will use IBM Cloud’s recently released private endpoint for the IAM token service, which allows us to avoid connection to the public internet provided we have enabled VRF and service endpoints in our account.

# Variables

$apikey = '...'

# URIs and script level settings

$tokenURI = 'https://private.iam.cloud.ibm.com/identity/token'
$ErrorActionPreference = 'Stop'
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

# Exchange IBM Cloud API key for token

$headers = @{Accept='application/json'}
$body = @{grant_type='urn:ibm:params:oauth:grant-type:apikey'; apikey=$apikey}
$tokenResponse = Invoke-RestMethod -Uri $tokenURI -Method POST -Body $body -Headers $headers

# Bearer token is now present in $tokenResponse.access_token

This token will be used in each of the following use cases.

Generating a password

In order to generate a new password for use with Veeam, we will use this token to call the Key Protect or HPCS API to generate an AES256 key and “wrap” (that is, encrypt) it with a root key. The service ID associated with our API key above needs Reader access to the Key Protect or HPCS instance to perform this operation. The following example uses the Key Protect private API endpoint; if you are using HPCS you will have a private API endpoint specific to your instance that looks something like https://api.private.us-south.hs-crypto.cloud.ibm.com:12345. In this script we use a pre–selected Key Protect or HPCS instance (identified by $kms) and root key within that instance (identified by $crk).

# Variables

$kms = 'nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn'
$crk = 'nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn'

# URIs and script level settings

$kmsURIbase = 'https://private.us-south.kms.cloud.ibm.com/api/v2/keys/'
$ErrorActionPreference = 'Stop'
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

# Perform wrap operation with empty payload to generate an AES 256 key that will be used as password

$headers = @{Accept='application/json'; 'content-type'='application/vnd.ibm.kms.key_action_wrap+json'; 'bluemix-instance'=$kms; Authorization=("Bearer " + $tokenResponse.access_token); 'correlation-id'=[guid]::NewGuid()}
$body = @{}
$wrapResponse = Invoke-RestMethod -Uri ($kmsURIbase + $crk + "/actions/wrap") -Method POST -Body (ConvertTo-Json $body) -Headers $headers

# Plaintext key is present in $wrapResponse.plaintext, and wrapped key in $wrapResponse.ciphertext

After generating the key, we create a new Veeam password with that content. The output of the wrap operation includes both the plaintext key itself and also the wrapped form of the key. Our password can only be extracted from this wrapped ciphertext by someone who has sufficient access to the root key. We should store this wrapped form somewhere for recovery purposes; for the purposes of this example I am storing it as the password description together with a name for the password, $moniker, which in the full script is collected earlier from the script parameters.

$plaintext = ConvertTo-SecureString $wrapResponse.plaintext -AsPlainText -Force
$wdek = $wrapResponse.ciphertext
Remove-Variable wrapResponse

# Store this key as a new Veeam encryption key. Retain it in base64 format for simplicity.

Add-VBREncryptionKey -Password $plaintext -Description ($moniker + " | " + $wdek)

Write-Output ("Created new key " + $moniker)

You can see the full example script create-key.ps1 in GitHub.

Re–wrap a password

Because Veeam does not directly integrate with an external key manager, we have extra work to do if we want to respond to rotation of the root key, or to cryptographic erasure. The following code uses the rewrap API call to regenerate the wrapped form of our key in case the root key has been rotated. This ensures that our backup copy of the key is protected by the latest version of the root key.

# Perform rewrap operation to rewrap our key
# If this operation fails, it is possible your root key has been revoked and you should destroy the Veeam key

$headers = @{Accept='application/json'; 'content-type'='application/vnd.ibm.kms.key_action_rewrap+json'; 'bluemix-instance'=$kms; Authorization=("Bearer " + $tokenResponse.access_token); 'correlation-id'=[guid]::NewGuid()}
$body = @{ciphertext=$wdek}
$rewrapResponse = Invoke-RestMethod -Uri ($kmsURIbase + $crk + "/actions/rewrap") -Method POST -Body (ConvertTo-Json $body) -Headers $headers

Note that this API call will fail with a 4xx error in cases that include the revocation of the root key. In this case, if the root key has been purposely revoked, it is appropriate for you to remove your Veeam password to accomplish the cryptographic erasure. However, assuming that the rewrap is successful, we should update our saved copy of the wrapped form of the key to this latest value. In this example, $key is a PSCryptoKey object that was earlier collected from the Get-VBREncryptionKey cmdlet, and represents the key whose description will be updated:

$newWdek = $rewrapResponse.ciphertext
Remove-Variable rewrapResponse

# Update the existing description of the Veeam encryption key to reflect the updated wrapped version

Set-VBREncryptionKey -EncryptionKey $key.Description -Description ($moniker + " | " + $newWdek)

Write-Output ("Rewrapped key " + $moniker)

You can see the full example script rewrap-key.ps1 in GitHub.

Recover a password

Within a single site the above approach is sufficient. For additional resilience, you can use Veeam backup copy jobs to copy your data to a remote location. If you have a Veeam repository in a remote site and you lose the VBR instance and repositories in your primary site, Veeam enables you to recover VBR in the remote site from an encrypted configuration backup, after which you can restore backups from the repository in that site.

However, you need to plan carefully for recovery not only of your data but also your encryption keys. Ideally, you would choose to protect both the Veeam configuration backup and the VM backups using keys that are protected by IBM Cloud Key Protect or HPCS. This means that for configuration backups and for remote backups, you should choose a Key Protect or HPCS key manager instance in the remote location so that your key management in the remote site is not subject to the original site failure. You might therefore be using two key manager instances: one local key manager instance for keys to protect your local backup jobs used for common recovery operations, and another remote instance for keys to protect your configuration backup and your copy backup jobs used in case of disaster.

This also implies that the key used to protect your configuration backups should be preserved in an additional location than your VBR instance and in a form other than a Veeam key object; in fact, the Veeam configuration restore process requires you to enter the password–key manually. You should store the key in its secure wrapped form, ideally alongside your Veeam configuration backup. You will then need to unwrap the key when you restore the configuration. In this example, the wrapped form of the key is expected to be one of the script arguments, and this underscores the need to protect this key with a key manager that will still be available in case of the original site failure:

# Perform unwrap operation

$headers = @{Accept='application/json'; 'content-type'='application/vnd.ibm.kms.key_action_unwrap+json'; 'bluemix-instance'=$kms; Authorization=("Bearer " + $tokenResponse.access_token); 'correlation-id'=[guid]::NewGuid()}
$body = @{ciphertext=$args[0]}
$unwrapResponse = Invoke-RestMethod -Uri ($kmsURIbase + $crk + "/actions/unwrap") -Method POST -Body (ConvertTo-Json $body) -Headers $headers

Write-Output ("Plaintext key: " + $unwrapResponse.plaintext)

Because this exposes your key to your administrator or operator, after restoring VBR from configuration backup, you should generate a new key for subsequent configuration backups.

You can see the full example script unwrap-key.ps1 in GitHub.

Summary

In this article, I’ve showed how you can use IBM Cloud key management APIs to generate and manage encryption keys for use with Veeam Backup and Replication. You can see full examples of the scripts excerpted above in GitHub. These scripts are a basic example that are intended to be extended and customized for your own environment. You should take special care to consider how you manage and protect your IBM Cloud service ID API keys, and how you save and manage the wrapped form of the keys generated by these scripts. Most likely you would store all of these in your preferred secret manager.

Complex

I’ve joked for awhile that EDR and similar systems like CrowdStrike or Carbon Black would become Skynet. Or, more likely, a tool of international espionage and cyber–warfare. It doesn’t feel good to be vindicated. (Now imagine someone accomplishing this with a major browser or password manager application.) Complex and highly interconnected systems are difficult to make stable, resilient, or secure; and cannot possibly be made anti-fragile. (This is a lesser reason why I miss my old pickup truck.) I’m not very excited about Kubernetes for the same reason. It’s also partly why I’m not very excited about artificial intelligence; but additionally because analysis of data, whether by machine or by human, does not automatically involve either wisdom or decisiveness (see also Edwin Friedman and Nassim Taleb).

Crossposted on I gotta have my orange juice.