deGoogling, part 1b

In my previous post I summarized my move from Gmail to Fastmail. After making this move, I noticed from time to time my iPhone battery drained quickly, with background activity by the Calendar application by far the biggest user of battery life.

The Fastmail team took a look at what was going on and indicated that, strangely, my macOS client was frequently changing the order of calendars. This explained why I saw the phone battery drain only from time to time: when my Mac was awake.

The macOS configuration doesn’t provide a lot of insight or control into this. I am no expert on iCalendar, but this doesn’t prevent me from fashioning two naïve theories about what was going on: first, possibly the Fastmail implementation doesn’t handle ordering of shared calendars well. This theory seems unlikely since I share a calendar with my wife and her phone was unaffected. Second, possibly macOS doesn’t handle calendars with duplicate names well (I had ended up with two calendars named Birthdays).

I condensed my list of calendars significantly and no longer see the issue.

deGoogling, part 1

I’m engaging in a slow effort to decouple from Google.

My first step was to decouple email, calendar, and contacts. For a long time I’ve been using Fastmail as a mail server for my non–Gmail domains, and then fetching this mail into Gmail. Before deciding to reverse this flow, I briefly considered ProtonMail and Mailbox.org. While I appreciate the data residency and security of ProtonMail, it seems that the added security creates some connectivity challenges. As for Mailbox.org, there were no compelling features that motivated me to switch from Fastmail.

Fastmail provides direct integration with Gmail to import mail, contacts, and calendars. Before doing this, I first disabled Gmail’s email import from my Fastmail account. When I first attempted to import my Gmail content into Fastmail, I was surprised by the storage estimate. Google reported that Gmail was using about 4GB of my Google storage, and this matched the size of email data when I downloaded all of my email from Google Takeout. However, Fastmail reported that there was an estimated 116GB of email that would be downloaded from Google using IMAP. This was quite disconcerting to me, partly because Google indicates that they may throttle IMAP downloads to 2.5GB per day, and partly because Fastmail charges based on my storage usage.

The Fastmail import from Gmail may create duplicate email messages for any emails that have multiple tags, so my first thought was to ensure that I had no emails with multiple tags. However, this did not affect Fastmail’s download estimate. My second thought was to sift through old archived emails that I no longer needed, and while this helped, it did not help significantly.

After working with Fastmail support, it turns out that the IMAP import size estimate is based on the entire storage usage that Google reports, which includes Google Drive: and I currently have over 100GB of data on my Google Drive (stay tuned). After crossing my fingers, I went ahead and imported my email. Google’s 4GB of email was fully imported in just under an hour, after which Fastmail reported that I had about 4GB of email.

When I imported my own email, for some reason the Gmail Sent and Spam folders appeared as custom folders within Fastmail, and I had to move the emails to Fastmail’s equivalent folders. However, when I did this for my wife’s email, these two folders transferred seamlessly.

I appreciate that the keyboard shortcuts in Fastmail’s web interface generally match those of Gmail. I also appreciate how easy Fastmail makes it to configure Apple devices. So far the experience has been relatively seamless for me, and I haven’t noticed any problems at all with my contacts or my calendar.

Managing Veeam backup encryption using IBM Cloud key management

Veeam Backup and Replication offers the ability to encrypt your backups using passwords, which function as a kind of envelope encryption key for the encryption keys protecting the actual data. Veeam works hard to protect these passwords from exposure, to the degree that Veeam support cannot recover your passwords. You can ensure the resiliency of these keys either with a password–encrypted backup of your Veeam configuration; or by using Veeam Backup Enterprise Manager, which can protect and recover these passwords using an asymmetric key pair managed by Enterprise Manager. However, neither of these offerings allows integration with an external key manager for key storage and lifecycle. As a result, you must implement automation if you want to achieve Veeam backup encryption without your administrators and operators having direct knowledge of your encryption passwords. Veeam provides a set of PowerShell encryption cmdlets for this purpose.

In this article, I will demonstrate how you can use IBM Cloud Key Protect or IBM Cloud Hyper Protect Crypto Services (HPCS) to create and manage your Veeam encryption passwords.

Authenticating with the IBM Cloud API

Our first step is to use an IBM Cloud service ID API key to authenticate with IBM Cloud IAM and obtain a limited–time token that we will provide as our authorization for Key Protect or HPCS APIs. For this purpose we will use IBM Cloud’s recently released private endpoint for the IAM token service, which allows us to avoid connection to the public internet provided we have enabled VRF and service endpoints in our account.

# Variables

$apikey = '...'

# URIs and script level settings

$tokenURI = 'https://private.iam.cloud.ibm.com/identity/token'
$ErrorActionPreference = 'Stop'
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

# Exchange IBM Cloud API key for token

$headers = @{Accept='application/json'}
$body = @{grant_type='urn:ibm:params:oauth:grant-type:apikey'; apikey=$apikey}
$tokenResponse = Invoke-RestMethod -Uri $tokenURI -Method POST -Body $body -Headers $headers

# Bearer token is now present in $tokenResponse.access_token

This token will be used in each of the following use cases.

Generating a password

In order to generate a new password for use with Veeam, we will use this token to call the Key Protect or HPCS API to generate an AES256 key and “wrap” (that is, encrypt) it with a root key. The service ID associated with our API key above needs Reader access to the Key Protect or HPCS instance to perform this operation. The following example uses the Key Protect private API endpoint; if you are using HPCS you will have a private API endpoint specific to your instance that looks something like https://api.private.us-south.hs-crypto.cloud.ibm.com:12345. In this script we use a pre–selected Key Protect or HPCS instance (identified by $kms) and root key within that instance (identified by $crk).

# Variables

$kms = 'nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn'
$crk = 'nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn'

# URIs and script level settings

$kmsURIbase = 'https://private.us-south.kms.cloud.ibm.com/api/v2/keys/'
$ErrorActionPreference = 'Stop'
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

# Perform wrap operation with empty payload to generate an AES 256 key that will be used as password

$headers = @{Accept='application/json'; 'content-type'='application/vnd.ibm.kms.key_action_wrap+json'; 'bluemix-instance'=$kms; Authorization=("Bearer " + $tokenResponse.access_token); 'correlation-id'=[guid]::NewGuid()}
$body = @{}
$wrapResponse = Invoke-RestMethod -Uri ($kmsURIbase + $crk + "/actions/wrap") -Method POST -Body (ConvertTo-Json $body) -Headers $headers

# Plaintext key is present in $wrapResponse.plaintext, and wrapped key in $wrapResponse.ciphertext

After generating the key, we create a new Veeam password with that content. The output of the wrap operation includes both the plaintext key itself and also the wrapped form of the key. Our password can only be extracted from this wrapped ciphertext by someone who has sufficient access to the root key. We should store this wrapped form somewhere for recovery purposes; for the purposes of this example I am storing it as the password description together with a name for the password, $moniker, which in the full script is collected earlier from the script parameters.

$plaintext = ConvertTo-SecureString $wrapResponse.plaintext -AsPlainText -Force
$wdek = $wrapResponse.ciphertext
Remove-Variable wrapResponse

# Store this key as a new Veeam encryption key. Retain it in base64 format for simplicity.

Add-VBREncryptionKey -Password $plaintext -Description ($moniker + " | " + $wdek)

Write-Output ("Created new key " + $moniker)

You can see the full example script create-key.ps1 in GitHub.

Re–wrap a password

Because Veeam does not directly integrate with an external key manager, we have extra work to do if we want to respond to rotation of the root key, or to cryptographic erasure. The following code uses the rewrap API call to regenerate the wrapped form of our key in case the root key has been rotated. This ensures that our backup copy of the key is protected by the latest version of the root key.

# Perform rewrap operation to rewrap our key
# If this operation fails, it is possible your root key has been revoked and you should destroy the Veeam key

$headers = @{Accept='application/json'; 'content-type'='application/vnd.ibm.kms.key_action_rewrap+json'; 'bluemix-instance'=$kms; Authorization=("Bearer " + $tokenResponse.access_token); 'correlation-id'=[guid]::NewGuid()}
$body = @{ciphertext=$wdek}
$rewrapResponse = Invoke-RestMethod -Uri ($kmsURIbase + $crk + "/actions/rewrap") -Method POST -Body (ConvertTo-Json $body) -Headers $headers

Note that this API call will fail with a 4xx error in cases that include the revocation of the root key. In this case, if the root key has been purposely revoked, it is appropriate for you to remove your Veeam password to accomplish the cryptographic erasure. However, assuming that the rewrap is successful, we should update our saved copy of the wrapped form of the key to this latest value. In this example, $key is a PSCryptoKey object that was earlier collected from the Get-VBREncryptionKey cmdlet, and represents the key whose description will be updated:

$newWdek = $rewrapResponse.ciphertext
Remove-Variable rewrapResponse

# Update the existing description of the Veeam encryption key to reflect the updated wrapped version

Set-VBREncryptionKey -EncryptionKey $key.Description -Description ($moniker + " | " + $newWdek)

Write-Output ("Rewrapped key " + $moniker)

You can see the full example script rewrap-key.ps1 in GitHub.

Recover a password

Within a single site the above approach is sufficient. For additional resilience, you can use Veeam backup copy jobs to copy your data to a remote location. If you have a Veeam repository in a remote site and you lose the VBR instance and repositories in your primary site, Veeam enables you to recover VBR in the remote site from an encrypted configuration backup, after which you can restore backups from the repository in that site.

However, you need to plan carefully for recovery not only of your data but also your encryption keys. Ideally, you would choose to protect both the Veeam configuration backup and the VM backups using keys that are protected by IBM Cloud Key Protect or HPCS. This means that for configuration backups and for remote backups, you should choose a Key Protect or HPCS key manager instance in the remote location so that your key management in the remote site is not subject to the original site failure. You might therefore be using two key manager instances: one local key manager instance for keys to protect your local backup jobs used for common recovery operations, and another remote instance for keys to protect your configuration backup and your copy backup jobs used in case of disaster.

This also implies that the key used to protect your configuration backups should be preserved in an additional location than your VBR instance and in a form other than a Veeam key object; in fact, the Veeam configuration restore process requires you to enter the password–key manually. You should store the key in its secure wrapped form, ideally alongside your Veeam configuration backup. You will then need to unwrap the key when you restore the configuration. In this example, the wrapped form of the key is expected to be one of the script arguments, and this underscores the need to protect this key with a key manager that will still be available in case of the original site failure:

# Perform unwrap operation

$headers = @{Accept='application/json'; 'content-type'='application/vnd.ibm.kms.key_action_unwrap+json'; 'bluemix-instance'=$kms; Authorization=("Bearer " + $tokenResponse.access_token); 'correlation-id'=[guid]::NewGuid()}
$body = @{ciphertext=$args[0]}
$unwrapResponse = Invoke-RestMethod -Uri ($kmsURIbase + $crk + "/actions/unwrap") -Method POST -Body (ConvertTo-Json $body) -Headers $headers

Write-Output ("Plaintext key: " + $unwrapResponse.plaintext)

Because this exposes your key to your administrator or operator, after restoring VBR from configuration backup, you should generate a new key for subsequent configuration backups.

You can see the full example script unwrap-key.ps1 in GitHub.

Summary

In this article, I’ve showed how you can use IBM Cloud key management APIs to generate and manage encryption keys for use with Veeam Backup and Replication. You can see full examples of the scripts excerpted above in GitHub. These scripts are a basic example that are intended to be extended and customized for your own environment. You should take special care to consider how you manage and protect your IBM Cloud service ID API keys, and how you save and manage the wrapped form of the keys generated by these scripts. Most likely you would store all of these in your preferred secret manager.

Complex

I’ve joked for awhile that EDR and similar systems like CrowdStrike or Carbon Black would become Skynet. Or, more likely, a tool of international espionage and cyber–warfare. It doesn’t feel good to be vindicated. (Now imagine someone accomplishing this with a major browser or password manager application.) Complex and highly interconnected systems are difficult to make stable, resilient, or secure; and cannot possibly be made anti-fragile. (This is a lesser reason why I miss my old pickup truck.) I’m not very excited about Kubernetes for the same reason. It’s also partly why I’m not very excited about artificial intelligence; but additionally because analysis of data, whether by machine or by human, does not automatically involve either wisdom or decisiveness (see also Edwin Friedman and Nassim Taleb).

Crossposted on I gotta have my orange juice.

How to bring your team back to the office

A lot of people are offering advice on how to return to the office safely. Here is my evidence-based approach:

Bring your team back to the office.

You should, of course, continue to give people the freedom to work at home if you are able.

I’ve written much more on my other blog stressing that churches must be open for worship, and should do so as normally as possible. But there is also a tremendous benefit to other spheres of life—business, family, education, politics—for people to be face to face with one another. Beyond that, there is actual harm in exercising authority to constrain any of these spheres of life beyond natural limits.

Here is my evidence: Alex Berenson’s Unreported Truths. In addition to that, Edwin Friedman’s A Failure of Nerve offers the appropriate framework for leadership in difficult times: non–anxiety.

reCAPTCHA v3 on Rails

I implemented reCAPTCHA v3 on a Rails site recently. It was very straightforward, and I’m generally pleased with the outcome. Interestingly, the vast majority of spam requests seem to be made without having generated a reCAPTCHA token, suggesting that they are not even loading JavaScript. This points to a possible poor man’s approach for spam suppression: generate your CSRF token using JavaScript rather than emitting it directly in the form.

For better or worse, it is difficult to operate on the web today without JavaScript.

Version 3 of reCAPTCHA operates by capturing all user site activity, which is a privacy concern, but also allows it to function unobtrusively.

I defined several variables in my environment:

RECAPTCHA_CHECK = true
RECAPTCHA_SITE_KEY = 'xyz'
RECAPTCHA_SECRET_KEY = 'xyz'

I load the reCAPTCHA script on secondary pages of my signup site, but not the root page and not any internal pages:

<% if RECAPTCHA_CHECK -%>
  <script src="https://www.google.com/recaptcha/api.js?render=<%= RECAPTCHA_SITE_KEY %>"></script>
<% end -%>

Then I attach an action to the signup form submission to generate the reCAPTCHA token and include it in the form data (the form is named signup_form and has a hidden input named recaptcha):

<% if RECAPTCHA_CHECK -%>
  // Add reCAPTCHA token on form submission
  $(function() {
    $('#signup_form').submit(function(event) {
      event.preventDefault();
      $('#signup_form').off('submit');
      grecaptcha.ready(function() {
        grecaptcha.execute('<%= RECAPTCHA_SITE_KEY %>', {action: 'submit'}).then(function(token) {
          $('#recaptcha').val(token);
          $('#signup_form').submit();
        });
      });
    });
  });
<% end -%>

This site already has several conditions for trashing a suspicious signup request, with a simple error page that directs visitors to reach out to our administrators by email. I added a new condition to check the reCAPTCHA token:

if RECAPTCHA_CHECK
  # Check reCAPTCHA
  recap_uri = URI.parse('https://www.google.com/recaptcha/api/siteverify')
  recap_params = { secret: RECAPTCHA_SECRET_KEY, response: params[:recaptcha] }
  response = Net::HTTP.post_form(recap_uri, recap_params)
  logger.info "Recaptcha result: " + response.code + " / " + response.body
  response_json = JSON.load(response.body) if response.code == "200"
end

# Silently throttle requests that fail reCAPTCHA
if RECAPTCHA_CHECK && (response.code != "200" || !response_json['success'] || response_json['score'] < 0.5)
  Mailer.exception_notification(Exception.new("reCAPTCHA throttled signup"), params).deliver_now
  redirect_to :action => :thankyou
end

As you can see, for now I am operating with a threshold of 0.5, but I may adjust this over time. In my own testing, I generated a confidence value of 0.9.

Using multiple KMS clusters in vCenter

VMware vCenter Server allows you to create multiple KMS clusters, but does not currently provide a policy-based mechanism by which you can direct particular objects to be protected by a specific KMS cluster. Instead, for both vSphere and vSAN encryption, all new objects requiring encryption are protected by the default KMS cluster.

However, VMware architect Mike Foley has provided us with some helpful PowerCLI ammunition which we can leverage in order to rekey objects under the protection of the KMS cluster of our choice. You can use this approach either to manage multiple KMS connections, or alternatively to migrate from one KMS to another without decrypting your resources. Here are the steps that I’ve used to test this capability:

First, you need to connect vCenter to each of your KMS clusters. You can leverage the same client certificate or different client certificates, as you wish. If you are configuring multiple connections to the same key manager, you will need to distinguish these connections with their own username and password. Choose one of your KMS clusters to be the default key provider. Using the VMEncryption module’s Get-KMSCluster cmdlet, you can now see you are connected to two clusters:

PS /Users/smoonen/vmware> Get-KMSCluster

Name                      DefaultForSystem     ClientCertificateExpiryDate
----                      ----------------     ---------------------------
management-kms            False                4/5/2030 5:33:48 PM
workload-kms              True                 4/5/2030 5:51:42 PM

Here you can see we have created two VMs that are both protected by the default KMS cluster:

PS /Users/smoonen/vmware> Get-VM | Select Name,KMSserver

Name        KMSserver
----        ---------
testvm-2    workload-kms
testvm-1    workload-kms

The VMEncryption module’s Set-VMEncryptionKey cmdlet allows us to rekey one of these VMs using an alternate KMS cluster:

PS /Users/smoonen/vmware> Get-VM testvm-2 | Set-VMEncryptionKey -KMSClusterId management-kms

PS /Users/smoonen/vmware> Get-VM | Select Name,KMSserver

Name        KMSserver
----        ---------
testvm-2    management—kms
testvm-1    workload-kms

There are two other types of resources that we may need to rekey in this manner are hosts and vSAN clusters. If a vSphere cluster is using either vSphere or vSAN encryption, recall that your hosts are issued keys for encryption of core dumps. You can rekey your hosts using the Set-VMHostCryptoKey cmdlet.

PS /Users/smoonen/vmware> Get-VMhost | Select Name,KMSserver

Name                        KMSserver
----                        ---------
host000.smoonen.example.com management-kms
host001.smoonen.example.com management-kms

PS /Users/smoonen/vmware> Get-VMHost -Name host000.smoonen.example.com | Set-VMHostCryptoKey -KMSClusterId workload-kms

PS /Users/smoonen/vmware> Get-VMHost -Name host001.smoonen.example.com | Set-VMHostCryptoKey -KMSClusterId workload-kms

PS /Users/smoonen/vmware> Get-VMHost | Select Name,KMSserver

Name                        KMSserver
----                        ---------
host000.smoonen.example.com workload-kms
host001.smoonen.example.com workload-kms

Likewise, VMware offers a VsanEncryption module that allows you to rekey your vSAN cluster using a new KMS. The Set-VsanEncryptionKms cmdlet allows you to choose a new KMS cluster for any given vSAN cluster:

PS /Users/smoonen/vmware> Set-VsanEncryptionKms -Cluster cluster1 -KMSCluster workload-kms

Multipath iSCSI for VMware in IBM Cloud

Today we’re really going to go down the rabbit hole. Although there was not a great deal of fanfare, earlier this year IBM Cloud released support for up to 64 VMware hosts to attach an Endurance block storage volume using multipath connections. In order to use multipath, this requires the use of some APIs that are not well documented. After a lot of digging, here is how I was able to leverage this support.

First, your account must be enabled for what IBM Cloud calls “iSCSI isolation.” All new accounts beginning in early 2020 have this enabled. You can check whether it is enabled using the following Python script:

# Connect to SoftLayer
client = SoftLayer.Client(username = USERNAME, api_key = API_KEY, endpoint_url = SoftLayer.API_PUBLIC_ENDPOINT)

# Assert that iSCSI isolation is enabled
isolation_disabled = client['SoftLayer_Account'].getIscsiIsolationDisabled()
assert isolation_disabled == False

iSCSI isolation enforces that all devices in your account are using authentication to connect to iSCSI. In rare cases, some accounts may be using unauthenticated connections. If the above test passes, your account is ready to go! If the above test fails, you should first audit your usage of iSCSI connections to ensure they are all authenticated. Only if the above test fails and you have verified that either you are not using iSCSI, or all of your iSCSI connections are authenticated, then open a support ticket as follows. Plan for this process to take several days as it requires internal approvals and configruation changes:

Please enable my account for iSCSI isolation according to the standard block storage method of procedure.

Thank you!

Once the above test for iSCSI isolation passes, we are good to proceed. We need to order the following from IBM Cloud classic infrastructure:

  1. Endurance iSCSI block storage in the same datacenter as your hosts, with OS type VMware.
  2. A private portable subnet on the storage VLAN in your instance. Ensure the subnet is large enough to allocate two usable IP addresses for every current or future host in your cluster. We are ordering a single subnet for convenience, although it is possible to authorize multiple subnets (either for different hosts, or for different interfaces on each host). A single /25 subnet should be sufficient for any cluster since VMware vCenter Server (VCS) limits you to 59 hosts per cluster.

The Endurance authorization process authorizes each host individually to the storage, and assigns a unique iQN and CHAP credentials to each host. After authorizing the hosts, we then specify which subnet or subnets each host will be using to connect to the storage, so that the LUN accepts connections not only from the hosts’ primary IP addresses but also these alternate portable subnets. The following Python script issues the various API calls needed for these authorizations, assuming that we know the storage, subnet, and host ids:

STORAGE_ID = 157237344
SUBNET_ID = 2457318
HOST_IDS = (1605399, 1641947, 1468179)

# Connect to SoftLayer
client = SoftLayer.Client(username = USERNAME, api_key = API_KEY, endpoint_url = SoftLayer.API_PUBLIC_ENDPOINT)

# Authorize hosts to storage
for host_id in HOST_IDS :
  try :
    client['SoftLayer_Network_Storage_Iscsi'].allowAccessFromHost('SoftLayer_Hardware', host_id, id = STORAGE_ID)
  except :
    if 'Already Authorized' in sys.exc_info()[1].message :
      pass
    else :
      raise

# Lookup the "iSCSI ACL object id" for each host
hardwareMask = 'mask[allowedHardware[allowedHost[credential]]]'
result = client['SoftLayer_Network_Storage_Iscsi'].getObject(id = STORAGE_ID, mask = hardwareMask)
aclOids = [x['allowedHost']['id'] for x in result['allowedHardware']]

# Add our iSCSI subnet to each host's iSCSI ACL
for acl_id in aclOids :
  # Assign; note subnet is passed as array
  client['SoftLayer_Network_Storage_Allowed_Host'].assignSubnetsToAcl([SUBNET_ID], id = acl_id)

  # Verify success
  result = client['SoftLayer_Network_Storage_Allowed_Host'].getSubnetsInAcl(id = acl_id)
  assert len(result) > 0

At this point, the hosts are authorized to the storage. But before we can connect them to the storage we need to collect some additional information. First, we need to collect the iQN and CHAP credentials that were issued for the storage to each host:

STORAGE_ID = 157237344

# Connect to SoftLayer
client = SoftLayer.Client(username = USERNAME, api_key = API_KEY, endpoint_url = SoftLayer.API_PUBLIC_ENDPOINT)

# Lookup the iQN and credentials for each host
hardwareMask = 'mask[allowedHardware[allowedHost[credential]]]'
result = client['SoftLayer_Network_Storage_Iscsi'].getObject(id = STORAGE_ID, mask = hardwareMask)
creds = [ { 'host' : x['fullyQualifiedDomainName'],
            'iqn'  : x['allowedHost']['name'],
            'user' : x['allowedHost']['credential']['username'],
            'pass' : x['allowedHost']['credential']['password'] } for x in result['allowedHardware']]
print("Host connection details")
pprint.pprint(creds)

For example:

Host connection details
[{'host': 'host002.smoonen.example.com',
  'iqn': 'iqn.2020-07.com.ibm:ibm02su1368749-h1468179',
  'pass': 'dK3bACHQQSg5BPwA',
  'user': 'IBM02SU1368749-H1468179'},
 {'host': 'host001.smoonen.example.com',
  'iqn': 'iqn.2020-07.com.ibm:ibm02su1368749-h1641947',
  'pass': 'kFCw2TDLr5bL4Ex6',
  'user': 'IBM02SU1368749-H1641947'},
 {'host': 'host000.smoonen.example.com',
  'iqn': 'iqn.2020-07.com.ibm:ibm02su1368749-h1605399',
  'pass': 'reTLYrSe2ShPzZ6A',
  'user': 'IBM02SU1368749-H1605399'}]

Note that Endurance storage uses the same iQN and CHAP credentials for all LUNs authorized to a host. This will enable us to attach multiple LUNs using the same HBA.

Next, we need to retrieve the two IP addresses for the iSCSI LUN:

STORAGE_ID = 157237344

# Connect to SoftLayer
client = SoftLayer.Client(username = USERNAME, api_key = API_KEY, endpoint_url = SoftLayer.API_PUBLIC_ENDPOINT)

print("Target IP addresses")
storage = client['SoftLayer_Network_Storage_Iscsi'].getIscsiTargetIpAddresses(id = STORAGE_ID)
pprint.pprint(storage)

For example:

Target IP addresses
['161.26.114.170', '161.26.114.171']

Finally, we need to identify the vendor suffix on the LUN’s WWN so that we can positively identify it in vSphere. We can do this as follows:

STORAGE_ID = 157237344

# Connect to SoftLayer
client = SoftLayer.Client(username = USERNAME, api_key = API_KEY, endpoint_url = SoftLayer.API_PUBLIC_ENDPOINT)

props = client['SoftLayer_Network_Storage_Iscsi'].getProperties(id = STORAGE_ID)
try    : wwn = [x['value'] for x in props if len(x['value']) == 24 and x['value'].isalnum()][0]
except : raise Exception("No WWN")
print("WWN: %s" % wwn)

For example:

WWN: 38305659702b4f6f5a5a3044

Armed with this information, we can now attach the hosts to the storage.

First, create two new portgroups on your private vDS. Our design uses a shared vDS across clusters but unique portgroups, so they should be named based on the instance and cluster name, for example, smoonen-mgmt-iSCSI-A and smoonen-mgmt-iSCSI-B. Tag these port groups with the storage VLAN, and ensure that each portgroup has only one active uplink. iSCSI-A should have uplink1 active and uplink2 unused, while iSCSI-B should have uplink2 active and uplink1 unused:

Next, create kernel ports for all hosts in each port group, using up IP addresses from the subnet you ordered earlier. You will end up using two IP addresses for each host. Set the gateway to Configure on VMkernel adapters and using the gateway address for your subnet:

Next, let’s begin a PowerCLI session to connect to the storage and create the datastore. First, as a one-time setup, we must enable the software iSCSI adapter on every host:

PS /Users/smoonen/vmware> $myhost = Get-VMHost host000.smoonen.example.com
PS /Users/smoonen@us.ibm.com/Desktop> Get-VMHostStorage -VMHost $myhost | Set-VMHostStorage -SoftwareIScsiEnabled $True

SoftwareIScsiEnabled
--------------------
True

Next, also as a one-time setup on each host, bind the iSCSI kernel ports to the iSCSI adapter:

PS /Users/smoonen/vmware> $vmkA = Get-VMHostNetworkAdapter -PortGroup smoonen-mgmt-iSCSI-A -VMHost $myhost
PS /Users/smoonen/vmware> $vmkB = Get-VMHostNetworkAdapter -PortGroup smoonen-mgmt-iSCSI-B -VMHost $myhost
PS /Users/smoonen/vmware> $esxcli = Get-EsxCli -V2 -VMHost $myhost
PS /Users/smoonen/vmware> $esxcli.iscsi.networkportal.add.Invoke(@{adapter='vmhba64';force=$true;nic=$vmkA})
true
PS /Users/smoonen/vmware> $esxcli.iscsi.networkportal.add.Invoke(@{adapter='vmhba64';force=$true;nic=$vmkB})
true

Finally, once for each host, we set the host iQN to the value expected by IBM Cloud infrastructure, and also initialize the CHAP credentials:

PS /Users/smoonen/vmware> $esxcli.iscsi.adapter.set.Invoke(@{adapter='vmhba64'; name='iqn.2020-07.com.ibm:ibm02su1368749-h1605399'}) 
false
PS /Users/smoonen/vmware> $hba = Get-VMHostHba -VMHost $myhost -Device vmhba64
PS /Users/smoonen/vmware> Set-VMHostHba -IscsiHba $hba -MutualChapEnabled $false -ChapType Preferred -ChapName "IBM02SU1368749-H1605399" -ChapPassword "reTLYrSe2ShPzZ6A"

Device     Type         Model                          Status
------     ----         -----                          ------
vmhba64    IScsi        iSCSI Software Adapter         online

Now, for each LUN, on each host we must add that LUN’s target addresses (obtained above) as dynamic discovery targets. You should not assume that all LUNs created in the same datacenter share the same addresses:

PS /Users/smoonen/vmware> New-IScsiHbaTarget -IScsiHba $hba -Address "161.26.114.170"             

Address              Port  Type
-------              ----  ----
161.26.114.170       3260  Send

PS /Users/smoonen/vmware> New-IScsiHbaTarget -IScsiHba $hba -Address "161.26.114.171"

Address              Port  Type
-------              ----  ----
161.26.114.171       3260  Send

After this, we rescan on each host for available LUNs and datastores:

PS /Users/smoonen/vmware> Get-VMHostStorage -VMHost $myhost -RescanAllHba -RescanVmfs

SoftwareIScsiEnabled
--------------------
True

This enables us to locate the new LUN and create a VMFS datastore on it. We locate the LUN on all hosts but create the datastore on one host. Locate the LUN using the WWN suffix obtained above:

PS /Users/smoonen/vmware> $disks = Get-VMHostDisk -Id *38305659702b4f6f5a5a3044
PS /Users/smoonen/vmware> New-Datastore -VMHost $myhost -Vmfs -Name "smoonen-mgmt2" -Path $disks[0].ScsiLun.CanonicalName        

Name                               FreeSpaceGB      CapacityGB
----                               -----------      ----------
smoonen-mgmt2                           48.801          49.750

Finally, rescan on all hosts to discover the datastore:

PS /Users/smoonen/vmware> Get-VMHostStorage -VMHost $myhost -RescanAllHba -RescanVmfs

SoftwareIScsiEnabled
--------------------
True

We can confirm that we have multiple paths to the LUN as follows:

PS /Users/smoonen/vmware> $luns = Get-ScsiLun -Id *38305659702b4f6f5a5a3044
PS /Users/smoonen/vmware> Get-ScsiLunPath -ScsiLun $luns[0]

Name       SanID                                    State      Preferred
----       -----                                    -----      ---------
vmhba64:C… iqn.1992-08.com.netapp:stfdal1303        Standby    False
vmhba64:C… iqn.1992-08.com.netapp:stfdal1303        Standby    False
vmhba64:C… iqn.1992-08.com.netapp:stfdal1303        Active     False
vmhba64:C… iqn.1992-08.com.netapp:stfdal1303        Active     False

Migrating vCenter SSO from IWA to LDAPS

For some time I’ve used Integrated Windows Authentication (IWA) for VMware vCenter single sign-on (SSO). But there are a few considerations that are driving me from IWA to LDAPS. First, IWA is deprecated starting in vSphere 7. Second, I want to leverage LDAPS rather than LDAP since it is more secure and especially since Microsoft is pushing the use of LDAP signing more aggressively. Here are the steps that I followed to migrate from IWA to LDAPS:

  1. I chose to leverage Active Directory Certificate Services (AD CS) rather than an external CA in order to benefit from autoenrollment. Install the AD CS server role on each Active Directory Domain Controller. This also installs the certificate management feature. I configured AD CS as follows:
    1. Credentials should be those of $DOMAIN\Administrator
    2. Configure only the Cert Authority role service
    3. Create an Enterprise CA rather than a Standalone CA
    4. Create a Root CA rather than a Subordinate CA
    5. Create a new private key rather than using an existing private key
    6. Use the RSA#Microsoft cryptographic provider
    7. Use a 4096-bit RSA key
    8. Use SHA256 hash algorithm
    9. Accept the default CN
    10. Set a 10 year validity period
    11. Use the default database and log location
  2. I found in one case that the Local Computer | Personal certificate was either immediately created for my AD server’s hostname, or else was created on demand when I attempted an LDAPS connection. In another case I had to reboot before the server certificate was autoenrolled. If this doesn’t work for you, you may wish to try using the ldifde command to create the LDAPS certificate. You can test for enrollment by either searching for the certificate in the Local Computer | Personal certificate store, or else by attempting to connect to LDAPS on port 636.
  3. Export the CA certs from the AD servers and convert them from CER format to PEM format for use with vCenter and any other LDAP clients:
    openssl x509 -inform der -in adns1.cer -out adns1.pem
  4. Using your Administrator@vsphere.local account, remove the IWA identity source and create a new identity source as follows. In this example I am joining the domain example.com and using an unprivilged service account I created for vCenter’s use. In my experience, my vCenter role and permission settings were preserved independently of changes to the identity source:
    1. Identity source type = Active Directory over LDAP
    2. Users = DC=example,DC=com
    3. Groups = DC=example,DC=com
    4. Domain = example.com
    5. Alias = EXAMPLE
    6. Username = vCenter LDAP service user
    7. Password – vCenter LDAP service account password
    8. Connect to = Specific domain controllers
    9. Specify one or two AD server URLs in the following format: ldaps://adnssmoonen1.example.com:636
    10. Upload all PEM files generated above for SSL certificates
  1. After ensuring that vCenter and any other LDAP clients (for example, HyTrust Cloud Control) are successfully leveraging LDAPS, configure the group policy as follows to enforce LDAP signing:
    Default Domain Controllers Policy :: Computer Configuration | Policies | Windows Settings | Security Settings | Local Policies | Security Options | Domain Controller: LDAP server signing requirements = Require signing

Occam’s razor

On my nontechnical blog I reflect on simplicity:

I’ve been appealing to Occam’s razor lately as a rule for evaluating architectural decisions and their tradeoffs. In particular, architectural decisions must take into account not only ideal considerations, but also a team’s capacity to develop, maintain, and support these decisions. Simplicity has its own rewards regardless of the size of your team, but the smaller the team, the more aggressively you must press for that simplicity. Don’t multiply entities unnecessarily!