Sunday, October 10, 2021

Get rid of certificate errors with Avi (NSX-ALB) and Hashicorp Vault!

 Have you ever seen this error before?

This is a really important issue in enterprise infrastructure because unauthenticated TLS connections teach our end users to be complacent and ignore this error. 

TLS Authentication

SSL/TLS for internal enterprise administration typically only addresses the confidentiality aspects of an organizational need - yet the integrity aspects are not well realized:

This is an important aspect of our sense of enterprise security, but the level of effort to authenticating information endpoints is high for TLS, so we make do with what we have. 

The practice of ignoring authentication errors for decades has promoted complacency

Here's another error that enterprise systems administrators see all the time:

ssh {{ ip }}
The authenticity of host '{{ ip }} ({{ ip }})' can't be established.
RSA key fingerprint is SHA256:{{ hash }}.
Are you sure you want to continue connecting (yes/no)?

This probably looks familiar too - Secure Shell (SSH) follows a different method of establishing trust, where the user should verify that hash is correct by some method, and if it changes, it'll throw an error that we hopefully don't ignore:

ssh {{ ip }}
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:{{ hash }}.
Please contact your system administrator.
Add correct host key in known_hosts to get rid of this message.
Offending ECDSA key in known_hosts
{{ cipher }} host key for {{ ip }} has changed and you have requested strict checking.
Host key verification failed.

SSH is performing something very valuable here - authentication. By default, SSH will record a node's SSH hash in a file called known_hosts to ensure that the server is in fact the same as the last time you accessed it. In turn, once the server authenticates, you provide some level of authentication (user, key) afterward to ensure that you are who you say you are too. Always ensure that the service you're giving a secret to (like your password!) is authenticated or validated in some way first!

Web of Trust versus Centralized Identity

Web-of-Trust (WoT)

Web-of-Trust (WoT) is typically the easiest form of authentication scheme to start out with but results in factorial scaling issues later on if executed properly. In this model, it's on the individual to validate identities from each peer they interact with since WoT neither requires nor wants a centralized authority to validate against.

Typically enterprises use WoT because it's baked into a product, not specifically due to any particular need. Certain applications work well with it - so generally you should:
  • Keep your circle small
  • Replace crypto regularly
  • Leverage multiple identities for multiple tasks
    • e.g. separate your code signing keys from your SSH authentication keys
  • Easy initial set-up
  • Doesn't depend on a third party to establish infrastructure
  • The user is empowered to make both good and bad decisions, and the vast majority of users don't care enough about security to maintain vigilance
  • If you're in an organization with hundreds of "things to validate", you have to personally validate a lot of keys
  • It's a lot of work to properly validate - Ex. You probably don't ask for a person's ID every time you share emails with them
  • Revocation: If a key is compromised, you're relying on every single user to revoke it (or renew it, change your crypto folks) in a timely manner. This is a lot of work depending on how much a key is used.
Examples: SSH, PGP

Centralized Identity

Centralized Identity services are the sweetheart of large enterprises. Put your security officers in charge of one of these babies and they'll make it sing

In this model, it's on the Identity Administrator to ensure the integrity of any Identity Store. They'll typically do quite a bit better than your average WoT user because it's their job to do so.

Centralized Identity services handle routine changes like ID refreshes/revocations much more easily with dedicated staffing - mostly because the application and maintainer are easy to define. But here's the rub, you have to be able to afford it. Most of the products that fit in this category are not free and require at least part-time supervision by a capable administrator.

It's not impossible, though. One can build centralized authentication mechanisms with open source tooling, it just takes work. If you aren't the person doing this work, you should help them by being a vigilant user - if an identity was compromised, report it quickly, even if it was your fault - the time to respond here is vital. Try to shoulder some of this weight whenever you can - it's an uphill hike for the people doing it and every little contribution counts.

Back to TLS and Certificates

In the case of an Application Delivery Administrator, an individual is responsible for the integrity and confidentiality of the services they deliver. This role must work hand-in-glove with Identity administrators in principle, and both are security administrators at heart.

This is really just a flowery way to say "get used to renewing and filing Certificate Signing Requests (CSRs)".

In an ideal world, an Application Delivery Controller (ADC) will validate the integrity of a backend server (Real Server) before passing traffic to it, in addition to providing the whole "CIA Triad" to clients. Availability is an ADC's thing, after all.

Realistically an ADC Administrator will only control one of these two legs - and it's plenty on its own. Here's one way to execute this model.

Certificate Management

Enough theory, let's do some things. First, we'll build a PKI inside of HashiCorp Vault - this assumes a full Vault installation. Here's a view of the planned Certificate Hierarchy:

From the HashiCorp Vault GUI - let's set up a PKI secrets engine for the root CA:

Note: Default duration is 30 days, so I've overridden this by setting the default and max-lifetime under each CA labeled as "TTL"
Let's create the services and user CAs:

This will provide a CSR - we need to sign it under the root CA:

Copy the resulting certificate into your clipboard - these secrets engines are autonomous, and don't interoperate - so we'll have to install it into the intermediate CA.
We install the certificate via the "Set signed intermediate" button in Vault:
Now, we have a hierarchical CA!
NB: You will need to create a Vault "role" here - 
Mega NB: The root CA should nominally be "offline" and at a minimum part of a separate Vault instance!

For this post, we'll just be issuing certificates manually. We need to extract the intermediate and root certificates to install in NSX ALB and participating clients. These can be pulled from the root-ca module:
Note: Vault doesn't come with a certificate reader as of 1.8.3. You can read these certificates with online tools or by performing the following command with OpenSSL:
openssl x509 -in cert1.crt -noout -text

Once we have the files, let's upload them to Avi:
For each certificate, click "Root/Intermediate CA Certificate" and Import. Note that you do need to click on Validate before importing.

Now that we have the CA available, we should start by authenticating Avi itself and create a controller certificate:
Fulfilling the role of PKI Administrator, let's sign the CSR after verifying authenticity.
Back to the role of Application Administrator! We've received the certificate, let's install it in the Avi GUI!
Once we've verified the certificate is healthy, let's apply it to the management plane under Administration -> Settings -> Access Settings:
At this point, we'll need to trust the root certificate created in Vault - else we'll still see certificate errors. Once that's done, we'll be bidirectionally authenticated with the Avi controller!

From here on out - we'll be able to  leverage the same process, in short:
  • Under Avi -> Templates -> Security -> TLS/SSL Certificates, create a new Application CSR
    • Ensure that all appropriate Subject Alternative Names (SANs) are captured!
  • Under Vault -> svc-ca -> issued-certificates -> Sign Certificate
  • Copy issued certificate to TLS Certificate created in the previous step
  • Assign to a virtual service. Unlike F5 LTM, this is decoupled from the clientssl profile.

Sunday, September 19, 2021

Get an A on with VMware Avi / NSX ALB (and keep it that way with SemVer!)

Cryptographic security is an important aspect of hosting any business-critical service.

When hosting a public service secured by TLS, it is important to strike a balance between compatibility (The Availability aspect of CIA), and strong cryptography (the Integrity or Authentication and Confidentiality aspects of CIA). To illustrate, let's look at the CIA model:

In this case, we need to balance backward compatibility with using good quality cryptography -  here's a brief and probably soon-to-be-dated overview of what we ought to use and why.


This block is fairly easy, as older protocols are worse, right? 

TLS 1.3

As a protocol, TLS 1.3 has quite a few great improvements and is fundamentally simpler to manage with fewer knobs and dials. There is a major concern with TLS 1.3 currently - security tooling in the large enterprise hasn't caught up with this protocol yet as new ciphers like ChaCha20 don't have hardware-assisted lanes for decryption. Here are some of the new capabilities you'll like::
  • Simplified Crypto sets: TLS 1.3 deprecates a ton of less-than-secure crypto - TLS 1.2 supports up to 356 cipher suites, 37 of which are new with TLS 1.2. This is a mess - TLS 1.3 supports five.
    • Note: The designers for TLS 1.3 achieved this by removing forward secrecy methods from the cipher suite, and they must be separately selected.
  • Simplified handshake: TLS 1.3 connections require fewer round-trips, and session resumption features allow a 0-RTT handshake.
  • AEAD Support: AEAD ciphers both support integrity and confidentiality. AES Galois Counter Mode (GCM) and Google's ChaCha20 serve this purpose.
  • Forward Secrecy: If a cipher suite doesn't have PFS (I disagree with perfect) support, it means that a user can capture your network traffic and replay it to decrypt if the private keys are acquired. PFS support is mandatory in TLS 1.2

Here are some of the things you can do to mitigate the risk if you're in a large enterprise that performs decryption:
  • Use a load balancer - since this is about a load balancer, you can protect your customer's traffic in transit by performing SSL/TLS bridging. Set the LB-to-Server (serverssl) profile to a high-efficiency cipher suite (TLS 1.2 + AES-CBC) to maintain confidentiality while still protecting privacy.

TLS 1.2

TLS 1.2 is like the Toyota Corolla of TLS, it's run for forever and not everyone maintains it properly.

It can still perform well if properly configured and maintained - we'll go into more detail on how in the next section. The practices outlined here are good for all editions of TLS.

Generally, TLS 1.0 and 1.1 should not be used. Two OS providers (Windows XP, Android 4, and below) were disturbingly slow to adopt TLS 1.2, so if this is part of your customer base, beware.


This information is much more likely to be dated. I'll try to keep this short:


  • (AEAD) AES-GCM: This is usually my all-around cipher. It's decently fast and supports partial acceleration with hardware ADCs / CPUs. AES is generally pretty fast, so it's a good balance of performance and confidentiality. I don't personally think it's worth running anything but 256-bit on modern hardware.
  • (AEAD) ChaCha20: This was developed by Google, and is still "being proven". Generally trusted by the public, this novel cipher suite is fast despite a lack of hardware acceleration.
  • AES-CBC: This has been the "advanced" cipher for confidentiality before AES-GCM. Developed in 1993, this crypto is highly performant and motivated users to move from suites like DES and RC4 by being both more performant and stronger. Like with AES-GCM, I prefer not to use anything but 256-bit on modern hardware
  • Everything else: This is the "don't bother" bucket: RC4, DES, 3DES


Generally, AEAD provides an advantage here - SHA3 isn't generally available yet but SHA2 variants should be the only thing used. The more bits the better!

Forward Secrecy

  • ECDHE (Elliptic Curve Diffie Hellman): This should be mandatory with TLS 1.2 unless you have customers with old Android phones and Windows XP.
  • TLS 1.3 lets you select multiple PFS algorithms that are EC-based.

Matters of Practice

Before we move into the Avi-specific configuration, I have a recommendation that is true for all platforms:
Cryptography practices change over time - and some of these changes break compatibility. Semantic versioning provides the capability to support three scales of change:
  • Major Changes: First number in a version. Since the specification is focused on APIs, I'll be more clear here. This is what you'd iterate if you are removing cipher suites or negotiation parameters that might break existing clients
  • Minor Changes: This category would be for tuning and adding support for something new that won't break compatibility. Examples here would be cipher order preference changes or adding new ciphers.
  • Patch Changes: This won't be used much in this case - here's where we'd document a change that matches the Minor Change's intent, like mistakes on cipher order preference.

Let's do it!

Let's move into an example leveraging NSX ALB (Avi Vantage). Here, I'll be creating a "first version," but the practices are the same. First, navigate to Templates -> Security -> SSL/TLS Profile:

Note: I really like this about Avi Vantage, even if I'm not using it here. The security scores here are accurate, albeit capped out - VMware is probably doing this to encourage use of AEAD ciphers:
...but, I'm somewhat old-school. I like using Apache-style cipher strings because they can apply to anything, and everything will run TLS eventually. Here are the cipher strings I'm using - the first is TLS 1.2, the second is TLS 1.3.

One gripe I have here is that Avi won't add the "What If" analysis like F5's TM-OS does (14+ only).  Conversely, applying this profile is much easier. To do this, open the virtual service, and navigate to the bottom right:

That's it! Later on, we'll provide examples of coverage reporting for these profiles. In a production-like deployment, these services should be managed with release strategies given that versioning is applied.

Friday, September 17, 2021

Static IPv4/IPv6 Addresses - Debian 11

 Here's how to set both static IPv4 and IPv6 addressing on Debian 11. The new portions are outlined in italics.

First, edit /etc/network/interfaces

auto lo
auto ens192
iface lo inet loopback

# The primary network interface
allow-hotplug ens192
iface ens192 inet static
address {{ ipv4.address }}
gateway {{ ipv4.gateway }}
iface ens192 inet6 static
address {{ ipv6.address }}
gateway {{ ipv6.gateway }}

Then, restart your networking stack:
systemctl restart networking

Friday, September 10, 2021

VMware NSX ALB (Avi Networks) and NSX-T Integration, Installation

Note: I created a common baseline for pre-requisites in this previous post. We'll be following VMware's Avi + NSX-T Design guide.

This will be a complete re-install. Avi Vantage appears to develop some tight coupling issues with using the same vCenter for both Layer 2 and NSX-T deployments - which is not an issue that most people will typically have. Let's start with the OVA deployment:

Initial setup here will be very different compared to a typical vCenter standalone or read-only deployment. The setup wizard should be very minimally followed:

With a more "standard" deployment methodology, the Avi Service Engines will be running on their own Tier-1 router, and leveraging Source-NAT (misnomer, since it's a TCP proxy) for "one-arm load balancing":

To perform this, we'll need to add two segments to the ALB Tier-1. one for management, and one for vIPs. I have created the following NSX-T segments, with running DHCP and for vIPs:
Note: I used underscores in this segment name, in my own testing both ./ are illegal characters. Avi's NSX-T Cloud Connector will report "No Transport Nodes Found" if it cannot match the segment name due to these characters.
Note: If you configure an NSX-T cloud and discover this issue, you will need to delete and re-add the cloud after fixing the names!
Note: IPv6 is being used, but I will not share my globally routable prefixes.

First off, let's create NSX-T Manager and vCenter Credentials:
There is one thing that needs to be created on vCenter as well - a content library. Just create a blank one and label it accordingly, then proceed with the following steps:
Click Save, and get ready to wait. The Avi controller has automated quite a few steps, and it will take a while to run. If you want, the way to track any issue in NSX ALB is to navigate to Operations -> Events -> Show Internal:
Once the NSX Cloud is reporting as "Complete" under Infrastructure -> Dashboard, we need to specify some additional data to ensure that the service engines will deploy. To do this, we navigate to Infrastructure -> Cloud Resources -> Service Engine Groups, and select the Cloud:
Then let's build a Service Engine Group. This will be the compute resource attached to our vIPs. Here I configured a naming convention and a compute target - and it can automatically drop SEs into a specific folder.
The next step here is to configure the built-in IPAM. Let's add an IP range under Infrastructure -> Cloud Resources -> Networks by editing the appropriate network ID. Note that you will need to select the NSX-T cloud to see the correct network:
Those of you who have been LTM Admins will appreciate this. Avi SE also perform "Auto Last Hop," so you can reach a vIP without a default route, but monitors (health checks) will fail. The spot to configure the custom routes is under Infrastructure -> Cloud Resources -> Routing:

Finally, let's verify that the NSX-T Cloud is fully configured. An interesting thing I saw here is that Avi 21 shows an unconfigured or "In Progress" cloud as green now, so we'll have to mouse over the cloud status to check in on it. 
Now that everything is configured (at least in terms of infrastructure), Avi will not deploy Service Engines until there's something to do! So let's do that:
Let's define a pool (back-end server resources):

Let's set a HTTP-to-HTTPS redirect as well:

Finally, let's make sure that the correct SE group is selected:
And that's it! You're up and running with Avi Vantage 21! After a few minutes, you should see deployed service engines:
The service I configured is also now up - In this case, I'm using Hyperglass, and I can leverage the load-balanced vIP to check and see what the route advertisement from Avi looks like. As you can see, it's firing a multipath BGP host address:

Get rid of certificate errors with Avi (NSX-ALB) and Hashicorp Vault!

 Have you ever seen this error before? This is a really important issue in enterprise infrastructure because unauthenticated TLS connections...