Best-Practice Security, Automation & Operability, with mTLS

Author: CECG | Posted on: June 10, 2024



Discover how we designed a robust authentication approach which can flexibly handle a diverse range of communication protocols and which scales efficiently.

The Background

This blog is about the path to creating a robust, scalable and highly-operable authentication layer for a cloud, kubernetes-based service.

For this particular project, we had a requirement to create a telemetry aggregator service which would act as a central hub to receive data from clients across Europe. The service will have a firewall so only specific source IP ranges could connect to it, but we also needed some sort of authentication mechanism.

This telemetry service will receive metrics, logs and trace data over a variety of protocols and standards:

  • gRPC
  • fluentd’s “forward ” protocol
  • syslog
  • HTTP for transmitting Metric data
  • OTLP protocol for use by traces

So, to begin with, we’re supporting five different protocols – and we may need to support further protocols in future depending on additional client requirements.

We therefore needed an auth system which was highly secure but also highly flexible, we immediately ruled out HTTP basic auth – many gRPC clients don’t have a mechanism to use this because it’s not part of the core gRPC spec, nor is it supported by the syslog protocol.

That leaves mTLS, a great authentication mechanism because it’s very secure, and furthermore by using a mechanism which is itself a part of the TLS protocol we completely eliminate the possibility of a misconfigured client sending credentials in plaintext. Because of its good security attributes, mTLS is ubiquitously supported - this is inclusive of all the protocols which we’re intending to offer.

Auth == (Authentication + Authorization)

It’s important to stress that the authentication approach presented here is relatively simple, and this is a consequence of the inherent simplicity of the system for which we’ve designed the approach. The system is an API which enables external services to push data, but perform no other operation including reading data.

Accordingly, we needed authentication at the services level, rather than individual users, and only one level of authorisation was required. On this basis, the mTLS approach outlined here was deemed to be sufficient. If the system was developed in the future to add extra functionality to enable individual users to, for example, requiring the ability to read data, we would need to develop a more sophisticated auth approach - by using scoped, time-limited tokens.

The Design

The service is intended to act as a central endpoint for clients distributed across Europe, and one of its aims is to attach appropriate labels to each piece of data it receives.

These labels will match the client from whom the data was sent. So, all logs from team Alpha will be labelled “team=alpha” and so on. This raises the question of how to identify which logs came from which team. We could ask teams to apply an identifier to each piece of data – however, if teams fail to do this, or do it incorrectly, it could cause a circumstance in which data is lost (or assigned to the wrong team) and this could be tricky to rectify. A better approach would be to implement something server-side which matches each log to its corresponding client, without any obligation on the client to do anything in order to accomplish this.

We decided we would create a dedicated DNS record for each client, so all data sent to a particular subdomain would be routed to a corresponding backend service, which would in turn attach the appropriate label. So, team _Alpha _would be given a sub-domain of alpha.logs.telemetry-aggregator.com.

Within this sub-domain, we decided to support the different log protocols on dedicated ports:

Team Alpha gRPC: alpha.logs.telemetry-aggregator.com:443

Team Alpha Fluent/Forward: alpha.logs.telemetry-aggregator.com:24224

Team Alpha syslog: alpha. logs.telemetry-aggregator.com:601

Recall that we’re supporting multiple protocols, only one of which is HTTP. Lower-level protocols (e.g. syslog) won’t declare the DNS record to which they’ve sent the data; they’ll resolve the DNS record to an IP and the IP alone will be transmitted across the wire, rather than the intended DNS record.

All the DNS records will resolve to one load-balancer. So once the TCP packets arrive, how will we know which DNS record was the intended recipient - or to put it more plainly, how will we know which team sent each piece of received data?

Luckily the TLS spec contains an element specifically created to address this problem: Server Name Indication , or SNI. SNI is simply a declaration by the client during the TLS handshake which states the host to which the TLS connection pertains. This allows a single server IP to use multiple different certificates, returning the certificate which matches the SNI stated by the client.

In practice, the SNI is the same as the DNS record to which the client addressed their request.

So, we can use the SNI to determine the intended DNS address, and by extension the identity of the client establishing the connection - even before any client data is transmitted. This removes the need for clients to insert a label into the telemetry data itself; as long as they address the correct DNS record,ensuring their data will be routed to the correct backend.

In regards to routing, we need to prevent unwanted users from sending data to our system,by enforcing a requirement on clients to present a valid client-certificate - in other words, mutual-TLS – this will block agents without such a certificate.

However, if all the client certificates are signed by the same CA as the server certificate, each client will be able to push data to every other client’s dedicated DNS record, and this is something which we do not want.

One way around this, would be to use a different CA for each client, but this would be extremely inefficient and burdensome.

Luckily, once again, the TLS spec contains a feature which addresses this. The spec stipulates that TLS certificates must contain at least one SAN value which, in brief, restricts the DNS names against which the certificate can be used. So, we can add each client’s dedicated DNS record to their certificate as a SAN - and we can add the complete list of all client DNS records to the server’s certificate.

Once each side validates the other, every client will accept the server’s certificate - but clients will only be able to initiate a connection to their dedicated DNS record. So with this approach we’ve achieved our security aims but with an elegantly minimal PKI architecture: one CA, one server certificate and one certificate per client.

The Implementation

To begin with, we need something that can create the DNS records for each team. Creating these manually, or in terraform, would be time-consuming and inefficient. Instead, we’re using a tool called ExternalDNS , a Kubernetes utility which listens for annotations on Ingress or Service objects and creates DNS records which point to the associated load-balancer. It’s a handy tool because it means you can perform service-discovery of Kubernetes workloads whilst working only with k8s yaml – and once properly configured it will continuously track the lifecycle of the k8s resources and automatically update and delete the DNS records as appropriate.

Next, we need some tooling to provision the TLS certificates. The de facto standard tool for this is cert-manager. This k8s operator can automatically provision and renew certificates by listening for a custom “Certificate” resource: once the resource is created in the cluster, cert-manager will obtain the cert and then place it, plus its associated key, in a k8s Secret.

Lastly, we also need a Certificate Authority to issue the certs. We could provision our own CA using, say, Hashicorp Vault , and run it in the Kubernetes cluster. But this would impose a significant maintenance burden, particularly in terms of securing the root key. Our cloud provider, GCP, offers a Private CA service – which meets our needs perfectly. We can provision the CA with Terraform so we have an audit trail and the ability to rescue or recreate the CA should we need to. And because the CA is held by the cloud provider we don’t need to worry about securing the root key.

Lastly, we need a proxy which will do the work of validating and terminating the mTLS connections, routing the requests based on the domain to which the data was sent, and load-balancing connections to the aggregators in the backend.

We were already using Envoy elsewhere in this project, so it made sense to re-use the prior art for this piece of work. On top of this, Envoy has some features which make it well-suited to this use case. One in particular is that Envoy can perform a “hot reload”of certs without restarting or dropping connections – so, when the cert-manager renews a server certificate, Envoy can use the updated cert immediately without any interruption or manual intervention. Another benefit is that Envoy supports Certificate Revocation Lists so if a client cert was ever compromised we could add it to a CRL and block it.

Figure 1. A diagram representing our architecture for a cloud authentication system which uses DNS & SNI to route data to appropriate backend services. DNS records resolve to one network passthrough load-balancer. The load-balancer routes traffic to an Envoy deployment. Envoy checks the mTLS client-certificate, terminates the TLS, and then routes the traffic to the relevant backend service via the SNI.

The Deployment

We’re using Helm to template and deploy our Kubernetes manifest files. This enables us to programmatically create different sets of resources for each client whilst keeping our k8s YAML very DRY. It also has the benefit of putting in place guard-rails to limit the possibility of introducing incorrect config –as this could be a big problem in the case of Envoy.

Our Helm chart can provision a single Envoy deployment, to serve as the auth and load-balancing layer - and one aggregator deployment per client, to apply the requisite labels. We can provision one LoadBalancer Service to expose the envoy deployment externally - and we can populate the annotations programmatically, enabling ExternalDNS to create the corresponding DNS records. Likewise, our chart can contain a server & client certificate template – with one of the latter created for each client.

The upshot of all this, is we can deploy and modify the architecture outlined above: proxies, aggregator agents, load-balancers, certificates, DNS records, etc - simply by modifying one Helm YAML input file, making our stack very easy to operate and easy to incorporate into CI/CD.

Conclusion

This approach has given us a robust auth mechanism which will work for any protocol. On top of this, the partnership of Helm, Envoy, cert-manager and ExternalDNS enables us to onboard new clients and provision all the associated resources simply by maintaining a Helm YAML file – so we have achieved a good outcome in terms of automation and operability.

Routing all the traffic through a single envoy layer means we have a single-point of failure, so it’s important to ensure that any changes to the Envoy config template are comprehensively tested – adding new input parameters to the existing template is very safe because this will simply add or remove pre-tested elements, but updating the template itself carries a risk, which is mitigated by ensuring we have comprehensive coverage in our integration tests.

Another pitfall of this approach is the requirement to transmit certificates to our clients. Because of the difficulty inherent in doing this, in the present architecture, the issued certificates have long expiry periods. This underscores the importance of a robust revocation mechanism in order to mitigate a compromised certificate. One way in which the system could be improved is to shorten the client certificate lifetime and then grant the clients access to a secret store which they can use to retrieve the rotated certificates. Using such an approach, we could provision client certificates with very short expiry periods, as is best practice.