Supporting private service access in GCP from a multi-tenanted kubernetes platform

Author: Tiago Alves | Posted on: October 25, 2024

Having a developer platform that works out of the box is great, but integrating it with other cloud provider resources outside the platform can be tricky:

Cost calculation complexity for tenants increases, as we’ll need to have internal calculation of which resources belong to which tenant and what does that cost.
Access to the created resources needs to be part of the feature.
Multi-tenanted resources: we would need to ensure there are no clashes and no noisy neighbours.

One of the potential solutions to these issues is to have tenants fully manage their own resources. To support this, the platform needs to support connecting to those resources.

What considerations should this feature have?

It should be easy to add the configuration to establish connectivity for the tenant to their resources (e.g., databases, and storage buckets).
The platform needs to ensure that only the tenant who owns the resource can connect to it.

What have we implemented in our GKE platform?

As part of our Core Platform on GKE, we’ve taken advantage of a few provided features.

IAM Auth & Connectivity

This feature allows us to add an IAM policy to a Google Cloud Service Account to bind it to one or more namespaces and a specific Kubernetes Service Account name. This ensures:

I can’t use other Kubernetes Service Accounts to impersonate a Google Service Account because only an explicitly named Kubernetes Service Account can be used.
I can’t copy the Kubernetes Service Account and use it in another namespace because it needs to be listed on the allowed namespace list. A nice feature here is that your Kubernetes namespace does not need to exist when it is referenced, so you can create the policy before creating the namespace.

You can then reuse these same Service Accounts to Authentication on services like databases.

Let’s take the example of Cloud SQL with MySQL Server, you’ll need to:

Create the Database with cloudsql_iam_authentication enabled.
Create an IAM user on the Database.

IMPORTANT NOTE - IAM Users are created without any privileges. In order to grant them, you’ll need a regular user to run the commands

---
GRANT ALL PRIVILEGES ON `database`.* TO `iam-username`@`%`;

There are 2 ways then to connect to your newly created Database from the Developer Platform:

Use one of the existing connectors . This will mean you might need an specific implementation for CloudSQL, meaning you might need some logic to pick different clients if you’re planning to connect to regular Databases (in memory, deployed MySQL instances for testing, etc).
The other more language agnostic solution is to have a CloudSQL Proxy Auth running as a sidecard on your pods.

Private Service Access

Another solution to connect to resources/services that are on different GCP Accounts is to use Private Service Access (PSA).

This works very differently from IAM auth as described in the previous section. This works by creating a VPC inside the same Network on the Developer Platform’s account with a given private IP range, supported resources created by tenants (e.g., Databases) use IPs from this Subnetwork, and the Core Platform is able to connect to them.

alt_text

This has both advantages and disadvantages. Main advantage is that it’s slightly easier for the Platform user to wrap their heads around it as they don’t need to deal with Service Accounts and CloudSQL Proxies for authentication. A CloudSQL instance created with PSA, will be reachable from anywhere inside the cluster by using the IP directly. There are many disadvantages though:

Only a single PSA VPC can be created per GCP Account.
You can only link account A to a single GCP Account. If you want to link your account to 2 different GCP Accounts, that’s not possible.
Additional work to restrict this to the requesting tenant: Only with network policies can you restrict who can actually connect to a specific IP because the whole cluster would be able to hit that IP. This would of course be controlled by additional Database Authentication, but we cannot provide that as a feature, so network policies need to be in place to fully isolate that resource.

Final Thoughts

If we were to pick one, we’d go with IAM Authentication & Connectivity. It does the isolation out of the box and the complexity added with the Service Accounts is low. However, some features don’t support IAM Auth & Connectivity like is the case with Memory Store (Redis) and in the scenario we’re forced to go with PSA.