Data Contracts Terraform/Kubernetes

I know, I just wrote a rant about using Kubernetes and now yet another technical article about it?

Reality is it's several years now I'm using Kubernetes to run production workloads at a scale. I'm also a strong sustainer of SOLID principles and everything as code.

With this article I'm going to share a pattern I come up with, that I've named "Data Contracts Terraform/Kubernetes". We successfully introduced this pattern and it makes the life easier for Platform Engineers and internal customers of such platform.

#What am I trying to solve?

In general, I'm trying to keep the responsibility for holding information as source of truth on a single place (inspired by SOLID principles), and using proper API or contracts to let these information be consumed.

But what information are we talking about? Let's go with some real-life use-cases.

#Assumption in this scenario

We run Terraform (or better Terragrunt) to provision our infrastructure up until we have a provisioned Kubernetes control plane and one or more node group/pools to deploy our workloads
The workloads in Kubernetes from this moment on are deployed via GitOps pattern, for example with FluxCD or ArgoCD.
We use HELM to provision the Kubernetes components.

#Use-Case: Identities for K8S workloads

When - in Kubernetes - you need to run an application that in some way requires interaction with cloud resources you need to enable the service accounts behind the workload to assume a role in the cloud provider and get temporary tokens. It's at high level the way AWS IRSA (IAM Role for Service Account) and Pod Identity, or Azure Workload Identity work.

To enable the Service Account you have to pair the service account namespace and name to a proper dedicated role with the only required privileges, and as well provision an hosted zone to manage. All these resources are provisioned in Terraform, but within the realm of K8S there is no knowledge of those.

A practical example is ExternalDNS. This K8s component need to perform changes on the Hosted Zone (route53, Azure DNS Zone, ...). In order to deploy a working configuration we need to pass some information in the HELM release, such as, for example in Azure, the annotation azure.workload.identity/client-id that you preconfigured with a managed identity in Terraform, the ResourceGroup, SubscriptionID, etc..

If you use your GitOps technology you will have the possibility to assign the values.yaml while deploy the base chart.

Now, do we want to hardcode these value? What about deploying the same blueprint on another cluster? We do need to have multiple overrides per each cluster?

#Enters Infra/Kubernetes contracts

The Infra/K8S contract is nothing more than a glorified ConfigMap that will be in charge of holding infrastructure metadata that would be otherwise not known inside the Kubernetes realm.

When we provision the infrastructure via code, we provision a dedicated hosted zone for each cluster. This hosted zone will be managed exclusively by the installation of ExternalDNS in each cluster.

Since we know the information about the identities, resource groups and hosted zone, we can make it available inside the cluster via a ConfigMap. For example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-metadata
  namespace: default
data:
  clusterFQDNSuffix: myaks-westeurope.g-c.dev
  subscriptionId: 453004c1-3997-44b1-99e1-3210956f4cb4
  resourceGroup: myResourceGroup
  [...]

See the diagram below:

sequenceDiagram
    title Terraform / Kubernetes contract exchange

    participant terraform as Terraform/Terragrunt

    terraform -->> Cloud : provision Cloud resources/identities
    terraform -->> AKS : provision AKS controlplane/nodes
    terraform -->> FluxCD : provision FluxCD
    terraform -->> AKS : generate cluster-medadata
    note over terraform,AKS: ConfigMap: clusterFQDNSuffix, dnsManagementClientId, ...

    FluxCD -->> FluxCD: process HELM Charts
    FluxCD -->> AKS : consume cluster-metadata

    box Kubernetes Realm (GitOps)
    participant AKS
    participant FluxCD
    end

#Consuming the contract

Form within the Kubernetes realm, we have the possibility to dynamically discover this information.

with FluxCD we can leverage the postBuild capability to lookup and finalise the HELM templates with information coming from the ConfigMap.
with HELM itself we can leverage the lookup to retrieve dynamically information that are strictly cluster relate and keep the template as dry as possible.

#Examples

I have a working example with my AKSLab Cluster. It is composed of 2 repositories.

The first, aks-lab, is "PushOps", and leverages Terraform/Terragrunt to provision the underlying network, hosted zone, AKS, identities, and FluxCD to support the GitOps reverse pattern later on. Finally it provision the cluster metadata.
The latter, aks-lab-gitops, contains all the Kubernetes realm component, deployed in waves.

#1 - Provisioning ClusterMetadata

Here you can see how - after provisioned the AKS control plane and node pool - I collect all the data that I want to project inside Kubernetes realm via the cluster-metadata ConfigMap. The latter is created via terraform K8S manifest in here

#2 - Configuring FluxCD to finalise the HELM Templates

In my case I'm using Flux Kustomization to finalize the HELM templates.

Here I create the Flux manifest to reference the Kustomization and include the postBuild stage to finalize the templates.

#3 - Helm Charts

Again, with FluxCD, in my gitops repository I can keep the configuration completely DRY and leverage the substitution with the cluster-metadata for finalising the HELM template.

See this example for ExternalDNS (relevant snippet below)

[...]
# note the placeholders that will be dynamically substituted by FluxCD at deploy time
        azure.workload.identity/client-id: ${dnsManagementClientId:=notProvided}
      labels:
        azure.workload.identity/use: "true"
    azure:
      resourceGroup: ${dnsManagementAzureResourceGroup:=notProvided}
      tenantId: ${azureTenantId:=notProvided}
      subscriptionId: ${azureSubscriptionId:=notProvided}

The final result is that I can keep the HELM charts very DRY, agnostic in respect of the cluster where is deployed.

#One more example: ingress hostnames

When you prepare an helm chart for your workload, you typically template the ingress configuration. An aspect of this configuration is the FQDN (fully qualified domain names) of the workload. This will probably trigger certificate creation, hostname registration in a dedicated hosted zone (managed by External-DNS), and more.

But, when I create my helm chart, I don't know the complete FQDN. I surely know the hostname I want to use, but I might not even care about the rest of the FQDN.

If we use indeed Kubernetes as cattle instead as pet, we should not be interested on knowing beforehand the hosted zone managed by the cluster.

As Application owner, I'm interested in obtaining an endpoint on an allocated cluster on a specific region, and I'm interested to register that endpoint behind my commercially oriented hosted zone (that I advertise to my customer).

This is a very good pattern, so that I don't disclose directly the cluster FQDN of my workload, and it gives me the chance to re-route the DNS traffic toward another cluster (another region, or blue-green deployments, you decide).

Below the very high level diagram

---
title: Example of AKS MultiRegion Workload
---
graph LR
    U(User) --> HZ(["`Hosted Zone _api.g-c.dev_`"])
    HZ --> EP1{{"`AKS / westeurope _api.myaks-westeurope.g-c.dev_`"}}
    HZ --> EP2{{"`AKS / northeurope _api.myaks-northeurope.g-c.dev_`"}}

    EP1 --> WKLD1(Workload)
    EP2 --> WKLD2(Workload)

What I really need to have under control is the hosted-zone api.g-c.dev and the hostname (only the initial part) I decide on each compute allocation (AKS Clusters).

Now, most of all the charts require me to have knowledge of the FQDN. How can I make sure I can create a chart without knowing the information stricly related to the target cluster?

#Again, cluster-metadata

With a helm chart we can template the ingress manifest very easily, and leverage the helm lookup capability to automatically retrieve the necessary information from the cluster-metadata ConfigMap.

Example how we can implement the ingress template with HELM:

NOTE: I've redacted most of the template and kept only the interesting snippet

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-workload
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  tls:
  - hosts:
    - {{ .Values.hostname }}.{{ lookup "v1" "ConfigMap" "default" "cluster-metadata".metadata.items.clusterFQDNSuffix }}
    secretName: my-workload-tls
  rules:
  - host: {{ .Values.hostname }}.{{ lookup "v1" "ConfigMap" "default" "cluster-metadata".metadata.items.clusterFQDNSuffix }}
    http:
      paths:
      - path: /
        backend:
          serviceName: my-workload-service
          servicePort: 80

with this template, we don't care about the FQDN and we can deploy the same values on each cluster in each region, without having to hardcode or keep separate values per target.

#Conclusion

With this pattern you keep a single source of truth, propagate information required on the Kubernetes realm and keep these always in sync, after each subsequent infrastructure change.

In my previous project we were deploying cluster at a scale for a Kubernetes Platform as Service and this pattern resulted extremely convenient and powerful for having standard blueprints applicable to every cluster.

And finally treating Kubernetes clusters as cattles and not as pets.