Introduction
Kubernetes is becoming more and more popular these recent years. Many new projects choose it - by default - as a platform. Its flexibility and advantages are numerous to the point that many of the legacy applications are getting refactored, containerized, and migrated to Kubernetes.
In real life, it is common to have services you develop and maintain, and third-party services and tools, deployed altogether in the same cluster (or clusters).
Although not all services will be exposed via an endpoint (eg. batch jobs), many of them will have an endpoint and can be exposed at least to a set of trusted clients, and generally very publicly. In both cases, communication goes through networks we do not own.
From a security standpoint, a lot can be done to secure these communications, but a ‘must-have’ is using TLS certificates to secure the communication.
In this article, we will explore how to secure Kubernetes services using Let’s Encrypt certificates, and how to automatically generate these certificates and automatically renew them when they expire.
Preparing the Kubernetes cluster
We will be using AWS as a Cloud Provider, and EKS to create our cluster. The DNS will be managed by Route 53.
We will be using mainly Terraform to provision our Infrastructure.
Let’s start by creating a Kubernetes cluster that we will be using during our exploration, and let’s start with the basic building blocks.
In the following setup phase, we will prepare an EKS cluster and install ExternalDNS controller and NGINX Ingress Controller. If you want to skip this part, feel free to jump directly to Configure and Install cert-manager
Installing controllers
At this stage, we start with a simple EKS cluster as defined here.
However, some functionalities we will be needing are not there yet. An Ingress Controller is not yet installed. DNS records are not handled by any component.
We will install NGINX Ingress Controller, and ExternalDNS to automatically update our Route 53 zone from ingress resources.
These controllers’ Helm Charts will be installed using Terraform Helm provider.
We won’t go deep into the details of these controllers as they are not our main topic.
For our Kubernetes workloads, we will make use of IRSA (IAM Roles for Service Account) through iam-assumable-role-with-oidc Terraform module. This will allow us to have dedicated IAM roles for our Kubernetes service accounts, with the required permissions only.
ExternalDNS
To manage our AWS Route 53 zone we will install and configure external-dns. We will define a Terraform variable with our DNS zone:
public_dns_zone = "dev.cloudiaries.com"
For ExternalDNS specific variables:
variable "external-dns" {
default = {
chart_version = "6.14.1"
namespace = "system"
service_account = "external-dns"
}
}
ExternalDNS IAM
Let’s get the ID of our DNS zone:
data "aws_route53_zone" "selected" {
name = var.public_dns_zone
private_zone = false
}
And create an IAM Role for ExternalDNS deployment:
module "iam_assumable_role_for_external_dns" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "5.14.3"
create_role = true
number_of_role_policy_arns = 1
role_name = "external-dns-role-${var.eks.cluster_name}"
provider_url = replace(var.eks.cluster_oidc_issuer_url, "https://", "")
role_policy_arns = [aws_iam_policy.external_dns.arn]
oidc_fully_qualified_subjects = ["system:serviceaccount:${var.external-dns.namespace}:${var.external-dns.service_account}"]
}
# ExternalDNS policy
data "aws_iam_policy_document" "external_dns" {
statement {
actions = ["sts:AssumeRole"]
resources = ["*"]
}
statement {
actions = [
"route53:ChangeResourceRecordSets"
]
resources = [
"arn:aws:route53:::hostedzone/${data.aws_route53_zone.selected.zone_id}",
]
}
statement {
actions = [
"route53:ListHostedZones",
"route53:ListResourceRecordSets"
]
resources = ["*"]
}
}
resource "aws_iam_policy" "external_dns" {
name = "external-dns-policy-${var.eks.cluster_name}"
policy = data.aws_iam_policy_document.external_dns.json
}
We will use the following basic configuration for ExternalDNS, to watch ingress resources for DNS records management.
We’ll also create CRDs, and force the deployment to be scheduled on the system
node group. The following configuration will be part of our external-dns-values.yaml
tolerations:
- key: "workload_type"
operator: "Equal"
value: "system"
effect: "NoSchedule"
nodeSelector:
workload_type: system
sources:
- ingress
crd:
create: true
logFormat: json
policy: sync
And finally, the chart installation:
resource "helm_release" "external_dns" {
name = "external-dns"
repository = "https://charts.bitnami.com/bitnami"
chart = "external-dns"
version = var.external-dns.chart_version
values = [file("${path.module}/external-dns-values.yml")]
render_subchart_notes = false
namespace = var.external-dns.namespace
create_namespace = true
set {
name = "serviceAccount.name"
value = var.external-dns.service_account
}
set {
name = "provider"
value = "aws"
}
set {
name = "aws.region"
value = var.region
}
set {
name = "aws.assumeRoleArn"
value = module.iam_assumable_role_for_external_dns.iam_role_arn
}
set {
name = "aws.zoneType"
value = "public"
}
set {
name = "domainFilters[0]"
value = data.aws_route53_zone.selected.name
}
set {
name = "zoneIdFilters[0]"
value = data.aws_route53_zone.selected.zone_id
}
set {
name = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
value = module.iam_assumable_role_for_external_dns.iam_role_arn
}
}
Ingress Controller
We will use the ingress-nginx as our Ingress Controller.
The Ingress Controller will be exposing our services through a Load Balancer.
For ingress-specific variables
variable "ingress" {
default = {
namespace = "system"
chart_version = "4.5.2"
timeout = "600"
}
}
A lot can be said about NGINX ingress configuration. This is not our main topic. Let’s install the Ingress Controller this way:
resource "helm_release" "ingress_controller" {
name = "nginx-ingress-controller"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
version = var.ingress.chart_version
render_subchart_notes = false
namespace = var.ingress.namespace
create_namespace = true
values = [file("${path.module}/ingress-values.yml")]
timeout = var.ingress.timeout
set {
name = "controller.service.annotations.external-dns\\.alpha\\.kubernetes\\.io/hostname"
value = "*.${var.public_dns_zone}"
}
set {
name = "controller.ingressClass"
value = "nginx"
}
set {
name = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/aws-load-balancer-proxy-protocol"
value = "*"
}
set {
name = "controller.config.use-forwarded-headers"
value = true
}
set {
name = "controller.config.use-proxy-protocol"
value = true
}
set {
name = "controller.config.compute-full-forwarded-for"
value = true
}
}
Configure and Install cert-manager
At this stage, we have an EKS cluster with an Ingress Controller and ExternalDNS to manage records in our AWS Route 53 zone.
Let’s now tackle our main topic: generating certificates using cert-manager.
cert-manager is a Kubernetes controller that is responsible for issuing certificates from different Issuers. It supports many public issuers as well as private issuers.
It will also make sure certificates are up to date, and renew the expiring certificates before expiry.
Install cert-manager
To install cert-manager we will be using - as with previous components - a Helm Chart.
We will be using the Let’s Encrypt Staging server to generate certificates for this demo. In order to issue a certificate from an Issuer, we need to solve a challenge to prove we own the domain name we are generating the certificate for. Cert-manager offers two challenge validation methods: HTTP01 and DNS01.
For HTTP01 challenge, the client is asked to present a token in an HTTP URL that is publicly routable and accessible. Once Let’s Encrypt gets the URL successfully with the expected token, the certificate will be issued. The URL has the format: http://<YOUR_DOMAIN>/.well-known/acme-challenge/<TOKEN>
For DNS01 challenge, we will be asked to present a token in a TXT DNS record. Once Let’s Encrypt gets the DNS record successfully with the expected key the certificate will be issued. The token is expected to be in a TXT record named _acme-challenge.<YOUR_DOMAIN>
under your domain name.
Let’s start by setting our IAM Role for the cert-manager Service Account, which will be needed for DNS01 challenges. We will need to create an IAM role with the required permissions as in the official documentation.
We can now create the IAM policy:
data "aws_iam_policy_document" "cert_manager" {
statement {
actions = ["sts:AssumeRole"]
resources = ["*"]
}
statement {
actions = ["route53:GetChange"]
resources = ["arn:aws:route53:::change/*"]
}
statement {
actions = [
"route53:ChangeResourceRecordSets",
"route53:ListResourceRecordSets"
]
resources = [
"arn:aws:route53:::hostedzone/${data.aws_route53_zone.selected.zone_id}"
]
}
statement {
actions = [
"route53:ListHostedZonesByName"
]
resources = ["*"]
}
}
resource "aws_iam_policy" "cert_manager" {
name = "cert-manager-policy-${var.eks.cluster_name}"
policy = data.aws_iam_policy_document.cert_manager.json
}
And the IAM Role for the Kubernetes Service Account:
module "iam_assumable_role_for_cert_manager" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "5.14.3"
create_role = true
number_of_role_policy_arns = 1
role_name = "cert-manager-role-${var.eks.cluster_name}"
provider_url = replace(var.eks.cluster_oidc_issuer_url, "https://", "")
role_policy_arns = [aws_iam_policy.cert_manager.arn]
oidc_fully_qualified_subjects = ["system:serviceaccount:${var.cert-manager.namespace}:${var.cert-manager.service_account}"]
}
We can now install the cert-manager
Helm Chart, enabling CRDs and setting the ARN of the IAM Role on the Service Account annotation:
resource "helm_release" "cert_manager" {
name = "cert-manager"
repository = "https://charts.jetstack.io"
chart = "cert-manager"
version = var.cert-manager.chart_version
render_subchart_notes = false
namespace = var.cert-manager.namespace
create_namespace = true
values = [file("${path.module}/cert-manager-values.yml")]
set {
name = "installCRDs"
value = true
}
set {
name = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
value = module.iam_assumable_role_for_cert_manager.iam_role_arn
}
}
Configure a Cluster Issuer
First of all, we will need to configure an Issuer
or ClusterIssuer
. The first is a namespaced resource, while the second is a cluster-wide resource.
Both of these resources will define the CA that can sign Certificates in response to Certificate Signing Requests.
You can have multiple Issuers/CluserIssuers on your cluster. When you request a Certificate, you will specify the Issuer you want to use.
The HTTP01 challenges will use the NGINX ingress controller to expose a public endpoint. For that, we need to specify the ingress Class used for our Ingress Controller, which is ’nginx’ in our example.
DNS01 Challenges need to access the Route 53 zone to add the TXT record. In the DNS01 solver configuration, we will provide the IAM role we created for cert-manager
in the previous step, as well as the hosted zone ID.
The following Cluster Issuer definition contains three main sections to take note of:
- The ACME server to use (Let’s Encrypt - Staging) and yout email. This is used for certificate expiry notifications.
- The HTTP01 solver with the ingress class to use, and a selector on the DNS zone
dev.cloudiaries.com
. This means all certificates under this domain will use this HTTP01 solver. - The DNS01 solver with the Route 53 config and a selector for a DNS name. This means all certificates for these names will use the DNS01 solver.
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
spec:
acme:
email: youraccount@example.com
preferredChain: ""
privateKeySecretRef:
name: issuer-account-key
server: https://acme-staging-v02.api.letsencrypt.org/directory
solvers:
- http01:
ingress:
class: nginx
selector:
dnsZones:
- dev.cloudiaries.com
- dns01:
route53:
hostedZoneID: <HOSTED_ZONE_ID>
region: <AWS_REGION>
role: <CERT_MANAGER_IAM_ROLE_ARN>
secretAccessKeySecretRef:
name: ""
selector:
dnsNames:
- "*.dev.cloudiaries.com"
Let’s now install it:
kubectl apply -f staging-cluster-issuer.yaml
and check that the cluster issuer have been created successfully:
kubectl get clusterissuer
NAME READY AGE
letsencrypt-staging True 5m
Services configuration
We have now:
- An EKS cluster with
- An Ingress Controller
- An ExternalDNS controller for managing our Route 53 zone
- Cert-manager for creating/renewing our certificates, with a Cluster Issuer definition
In the next steps, we will use the podinfo microservice to deploy it in our cluster. We will then secure the service ingress using Let’s Encrypt certificates, with two different methods.
Again, we will use a Helm Chart here.
Service with wildcard certificate
For this first method, we will create a podinfo
service that will use a wildcard certificate.
Let’s start by generating the dev wildcard certificate *.dev.cloudiaries.com
.
To do so we will need to create a Certificate
object for cert-manager
controller.
The following is a definition of our Certificate
object:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: dev.cloudiaries.com
namespace: dev
spec:
# Secret name that will hold the issued certificate
secretName: dev.cloudiaries.com
duration: 2160h # 90d: The duration of the certificate validity
renewBefore: 360h # 15d: Renew the certificate 15 days before expiry
subject:
organizations:
- cloudiaries
isCA: false
privateKey:
algorithm: RSA
encoding: PKCS1
size: 2048
usages:
- server auth
- client auth
# At least one of a DNS Name, URI, or IP address is required.
dnsNames:
- "*.dev.cloudiaries.com"
# Issuer references are always required.
issuerRef:
name: letsencrypt-staging
# We can reference ClusterIssuers by changing the kind here.
# The default value is Issuer (i.e. a locally namespaced Issuer)
kind: ClusterIssuer
When we create this object, the DNS Name will match the DNS01 solver, and it will be used for the challenge. cert-manager
will create a CSR (Certificate Signing Request) and submit it to the ACME server defined in the ClusterIssuer
.
The privateKey
in the Certificate
object is used to generate a private key. This generated key is then used to sign the CSR. This key will become later the private key of the issued certificate.
Once cert-manager
requests a certificate from Let’s Encrypt, it will be asked to solve a challenge by adding a TXT DNS record under the specified domain.
We can inspect the main resources’ status on the cluster. The challenge is pending, waiting for TXT records to be set as requested (and propagated):
kubectl -n dev get challenges
NAME STATE DOMAIN AGE
dev.cloudiaries.com-c8t66-847405157-2663135921 pending dev.cloudiaries.com 35s
The certificate request is approved by the cert-manager
but not yet ready:
kubectl -n dev get certificaterequest
NAME APPROVED DENIED READY ISSUER REQUESTOR AGE
dev.cloudiaries.com-c8t66 True False letsencrypt-staging system:serviceaccount:system:cert-manager 5s
Similarly, the certificate is not yet ready:
kubectl -n dev get chertificate
NAME READY SECRET AGE
dev.cloudiaries.com False dev.cloudiaries.com 6s
When Let’s Encrypt succeeds to fetch the TXT record, a certificate is issued and given back to cert-manager
.
kubectl -n dev get chertificate
NAME READY SECRET AGE
dev.cloudiaries.com True dev.cloudiaries.com 2m38s
Finally, the certificate is stored in the secret with the specified secret name in the Certificate definition.
kubectl -n dev get secrets dev.cloudiaries.com
NAME TYPE DATA AGE
dev.cloudiaries.com kubernetes.io/tls 2 2m7s
The secret as you can see in the DATA field, contains 2 keys. These are:
tls.crt
: The issued certificatetls.key
: The certificate private key (generated for the CSR)
We can check the certificate in the secret to make sure it contains the information we defined in the Certificate
object:
kubectl -n dev get secrets dev.cloudiaries.com -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -noout -subject
subject= /CN=*.dev.cloudiaries.com
And it is generated from the Let’s Encrypt staging server:
kubectl -n dev get secrets dev.cloudiaries.com -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -noout -issuer
issuer= /C=US/O=(STAGING) Let's Encrypt/CN=(STAGING) Ersatz Edamame E1
And the public key:
kubectl -n dev get secrets dev.cloudiaries.com -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -noout -pubkey
-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE75HnxlU0dVC8mliZFEJEWvAQ1LX0
oy/gJnVLFjDrSMObURIpSm9g48RVjzuRRprcmI6TZb7kqY52Oi/2BMhG4w==
-----END PUBLIC KEY-----
Notice that, as secrets are base64 encoded, we need to decode the certificate before passing it to the openssl
command.
Let’s now deploy our service and use the wildcard certificate. Let’s start by setting our service ingress config in a podinfo-dev-values.yaml
file:
ingress:
enabled: true
className: "nginx"
hosts:
- host: podinfo-dev.dev.cloudiaries.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: dev.cloudiaries.com
hosts:
- podinfo-dev.dev.cloudiaries.com
Make sure you have added the podinfo
Helm Chart repo:
helm repo add podinfo https://stefanprodan.github.io/podinfo
And let’s install the chart:
helm -n dev install podinfo-dev podinfo/podinfo -f podinfo-dev-values.yaml
You will notice that a new record podinfo-dev.dev.cloudiaries.com
have been added to the Route 53 zone, thanks to ExternalDNS. Remember that ExternalDNS was configured to watch ingress resources.
Give some time for DNS to propagate.
A few moments later, we can check that our service is reachable:
curl --insecure https://podinfo-dev.dev.cloudiaries.com/version
{
"commit": "67e2c98a60dc92283531412a9e604dd4bae005a9",
"version": "6.3.5"
}
Notice the curl
--insecure
flag. This is required as Staging Let’s Encrypt certificates are not trusted.
We can use openssl
to inspect the certificate, to make sure we have used the generated certificate:
openssl s_client -connect podinfo-dev.dev.cloudiaries.com:443 -showcerts </dev/null | openssl x509 -noout -pubkey
# Removed output
-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE75HnxlU0dVC8mliZFEJEWvAQ1LX0
oy/gJnVLFjDrSMObURIpSm9g48RVjzuRRprcmI6TZb7kqY52Oi/2BMhG4w==
-----END PUBLIC KEY-----
We can see that the public key of our service is the same as the key from our certificate generated previously.
We can also inspect the certificate using a browser:
We’re now done with this example. We can uninstall the Chart:
helm -n dev uninstall podinfo-dev
Service using dedicated certificate
In the previous example we generated a wildcard certificate *.dev.cloudiaries.com
that can be used for any service under the subdomain dev.cloudiaries.com
.
In the following example, we will create a dedicated certificate for one service, under dev.cloudiaries.com
subdomain. Let’s call it simply podinfo.dev.cloudiaries.com
.
These types of certificates will be generated using an HTTP01 challenge. We won’t need to create a Certificate object this time. A simple method is to add annotations on our ingress resource, and cert-manager
will take care of creating the certificate. Let’s create our ingress configuration:
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: letsencrypt-staging
hosts:
- host: podinfo.dev.cloudiaries.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: podinfo.dev.cloudiaries.com
hosts:
- podinfo.dev.cloudiaries.com
We can see that the certificate request is approved and not yet valid:
kubectl -n dev get certificaterequest
NAME APPROVED DENIED READY ISSUER REQUESTOR AGE
podinfo.dev.cloudiaries.com-xxxpg True False letsencrypt-staging system:serviceaccount:system:cert-manager 65s
Waiting for the challenge to complete:
kubectl -n dev get challenges
NAME STATE DOMAIN AGE
podinfo.dev.cloudiaries.com-xxxpg-3709303441-4086217025 pending podinfo.dev.cloudiaries.com 63s
And given this certificate request will use the HTTP01 solver, a new ingress resource is created to complete the challenge:
kubectl -n dev get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
cm-acme-http-solver-7nqgc <none> podinfo.dev.cloudiaries.com a1a0e3219534c4c77b5e2fc5ef859be7-2059265022.eu-west-1.elb.amazonaws.com 80 65s
podinfo nginx podinfo.dev.cloudiaries.com a1a0e3219534c4c77b5e2fc5ef859be7-2059265022.eu-west-1.elb.amazonaws.com 80, 443 67s
Although the acme ingress class is set to <none>
, it is nginx:
kubectl -n dev get ing cm-acme-http-solver-7nqgc -o jsonpath='{.metadata.annotations.kubernetes\.io/ingress\.class}'
nginx
This type of challenges (HTTP01) is completed by exposing a specific URL. We can see that on the Ingress specification:
spec:
rules:
- host: podinfo.dev.cloudiaries.com
http:
paths:
- backend:
service:
name: cm-acme-http-solver-hx5nw
port:
number: 8089
path: /.well-known/acme-challenge/IzWGIem3ciEqz_drxbnRUd79nBxchuGCRD67WF0q6f4
pathType: ImplementationSpecific
Once the challenge is completed and the certificate generated, the Ingress resource used for validation will be deleted.
We can see that this time we get a certificate for our service and not a wildcard certificate:
Conclusion
Throughout this article, we’ve explored the installation and configuration of cert-manager
. We’ve used it with Let’s Encrypt to secure our services endpoints’. We’ve gone through the different methods for solving challenges (HTTP01 and DNS01).
We’ve been using the Let’s Encrypt Staging server to generate certificates. Switching to the Prod server can be done by simply replacing the server
on the ClusterIssuer by: https://acme-v02.api.letsencrypt.org/directory