Kubernetes Operators and Helm — It takes Two to Tango
When it comes to running complex application workloads on Kubernetes two technologies standout — Helm and Kubernetes Operators. In this post we compare them and discuss how they actually complement each other towards solving problems of day-1 and day-2 operations when it comes to running complex application workloads on Kubernetes. We also present guidelines for creating Helm charts for Operators.
What is Helm?
The basic idea of Helm is to enable reusability of Kubernetes YAML artifacts through templatization. Helm allows defining Kubernetes YAMLs with marked up properties. The actual values for these properties are defined in a separate file. Helm takes the templatized YAMLs and the values file and merges them before deploying the merged YAMLs into a cluster. The package consisting of templatized Kubernetes YAMLs and the values file is called a ‘Helm chart’. Helm project has gained considerable popularity as it solves one of the key problems that enterprises face — creating custom YAMLs for deploying the same application workload with different settings, or deploying it in different environments (dev/test/prod).
What is a Kubernetes Operator?
The basic idea of a Kubernetes Operator is to extend the Kubernetes’s level-driven reconciliation loop and API set towards running stateful application workloads natively on Kubernetes. A Kubernetes Operator consists of Kubernetes Custom Resource(s) / API(s) and Kubernetes Custom Controller(s). The Custom Resources represent an API that takes declarative inputs on the workload being abstracted and the Custom Controller implements corresponding actions on the workload. Kubernetes Operators are typically implemented in a standard programming language (Golang, Python, Java, etc.). An Operator is packaged as a container image and is deployed in a Kubernetes cluster using Kubernetes YAMLs. Once deployed, new Custom Resources (e.g. Mysql, Cassandra, etc.) are available to the end users similar to built-in Resources (e.g. Pod, Service, etc.). This allows them to orchestrate their application workflows more effectively leveraging additional Custom Resources.
Day-1 vs. Day-2 operations
Helm and Operators represent two different phases in managing complex application workloads on Kubernetes. Helm’s primary focus is on the day-1 operation of deploying Kubernetes artifacts in a cluster. The ‘domain’ that it understands is that of Kubernetes YAMLs that are composed of available Kubernetes Resources / APIs. Operators, on the other hand, are primarily focused on addressing day-2 management tasks of stateful / complex workloads such as Postgres, Cassandra, Spark, Kafka, SSL Cert Mgmt etc. on Kubernetes.
Both these mechanisms are complementary to deploying and managing such workloads on Kubernetes. To Helm, Operator deployment YAMLs represent one of the artifact types that can be potentially templatized and deployed with different settings and in different environments. To an Operator developer, Helm represents a standard tool to package, distribute and install Operator deployment YAMLs without tie-in to any Kubernetes vendor or distribution. As an Operator developer, it is tremendously useful for your users if you create Helm chart for its deployment. Below are some guidelines that you should follow when creating such Operator Helm charts.
Guidelines for creating Operator Helm charts
- Register CRDs in Helm chart (and not in Operator code)
As mentioned earlier, an Operator consists of one or more Custom Resources and associated Controller(s). In order for the Custom Resources to be recognized in a cluster, they need to be first registered in the cluster using Kubernetes’s meta API of ‘Custom Resource Definition (CRD)’. The CRD itself is a Kubernetes resource. Our first guideline is that registering the Custom Resources using CRDs should be done as Kubernetes YAML files in a Helm chart, rather than in Operator’s code (Golang/Python, etc.). The primary reason for this is that installing CRD YAML requires cluster-scope permission whereas an Operator Pod may not require cluster-scope permissions for its day-2 operations on an application workload. If you include CRD registration in your Operator’s code, then you will have to deploy your Operator Pod with cluster-scope permissions, which may come in the way of your security preferences. By defining CRD registration in Helm chart, you only need to give the Operator Pod those permissions that are necessary for it to perform the actual day-2 operations on the underlying application workload.
2. Make sure CRDs are getting installed prior to the Operator deployment
The second guideline is to define crd-install hook as part of the Operator Helm chart. What is this hook? ‘crd-install’ is a special annotation that you can add to your Helm chart’s CRD manifest. When installing a chart in a cluster, Helm installs any YAML manifests with this annotation before installing any other manifests of a chart. By adding this annotation you will ensure that the Custom Resources that your Operator works with are registered in the cluster before the Operator is deployed. This will prevent errors that happen if Operator starts running without its CRDs registered in the cluster. Note that in Helm 3.0, this hook is going away. Instead, CRDs are getting a special directory inside the charts directory.
3. Define Custom Resource validation rules in CRD YAML
Next guideline is about adding Custom Resource validation rules as part of CRD YAML. Kubernetes Custom Resources are designed to follow the Open API Spec. Additionally from Operator logic perspective, Custom Resource instances should have valid values for its defined Spec properties. You can define validity rules for the Custom Resource Spec properties as part of CRD YAML. These validation rules are used by Kubernetes machinery before a Custom Resource YAML reaches your Operator through one of the CRUD (Create, Read, Update, Delete) operations on the Custom Resource. This guideline helps prevent end user of the Custom Resource making unintentional errors while using it.
4. Use values.yaml or ConfigMaps for Operator configurables
Your Operator itself may need to be customized for different environments or even for different namespaces within a cluster. The customizations may range from small modifications, such as customizing the base MySQL image used by a MySQL Operator, to larger modifications such as installing different default set of plugins on different Moodle instances by a Moodle Operator. As mentioned earlier, Helm supports mechanisms of Spec property markers and Values.yaml file for templatization/customization. You can leverage these for small modifications that typically can be represented as singular properties in Operator deployment manifests. For supporting larger modifications to an Operator, consider passing the configuration data via ConfigMaps. The names of such ConfigMaps need to be passed to the Operator. For that you can use Helm’s property markers and Values.yaml file.
5. Add Platform-as-Code annotations to enable easy discovery and consumption of Operator’s Custom Resources
The last guideline that we have is to add ‘Platform-as-Code’ annotations on your Operator’s CRD YAML. There are two annotations that we recommend — ‘composition’ and ‘man’. The value of the ‘platform-as-code/composition’ annotation is listing of all Kubernetes built-in resources that are created by the Operator when the Custom Resource instance is created. The value of the ‘platform-as-code/man’ annotation is the name of a ConfigMap that packages ‘man page’ like usage information about the Custom Resource. Make sure to include this ConfigMap in your Operator’s Helm chart. These two annotations are part of our KubePlus API Add-on that simplifies discovery, binding and orchestration of Kubernetes Custom Resources to create platform workflows in typical multi-Operator environments.
Helm and Operators are complementary technologies. Helm is geared towards performing day-1 operations of templatization and deployment of Kubernetes YAMLs — in this case Operator deployment. Operator is geared towards handling day-2 operations of managing application workloads on Kubernetes. You will need both when running stateful / complex workloads on Kubernetes.
At CloudARK, we have been helping platform engineering teams build their custom platform stacks assembling multiple Operators to run their enterprise workloads. Our novel Platform-as-Code technology is designed to simplify workflow management in multi-Operator environments by bringing in consistency across Operators and simplifying consumption of Custom Resources towards defining required platform workflows. For this we have developed comprehensive Operator curation guidelines and open source Platform-as-Code tooling.
Reach out to us to find out how you too can build your Kubernetes Native platforms with Platform-as-Code approach.