Coding and testing Kubernetes controllers
When implementing Kubernetes, controllers are becoming increasingly present in the design system ecosystem. While I’m not saying you should throw away your Terraform or Ansible processes, it’s important to understand what Kubernetes controllers are and how they work. Especially now that more and more companies are using Kubernetes for their software development. The Elasticsearch operator or Prometheus are the best known and the most frequently used examples.
A few weeks ago, I gave a presentation at Cloud Est about Kubernetes controllers and how to test them. I wanted to summarize it here for my international engineering fellows as the talk was in french. I’ll talk about controllers of course but also how we applied this design system at Artifakt. Let’s start!
What is a Kubernetes controller?
A Kubernetes controller uses a control loop that reconciles the current state with the desired state. To make it clear, it’s a piece of code that will be in charge of reconciling the Kubernetes API objects with the system resources. The resources have a spec that provides the desired state. The controller checks the current states. If it doesn’t match the desired state, it will make the appropriate changes to bring it closer to the desired state.
What is a reconciliation loop and how does it work?
A reconciliation loop is the action of maintaining the desired state. The controller will continuously observe the current state in contrast to the user-declared desired state. If there’s a difference between the two, the controller will take actions to make sure the current state always matches the desired state.
Controllers at Artifakt
At Artifakt, we created our own controllers. As a PaaS, we’re deploying our client’s code on a cloud provider. The client states what they need (storage, additional services…), which is declared as a desired state. The Artifakt platform uses this declared state and for each object has controllers taking action to build the ideal infrastructure to match that desired state.
It’s by adding our own objects and our own controllers to manipulate all the resources that we can manage the infrastructure in a declared state. All the ops/infra logic lives on the platform which then manages the creation of the infrastructure.
There’re 2 main benefits from this system:
- We use a unique point of entry which is the API Kubernetes. Our engineering team doesn’t have to add another abstract layer on top of the cloud providers with a specific API.
- The solution is agnostic from the cloud provider.
This is how it looks:
You have the Artifakt console which is the front and backend. It’s connected and interacts with the platform through the Kubernetes API. On this cluster are deployed the Artifkat controllers which are going to reconcile with the cloud providers.
When implementing the controllers: we are developing software. Duh!
This means that we’re going through CICD, lifecycle management, bug and incident management and we must test, test, test!
I’ve created a dummy sample controller that you’re free to fork and play with. This controller only does a simple deployment from a CRD application and registers the deployment reference within the status.
We’re receiving a key (name + namespace) of the object we need to reconcile and through the key, we’re getting the applicative object. From this object, we’ll receive specs and status and get the associated deployment.
Two outcomes can happen:
- We found the deployment
- The deployment doesn’t exist
We then compare what we were expecting and what we have. This allows us to identify errors as the controller reconciles because if what we have and what we’re expecting doesn’t match, then it means something has drifted and needs debugging.
At Artifakt, controllers are more complex than in the example above, but spending time writing the reconciliation loop allowed us to identify our use case and our tests. So I highly recommend doing so, even if it means spending more time on it.
Once you’ve identified your test use case, you create a FakeClient to mock Kubernetes to avoid the hassle of launching a Kubernetes cluster. In summary, the test will compare the action done by the FakeClient and the one we were expecting.
Interested to know more about what we’ve been working on at Artifakt, checkout our Github.