Migrating from Mesos to Kubernetes: First week experiences

A couple months ago D2IQ (formerly known as Mesosphere) CEO Tobi Knaup announced all their efforts are going towards their Kubernetes offering: https://d2iq.com/blog/d2iq-takes-the-next-step-forward

If you read between all the marketing and management blabla you will find out that DCOS is End Of Life. Since D2IQ was the main contributor of Mesos, one of the core components of DCOS, we can safely conclude that Mesos is losing its main contributor.

Since most of the community already have left Mesos and now its main contributor is leaving the question is not if you should leave Mesos or DCOS and migrate to another option but rather: When are you going to migrate to another platform? New updates will no longer be released for DCOS and in Q4 DCOS will be deprecated and no longer receive security updates.

As the post already from D2IQ mentioned, Kubernetes is basically number one from all the container infrastructure platforms. A couple years ago I had my first encounter with Kubernetes, it was not a good impression. The etcd v2 to v3 migration was initially unsupported and clusters needed to be rebuilt from scratch, there was no ability to upgrade your etcd data store. You had to build a new cluster, set up all your applications again and copy over the data. These days Kubernetes is considered stable, the etcd migration was 3-4 years ago. That is basically a lifetime in software. So in that sense I am starting fresh and new again.

Setup used

After hitting some quirks in the first week with Kubernetes I decided to list a few from them. In a virtual machine I have Rancher running, Rancher runs in a Docker container to get up and running with Kubernetes as quickly as possible. Another node running k3s is added in the Rancher interface.

Since these platforms are complex and offer many features to extend the platform so they can basically do anything, your mileage might vary. What I ran into can be totally different on another Kubernetes environment.

Container runtimes

At a first look it seemed not possible to use multiple container runtimes in Kubernetes, however there are a few documents online that describe the use of multiple container runtimes. It seems it is not a commonly used way of running workloads in your clusters.

In Mesos it was quite normal to have multiple container runtimes, usually you had the Docker and Mesos Containerizer runtimes installed per default on your cluster. While Docker is known as the default container runtime these days, Mesos containerizer offered less overhead and network isolation. Mesos containerizers were basically cgroups and namespaces in the early days, later they got extended to support nvidia cards and other fancy features such as CSI (container storage interface). They even supported Docker images after a while.

The main benefit was that this gave platform engineers the ability to launch a thing like node exporter without messing around much with capabilities, privileges or even the need for building a simple container. It allowed me to give in the URL where the binary was located, and have the tool be up and running in less than a second. This allowed business critical applications to recover fast, if you ever have experienced an outage then you know how long a second feels. Not having Mesos container runtime is something I will miss, there seems to be nothing similar in that sense.

There is no restart command

Error: unknown command "restart" for "kubectl"

DCOS offered the restart ability, what it did was scale up an instance of the service and perform a health check on it. The moment the health check passed the older version got terminated. With multiple containers running per service it performed a rolling restart, where one by one all the containers got a restart. This was a very clean way of terminating old containers and helped platform engineers like me out of situations.

The number one reason for restarting a container was in our case: Filling disks. It happens almost 1-2 times per month a platform engineer gets paged about a disk filling up. 9/10 the reason is a new tool that got deployed and the team forgot to stop logging to files in the container. The result is a filling disk, and restarting containers cleanly allowed a couple things to happen:

Restart with no downtime
Clean up the container that was filling the disk
Give the team to solve it without taking the application offline

In most cases it gave me the ability to just ping the team and mention:

Hey, your application is filling the disk. Can you fix this asap?

Applications never needed to be taken offline and often this gave the developers a day or so to fix this while their application was still online till they solved the issue. In Kubernetes people mention a solution to scale the application down and then up. The result is that you will experience downtime. Or you can scale up and then down manually.

Since kubernetes 1.15 you can do rollout restarts, after testing this it works similar. It is only available for Statefulsets, Daemonsets and Deployments.

Example:

kubectl rollout restart statefulset -n mynamespace awesome-app

If you for whatever reason decided to deploy a single pod, then you are out of luck and need to manually play around to mimic a restart command.

Service changes

In DCOS it was very common that the moment you changed a configuration, even a minor setting, the containers restarted. Sometimes this was an annoying feature, for example if you wanted to simply just change an environment variable it resulted in your containers restarting. In test clusters, this sometimes meant that another app waiting for resources picked up your resources the moment your application restarted, not ideal in those moments.

In Kubernetes this behaves almost the same, change the docker container version of your pods and your containers restart. Except secret stores and config maps do not seem to be monitored, making a change to a config map would be a reason in my opinion to restart your container to load in the new settings. The result is that the moment you change a configuration or password in the secret store, you will need to manually restart your container (and as you saw in the previous point, that’s not that easy). There are however some clever folks that ran into this issue before and decided to write an application called Reloader to handle these kind of scenarios:

https://github.com/stakater/Reloader

What I don’t understand is why this is not default functionality inside Kubernetes and you need 3rd party tools to handle this.

Public Private nodes

In DCOS there were 2 node types your workloads could run on:

Public
Private

Public nodes were reserved for gateways and loadbalancer type of workloads. For example the Traefik binary would run on a public node. Private nodes would run all the other workloads even if they would be web servers, your configuration would have to specifically tell the scheduler that you want your service to land on a public node. By default all services would end up on the private nodes if you did not specify where your task needed to run.

Kubernetes does not have this concept by default. You are free to make your own setup as you please and add nodes with different roles, it is however not enforced as with DCOS. In Kubernetes it is common to have your load balancer share the same node as all the other applications.

There are benefits and negatives for sharing a node with your load balancer, the main negative I see is in resource contested clusters. In scenarios where there are multiple node failures, having your load balancer competing for the same resources with services that are less critical is not ideal. Not having designated public nodes in these scenarios can result in a disaster, however often in production clusters you keep spare capacity. Murphy’s law is however still a thing in 2021. Having more nodes available with spare capacity will probably save you and perhaps be more efficient to let your cron jobs run in the spare capacity.

The options

Another thing to mention is: My god, the options… There are so many options and names for simple things. In DCOS you added a volume to a container, using this volume resulted in a persistent application. It did not magically transcended into a statefulset, you simply added a volume for your application to use. In Kubernetes you have Statefulsets, Daemonsets, Deployments, single pods, Services.

Port binding?

This is a thing I still need to investigate a bit further, since I expect the way I run Rancher in my test setup is the cause of this. Will update this when I have a few extra minutes to play around.

My first test setup in Rancher was broken in a couple minutes, all it took me was deploying nginx containers. I am still not sure if the fault can be attributed to Kubernetes or Rancher, I suspect it is the way I am running Rancher. This is something I still need to test. In DCOS most of the networking is done by port binding, you start a service and it binds to the host port. Either the port dynamically gets generated and then forwarded inside your container, or you explicitly tell the service what port you want to use. A host port is therefore a resource in DCOS, if all nodes have port 80 in use and your application wants to deploy a service that uses port 80 then you will have to wait, your application does not get deployed. There are pro’s and con’s to this, the negative thing is that you can only use port 80 once, the positive is that you do not get port conflicts. In Rancher I decided to test port binding: Startup the nginx container and bind to port 80 and 443 and see what would happen. The Rancher interface went offline. There are a couple possible reasons for this:

Kubernetes does not take port reservations into account. (very unlikely)
Kubernetes does have port reservations on host level, but Rancher didn’t enable port reservations in Kubernetes. (also unlikely)
Rancher in docker does weird things and port 80 are reserved on the system level for the Rancher interface (likely)

This is still a topic I need to investigate and hope I will have the time to update this post.

Ending note

New stuff is exciting, it allows to try new things and gives opportunity for you to grow. While it is sad that Mesos failed, Kubernetes gives the option to rethink some of the paradigms that came with Mesos. It offers a vibrant community. Not once during this week I had a moment where i thought I needed to find a solution, a quick google search always resulted in solutions.

I want to take this moment to wish everyone a Happy new year!