Navigating etcd v3 waters

etc

A post about etcd v3, mainly about how paths work in v3.

Today I was playing around with an empty etcd v3 cluster, placing keys with the Golang Library inside it, and I was looking around to get some context on how etcd stores the data. Usually, I need to get an idea of how data is stored before I feel comfortable with the solution. You can view this post as a sort of gist for myself but I hope it can help you too!

In the past, I have played around with Consul and went into etcd as if it is Consul, quickly I found out that etcd didn’t behave as I expected. I thought it shared the same setup as Consul where you could create paths and browse through paths.

The Golang code I generated returned no errors and I wanted to see if I could retrieve the value back without using the Golang Library. While trying to find the key I decided to go to the top of the path: v3.

Online I found that etcd supports HTTP calls through a grpc proxy. When I tried to curl http://localhost:2379/v3 I got 404 as a return code which was surprising to me. With Consul you would get some form of a return since each part of the path are an object. After playing around with a few paths, even the full path gave the same result. So I decided to give etcdctl a try.

To my surprise, etcdctl returned nothing at the top path v3. However, the full path returned data. After this I decided to remove using a path and reverted to the examples on the etcd website: I placed the key: bla = xyz

And with etcdctl the key was returning the data, so far so good. Then I decided to use curl again and the same results: 404. After reading through some documentation I found out that the grpc endpoint provided a range path: http://localhost:2379/v3/kv/range

My first instinct was to use a GET request, but GET doesn’t work with the range path. You need to use a POST request. And you need to base64 encode the key you are looking for.

What I did not know is that piping an echo to base64 adds a new line to your output:

echo bla | base64 outputs the following value: YmxhCg==

While using printf returned a different value, it removed the newline: printf bla | base64

with the following value:

Ymxh

Adding newlines matters when retrieving the values with curl, using YmxhCg== never returned anything on the grpc endpoint. With the base64 encoded string issue now resolved and now knowing the range path, I finally was able to perform a curl request: curl -X POST localhost:2379/v3/kv/range -d '{"key": "Ymxh"}'

{"header":{"cluster_id":"17790290530319697850","member_id":"3513314090463485222","revision":"36","raft_term":"3"},"kvs":[{"key":"Ymxh","create_revision":"21","mod_revision":"24","version":"2","value":"eHl6"}],"count":"1"}

With this, I finally got the data back. Now it was time to look at the path issue and understand how that works in etcd v3.

After investigating for a while I decided to start a Minikube cluster and learn how Kubernetes utilizes etcd (see this post). Unfortunately in the Minikube Docker setup, the etcd ports were not bound to my machine directly. So I needed to ssh into the minikube machine:

minikube ssh

Quickly I found out that etcdctl was not working on the local machine, both the binary for etcdctl was not supplied (which can be easily resolved by using docker) but I also kept getting the error while trying to connect from within the minikube machine: Error while dialing dial tcp 127.0.0.1:2381: connect: connection refused

I, therefore, decided to docker exec into the etcd container and use the etcdctl from there to be able to look around at how Kubernetes uses etcd and understand how Kubernetes structures the data.

docker exec -it $(docker ps |grep "etcd " |awk '{print $1}') /bin/sh

After this was done I still could not connect. Looking at the ps | grep etcd the output I found out that the etcd process was running with certificates in place and the paths were conveniently listed in the output which helped me to create the etcdctl parameters:

etcdctl --endpoints localhost:2379 \
  --cert=/var/lib/minikube/certs/etcd/server.crt \
  --key=/var/lib/minikube/certs/etcd/server.key \
  --cacert=/var/lib/minikube/certs/etcd/ca.crt \
  member list

When I ran this command inside the etcd container, this returned the following output:

aec36adc501070cc, started, minikube, https://192.168.49.2:2380, https://192.168.49.2:2379, false

With this, I was now able to query the Kubernetes etcd instance from the minikube machine. However, I was still not able to browse the paths. It took a while, but only till I found the prefix flag in etcdctl I was able to get data.

To get all keys:

etcdctl --endpoints localhost:2379 \
  --cert=/var/lib/minikube/certs/etcd/server.crt \
  --key=/var/lib/minikube/certs/etcd/server.key \
  --cacert=/var/lib/minikube/certs/etcd/ca.crt \
  get "" --prefix

Providing an empty string returns all the keys in etcd.

And from there I could go through all the data and understand that etcd does not really use a path setup. I went into this with the idea that etcd was based on the Linux path /etc, however it doesn’t use the same directory structure the Linux filesystem uses. Where the previous etcdctl v2 supported mkdir and hinted towards a path-based setup, it now no longer has this feature.

etcdctl only returns information if you provide a full string like:

etcdctl --endpoints localhost:2379 \
  --cert=/var/lib/minikube/certs/etcd/server.crt \
  --key=/var/lib/minikube/certs/etcd/server.key \
  --cacert=/var/lib/minikube/certs/etcd/ca.crt \
  get "/registry/storageclasses/standard" --prefix

Only then you will get the values, but the path itself will not hold keys as Consul does. This made etcd for me a bit harder to understand, but we got there!

Pretty sure it’s not good for performance if you start going through all the data on a busy production cluster, so be careful with using an empty prefix to list all keys!

Going through a path structure is perhaps a bit more friendly in those scenarios, but this is all I found so far. Perhaps when I have worked more with etcd I can suggest other options to safely go through its setup without querying all the keys that exist.

At least I hope this post will help people with connecting to etcd and understand how to retrieve all the keys within it and provide a way to browse through the data.