Kubernetes Operations | 0to1

Most Kubernetes problems aren’t Kubernetes problems. They’re architecture decisions made under pressure, configurations copied from Stack Overflow, and upgrades nobody wants to touch.

I help engineering teams get Kubernetes into a state where it’s boring infrastructure; reliable, observable, and maintained without heroics.

What Changes When I Get Involved

Teams I work with typically see:

Fewer incidents: Proper resource management, health checks, and deployment strategies that don’t gamble with availability
Faster deployments: GitOps workflows where engineers ship through pull requests, not kubectl access
Upgrades that happen: Clusters that stay current instead of drifting into unsupported territory
Costs that make sense: Right-sized workloads, spot instance strategies, and visibility into what’s actually consuming resources
On-call that’s sustainable: Runbooks, alerting that means something, and incidents that get resolved — not just restarted
Teams that understand Kubernetes: Knowledge transfer through pair programming, documentation, and practical advice
Documented architecture: Clear diagrams and explanations of how the cluster is set up and why

Problems I Solve

“Our cluster is a black box”

Nobody knows why pods get evicted, why deployments sometimes fail, or what half the namespaces are for. I establish observability, clean up sprawl, and document what matters.

“Deployments are manual and scary”

Engineers SSH into bastion hosts or run kubectl from laptops. I implement GitOps; code review for infrastructure, rollback in seconds. All deployments are automated and auditable.

“We’re stuck on an old version”

The cluster is three versions behind, and upgrading feels like defusing a bomb. I plan and execute upgrades with minimal disruption, then establish a cadence so you stay current.

“It works, but we don’t know why”

The platform evolved through trial and error. Helm charts are forked and modified, YAML is copy-pasted between environments. I refactor toward maintainable, reproducible infrastructure.

“We’re migrating to Kubernetes”

You’re moving from VMs, ECS, or another orchestrator. I’ve led migrations of 400+ services with almost no downtime. I’ll help you avoid the architectural mistakes that become expensive later.

“Growth is out of control”

Spiky traffic, runaway costs, and resource contention. I implement autoscaling, cost visibility, and resource quotas so the cluster scales with your needs; not the other way around. Together with your developers we analyze and optimize workloads, bringing understanding and control to the team.

What I Work On

Hands-on technical work across the full Kubernetes stack:

Cluster architecture: Multi-tenant design, node pools, networking (Cilium, Calico), service mesh evaluation
GitOps implementation: ArgoCD, Flux, environment promotion strategies, secrets management
Reliability engineering: Resource quotas, pod disruption budgets, health probes, graceful shutdowns
Observability: Prometheus, Grafana, distributed tracing, log aggregation, alerting that reduces noise
Security posture: RBAC design, network policies, admission controllers, supply chain security
Cost optimization: Resource right-sizing, cluster autoscaling, spot/preemptible strategies
Migration planning: Containerization strategy, stateful workload handling, cutover orchestration
Platform engineering: Self-service abstractions, developer experience, internal tooling

I work in AWS (EKS), GCP (GKE), Azure (AKS), and self-managed onprem clusters.

How Engagements Work

I work on a project basis with clear scope and outcomes.

An engagement starts with a paid scoping phase (typically 2-4 weeks). The deliverable is a project brief: scope, approach, success criteria, and timeline. You review it, we align on terms, then implementation begins.

This structure exists because it works:

You get clarity before committing to larger investments
I bring focused expertise to bounded problems
We both know what success looks like upfront

It also means our working relationship is structured as project-based from day one; which matters given current Dutch labor regulations (WetDBA). I define how and when I work. You define what needs to be solved.

Background

Over a decade building and operating Kubernetes and container platforms at scale:

Led container platform migration for 400+ services with almost no downtime
Tech lead for platform teams at scale-ups and enterprises
Deep experience with EKS, GKE, and self-managed clusters
Background in high-performance computing and data-intensive workloads
Designed multi-tenant platforms serving hundreds of engineering teams

I’ve worked with large retailers, research institutions, and energy companies where downtime has real cost.

Discuss your Kubernetes challenges

30 minutes to understand your situation and see if I can help.