Zero-Downtime Akka.NET on Kubernetes

Akka.NET Kubernetes

Production Patterns

Why Akka.NET + Kubernetes?

  • ✓ Native StatefulSet support for stateful actors
  • ✓ Mesh networking simplified via K8s Services
  • ✓ Customizable health checks for graceful transitions
  • ✓ Zero-downtime rolling updates

K8s Networking for Akka.NET

Load Balancer
HTTP/gRPC :8080
Headless Service
pod-N.svc.cluster.local
Pod-0
:5055
Pod-1
:5055
Pod-2
:5055

Deployment Strategies Compared

StatefulSet

  • ✅ Stable DNS names (pod-0, pod-1)
  • ✅ Simple seed node config
  • ⚠️ 2x rebalancing per pod update
  • ⚠️ Manual intervention on failed deployments

Update pattern: Pod down → rebalance → pod up → rebalance

Deployment + Surge

  • ✅ Single rebalancing per pod
  • ✅ Faster rollbacks
  • ✅ Auto-recovery on failures
  • ⚠️ Requires Akka.Management setup
  • ⚠️ Temporary n+1 pods (cost)

Update pattern: New pod joins → old pod leaves → rebalance

Deployment with maxSurge=1 reduces rebalancing overhead by 50%

StatefulSet Rolling Update

Initial
Pod-2
v1.0
Pod-1
v1.0
Pod-0
v1.0
Step 1
Pod-2
DOWN
Pod-1
v1.0 ⬅
Pod-0
v1.0 ⬅
Step 2
Pod-2
v2.0 ✓
Pod-1
v1.0 ⬅
Pod-0
v1.0 ⬅
Step 3
Pod-2
v2.0 ✓
Pod-1
DOWN
Pod-0
v1.0 ⬅
Step 4
Pod-2
v2.0 ✓
Pod-1
v2.0 ✓
Pod-0
v1.0 ⬅

Pod DOWN → rebalance → Pod UP → rebalance again

⚠️ 4 rebalancing events (orange) for just 2 pods!

Deployment Rolling Update (Surge)

Initial
Pod-C
v1.0
Pod-B
v1.0
Pod-A
v1.0
Step 1
Pod-D
v2.0 ✓
Pod-C
v1.0
Pod-B
v1.0
Pod-A
v1.0
Step 2
Pod-D
v2.0 ✓
Pod-C
v1.0 ⬅
Pod-B
v1.0 ⬅
Step 3
Pod-E
v2.0 ✓
Pod-D
v2.0 ✓
Pod-C
v1.0
Pod-B
v1.0
Step 4
Pod-E
v2.0 ✓
Pod-D
v2.0 ✓
Pod-C
v1.0 ⬅

New pod JOINS → old pod leaves → rebalance once

✓ Only 2 rebalancing events (orange) for 2 pods!

RBAC for Akka.Discovery.Kubernetes

graph LR SA[ServiceAccount] --> Role[Role] Role --> RB[RoleBinding] Role --> Perms["Permissions
get pods
list pods
watch pods"] style SA fill:#42affa style Role fill:#17b2c3 style RB fill:#17b2c3 style Perms fill:#2d2d2d

⚠️ Critical: All resources must specify explicit namespace!

Omitting namespace causes "Forbidden" errors during pod discovery

Liveness vs Readiness Probes

/healthz/live

Question: Is process alive?

  • ✓ Fast, simple check
  • ✓ Basic responsiveness
  • Restart pod on failure

Always healthy during startup

/healthz/ready

Question: Can accept traffic?

  • ✓ Includes dependencies
  • ✓ Database connectivity
  • Remove from load balancer on failure

Waits for full initialization

Separate probes prevent restart loops during startup and recovery

Securing Akka.Remote with TLS

Step 1
Create Secret
kubectl create secret
from .pfx file
Step 2
Mount Volume
Secret as /certs
(read-only)
Step 3
Configure
Akka.Hosting
reads /certs/*.pfx
Step 4
Protocol Switch
akka.tcp →
akka.ssl.tcp

Why Secrets? Binary support • Encryption at rest • Fine-grained RBAC

Thank You!

Questions?

https://aaronstannard.com
@aaronstannard