Skip to content

Webhook Configuration

This guide covers the configuration and operation of OptiPod’s mutating admission webhook, including TLS setup, cert-manager integration, high availability, and performance tuning.

OptiPod’s webhook provides runtime resource injection for pods. When enabled, the webhook:

  1. Intercepts pod creation requests
  2. Reads resource recommendations from workload annotations
  3. Injects resource requests and limits into pod specifications
  4. Allows the pod to be created with optimized resources

This approach is GitOps-compatible because:

  • Recommendations are stored in workload metadata (not pod templates)
  • ArgoCD/Flux manage the workload spec
  • The webhook applies recommendations at pod creation time
  • No conflicts with GitOps controllers

Use the webhook strategy when:

  • Using GitOps: ArgoCD or Flux manage your workloads
  • Immutable infrastructure: Pod templates should not be modified
  • Gradual rollout: Apply recommendations only to new pods
  • Compliance requirements: Workload specs must match Git

Use Server-Side Apply (SSA) instead when:

  • Direct kubectl management: No GitOps controller
  • Immediate updates: Want to update existing pods
  • Simpler setup: Don’t want to manage webhook certificates
  1. Policy Reconciliation: Controller generates recommendations and stores them in workload annotations
  2. Pod Creation: User or controller creates a pod (e.g., Deployment rollout)
  3. Webhook Intercept: Kubernetes API server calls OptiPod webhook
  4. Annotation Lookup: Webhook reads recommendations from parent workload annotations
  5. Resource Injection: Webhook generates JSON patches to modify pod resources
  6. Pod Admission: Modified pod is created with optimized resources

The webhook requires TLS certificates for secure communication with the Kubernetes API server. OptiPod supports two certificate management approaches:

cert-manager automatically generates and rotates certificates.

Prerequisites:

  • cert-manager installed in the cluster

Installation:

Terminal window
# Install OptiPod with cert-manager
helm install optipod optipod/optipod \
--set webhook.enabled=true \
--set webhook.certManager.enabled=true \
--namespace optipod-system \
--create-namespace

How it works:

  1. Helm creates a self-signed Issuer
  2. Helm creates a Certificate resource with 1-year validity
  3. cert-manager generates the certificate and stores it in a Secret
  4. cert-manager injects the CA bundle into the MutatingWebhookConfiguration
  5. cert-manager automatically renews certificates before expiry

Verification:

Terminal window
# Check certificate status
kubectl get certificate -n optipod-system
# Check secret
kubectl get secret webhook-server-certs -n optipod-system
# Check webhook configuration has CA bundle
kubectl get mutatingwebhookconfiguration optipod-webhook -o yaml | grep caBundle

For clusters without cert-manager, use manually generated certificates.

Generate certificates:

Terminal window
# Create certificate directory
mkdir -p /tmp/optipod-certs
cd /tmp/optipod-certs
# Generate private key and certificate
openssl req -x509 -newkey rsa:2048 \
-keyout tls.key -out tls.crt \
-days 365 -nodes \
-subj "/CN=optipod-webhook-service.optipod-system.svc" \
-addext "subjectAltName=DNS:optipod-webhook-service.optipod-system.svc,DNS:optipod-webhook-service.optipod-system.svc.cluster.local"
# Create Kubernetes secret
kubectl create secret tls webhook-server-certs \
--cert=tls.crt \
--key=tls.key \
-n optipod-system

Install OptiPod:

Terminal window
helm install optipod optipod/optipod \
--set webhook.enabled=true \
--set webhook.certManager.enabled=false \
--namespace optipod-system \
--create-namespace

Certificate Rotation:

Manual certificates must be rotated before expiry:

Terminal window
# Check certificate expiry
openssl x509 -in tls.crt -noout -enddate
# Rotate certificate (repeat generation steps)
# Update secret
kubectl create secret tls webhook-server-certs \
--cert=tls.crt \
--key=tls.key \
-n optipod-system \
--dry-run=client -o yaml | kubectl apply -f -
# Restart webhook pods
kubectl rollout restart deployment optipod-webhook -n optipod-system

Enable the webhook in Helm values:

values.yaml
webhook:
enabled: true
# Certificate management
certManager:
enabled: true # Use cert-manager (recommended)
# Replica count for high availability
replicas: 2
# Resource limits
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 50m
memory: 64Mi
values.yaml
webhook:
enabled: true
# High availability
replicas: 3
# Pod disruption budget
podDisruptionBudget:
enabled: true
minAvailable: 1
# Webhook server configuration
port: 9443
timeoutSeconds: 10
failurePolicy: Ignore # or "Fail" for strict enforcement
# Namespace selector (exclude system namespaces)
namespaceSelector:
matchExpressions:
- key: name
operator: NotIn
values: ["kube-system", "kube-public", "kube-node-lease"]
# Object selector (only pods with webhook annotation)
objectSelector:
matchExpressions:
- key: optipod.io/webhook-enabled
operator: In
values: ["true"]

The failurePolicy determines webhook behavior when the webhook server is unavailable:

  • Ignore (default): Allow pod creation even if webhook fails

    • Pros: No service disruption
    • Cons: Pods may be created without optimizations
    • Use for: Non-critical workloads, testing
  • Fail: Block pod creation if webhook fails

    • Pros: Ensures all pods are optimized
    • Cons: Service disruption if webhook is down
    • Use for: Critical workloads, strict compliance

Recommendation: Start with Ignore, switch to Fail after validating webhook stability.

Run multiple webhook replicas for availability:

values.yaml
webhook:
replicas: 3 # Minimum 2 for HA
podDisruptionBudget:
enabled: true
minAvailable: 1 # At least 1 replica always available

Distribute replicas across nodes:

values.yaml
webhook:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: optipod-webhook
topologyKey: kubernetes.io/hostname

Adjust based on cluster size:

# Small clusters (< 100 pods/min)
webhook:
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 50m
memory: 64Mi
# Large clusters (> 1000 pods/min)
webhook:
resources:
limits:
cpu: 1000m
memory: 512Mi
requests:
cpu: 500m
memory: 256Mi

Balance between reliability and performance:

# Fast timeout (low latency, may fail under load)
webhook:
timeoutSeconds: 5
# Standard timeout (balanced)
webhook:
timeoutSeconds: 10
# Long timeout (high reliability, higher latency)
webhook:
timeoutSeconds: 30

The webhook exposes Prometheus metrics on port 8080:

# Admission requests
optipod_webhook_admission_requests_total{namespace, pod_name, dry_run}
# Admission success
optipod_webhook_admission_success_total{namespace, policy, patches_applied}
# Admission failures
optipod_webhook_admission_failures_total{namespace, failure_reason}
# Mutation duration
optipod_webhook_mutation_duration_seconds{namespace, policy}
# Server health
optipod_webhook_server_health_status{endpoint}
# Certificate expiry
optipod_webhook_certificate_expiry_timestamp_seconds{cert_type}

Example queries:

# Admission request rate
rate(optipod_webhook_admission_requests_total[5m])
# Success rate
rate(optipod_webhook_admission_success_total[5m]) /
rate(optipod_webhook_admission_requests_total[5m])
# P95 mutation latency
histogram_quantile(0.95,
rate(optipod_webhook_mutation_duration_seconds_bucket[5m]))
# Certificate expiry (days remaining)
(optipod_webhook_certificate_expiry_timestamp_seconds - time()) / 86400

Symptoms: Pods created without resource modifications

Diagnosis:

Terminal window
# Check webhook configuration exists
kubectl get mutatingwebhookconfiguration optipod-webhook
# Check webhook service
kubectl get svc optipod-webhook-service -n optipod-system
# Check webhook pods
kubectl get pods -n optipod-system -l app.kubernetes.io/name=optipod-webhook

Solutions:

  1. Verify webhook is enabled in Helm values
  2. Check namespace and object selectors match your pods
  3. Ensure workload has optipod.io/webhook-enabled: "true" annotation

Symptoms: Webhook admission failures with TLS errors

Diagnosis:

Terminal window
# Check certificate secret
kubectl get secret webhook-server-certs -n optipod-system
# Check certificate validity
kubectl get certificate -n optipod-system
# Check webhook logs
kubectl logs -n optipod-system -l app.kubernetes.io/name=optipod-webhook

Solutions:

  1. Verify cert-manager is installed and running
  2. Check certificate is not expired
  3. Verify CA bundle in MutatingWebhookConfiguration matches certificate
  4. Restart webhook pods after certificate rotation

Symptoms: Slow pod creation times

Diagnosis:

Terminal window
# Check webhook metrics
kubectl port-forward -n optipod-system svc/optipod-webhook-service 8080:8080
curl http://localhost:8080/metrics | grep webhook_mutation_duration
# Check webhook resource usage
kubectl top pods -n optipod-system -l app.kubernetes.io/name=optipod-webhook

Solutions:

  1. Increase webhook resource limits
  2. Add more webhook replicas
  3. Increase timeout in webhook configuration
  4. Optimize policy selectors to reduce matching overhead

Restrict webhook network access:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: optipod-webhook-netpol
namespace: optipod-system
spec:
podSelector:
matchLabels:
app.kubernetes.io/name: optipod-webhook
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 9443 # Webhook port
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443 # Kubernetes API

The webhook runs with restricted security context:

securityContext:
runAsNonRoot: true
runAsUser: 65532
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
seccompProfile:
type: RuntimeDefault