Webhook Configuration
This guide covers the configuration and operation of OptiPod’s mutating admission webhook, including TLS setup, cert-manager integration, high availability, and performance tuning.
Overview
Section titled “Overview”OptiPod’s webhook provides runtime resource injection for pods. When enabled, the webhook:
- Intercepts pod creation requests
- Reads resource recommendations from workload annotations
- Injects resource requests and limits into pod specifications
- Allows the pod to be created with optimized resources
This approach is GitOps-compatible because:
- Recommendations are stored in workload metadata (not pod templates)
- ArgoCD/Flux manage the workload spec
- The webhook applies recommendations at pod creation time
- No conflicts with GitOps controllers
When to Use the Webhook
Section titled “When to Use the Webhook”Use the webhook strategy when:
- Using GitOps: ArgoCD or Flux manage your workloads
- Immutable infrastructure: Pod templates should not be modified
- Gradual rollout: Apply recommendations only to new pods
- Compliance requirements: Workload specs must match Git
Use Server-Side Apply (SSA) instead when:
- Direct kubectl management: No GitOps controller
- Immediate updates: Want to update existing pods
- Simpler setup: Don’t want to manage webhook certificates
Architecture
Section titled “Architecture”Webhook Flow
Section titled “Webhook Flow”- Policy Reconciliation: Controller generates recommendations and stores them in workload annotations
- Pod Creation: User or controller creates a pod (e.g., Deployment rollout)
- Webhook Intercept: Kubernetes API server calls OptiPod webhook
- Annotation Lookup: Webhook reads recommendations from parent workload annotations
- Resource Injection: Webhook generates JSON patches to modify pod resources
- Pod Admission: Modified pod is created with optimized resources
TLS Certificate Management
Section titled “TLS Certificate Management”The webhook requires TLS certificates for secure communication with the Kubernetes API server. OptiPod supports two certificate management approaches:
Option 1: cert-manager (Recommended)
Section titled “Option 1: cert-manager (Recommended)”cert-manager automatically generates and rotates certificates.
Prerequisites:
- cert-manager installed in the cluster
Installation:
# Install OptiPod with cert-managerhelm install optipod optipod/optipod \ --set webhook.enabled=true \ --set webhook.certManager.enabled=true \ --namespace optipod-system \ --create-namespaceHow it works:
- Helm creates a self-signed Issuer
- Helm creates a Certificate resource with 1-year validity
- cert-manager generates the certificate and stores it in a Secret
- cert-manager injects the CA bundle into the MutatingWebhookConfiguration
- cert-manager automatically renews certificates before expiry
Verification:
# Check certificate statuskubectl get certificate -n optipod-system
# Check secretkubectl get secret webhook-server-certs -n optipod-system
# Check webhook configuration has CA bundlekubectl get mutatingwebhookconfiguration optipod-webhook -o yaml | grep caBundleOption 2: Manual Certificates
Section titled “Option 2: Manual Certificates”For clusters without cert-manager, use manually generated certificates.
Generate certificates:
# Create certificate directorymkdir -p /tmp/optipod-certscd /tmp/optipod-certs
# Generate private key and certificateopenssl req -x509 -newkey rsa:2048 \ -keyout tls.key -out tls.crt \ -days 365 -nodes \ -subj "/CN=optipod-webhook-service.optipod-system.svc" \ -addext "subjectAltName=DNS:optipod-webhook-service.optipod-system.svc,DNS:optipod-webhook-service.optipod-system.svc.cluster.local"
# Create Kubernetes secretkubectl create secret tls webhook-server-certs \ --cert=tls.crt \ --key=tls.key \ -n optipod-systemInstall OptiPod:
helm install optipod optipod/optipod \ --set webhook.enabled=true \ --set webhook.certManager.enabled=false \ --namespace optipod-system \ --create-namespaceCertificate Rotation:
Manual certificates must be rotated before expiry:
# Check certificate expiryopenssl x509 -in tls.crt -noout -enddate
# Rotate certificate (repeat generation steps)# Update secretkubectl create secret tls webhook-server-certs \ --cert=tls.crt \ --key=tls.key \ -n optipod-system \ --dry-run=client -o yaml | kubectl apply -f -
# Restart webhook podskubectl rollout restart deployment optipod-webhook -n optipod-systemWebhook Configuration
Section titled “Webhook Configuration”Basic Configuration
Section titled “Basic Configuration”Enable the webhook in Helm values:
webhook: enabled: true
# Certificate management certManager: enabled: true # Use cert-manager (recommended)
# Replica count for high availability replicas: 2
# Resource limits resources: limits: cpu: 200m memory: 256Mi requests: cpu: 50m memory: 64MiAdvanced Configuration
Section titled “Advanced Configuration”webhook: enabled: true
# High availability replicas: 3
# Pod disruption budget podDisruptionBudget: enabled: true minAvailable: 1
# Webhook server configuration port: 9443 timeoutSeconds: 10 failurePolicy: Ignore # or "Fail" for strict enforcement
# Namespace selector (exclude system namespaces) namespaceSelector: matchExpressions: - key: name operator: NotIn values: ["kube-system", "kube-public", "kube-node-lease"]
# Object selector (only pods with webhook annotation) objectSelector: matchExpressions: - key: optipod.io/webhook-enabled operator: In values: ["true"]Failure Policy
Section titled “Failure Policy”The failurePolicy determines webhook behavior when the webhook server is unavailable:
Ignore (default): Allow pod creation even if webhook fails
- Pros: No service disruption
- Cons: Pods may be created without optimizations
- Use for: Non-critical workloads, testing
Fail: Block pod creation if webhook fails
- Pros: Ensures all pods are optimized
- Cons: Service disruption if webhook is down
- Use for: Critical workloads, strict compliance
Recommendation: Start with Ignore, switch to Fail after validating webhook stability.
High Availability
Section titled “High Availability”Multiple Replicas
Section titled “Multiple Replicas”Run multiple webhook replicas for availability:
webhook: replicas: 3 # Minimum 2 for HA
podDisruptionBudget: enabled: true minAvailable: 1 # At least 1 replica always availablePod Anti-Affinity
Section titled “Pod Anti-Affinity”Distribute replicas across nodes:
webhook: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app.kubernetes.io/name: optipod-webhook topologyKey: kubernetes.io/hostnamePerformance Tuning
Section titled “Performance Tuning”Resource Limits
Section titled “Resource Limits”Adjust based on cluster size:
# Small clusters (< 100 pods/min)webhook: resources: limits: cpu: 200m memory: 256Mi requests: cpu: 50m memory: 64Mi
# Large clusters (> 1000 pods/min)webhook: resources: limits: cpu: 1000m memory: 512Mi requests: cpu: 500m memory: 256MiTimeout Configuration
Section titled “Timeout Configuration”Balance between reliability and performance:
# Fast timeout (low latency, may fail under load)webhook: timeoutSeconds: 5
# Standard timeout (balanced)webhook: timeoutSeconds: 10
# Long timeout (high reliability, higher latency)webhook: timeoutSeconds: 30Monitoring
Section titled “Monitoring”Metrics
Section titled “Metrics”The webhook exposes Prometheus metrics on port 8080:
# Admission requestsoptipod_webhook_admission_requests_total{namespace, pod_name, dry_run}
# Admission successoptipod_webhook_admission_success_total{namespace, policy, patches_applied}
# Admission failuresoptipod_webhook_admission_failures_total{namespace, failure_reason}
# Mutation durationoptipod_webhook_mutation_duration_seconds{namespace, policy}
# Server healthoptipod_webhook_server_health_status{endpoint}
# Certificate expiryoptipod_webhook_certificate_expiry_timestamp_seconds{cert_type}Grafana Dashboard
Section titled “Grafana Dashboard”Example queries:
# Admission request raterate(optipod_webhook_admission_requests_total[5m])
# Success raterate(optipod_webhook_admission_success_total[5m]) /rate(optipod_webhook_admission_requests_total[5m])
# P95 mutation latencyhistogram_quantile(0.95, rate(optipod_webhook_mutation_duration_seconds_bucket[5m]))
# Certificate expiry (days remaining)(optipod_webhook_certificate_expiry_timestamp_seconds - time()) / 86400Troubleshooting
Section titled “Troubleshooting”Webhook Not Called
Section titled “Webhook Not Called”Symptoms: Pods created without resource modifications
Diagnosis:
# Check webhook configuration existskubectl get mutatingwebhookconfiguration optipod-webhook
# Check webhook servicekubectl get svc optipod-webhook-service -n optipod-system
# Check webhook podskubectl get pods -n optipod-system -l app.kubernetes.io/name=optipod-webhookSolutions:
- Verify webhook is enabled in Helm values
- Check namespace and object selectors match your pods
- Ensure workload has
optipod.io/webhook-enabled: "true"annotation
Certificate Errors
Section titled “Certificate Errors”Symptoms: Webhook admission failures with TLS errors
Diagnosis:
# Check certificate secretkubectl get secret webhook-server-certs -n optipod-system
# Check certificate validitykubectl get certificate -n optipod-system
# Check webhook logskubectl logs -n optipod-system -l app.kubernetes.io/name=optipod-webhookSolutions:
- Verify cert-manager is installed and running
- Check certificate is not expired
- Verify CA bundle in MutatingWebhookConfiguration matches certificate
- Restart webhook pods after certificate rotation
High Latency
Section titled “High Latency”Symptoms: Slow pod creation times
Diagnosis:
# Check webhook metricskubectl port-forward -n optipod-system svc/optipod-webhook-service 8080:8080curl http://localhost:8080/metrics | grep webhook_mutation_duration
# Check webhook resource usagekubectl top pods -n optipod-system -l app.kubernetes.io/name=optipod-webhookSolutions:
- Increase webhook resource limits
- Add more webhook replicas
- Increase timeout in webhook configuration
- Optimize policy selectors to reduce matching overhead
Security Considerations
Section titled “Security Considerations”Network Policies
Section titled “Network Policies”Restrict webhook network access:
apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: optipod-webhook-netpol namespace: optipod-systemspec: podSelector: matchLabels: app.kubernetes.io/name: optipod-webhook policyTypes: - Ingress - Egress ingress: - from: - namespaceSelector: {} ports: - protocol: TCP port: 9443 # Webhook port egress: - to: - namespaceSelector: {} ports: - protocol: TCP port: 443 # Kubernetes APIPod Security Standards
Section titled “Pod Security Standards”The webhook runs with restricted security context:
securityContext: runAsNonRoot: true runAsUser: 65532 readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: ["ALL"] seccompProfile: type: RuntimeDefault