Creating Your First Policy
This guide walks you through creating and configuring your first OptimizationPolicy. You’ll learn about each field, understand the options, and create a policy tailored to your needs.
What is an OptimizationPolicy?
Section titled “What is an OptimizationPolicy?”An OptimizationPolicy is a Kubernetes Custom Resource that tells OptiPod:
- Which workloads to optimize (via selectors)
- How to collect metrics (provider, window, percentile)
- Safety constraints (resource bounds, safety factors)
- How to apply changes (Auto, Recommend, or Disabled mode)
Policy Structure
Section titled “Policy Structure”Every OptimizationPolicy has five main sections:
apiVersion: optipod.optipod.io/v1alpha1kind: OptimizationPolicymetadata: name: my-policy namespace: defaultspec: mode: Recommend # 1. Operational mode selector: {} # 2. Workload selection metricsConfig: {} # 3. Metrics collection resourceBounds: {} # 4. Safety constraints updateStrategy: {} # 5. Update behaviorLet’s build a policy step by step.
Step 1: Choose an Operational Mode
Section titled “Step 1: Choose an Operational Mode”The mode field controls what OptiPod does with recommendations:
Recommend Mode (Safe Default)
Section titled “Recommend Mode (Safe Default)”Start here! OptiPod computes recommendations but doesn’t apply them.
spec: mode: RecommendWhat happens:
- ✅ Recommendations are written as annotations on workloads
- ✅ No pods are restarted
- ✅ No resources are changed
- ✅ Safe to try in production
Use when:
- Testing OptiPod for the first time
- Evaluating recommendations before applying
- Generating impact reports
Auto Mode (Opt-In)
Section titled “Auto Mode (Opt-In)”OptiPod automatically applies recommendations.
spec: mode: AutoWhat happens:
- ⚠️ Resources are updated automatically
- ⚠️ Pods may be restarted (depending on update strategy)
- ✅ Continuous optimization
Use when:
- You’ve reviewed recommendations in Recommend mode
- You’ve generated and reviewed an impact report
- You’re confident in your safety constraints
Disabled Mode
Section titled “Disabled Mode”Stop processing workloads while preserving policy configuration.
spec: mode: DisabledUse when:
- Temporarily pausing optimization
- Troubleshooting issues
- Maintaining policy configuration without active processing
Step 2: Select Workloads
Section titled “Step 2: Select Workloads”The selector field determines which workloads the policy applies to. You can combine multiple selector types.
By Workload Labels
Section titled “By Workload Labels”Target specific workloads using labels:
spec: selector: workloadSelector: matchLabels: optipod.io/enabled: "true" tier: backendLabel your workloads:
kubectl label deployment my-app optipod.io/enabled=trueBy Namespace Labels
Section titled “By Namespace Labels”Target all workloads in namespaces with specific labels:
spec: selector: namespaceSelector: matchLabels: environment: productionBy Namespace Names
Section titled “By Namespace Names”Use allow/deny lists for namespace filtering:
spec: selector: namespaces: allow: - production - staging deny: - kube-system - kube-publicNote: Deny takes precedence over allow.
By Workload Type
Section titled “By Workload Type”Filter by workload type (Deployment, StatefulSet, DaemonSet):
spec: selector: # Only optimize Deployments workloadTypes: include: - DeploymentOr exclude specific types:
spec: selector: # Optimize everything except StatefulSets workloadTypes: exclude: - StatefulSetNote: Exclude takes precedence over include.
Combined Example
Section titled “Combined Example”Combine multiple selectors for precise targeting:
spec: selector: # Namespaces with environment=production label namespaceSelector: matchLabels: environment: production
# Workloads with optimize=true label workloadSelector: matchLabels: optipod.io/enabled: "true"
# Additional namespace filtering namespaces: deny: - kube-system
# Only Deployments and DaemonSets workloadTypes: include: - Deployment - DaemonSetStep 3: Configure Metrics Collection
Section titled “Step 3: Configure Metrics Collection”The metricsConfig section defines how OptiPod collects and analyzes metrics.
Basic Configuration
Section titled “Basic Configuration”spec: metricsConfig: provider: metrics-server rollingWindow: 24h percentile: P90 safetyFactor: 1.2Provider
Section titled “Provider”Choose your metrics backend:
metricsConfig: provider: metrics-server # or prometheusmetrics-server:
- ✅ Simple setup
- ✅ Lower resource usage
- ✅ Built into most clusters
- ⚠️ Limited historical data (~15 minutes)
prometheus:
- ✅ Rich historical data
- ✅ Advanced queries
- ✅ Existing monitoring infrastructure
- ⚠️ Requires Prometheus installation
Rolling Window
Section titled “Rolling Window”How far back to analyze metrics:
metricsConfig: rollingWindow: 24h # 1 day, 48h, 7d, etc.Recommendations:
- 24h: Standard for most workloads
- 48h: More stable for variable workloads
- 7d: Very conservative, includes weekly patterns
Percentile
Section titled “Percentile”Which percentile to use for recommendations:
metricsConfig: percentile: P90 # P50, P90, or P99P50 (Median):
- More aggressive optimization
- Use for stable, predictable workloads
- Higher risk of resource exhaustion
P90 (Recommended):
- Balanced approach
- Handles most spikes
- Good for typical workloads
P99 (Conservative):
- Very safe
- Handles nearly all spikes
- Use for critical or variable workloads
Safety Factor
Section titled “Safety Factor”Multiplier applied to the percentile:
metricsConfig: safetyFactor: 1.2 # 20% buffer above P90Recommendations:
- 1.1: Aggressive (10% buffer)
- 1.2: Balanced (20% buffer) - recommended
- 1.3-1.5: Conservative (30-50% buffer)
Example: If P90 CPU usage is 500m and safety factor is 1.2, the recommendation is 600m (500m × 1.2).
Step 4: Set Resource Bounds
Section titled “Step 4: Set Resource Bounds”The resourceBounds section defines min/max constraints for recommendations.
spec: resourceBounds: cpu: min: "100m" max: "4000m" memory: min: "128Mi" max: "8Gi"Important: OptiPod will NEVER recommend values outside these bounds, regardless of observed usage.
CPU Bounds
Section titled “CPU Bounds”resourceBounds: cpu: min: "100m" # 0.1 CPU cores max: "4000m" # 4 CPU coresRecommendations:
- min: Set based on minimum viable performance
- max: Set based on node capacity and cost constraints
Memory Bounds
Section titled “Memory Bounds”resourceBounds: memory: min: "128Mi" # 128 mebibytes max: "8Gi" # 8 gibibytesRecommendations:
- min: Set based on application startup requirements
- max: Set based on node capacity and OOM risk tolerance
Step 5: Configure Update Strategy
Section titled “Step 5: Configure Update Strategy”The updateStrategy section controls how changes are applied.
Basic Configuration
Section titled “Basic Configuration”spec: updateStrategy: allowInPlaceResize: true allowRecreate: false updateRequestsOnly: true useServerSideApply: trueAllow In-Place Resize
Section titled “Allow In-Place Resize”Enable in-place pod resize (Kubernetes 1.29+):
updateStrategy: allowInPlaceResize: trueWhen true:
- Resources updated without pod restart (if supported)
- Faster, less disruptive
When false:
- Changes require pod recreation
- Only applied if
allowRecreate: true
Allow Recreate
Section titled “Allow Recreate”Allow pod recreation when in-place resize isn’t available:
updateStrategy: allowRecreate: false # Safer defaultWhen true:
- ⚠️ Pods will be restarted to apply changes
- Use with caution in production
When false:
- Changes requiring recreation are skipped
- Safer, but some optimizations won’t apply
Update Requests Only
Section titled “Update Requests Only”Control whether to update limits:
updateStrategy: updateRequestsOnly: true # RecommendedWhen true:
- Only resource requests are updated
- Limits remain unchanged
- Simpler, less risky
When false:
- Both requests and limits are updated
- Limits calculated using
limitConfigmultipliers
Use Server-Side Apply
Section titled “Use Server-Side Apply”Enable Server-Side Apply for GitOps compatibility:
updateStrategy: useServerSideApply: true # RecommendedWhen true:
- ✅ Compatible with ArgoCD, Flux
- ✅ Field-level ownership tracking
- ✅ No sync conflicts
When false:
- ⚠️ May conflict with GitOps tools
- Uses Strategic Merge Patch
Complete Example
Section titled “Complete Example”Here’s a complete, production-ready policy:
apiVersion: optipod.optipod.io/v1alpha1kind: OptimizationPolicymetadata: name: production-backend namespace: defaultspec: # Start in Recommend mode mode: Recommend
# Target backend workloads in production selector: namespaceSelector: matchLabels: environment: production workloadSelector: matchLabels: tier: backend optipod.io/enabled: "true" namespaces: deny: - kube-system workloadTypes: include: - Deployment
# Balanced metrics configuration metricsConfig: provider: metrics-server rollingWindow: 24h percentile: P90 safetyFactor: 1.2
# Conservative resource bounds resourceBounds: cpu: min: "100m" max: "4000m" memory: min: "128Mi" max: "8Gi"
# Safe update strategy updateStrategy: allowInPlaceResize: true allowRecreate: false updateRequestsOnly: true useServerSideApply: true
# Check every 5 minutes reconciliationInterval: 5mApplying Your Policy
Section titled “Applying Your Policy”Create the Policy
Section titled “Create the Policy”kubectl apply -f my-policy.yamlVerify Creation
Section titled “Verify Creation”# Check policy existskubectl get optimizationpolicy
# View policy detailskubectl describe optimizationpolicy production-backendLabel Workloads
Section titled “Label Workloads”# Label a single deploymentkubectl label deployment my-app optipod.io/enabled=true
# Label multiple deploymentskubectl label deployment -l tier=backend optipod.io/enabled=trueMonitor Policy Status
Section titled “Monitor Policy Status”# Check workload countskubectl get optimizationpolicy production-backend -o yaml | grep -A5 status
# Watch for changeskubectl get optimizationpolicy -wReviewing Recommendations
Section titled “Reviewing Recommendations”Check Policy Status
Section titled “Check Policy Status”kubectl describe optimizationpolicy production-backendLook for:
workloadsDiscovered: How many workloads matchworkloadsProcessed: How many were successfully processedlastReconciliation: When the policy last ran
View Workload Recommendations
Section titled “View Workload Recommendations”Recommendations are stored as annotations on each workload:
# View all annotationskubectl get deployment my-app -o yaml | grep optipod.io
# View specific recommendationskubectl get deployment my-app -o jsonpath='{.metadata.annotations}' | jqExample annotations:
metadata: annotations: optipod.io/managed: "true" optipod.io/policy: "production-backend" optipod.io/last-recommendation: "2025-01-28T10:30:00Z" optipod.io/recommendation.app.cpu-request: "250m" optipod.io/recommendation.app.memory-request: "512Mi"Generate Impact Report
Section titled “Generate Impact Report”Before switching to Auto mode, generate a report:
curl -fsSL https://raw.githubusercontent.com/Sagart-cactus/optipod/main/scripts/optipod-recommendation-report.sh -o report.shchmod +x report.sh./report.sh -o html -f impact-report.htmlSwitching to Auto Mode
Section titled “Switching to Auto Mode”Once you’re satisfied with recommendations:
kubectl patch optimizationpolicy production-backend \ --type=merge \ -p '{"spec":{"mode":"Auto"}}'Important: Monitor your workloads closely after switching to Auto mode.
Common Patterns
Section titled “Common Patterns”Pattern 1: Start Conservative
Section titled “Pattern 1: Start Conservative”spec: mode: Recommend metricsConfig: percentile: P99 safetyFactor: 1.3 updateStrategy: allowRecreate: falsePattern 2: Gradual Adoption
Section titled “Pattern 2: Gradual Adoption”# Week 1: Recommend mode, reviewmode: Recommend
# Week 2: Auto mode, no recreationmode: AutoupdateStrategy: allowRecreate: false
# Week 3: Allow recreation if neededupdateStrategy: allowRecreate: truePattern 3: Environment-Specific
Section titled “Pattern 3: Environment-Specific”# Development: AggressivemetricsConfig: percentile: P50 safetyFactor: 1.1updateStrategy: allowRecreate: true
# Production: ConservativemetricsConfig: percentile: P99 safetyFactor: 1.3updateStrategy: allowRecreate: falseTroubleshooting
Section titled “Troubleshooting”No Workloads Discovered
Section titled “No Workloads Discovered”Check:
- Workload labels match selector
- Namespace is not in deny list
- Workload type is not excluded
# Verify labelskubectl get deployment my-app --show-labels
# Check policy selectorkubectl get optimizationpolicy production-backend -o yaml | grep -A10 selectorRecommendations Not Appearing
Section titled “Recommendations Not Appearing”Check:
- Policy is in Recommend or Auto mode
- Sufficient metrics data collected
- Controller logs for errors
# Check modekubectl get optimizationpolicy production-backend -o jsonpath='{.spec.mode}'
# Check controller logskubectl logs -n optipod-system -l app.kubernetes.io/component=controller --tail=50Changes Not Applied (Auto Mode)
Section titled “Changes Not Applied (Auto Mode)”Check:
allowRecreatesetting if in-place resize unavailable- Recommendations within resource bounds
- Controller logs for skip reasons
# Check update strategykubectl get optimizationpolicy production-backend -o yaml | grep -A5 updateStrategy
# Check logskubectl logs -n optipod-system -l app.kubernetes.io/component=controller | grep -i skipNext Steps
Section titled “Next Steps”- Architecture Overview - Understand how OptiPod works
- Operational Modes - Deep dive into Auto, Recommend, and Disabled modes
- Update Strategies - Learn about SSA vs Webhook strategies
- CRD Reference - Complete field documentation
For advanced configuration options, see the CRD Reference.