Key Terminology
Understanding these key terms will help you work effectively with OptiPod.
Core Concepts
Section titled “Core Concepts”Operator
Section titled “Operator”A Kubernetes operator is a software extension that uses custom resources to manage applications and their components. OptiPod is an operator that manages resource optimization for your workloads.
Key characteristics:
- Runs as a deployment in your cluster
- Watches for OptimizationPolicy resources
- Analyzes workload metrics
- Generates and applies recommendations
OptimizationPolicy
Section titled “OptimizationPolicy”A Custom Resource Definition (CRD) that defines how OptiPod should optimize workloads. Each policy specifies:
- Target workloads: Which resources to optimize (by namespace, labels, workload types)
- Operational mode: Recommend, Auto, or Disabled
- Metrics configuration: Provider, rolling window, percentile, safety factor
- Resource bounds: Min/max limits for CPU and memory
- Update strategy: How to apply recommendations (SSA or webhook)
Example:
apiVersion: optipod.optipod.io/v1alpha1kind: OptimizationPolicymetadata: name: production-policyspec: mode: Recommend selector: workloadSelector: matchLabels: env: production workloadTypes: include: [Deployment, StatefulSet] metricsConfig: provider: prometheus rollingWindow: 7d resourceBounds: cpu: min: 10m max: 2000m memory: min: 64Mi max: 4Gi updateStrategy: strategy: webhookWorkload
Section titled “Workload”A Kubernetes resource that runs containers. OptiPod supports:
- Deployments: Stateless applications
- StatefulSets: Stateful applications
- DaemonSets: Node-level services
Note: Jobs and CronJobs are not currently supported.
Recommendation
Section titled “Recommendation”A suggested change to a workload’s resource requests and limits, based on actual usage patterns. Recommendations are calculated using:
- Percentile-based analysis: P50, P90, or P99 of historical usage
- Safety factor: Multiplier applied to percentile (default 1.2)
- Resource bounds: Min/max constraints from policy
- Weight: Policy priority when multiple policies match
Recommendations are stored as per-resource annotations on the workload:
optipod.io/recommendation.<container-name>.cpu-requestoptipod.io/recommendation.<container-name>.memory-requestoptipod.io/recommendation.<container-name>.cpu-limitoptipod.io/recommendation.<container-name>.memory-limit
GitOps
Section titled “GitOps”GitOps
Section titled “GitOps”A deployment methodology where Git is the single source of truth for infrastructure and application configuration. Changes are made through Git commits and automatically applied to the cluster.
OptiPod’s GitOps compatibility:
With Webhook Strategy:
- Recommend mode stores suggestions as annotations only
- Workload specs in Git remain unchanged
- You review and commit changes to Git if desired
- Webhook applies recommendations at pod creation time
- No conflicts with ArgoCD/Flux
With SSA Strategy:
- OptiPod directly updates resource fields in workload specs
- May require configuring GitOps tools to ignore resource fields
- Use ArgoCD’s
ignoreDifferencesor SSA sync options
Server-Side Apply (SSA)
Section titled “Server-Side Apply (SSA)”A Kubernetes feature that allows multiple controllers to manage different fields of the same resource without conflicts. OptiPod can use SSA to apply recommendations directly to workload specs.
Benefits:
- No webhook required
- Direct field updates
- Clear field ownership
- In-place pod resize support (Kubernetes 1.27+)
Metrics
Section titled “Metrics”Metrics Provider
Section titled “Metrics Provider”The source of resource usage data. OptiPod supports:
- Prometheus: Full-featured metrics with historical data (recommended)
- metrics-server: Basic CPU/memory metrics with sampling
- Custom providers: Integrate your own metrics source (future)
Rolling Window
Section titled “Rolling Window”The time period OptiPod analyzes to generate recommendations. Longer windows provide more stable recommendations but take longer to react to changes.
Typical values:
- 7 days: Balanced (default)
- 14 days: More stable, slower to adapt
- 24 hours: Faster adaptation, less stable
Configuration:
metricsConfig: rollingWindow: 7d # or 24h, 14d, 30dPercentile
Section titled “Percentile”The statistical measure used to determine resource recommendations. OptiPod supports P50, P90, and P99.
Example:
- P90 CPU = 250m means 90% of the time, CPU usage is below 250m
- P99 provides more headroom for spikes
- P50 is more aggressive but riskier
Configuration:
metricsConfig: percentile: P90 # or P50, P99Safety Factor
Section titled “Safety Factor”A multiplier applied to the selected percentile to add headroom. Default is 1.2 (20% above percentile).
Example:
- P90 = 250m, safety factor = 1.2
- Recommendation = 250m × 1.2 = 300m
Configuration:
metricsConfig: safetyFactor: 1.2 # 1.0 = no headroom, 1.5 = 50% headroomResource Management
Section titled “Resource Management”Resource Request
Section titled “Resource Request”The amount of CPU/memory Kubernetes guarantees for a container. Used for scheduling decisions.
Impact:
- Too low: Pod may be scheduled on overloaded nodes, leading to throttling or OOM
- Too high: Wastes cluster capacity and increases costs
Resource Limit
Section titled “Resource Limit”The maximum amount of CPU/memory a container can use.
Impact:
- CPU: Container is throttled if exceeded
- Memory: Container is OOM-killed if exceeded
Limit Configuration
Section titled “Limit Configuration”OptiPod can calculate limits based on requests using multipliers:
updateStrategy: limitConfig: cpuLimitMultiplier: 1.0 # limit = request × 1.0 memoryLimitMultiplier: 1.1 # limit = request × 1.1Or you can update only requests and leave limits unchanged:
updateStrategy: updateRequestsOnly: trueVertical Pod Autoscaler (VPA)
Section titled “Vertical Pod Autoscaler (VPA)”Kubernetes’ built-in solution for adjusting resource requests/limits. OptiPod is an alternative designed to be more GitOps-friendly and explainable.
Key differences:
- VPA uses webhooks that can conflict with GitOps
- OptiPod offers webhook strategy that’s GitOps-safe
- OptiPod provides explainable recommendations with percentile-based calculations
Safety
Section titled “Safety”Memory Safety
Section titled “Memory Safety”OptiPod includes built-in protections to prevent dangerous memory reductions:
Default behavior:
- Memory increases are always allowed
- Memory decreases are blocked if they could cause OOM kills
- Checks current memory usage before reducing
Override (use with caution):
updateStrategy: allowUnsafeMemoryDecrease: trueGradual Memory Decrease (Not Yet Implemented)
Section titled “Gradual Memory Decrease (Not Yet Implemented)”⚠️ Status: This feature is not yet implemented. The configuration is accepted but has no effect.
For safer memory optimization, OptiPod plans to support applying large decreases incrementally:
updateStrategy: gradualDecreaseConfig: enabled: true memoryDecreasePercentage: 10 # Max 10% per reconciliation minimumDecreaseThreshold: 100Mi # Apply gradually if decrease > 100Mi maximumTotalDecrease: 70 # Never decrease more than 70% totalCurrent behavior: Memory decreases are applied immediately in full, subject to safety checks.
Resource Bounds
Section titled “Resource Bounds”Min/max constraints that limit recommendations:
resourceBounds: cpu: min: 10m # Never recommend less than 10m max: 4000m # Never recommend more than 4000m memory: min: 64Mi # Never recommend less than 64Mi max: 8Gi # Never recommend more than 8GiPolicy Weight
Section titled “Policy Weight”When multiple policies match the same workload, the policy with the highest weight takes precedence:
spec: weight: 200 # Higher weight = higher priority (default: 100)Next Steps
Section titled “Next Steps”Now that you understand the terminology, learn about: