Key Terminology

Understanding these key terms will help you work effectively with OptiPod.

Core Concepts

Operator

A Kubernetes operator is a software extension that uses custom resources to manage applications and their components. OptiPod is an operator that manages resource optimization for your workloads.

Key characteristics:

Runs as a deployment in your cluster
Watches for OptimizationPolicy resources
Analyzes workload metrics
Generates and applies recommendations

OptimizationPolicy

A Custom Resource Definition (CRD) that defines how OptiPod should optimize workloads. Each policy specifies:

Target workloads: Which resources to optimize (by namespace, labels, workload types)
Operational mode: Recommend, Auto, or Disabled
Metrics configuration: Provider, rolling window, percentile, safety factor
Resource bounds: Min/max limits for CPU and memory
Update strategy: How to apply recommendations (SSA or webhook)

Example:

apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: production-policy
spec:
  mode: Recommend
  selector:
    workloadSelector:
      matchLabels:
        env: production
    workloadTypes:
      include: [Deployment, StatefulSet]
  metricsConfig:
    provider: prometheus
    rollingWindow: 7d
  resourceBounds:
    cpu:
      min: 10m
      max: 2000m
    memory:
      min: 64Mi
      max: 4Gi
  updateStrategy:
    strategy: webhook

Workload

A Kubernetes resource that runs containers. OptiPod supports:

Deployments: Stateless applications
StatefulSets: Stateful applications
DaemonSets: Node-level services

Note: Jobs and CronJobs are not currently supported.

Recommendation

A suggested change to a workload’s resource requests and limits, based on actual usage patterns. Recommendations are calculated using:

Percentile-based analysis: P50, P90, or P99 of historical usage
Safety factor: Multiplier applied to percentile (default 1.2)
Resource bounds: Min/max constraints from policy
Weight: Policy priority when multiple policies match

Recommendations are stored as per-resource annotations on the workload:

optipod.io/recommendation.<container-name>.cpu-request
optipod.io/recommendation.<container-name>.memory-request
optipod.io/recommendation.<container-name>.cpu-limit
optipod.io/recommendation.<container-name>.memory-limit

GitOps

A deployment methodology where Git is the single source of truth for infrastructure and application configuration. Changes are made through Git commits and automatically applied to the cluster.

OptiPod’s GitOps compatibility:

With Webhook Strategy:

Recommend mode stores suggestions as annotations only
Workload specs in Git remain unchanged
You review and commit changes to Git if desired
Webhook applies recommendations at pod creation time
No conflicts with ArgoCD/Flux

With SSA Strategy:

OptiPod directly updates resource fields in workload specs
May require configuring GitOps tools to ignore resource fields
Use ArgoCD’s ignoreDifferences or SSA sync options

Server-Side Apply (SSA)

A Kubernetes feature that allows multiple controllers to manage different fields of the same resource without conflicts. OptiPod can use SSA to apply recommendations directly to workload specs.

Benefits:

No webhook required
Direct field updates
Clear field ownership
In-place pod resize support (Kubernetes 1.27+)

Metrics

Metrics Provider

The source of resource usage data. OptiPod supports:

Prometheus: Full-featured metrics with historical data (recommended)
metrics-server: Basic CPU/memory metrics with sampling
Custom providers: Integrate your own metrics source (future)

Rolling Window

The time period OptiPod analyzes to generate recommendations. Longer windows provide more stable recommendations but take longer to react to changes.

Typical values:

7 days: Balanced (default)
14 days: More stable, slower to adapt
24 hours: Faster adaptation, less stable

Configuration:

metricsConfig:
  rollingWindow: 7d  # or 24h, 14d, 30d

Percentile

The statistical measure used to determine resource recommendations. OptiPod supports P50, P90, and P99.

Example:

P90 CPU = 250m means 90% of the time, CPU usage is below 250m
P99 provides more headroom for spikes
P50 is more aggressive but riskier

Configuration:

metricsConfig:
  percentile: P90  # or P50, P99

Safety Factor

A multiplier applied to the selected percentile to add headroom. Default is 1.2 (20% above percentile).

Example:

P90 = 250m, safety factor = 1.2
Recommendation = 250m × 1.2 = 300m

Configuration:

metricsConfig:
  safetyFactor: 1.2  # 1.0 = no headroom, 1.5 = 50% headroom

Resource Management

Resource Request

The amount of CPU/memory Kubernetes guarantees for a container. Used for scheduling decisions.

Impact:

Too low: Pod may be scheduled on overloaded nodes, leading to throttling or OOM
Too high: Wastes cluster capacity and increases costs

Resource Limit

The maximum amount of CPU/memory a container can use.

Impact:

CPU: Container is throttled if exceeded
Memory: Container is OOM-killed if exceeded

Limit Configuration

OptiPod can calculate limits based on requests using multipliers:

updateStrategy:
  limitConfig:
    cpuLimitMultiplier: 1.0      # limit = request × 1.0
    memoryLimitMultiplier: 1.1   # limit = request × 1.1

Or you can update only requests and leave limits unchanged:

updateStrategy:
  updateRequestsOnly: true

Vertical Pod Autoscaler (VPA)

Kubernetes’ built-in solution for adjusting resource requests/limits. OptiPod is an alternative designed to be more GitOps-friendly and explainable.

Key differences:

VPA uses webhooks that can conflict with GitOps
OptiPod offers webhook strategy that’s GitOps-safe
OptiPod provides explainable recommendations with percentile-based calculations

Safety

Memory Safety

OptiPod includes built-in protections to prevent dangerous memory reductions:

Default behavior:

Memory increases are always allowed
Memory decreases are blocked if they could cause OOM kills
Checks current memory usage before reducing

Override (use with caution):

updateStrategy:
  allowUnsafeMemoryDecrease: true

Gradual Memory Decrease (Not Yet Implemented)

⚠️ Status: This feature is not yet implemented. The configuration is accepted but has no effect.

For safer memory optimization, OptiPod plans to support applying large decreases incrementally:

updateStrategy:
  gradualDecreaseConfig:
    enabled: true
    memoryDecreasePercentage: 10      # Max 10% per reconciliation
    minimumDecreaseThreshold: 100Mi   # Apply gradually if decrease > 100Mi
    maximumTotalDecrease: 70          # Never decrease more than 70% total

Current behavior: Memory decreases are applied immediately in full, subject to safety checks.

Resource Bounds

Min/max constraints that limit recommendations:

resourceBounds:
  cpu:
    min: 10m      # Never recommend less than 10m
    max: 4000m    # Never recommend more than 4000m
  memory:
    min: 64Mi     # Never recommend less than 64Mi
    max: 8Gi      # Never recommend more than 8Gi

Policy Weight

When multiple policies match the same workload, the policy with the highest weight takes precedence:

spec:
  weight: 200  # Higher weight = higher priority (default: 100)

Next Steps

Now that you understand the terminology, learn about: