Multi-tenant Setup

OptiPod supports multi-tenant Kubernetes environments where different teams or applications share the same cluster. This guide shows you how to configure OptiPod for multi-tenancy.

Multi-tenancy Models

Namespace-based Tenancy

Each tenant gets their own namespace(s) with isolated policies.

Benefits:

Clear separation of concerns
Team autonomy
Independent policy configuration

Setup:

# Team A policy
apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: team-a-policy
  namespace: team-a
spec:
  mode: Auto
  selector:
    workloadTypes:
      include: [Deployment]
    workloadSelector:
      matchLabels:
        optipod.io/enabled: "true"
  metricsConfig:
    provider: prometheus
  resourceBounds:
    cpu:
      min: 10m
      max: 4000m
    memory:
      min: 64Mi
      max: 8Gi
  updateStrategy:
    strategy: webhook
---
# Team B policy
apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: team-b-policy
  namespace: team-b
spec:
  mode: Recommend
  selector:
    workloadTypes:
      include: [Deployment, StatefulSet]
    workloadSelector:
      matchLabels:
        optipod.io/enabled: "true"
  metricsConfig:
    provider: prometheus
  resourceBounds:
    cpu:
      min: 10m
      max: 4000m
    memory:
      min: 64Mi
      max: 8Gi
  updateStrategy:
    strategy: webhook

Label-based Tenancy

Multiple tenants share namespaces but use labels for isolation.

Benefits:

Flexible resource sharing
Fine-grained control
Works with existing namespace structure

Setup:

apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: tenant-policy
  namespace: shared
spec:
  mode: Auto
  selector:
    workloadSelector:
      matchLabels:
        tenant: team-a
        optipod.io/enabled: "true"
  metricsConfig:
    provider: prometheus
  resourceBounds:
    cpu:
      min: 10m
      max: 4000m
    memory:
      min: 64Mi
      max: 8Gi
  updateStrategy:
    strategy: webhook

Configuration Patterns

Pattern 1: Centralized Management

Single operations team manages all policies.

# Central ops team creates policies for all tenants
apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: standard-policy
  namespace: tenant-1
  labels:
    managed-by: ops-team
spec:
  mode: Auto
  selector:
    workloadSelector:
      matchLabels:
        optipod.io/enabled: "true"
  metricsConfig:
    provider: prometheus
    rollingWindow: 7d
    percentile: P90
    safetyFactor: 1.2
  resourceBounds:
    cpu:
      min: 10m
      max: 4000m
    memory:
      min: 64Mi
      max: 8Gi
  updateStrategy:
    strategy: ssa
    useServerSideApply: true

RBAC:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: optipod-admin
rules:
  - apiGroups: ["optipod.optipod.io"]
    resources: ["optimizationpolicies"]
    verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ops-team-optipod-admin
subjects:
  - kind: Group
    name: ops-team
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: optipod-admin
  apiGroup: rbac.authorization.k8s.io

Pattern 2: Delegated Management

Each tenant manages their own policies.

# Tenant-specific RBAC
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: optipod-manager
  namespace: team-a
rules:
  - apiGroups: ["optipod.optipod.io"]
    resources: ["optimizationpolicies"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: team-a-optipod-manager
  namespace: team-a
subjects:
  - kind: Group
    name: team-a
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: optipod-manager
  apiGroup: rbac.authorization.k8s.io

Pattern 3: Hybrid Approach

Central defaults with tenant overrides.

# Central default policy
apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: default-policy
  namespace: team-a
  labels:
    policy-type: default
spec:
  mode: Recommend
  weight: 100  # Lower priority
  selector:
    workloadSelector:
      matchLabels:
        optipod.io/enabled: "true"
  metricsConfig:
    provider: prometheus
    rollingWindow: 7d
  resourceBounds:
    cpu:
      min: 10m
      max: 2000m
    memory:
      min: 64Mi
      max: 4Gi
  updateStrategy:
    strategy: webhook

---
# Tenant override policy
apiVersion: optipod.optipod.io/v1alpha1
kind: OptimizationPolicy
metadata:
  name: team-a-custom
  namespace: team-a
  labels:
    policy-type: custom
spec:
  mode: Auto
  weight: 200  # Higher priority - overrides default
  selector:
    workloadSelector:
      matchLabels:
        team: team-a
        tier: non-critical
  metricsConfig:
    provider: prometheus
    rollingWindow: 14d
  resourceBounds:
    cpu:
      min: 10m
      max: 4000m
    memory:
      min: 128Mi
      max: 8Gi
  updateStrategy:
    strategy: webhook

Resource Isolation

CPU and Memory Quotas

Ensure OptiPod respects namespace quotas:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-a-quota
  namespace: team-a
spec:
  hard:
    requests.cpu: "100"
    requests.memory: 200Gi
    limits.cpu: "200"
    limits.memory: 400Gi

Configure policy bounds within quota:

spec:
  resourceBounds:
    cpu:
      min: 10m
      max: 2000m  # Per-pod max, well within quota
    memory:
      min: 64Mi
      max: 4Gi    # Per-pod max, well within quota

Limit Ranges

Set default limits for tenant namespaces:

apiVersion: v1
kind: LimitRange
metadata:
  name: team-a-limits
  namespace: team-a
spec:
  limits:
    - type: Container
      default:
        cpu: 500m
        memory: 512Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi
      max:
        cpu: 4000m
        memory: 8Gi
      min:
        cpu: 10m
        memory: 64Mi

Metrics Isolation

Prometheus with Multi-tenancy

OptiPod queries Prometheus with namespace filtering automatically. Prometheus configuration is external (via Helm values or environment variables), not in the CRD.

# Helm values for OptiPod
prometheus:
  address: http://prometheus-server.monitoring.svc:9090

Policy configuration:

spec:
  metricsConfig:
    provider: prometheus
    rollingWindow: 7d
    # Queries automatically filtered by namespace

Separate Prometheus Instances

Each tenant can use their own Prometheus by configuring different addresses via Helm:

# Team A Helm values
prometheus:
  address: http://prometheus-team-a.team-a.svc:9090

# Team B Helm values
prometheus:
  address: http://prometheus-team-b.team-b.svc:9090

Policy Templates

Create reusable policy templates for tenants:

# Template ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: optipod-policy-template
  namespace: optipod-system
data:
  standard-policy.yaml: |
    apiVersion: optipod.optipod.io/v1alpha1
    kind: OptimizationPolicy
    metadata:
      name: standard-policy
      namespace: ${NAMESPACE}
    spec:
      mode: Recommend
      selector:
        workloadTypes:
          include: [Deployment, StatefulSet]
        workloadSelector:
          matchLabels:
            optipod.io/enabled: "true"
      metricsConfig:
        provider: prometheus
        rollingWindow: 7d
      resourceBounds:
        cpu:
          min: 10m
          max: 4000m
        memory:
          min: 64Mi
          max: 8Gi
      updateStrategy:
        strategy: webhook

Apply template for new tenant:

#!/bin/bash
NAMESPACE=$1
kubectl create namespace $NAMESPACE
envsubst < policy-template.yaml | kubectl apply -f -

Monitoring and Observability

Per-Tenant Metrics

Track OptiPod activity per tenant using Prometheus metrics:

# Recommendations generated per namespace
sum by (namespace) (optipod_recommendations_total)

# Policies per namespace
count by (namespace) (optipod_policy_info)

# Workloads managed per namespace
sum by (namespace) (optipod_workloads_managed_total)

Note: Actual metric names depend on OptiPod’s controller-runtime metrics. Check available metrics:

kubectl port-forward -n optipod-system svc/optipod-controller-metrics 8080:8080
curl localhost:8080/metrics | grep optipod

### Tenant Dashboards

Create Grafana dashboards per tenant:

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: team-a-dashboard
  namespace: team-a
data:
  dashboard.json: |
    {
      "dashboard": {
        "title": "Team A - OptiPod",
        "panels": [
          {
            "title": "Recommendations",
            "targets": [{
              "expr": "optipod_recommendations_total{namespace='team-a'}"
            }]
          }
        ]
      }
    }

Cost Allocation

Track Savings Per Tenant

# CPU cost savings per tenant (assuming $0.05 per core-hour)
sum by (namespace) (
  optipod_cpu_savings_cores * 0.05 * 24 * 30
)

# Memory cost savings per tenant (assuming $0.01 per GB-hour)
sum by (namespace) (
  optipod_memory_savings_bytes / 1024 / 1024 / 1024 * 0.01 * 24 * 30
)

Chargeback Reports

Generate monthly reports:

#!/bin/bash
for namespace in $(kubectl get namespaces -l tenant=true -o jsonpath='{.items[*].metadata.name}'); do
  echo "Tenant: $namespace"

  # Get savings
  cpu_savings=$(kubectl get optimizationpolicy -n $namespace -o json | \
    jq -r '.items[].status.totalCPUSavings')

  memory_savings=$(kubectl get optimizationpolicy -n $namespace -o json | \
    jq -r '.items[].status.totalMemorySavings')

  echo "  CPU Savings: $cpu_savings cores"
  echo "  Memory Savings: $memory_savings GB"
  echo "  Estimated Cost Savings: \$$(calculate_cost $cpu_savings $memory_savings)"
  echo
done

Security Considerations

Network Policies

Isolate tenant traffic:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: tenant-isolation
  namespace: team-a
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: team-a
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              name: team-a
    - to:  # Allow OptiPod operator
        - namespaceSelector:
            matchLabels:
              name: optipod-system

Pod Security Standards

Enforce security standards per tenant:

apiVersion: v1
kind: Namespace
metadata:
  name: team-a
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Best Practices

Namespace per tenant: Use dedicated namespaces for clear isolation
RBAC boundaries: Restrict policy management to appropriate teams
Resource quotas: Prevent resource exhaustion
Separate metrics: Use tenant-specific Prometheus when possible
Policy templates: Standardize configurations across tenants
Monitor per tenant: Track savings and issues by tenant
Document policies: Clear documentation of who manages what
Regular reviews: Audit tenant configurations quarterly

Troubleshooting Multi-tenant Issues

Policy Conflicts

Problem: Multiple policies targeting same workload

Solution:

# Find conflicting policies
kubectl get optimizationpolicy -A -o json | \
  jq -r '.items[] | select(.spec.selector.workloadSelector.matchLabels."optipod.io/enabled" == "true") |
    "\(.metadata.namespace)/\(.metadata.name)"'

Use more specific selectors:

selector:
  workloadSelector:
    matchLabels:
      optipod.io/enabled: "true"
      tenant: team-a  # Add tenant label

Cross-namespace Access

Problem: Tenant trying to access other tenant’s policies

Solution: Verify RBAC:

kubectl auth can-i get optimizationpolicy \
  --namespace team-b \
  --as system:serviceaccount:team-a:default