9.13 Kubernetes Operators

9.13 Kubernetes Operators

Automating Complex Applications with Operators

Operators extend Kubernetes to manage stateful applications like databases and monitoring systems. They encode operational knowledge into software that can provision, scale, backup, and maintain complex applications automatically.

What are Operators?

Operator Pattern

Operators combine:

  • Custom Resource Definitions (CRDs) - New Kubernetes API types

  • Controllers - Logic to manage those resources

  • Operational Knowledge - Best practices encoded in software

Benefits:

  • Automated deployment and scaling

  • Self-healing applications

  • Backup and disaster recovery

  • Rolling updates and maintenance

PostgreSQL Operator

Zalando PostgreSQL Operator

The PostgreSQL Operator manages PostgreSQL clusters with high availability, backups, and monitoring.

Installation

# Install PostgreSQL Operator
kubectl apply -k github.com/zalando/postgres-operator/manifests

# Verify installation
kubectl get pods -n postgres-operator

PostgreSQL Cluster

# High-availability PostgreSQL cluster
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: webapp-postgres
  namespace: production
spec:
  teamId: "webapp-team"
  volume:
    size: 100Gi
    storageClass: "gp3"
  numberOfInstances: 3
  users:
    webapp_user: []
    readonly_user: []
  databases:
    webapp_db: webapp_user
  postgresql:
    version: "15"
    parameters:
      max_connections: "200"
      shared_buffers: "256MB"
      effective_cache_size: "1GB"
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 2000m
      memory: 4Gi
  patroni:
    initdb:
      encoding: "UTF8"
      locale: "en_US.UTF-8"
    pg_hba:
      - hostssl all all 0.0.0.0/0 md5
  backup:
    schedule: "0 2 * * *"
    retentionDays: 30

Using the PostgreSQL Cluster

# Application connecting to PostgreSQL
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
spec:
  template:
    spec:
      containers:
      - name: app
        image: webapp:latest
        env:
        - name: DATABASE_URL
          value: "postgresql://webapp_user:$(POSTGRES_PASSWORD)@webapp-postgres:5432/webapp_db"
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: webapp-user.webapp-postgres.credentials
              key: password

PostgreSQL Operations

# List PostgreSQL clusters
kubectl get postgresql

# Check cluster status
kubectl describe postgresql webapp-postgres

# Get connection details
kubectl get secret webapp-user.webapp-postgres.credentials -o yaml

# Scale cluster
kubectl patch postgresql webapp-postgres --type='merge' -p='{\"spec\":{\"numberOfInstances\":5}}'

Prometheus Operator

Kube-Prometheus Stack

The Prometheus Operator provides Kubernetes native deployment and management of Prometheus and related monitoring components.

Installation

# Add Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Install kube-prometheus-stack
helm install prometheus prometheus-community/kube-prometheus-stack
  --namespace monitoring
  --create-namespace
  --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassName=gp3
  --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi

Prometheus Instance

# Custom Prometheus configuration
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: webapp-prometheus
  namespace: monitoring
spec:
  serviceAccountName: prometheus-webapp-prometheus
  serviceMonitorSelector:
    matchLabels:
      team: webapp
  ruleSelector:
    matchLabels:
      prometheus: webapp
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: gp3
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 100Gi
  retention: 30d
  replicas: 2
  resources:
    requests:
      memory: 2Gi
      cpu: 1000m
    limits:
      memory: 4Gi
      cpu: 2000m

ServiceMonitor for Application

# Monitor application metrics
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: webapp-metrics
  namespace: monitoring
  labels:
    team: webapp
spec:
  selector:
    matchLabels:
      app: webapp
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics
  namespaceSelector:
    matchNames:
    - production

PrometheusRule for Alerts

# Application alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: webapp-alerts
  namespace: monitoring
  labels:
    prometheus: webapp
spec:
  groups:
  - name: webapp.rules
    rules:
    - alert: WebappDown
      expr: up{job="webapp"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Webapp is down"
        description: "Webapp has been down for more than 5 minutes"

    - alert: HighErrorRate
      expr: rate(http_requests_total{job="webapp",status=~"5.."}[5m]) > 0.1
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: "High error rate detected"
        description: "Error rate is {{ $value }} per second"

Cert-Manager Operator

TLS Certificate Automation

# Install cert-manager
helm install cert-manager jetstack/cert-manager
  --namespace cert-manager
  --create-namespace
  --set installCRDs=true

ClusterIssuer

# Let's Encrypt issuer
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@example.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx

Certificate Request

# Automatic certificate
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: webapp-tls
  namespace: production
spec:
  secretName: webapp-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
  - webapp.example.com
  - api.example.com

Operator Management

Operator Lifecycle Manager (OLM)

# Install OLM
curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.24.0/install.sh | bash -s v0.24.0

# List available operators
kubectl get packagemanifests

Installing Operators via OLM

# Operator subscription
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: postgresql-operator
  namespace: postgres-operator
spec:
  channel: stable
  name: postgresql-operator
  source: community-operators
  sourceNamespace: olm

Essential Commands

# PostgreSQL Operator
kubectl get postgresql
kubectl describe postgresql webapp-postgres
kubectl logs -f deployment/postgres-operator -n postgres-operator

# Prometheus Operator
kubectl get prometheus
kubectl get servicemonitors
kubectl get prometheusrules
kubectl port-forward svc/prometheus-operated 9090:9090 -n monitoring

# Cert-Manager
kubectl get certificates
kubectl get clusterissuers
kubectl describe certificate webapp-tls

# General operator management
kubectl get crds | grep -E "(postgresql|monitoring|cert-manager)"
kubectl get operators

What’s Next?

Operators significantly simplify managing complex applications in Kubernetes. Consider exploring:

  • ArgoCD Operator for GitOps workflows

  • Grafana Operator for dashboard management

  • Elasticsearch Operator for logging infrastructure

  • Redis Operator for caching solutions

Operators represent the future of Kubernetes application management, encoding decades of operational expertise into automated software.