11.7 Google Kubernetes Engine (GKE)

Note

Google Kubernetes Engine (GKE) is a managed Kubernetes service that makes it easy to deploy, manage, and scale containerized applications using Google’s infrastructure. GKE provides a production-ready environment for deploying containerized applications with automatic upgrades, auto-repair, and built-in monitoring.

GKE Overview

Why GKE?

Fully Managed: Google manages the control plane
Auto-Upgrade: Automatic Kubernetes version upgrades
Auto-Repair: Automatic node health monitoring and repair
Auto-Scaling: Cluster and pod autoscaling
Integrated Logging: Cloud Logging and Monitoring integration
Security: Built-in security features and compliance
Multi-Zone: High availability across zones
Workload Identity: Secure access to Google Cloud services

GKE vs Self-Managed Kubernetes:

Feature	GKE	Self-Managed
Control Plane Management	Fully managed	Manual
Upgrades	Automatic	Manual
Node Repair	Automatic	Manual
Monitoring	Built-in	Setup required
Cost	Pay for nodes	Pay for all

GKE Modes:

Standard Mode: Full control over cluster configuration
Autopilot Mode: Google manages entire infrastructure (recommended for most users)

Prerequisites

Enable Required APIs:

# Enable GKE API
gcloud services enable container.googleapis.com

# Enable Artifact Registry (for container images)
gcloud services enable artifactregistry.googleapis.com

# Set default region and zone
gcloud config set compute/region us-central1
gcloud config set compute/zone us-central1-a

Install kubectl:

# Install kubectl with gcloud
gcloud components install kubectl

# Verify installation
kubectl version --client

# Or install standalone (Linux)
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Creating GKE Clusters

Create Standard Cluster (Quick Start):

# Create a basic cluster
gcloud container clusters create my-cluster \
    --zone us-central1-a \
    --num-nodes 3

# Get cluster credentials
gcloud container clusters get-credentials my-cluster \
    --zone us-central1-a

# Verify connection
kubectl get nodes

Create Autopilot Cluster (Recommended):

# Create Autopilot cluster
gcloud container clusters create-auto my-autopilot-cluster \
    --region us-central1

# Get credentials
gcloud container clusters get-credentials my-autopilot-cluster \
    --region us-central1

# Verify
kubectl get nodes

Create Production-Ready Cluster:

# Create cluster with best practices
gcloud container clusters create production-cluster \
    --zone us-central1-a \
    --num-nodes 3 \
    --machine-type n1-standard-4 \
    --disk-size 50 \
    --disk-type pd-standard \
    --enable-autoscaling \
    --min-nodes 3 \
    --max-nodes 10 \
    --enable-autorepair \
    --enable-autoupgrade \
    --enable-ip-alias \
    --network "default" \
    --subnetwork "default" \
    --enable-stackdriver-kubernetes \
    --addons HorizontalPodAutoscaling,HttpLoadBalancing,GcePersistentDiskCsiDriver \
    --workload-pool=PROJECT_ID.svc.id.goog \
    --enable-shielded-nodes \
    --release-channel regular

Cluster Configuration Options:

–machine-type: Node machine type (n1-standard-2, e2-medium, etc.)
–num-nodes: Initial number of nodes per zone
–enable-autoscaling: Enable cluster autoscaler
–min-nodes / –max-nodes: Autoscaling limits
–enable-autorepair: Automatic node repair
–enable-autoupgrade: Automatic Kubernetes upgrades
–release-channel: Update channel (rapid, regular, stable)

Deploying Applications

Deploy Simple Application:

# Create deployment
kubectl create deployment hello-web \
    --image=gcr.io/google-samples/hello-app:1.0

# Verify deployment
kubectl get deployments
kubectl get pods

# Expose deployment as a service
kubectl expose deployment hello-web \
    --type LoadBalancer \
    --port 80 \
    --target-port 8080

# Get external IP (may take a few minutes)
kubectl get service hello-web --watch

# Once EXTERNAL-IP is assigned, test the application
curl http://EXTERNAL_IP

Deploy with YAML Manifest:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  type: LoadBalancer
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80

# Apply manifest
kubectl apply -f deployment.yaml

# Check status
kubectl get deployments
kubectl get pods
kubectl get services

# View logs
kubectl logs -l app=nginx

# Describe resources
kubectl describe deployment nginx-deployment
kubectl describe service nginx-service

Working with Custom Images

Build and Deploy Custom Application:

# Create application directory
mkdir my-gke-app
cd my-gke-app

# Create simple Node.js app
cat > server.js << 'EOF'
const express = require('express');
const app = express();
const PORT = process.env.PORT || 8080;

app.get('/', (req, res) => {
  res.json({
    message: 'Hello from GKE!',
    hostname: require('os').hostname(),
    version: process.env.APP_VERSION || '1.0.0'
  });
});

app.get('/health', (req, res) => {
  res.json({ status: 'healthy' });
});

app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
});
EOF

# Create package.json
cat > package.json << 'EOF'
{
  "name": "my-gke-app",
  "version": "1.0.0",
  "main": "server.js",
  "scripts": {
    "start": "node server.js"
  },
  "dependencies": {
    "express": "^4.18.2"
  }
}
EOF

# Create Dockerfile
cat > Dockerfile << 'EOF'
FROM node:18-slim
WORKDIR /app
COPY package*.json ./
RUN npm install --production
COPY . .
EXPOSE 8080
CMD ["npm", "start"]
EOF

Build and Push to Artifact Registry:

# Create Artifact Registry repository
gcloud artifacts repositories create my-repo \
    --repository-format=docker \
    --location=us-central1 \
    --description="Docker repository"

# Configure Docker authentication
gcloud auth configure-docker us-central1-docker.pkg.dev

# Build and push image
IMAGE=us-central1-docker.pkg.dev/PROJECT_ID/my-repo/my-gke-app:v1

docker build -t $IMAGE .
docker push $IMAGE

Deploy Custom Image:

# deployment-custom.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-gke-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-gke-app
  template:
    metadata:
      labels:
        app: my-gke-app
    spec:
      containers:
      - name: my-gke-app
        image: us-central1-docker.pkg.dev/PROJECT_ID/my-repo/my-gke-app:v1
        ports:
        - containerPort: 8080
        env:
        - name: APP_VERSION
          value: "1.0.0"
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: my-gke-app-service
spec:
  type: LoadBalancer
  selector:
    app: my-gke-app
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080

# Deploy application
kubectl apply -f deployment-custom.yaml

# Watch deployment progress
kubectl rollout status deployment/my-gke-app

# Get service URL
kubectl get service my-gke-app-service

ConfigMaps and Secrets

Create ConfigMap:

# Create ConfigMap from literals
kubectl create configmap app-config \
    --from-literal=APP_NAME=MyApp \
    --from-literal=ENVIRONMENT=production

# Create ConfigMap from file
echo "log_level=info" > config.properties
kubectl create configmap app-config-file \
    --from-file=config.properties

# View ConfigMap
kubectl get configmap app-config -o yaml

Use ConfigMap in Deployment:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  APP_NAME: "MyApp"
  ENVIRONMENT: "production"
  LOG_LEVEL: "info"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-with-config
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: nginx:1.25
        envFrom:
        - configMapRef:
            name: app-config
        # Or mount as volume
        volumeMounts:
        - name: config-volume
          mountPath: /etc/config
      volumes:
      - name: config-volume
        configMap:
          name: app-config

Create and Use Secrets:

# Create secret from literals
kubectl create secret generic db-credentials \
    --from-literal=username=admin \
    --from-literal=password=secretpassword123

# Create secret from file
echo -n 'my-secret-key' > api-key.txt
kubectl create secret generic api-secret \
    --from-file=api-key.txt

# View secret (encoded)
kubectl get secret db-credentials -o yaml

Use Secret in Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-with-secrets
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myapp:1.0
        env:
        - name: DB_USERNAME
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: username
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: password
        # Or mount as volume
        volumeMounts:
        - name: secret-volume
          mountPath: /etc/secrets
          readOnly: true
      volumes:
      - name: secret-volume
        secret:
          secretName: db-credentials

Scaling Applications

Manual Scaling:

# Scale deployment
kubectl scale deployment nginx-deployment --replicas=5

# Verify scaling
kubectl get deployments
kubectl get pods

Horizontal Pod Autoscaler (HPA):

# Create HPA based on CPU usage
kubectl autoscale deployment nginx-deployment \
    --cpu-percent=50 \
    --min=3 \
    --max=10

# View HPA status
kubectl get hpa
kubectl describe hpa nginx-deployment

HPA with YAML:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-deployment
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Cluster Autoscaler:

# Enable cluster autoscaler (if not already enabled)
gcloud container clusters update my-cluster \
    --enable-autoscaling \
    --min-nodes 3 \
    --max-nodes 10 \
    --zone us-central1-a

# Update node pool autoscaling
gcloud container node-pools update default-pool \
    --cluster=my-cluster \
    --enable-autoscaling \
    --min-nodes=3 \
    --max-nodes=10 \
    --zone=us-central1-a

Persistent Storage

Create Persistent Volume Claim:

# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: standard-rwo

Use PVC in Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-with-storage
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: nginx:1.25
        volumeMounts:
        - name: data-volume
          mountPath: /data
      volumes:
      - name: data-volume
        persistentVolumeClaim:
          claimName: my-pvc

# Apply PVC and deployment
kubectl apply -f pvc.yaml
kubectl apply -f deployment-with-storage.yaml

# Verify PVC
kubectl get pvc
kubectl describe pvc my-pvc

Ingress and Load Balancing

Create Ingress:

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.class: "gce"
    kubernetes.io/ingress.global-static-ip-name: "web-static-ip"
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-gke-app-service
            port:
              number: 80
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80

Setup with TLS:

# Create TLS secret
kubectl create secret tls my-tls-secret \
    --cert=path/to/cert.crt \
    --key=path/to/cert.key

# ingress-tls.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress-tls
spec:
  tls:
  - hosts:
    - myapp.example.com
    secretName: my-tls-secret
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-gke-app-service
            port:
              number: 80

Namespaces

Create and Use Namespaces:

# Create namespace
kubectl create namespace development
kubectl create namespace staging
kubectl create namespace production

# List namespaces
kubectl get namespaces

# Deploy to specific namespace
kubectl apply -f deployment.yaml -n development

# Set default namespace
kubectl config set-context --current --namespace=development

# View resources in namespace
kubectl get all -n development

# Delete namespace (deletes all resources)
kubectl delete namespace development

Monitoring and Logging

View Logs:

# View pod logs
kubectl logs pod-name

# View logs from specific container
kubectl logs pod-name -c container-name

# Stream logs
kubectl logs -f pod-name

# View logs from all pods with label
kubectl logs -l app=nginx

# View previous container logs
kubectl logs pod-name --previous

Monitoring with Cloud Console:

Navigate to Kubernetes Engine → Workloads
Click on a workload to view details
View CPU, Memory, and Network metrics
Access Cloud Monitoring for advanced metrics

Check Resource Usage:

# View node resource usage
kubectl top nodes

# View pod resource usage
kubectl top pods

# View pod resource usage in specific namespace
kubectl top pods -n development

Cluster Management

List Clusters:

# List all clusters
gcloud container clusters list

# Describe cluster
gcloud container clusters describe my-cluster \
    --zone us-central1-a

Update Cluster:

# Update cluster to specific Kubernetes version
gcloud container clusters upgrade my-cluster \
    --zone us-central1-a \
    --cluster-version 1.28.3-gke.1203000

# Update node pool
gcloud container node-pools upgrade default-pool \
    --cluster my-cluster \
    --zone us-central1-a

Resize Cluster:

# Resize node pool
gcloud container clusters resize my-cluster \
    --num-nodes 5 \
    --zone us-central1-a

Add Node Pool:

# Create new node pool
gcloud container node-pools create high-mem-pool \
    --cluster my-cluster \
    --zone us-central1-a \
    --machine-type n1-highmem-4 \
    --num-nodes 2 \
    --enable-autoscaling \
    --min-nodes 2 \
    --max-nodes 8

Delete Cluster:

# Delete cluster
gcloud container clusters delete my-cluster \
    --zone us-central1-a

Rolling Updates

Update Deployment Image:

# Update image
kubectl set image deployment/nginx-deployment \
    nginx=nginx:1.26

# Watch rollout status
kubectl rollout status deployment/nginx-deployment

# View rollout history
kubectl rollout history deployment/nginx-deployment

# Rollback to previous version
kubectl rollout undo deployment/nginx-deployment

# Rollback to specific revision
kubectl rollout undo deployment/nginx-deployment --to-revision=2

Rolling Update Strategy:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 5
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Max pods above desired count
      maxUnavailable: 1  # Max pods unavailable during update
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.25

Best Practices

1. Cluster Design:

Use Autopilot mode for simplified management
Enable auto-upgrade and auto-repair
Use regional clusters for high availability
Implement proper node pool sizing
Use preemptible nodes for cost savings on non-critical workloads

2. Application Design:

Use health checks (liveness and readiness probes)
Set resource requests and limits
Design stateless applications when possible
Use init containers for initialization tasks
Implement graceful shutdown

3. Security:

Enable Workload Identity for service authentication
Use namespaces to isolate workloads
Implement Network Policies
Use private clusters when possible
Regularly update Kubernetes versions
Use Binary Authorization for image validation

4. Resource Management:

Define resource requests and limits
Use Horizontal Pod Autoscaler
Enable cluster autoscaling
Use node affinity and taints/tolerations
Implement pod disruption budgets

5. Monitoring and Logging:

Use Cloud Monitoring and Logging
Set up alerts for critical metrics
Use structured logging
Monitor resource usage regularly
Implement distributed tracing

Troubleshooting

Common Issues:

1. Pods Not Starting:

# Check pod status
kubectl get pods
kubectl describe pod POD_NAME

# Check events
kubectl get events --sort-by=.metadata.creationTimestamp

# Check logs
kubectl logs POD_NAME

2. Service Not Accessible:

# Check service
kubectl get services
kubectl describe service SERVICE_NAME

# Check endpoints
kubectl get endpoints SERVICE_NAME

# Verify pod labels match service selector
kubectl get pods --show-labels

3. Node Issues:

# Check node status
kubectl get nodes
kubectl describe node NODE_NAME

# Check node conditions
kubectl get nodes -o json | jq '.items[].status.conditions'

4. Resource Issues:

# Check resource usage
kubectl top nodes
kubectl top pods

# Check resource requests and limits
kubectl describe node NODE_NAME

Cleanup

# Delete all resources in namespace
kubectl delete all --all -n my-namespace

# Delete specific resources
kubectl delete deployment nginx-deployment
kubectl delete service nginx-service
kubectl delete ingress my-ingress

# Delete namespace
kubectl delete namespace my-namespace

# Delete cluster
gcloud container clusters delete my-cluster --zone us-central1-a

Additional Resources

GKE Documentation: https://cloud.google.com/kubernetes-engine/docs
Kubernetes Documentation: https://kubernetes.io/docs/
GKE Best Practices: https://cloud.google.com/kubernetes-engine/docs/best-practices
Kubectl Cheat Sheet: https://kubernetes.io/docs/reference/kubectl/cheatsheet/
GKE Samples: https://github.com/GoogleCloudPlatform/kubernetes-engine-samples