Backend & DevOps Blog

Real-world experiences with MongoDB, Docker, Kubernetes and more

Rolling Deployment for Next.js Apps on Kubernetes

Deploying new versions of our Next.js application used to be a nerve-wracking experience for our team. Despite using Kubernetes, which theoretically supports zero-downtime deployments, we often experienced 30-60 seconds of dropped requests during updates. This was particularly frustrating since one of the main reasons we migrated to Kubernetes was to eliminate deployment downtime. Our journey to truly zero-downtime deployments taught us valuable lessons about the nuances of rolling updates, health checks, and traffic management in Kubernetes.

The Initial Setup: Basic Deployment without Proper Configuration

Our initial Kubernetes deployment for our Next.js application was fairly basic. We used a standard Deployment resource without much customization:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nextjs-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nextjs-app
  template:
    metadata:
      labels:
        app: nextjs-app
    spec:
      containers:
      - name: nextjs
        image: our-registry/nextjs-app:v1.0.0
        ports:
        - containerPort: 3000
---
apiVersion: v1
kind: Service
metadata:
  name: nextjs-app-service
spec:
  selector:
    app: nextjs-app
  ports:
  - port: 80
    targetPort: 3000
  type: ClusterIP

While this configuration worked for running the application, we experienced significant downtime during deployments. When updating the image version, we observed:

  1. All pods were terminated nearly simultaneously
  2. New pods took 15-20 seconds to become ready
  3. The service sent traffic to pods that were terminating or not yet ready

The result was a poor user experience with dropped connections and error pages during deployments.

Problem #1: Default Rolling Update Strategy

After researching Kubernetes rolling updates, we discovered that our deployment was using the default update strategy parameters, which weren't suitable for our application. The default settings allowed too many pods to be replaced at once.

We needed to modify the rolling update strategy to ensure that:

  • Only one pod would be taken down at a time
  • A new pod would be fully ready before another pod was replaced
  • We had extra capacity during updates to handle the ongoing traffic

Solution #1: Customized RollingUpdate Strategy

We updated our deployment configuration with a more conservative rolling update strategy:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nextjs-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Maximum number of pods above desired replicas
      maxUnavailable: 0  # Maximum number of pods that can be unavailable
  # ... rest of the deployment configuration ...

With this configuration:

  1. maxSurge: 1 allowed one additional pod (beyond our desired count) to be created during updates
  2. maxUnavailable: 0 ensured that we would never have fewer than our desired number of pods available

This meant that during an update, Kubernetes would first create a new pod, wait for it to become ready, and only then terminate an old pod.

Problem #2: Inaccurate Readiness Detection

After implementing the custom rolling update strategy, we still experienced intermittent downtime. Our logs showed that Kubernetes was considering new pods "ready" before they were actually able to serve traffic. In particular:

  • Next.js takes several seconds to hydrate after the server starts listening on its port
  • Without proper readiness probes, Kubernetes sent traffic to pods as soon as the container started
  • Requests hitting a pod that wasn't fully initialized resulted in 503 errors

Solution #2: Implementing Proper Health Checks

We needed to implement three different types of probes to properly manage our pods' lifecycle:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nextjs-app
spec:
  # ... other configuration ...
  template:
    spec:
      containers:
      - name: nextjs
        image: our-registry/nextjs-app:v1.0.0
        ports:
        - containerPort: 3000
        
        # Check if container is alive
        livenessProbe:
          httpGet:
            path: /api/health
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 10
          timeoutSeconds: 2
          failureThreshold: 3
        
        # Check if container is ready to receive traffic
        readinessProbe:
          httpGet:
            path: /api/ready
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 2
          failureThreshold: 3
          successThreshold: 1
        
        # Handle slow-starting containers
        startupProbe:
          httpGet:
            path: /api/health
            port: 3000
          failureThreshold: 30
          periodSeconds: 2

To support these probes, we had to implement the health check endpoints in our Next.js application:

// pages/api/health.js
export default function handler(req, res) {
  // Basic health check - just responds with 200 if the server is running
  res.status(200).json({ status: 'ok' });
}

// pages/api/ready.js
export default function handler(req, res) {
  // Readiness check - verifies that the application is fully ready to serve traffic
  // For Next.js, this might check that:
  // 1. Database connections are established
  // 2. Required caches are warmed up
  // 3. Any initialization tasks are complete
  
  const isReady = checkIfAppIsReady(); // Your custom check function
  
  if (isReady) {
    res.status(200).json({ status: 'ready' });
  } else {
    res.status(503).json({ status: 'not ready' });
  }
}

function checkIfAppIsReady() {
  // In a real application, you would check things like:
  // - Database connectivity
  // - Cache availability
  // - External API accessibility
  // For simplicity, we're just returning true here
  return true;
}

The difference between these probes is important:

  • Liveness Probe: Determines if the container is running. If this fails, Kubernetes restarts the container.
  • Readiness Probe: Determines if the container can receive traffic. If this fails, Kubernetes stops sending traffic to the pod but doesn't restart it.
  • Startup Probe: Gives slow-starting containers time to initialize. Once this succeeds, the liveness probe takes over.

Problem #3: Connection Draining Issues

Even with proper rolling updates and health checks, we still saw some dropped connections during deployments. This happened because existing connections to pods that were being terminated were abruptly closed when the pod was shut down.

We needed to implement proper connection draining to ensure that:

  1. Pods being terminated stopped receiving new connections
  2. Existing requests were allowed to complete
  3. The pod only terminated after all connections were properly closed

Solution #3: Pod Termination Grace Period and PreStop Hook

We implemented two key changes to handle graceful shutdown:

  1. Increased the termination grace period to give pods more time to shut down gracefully
  2. Added a preStop hook to implement a sleep delay before termination
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nextjs-app
spec:
  # ... other configuration ...
  template:
    spec:
      terminationGracePeriodSeconds: 60  # Give pods 60 seconds to shut down
      containers:
      - name: nextjs
        image: our-registry/nextjs-app:v1.0.0
        ports:
        - containerPort: 3000
        
        # ... probes configuration ...
        
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 10 && kill -SIGTERM 1"]

We also had to modify our Next.js application to handle graceful shutdown. We created a custom server that could process the SIGTERM signal:

// server.js
const { createServer } = require('http');
const { parse } = require('url');
const next = require('next');

const dev = process.env.NODE_ENV !== 'production';
const app = next({ dev });
const handle = app.getRequestHandler();

app.prepare().then(() => {
  const server = createServer((req, res) => {
    const parsedUrl = parse(req.url, true);
    handle(req, res, parsedUrl);
  });

  // Keep track of connections to close them gracefully on shutdown
  const connections = {};
  let connectionCounter = 0;

  server.on('connection', (conn) => {
    const id = connectionCounter++;
    connections[id] = conn;
    
    conn.on('close', () => {
      delete connections[id];
    });
  });

  server.listen(3000, (err) => {
    if (err) throw err;
    console.log('> Ready on http://localhost:3000');
  });

  // Graceful shutdown handler
  process.on('SIGTERM', () => {
    console.log('Received SIGTERM, shutting down gracefully...');
    
    server.close(() => {
      console.log('Server closed');
      process.exit(0);
    });
    
    // Force close connections after timeout
    setTimeout(() => {
      console.log('Forcing server to close after timeout');
      
      // Close any remaining connections
      Object.keys(connections).forEach((key) => {
        connections[key].destroy();
      });
      
      process.exit(0);
    }, 30000);
  });
});

We then updated our Docker image to use this custom server instead of the default Next.js start command.

Problem #4: Traffic Routing During Updates

After implementing proper rolling updates, health checks, and graceful shutdown, we still occasionally saw issues during deployment. We discovered that the Kubernetes Service was sometimes routing traffic to pods that were in the process of shutting down before the readiness probe had a chance to fail.

This happened because of timing issues between:

  1. When a pod received the termination signal
  2. When the readiness probe failed and the pod was removed from the service endpoints
  3. When the pod stopped accepting new connections

Solution #4: Using a Service Mesh for Traffic Management

For more precise control over traffic routing, we implemented a service mesh using Istio. This gave us finer-grained control over how traffic was routed during deployments.

First, we installed Istio in our cluster and labeled our namespace for Istio injection:

kubectl label namespace nextjs-production istio-injection=enabled

Then, we created a VirtualService and DestinationRule to control traffic routing:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: nextjs-app-vs
spec:
  hosts:
  - "nextjs-app-service"
  http:
  - route:
    - destination:
        host: nextjs-app-service
        subset: stable
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: nextjs-app-dr
spec:
  host: nextjs-app-service
  trafficPolicy:
    connectionPool:
      http:
        maxRequestsPerConnection: 10
      tcp:
        maxConnections: 100
    outlierDetection:
      consecutive5xxErrors: 5
      interval: 30s
      baseEjectionTime: 30s
  subsets:
  - name: stable
    labels:
      app: nextjs-app

The key advantage of using Istio was its more sophisticated health checking and traffic management:

  • Istio can detect failing endpoints more quickly than Kubernetes Services
  • It provides circuit breaking for failing instances
  • It can gradually shift traffic between old and new versions

Problem #5: Long-Running Connections

After implementing all these improvements, we still had issues with WebSockets and other long-running connections. These connections would be maintained with terminating pods until they were forcibly closed, leading to errors.

Solution #5: WebSocket-Aware Deployment Strategy

For our WebSocket connections, we implemented a more specialized approach:

  1. Added a WebSocket-specific readiness check that failed immediately on shutdown notice
  2. Implemented client-side reconnection logic that could gracefully handle server switching
  3. Used Istio features to better manage WebSocket connections during deployments

For the WebSocket readiness probe, we implemented a special endpoint that would fail as soon as the pod received a shutdown signal:

// pages/api/ws-ready.js
let isShuttingDown = false;

// Set up a handler for shutdown signals
process.on('SIGTERM', () => {
  isShuttingDown = true;
});

export default function handler(req, res) {
  if (isShuttingDown) {
    // Immediately report as not ready when shutting down
    res.status(503).json({ status: 'shutting down' });
  } else {
    res.status(200).json({ status: 'ready for websockets' });
  }
}

We then updated our deployment to use this special readiness probe for WebSocket-capable pods:

readinessProbe:
  httpGet:
    path: /api/ws-ready
    port: 3000
  periodSeconds: 1  # Check frequently
  timeoutSeconds: 1
  failureThreshold: 1  # Fail immediately on first problem

On the client side, we implemented robust reconnection logic:

// websocket-client.js
class RobustWebSocketClient {
  constructor(url, options = {}) {
    this.url = url;
    this.options = options;
    this.reconnectAttempts = 0;
    this.maxReconnectAttempts = options.maxReconnectAttempts || 10;
    this.reconnectInterval = options.reconnectInterval || 1000;
    this.listeners = {
      message: [],
      open: [],
      close: [],
      error: [],
      reconnect: []
    };
    
    this.connect();
  }
  
  connect() {
    this.socket = new WebSocket(this.url);
    
    this.socket.onopen = (event) => {
      this.reconnectAttempts = 0;
      this.listeners.open.forEach(listener => listener(event));
    };
    
    this.socket.onmessage = (event) => {
      this.listeners.message.forEach(listener => listener(event));
    };
    
    this.socket.onclose = (event) => {
      this.listeners.close.forEach(listener => listener(event));
      this.handleReconnect();
    };
    
    this.socket.onerror = (event) => {
      this.listeners.error.forEach(listener => listener(event));
    };
  }
  
  handleReconnect() {
    if (this.reconnectAttempts < this.maxReconnectAttempts) {
      this.reconnectAttempts++;
      
      const delay = this.reconnectInterval * Math.pow(1.5, this.reconnectAttempts - 1);
      
      setTimeout(() => {
        this.listeners.reconnect.forEach(listener => 
          listener({ attempt: this.reconnectAttempts, maxAttempts: this.maxReconnectAttempts })
        );
        this.connect();
      }, delay);
    }
  }
  
  // Event listener methods
  on(event, callback) {
    if (this.listeners[event]) {
      this.listeners[event].push(callback);
    }
    return this;
  }
  
  send(data) {
    if (this.socket && this.socket.readyState === WebSocket.OPEN) {
      this.socket.send(typeof data === 'string' ? data : JSON.stringify(data));
      return true;
    }
    return false;
  }
  
  close() {
    if (this.socket) {
      this.socket.close();
    }
  }
}

Final Production-Ready Configuration

After addressing all these issues, our final Kubernetes configuration for zero-downtime Next.js deployments looked like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nextjs-app
  namespace: nextjs-production
spec:
  replicas: 5  # Increased for better availability
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: nextjs-app
  template:
    metadata:
      labels:
        app: nextjs-app
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "3000"
        prometheus.io/path: "/api/metrics"
    spec:
      terminationGracePeriodSeconds: 75  # 60s + 15s buffer
      containers:
      - name: nextjs
        image: our-registry/nextjs-app:v2.0.0
        ports:
        - containerPort: 3000
          name: http
        
        # Resource limits
        resources:
          requests:
            cpu: 100m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        
        # Environment variables
        env:
        - name: NODE_ENV
          value: "production"
        
        # Liveness probe - determine if pod is running
        livenessProbe:
          httpGet:
            path: /api/health
            port: 3000
          initialDelaySeconds: 15
          periodSeconds: 20
          timeoutSeconds: 3
          failureThreshold: 3
        
        # Readiness probe - determine if pod can receive traffic
        readinessProbe:
          httpGet:
            path: /api/ready
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 2
        
        # Startup probe - for slow-starting pods
        startupProbe:
          httpGet:
            path: /api/health
            port: 3000
          failureThreshold: 15
          periodSeconds: 5
        
        # Graceful shutdown
        lifecycle:
          preStop:
            exec:
              command:
              - "/bin/sh"
              - "-c"
              - "sleep 15 && kill -SIGTERM 1"
              
      # Affinity to spread pods across nodes
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - nextjs-app
              topologyKey: "kubernetes.io/hostname"
---
apiVersion: v1
kind: Service
metadata:
  name: nextjs-app-service
  namespace: nextjs-production
spec:
  selector:
    app: nextjs-app
  ports:
  - port: 80
    targetPort: 3000
    name: http
  type: ClusterIP

Additionally, for WebSocket and HTTP/2 traffic, we added the Istio VirtualService and DestinationRule configurations mentioned earlier.

Monitoring the Deployment Process

To ensure our rollouts were truly zero-downtime, we implemented comprehensive monitoring:

  1. Added real-time error rate dashboards in Grafana
  2. Implemented synthetic tests that ran continuously during deployments
  3. Set up alerts for any spike in 5xx errors during deployment windows
  4. Created a deployment status page for our team to monitor rollouts

Our monitoring dashboard during deployments looked something like this:

# Prometheus queries for deployment monitoring

# Error rate during deployment
sum(rate(http_requests_total{status=~"5.."}[1m])) by (service) 
  / 
sum(rate(http_requests_total[1m])) by (service)

# Pod termination duration
kube_pod_deletion_timestamp - kube_pod_start_time

# Request latency during deployment
histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[1m])) by (le, service))

# Connection drops
sum(rate(http_connections_closed_total{reason="error"}[1m])) by (service)

Lessons Learned

Our journey to zero-downtime deployments for Next.js on Kubernetes taught us several critical lessons:

  1. Rolling update strategy matters: The default Kubernetes settings aren't always appropriate for your application.
  2. Health checks are essential: Properly configured liveness, readiness, and startup probes make a huge difference.
  3. Graceful termination requires planning: You need to handle both the Kubernetes pod lifecycle and your application shutdown logic.
  4. WebSockets need special handling: Long-lived connections require additional consideration during rolling updates.
  5. Traffic management is complex: For critical applications, a service mesh provides valuable additional control.
  6. Monitor everything: Comprehensive monitoring is the only way to know if your deployments are truly zero-downtime.

Conclusion

After implementing all these changes, our deployment success rate improved dramatically. We went from frequent user-impacting deployments to truly zero-downtime updates. Our 99.9th percentile response times during deployments improved from spikes over 10 seconds to a consistent sub-500ms, even during updates.

Most importantly, our team's confidence in deploying new versions increased significantly. We now deploy multiple times a day without stress, knowing that our users won't experience any disruption.

While achieving zero-downtime deployments for Next.js on Kubernetes requires careful configuration and attention to detail, the result is worth the effort. With the approach outlined in this article, you can create a robust deployment pipeline that ensures your users never experience downtime, even as you continuously improve your application.