Service health check failure

The new service instance never became healthy enough for traffic.

health-check-failure high confidence deploy dockerkubernetes

Matched signals

  • health check failed
  • liveness probe failed
  • readiness probe failed
  • readiness probe always fails
  • unhealthy
  • service is not ready
  • waiting for deployment to be ready
  • /health/ready

Service health check failure

What this failure means

The new service instance never became healthy enough for traffic.

Symptoms

Faultline looks for one or more of these log fragments:

health check failed
liveness probe failed
readiness probe failed
readiness probe always fails
unhealthy
service is not ready
waiting for deployment to be ready
/health/ready

Diagnosis

The deploy reached the runtime environment, but the process did not satisfy the configured readiness or liveness checks before the timeout window closed.

If liveness stays green while readiness returns 503, the process usually started but is still blocked on startup work such as migrations, downstream API checks, or dependency initialization.

Common causes:

  • missing configuration or secrets
  • startup migrations failing
  • the app listening on the wrong port
  • a crash loop during bootstrap

Fix steps

  1. Inspect startup logs for the failing container:

    kubectl logs <pod> --previous
    
  2. Inspect the pod description for the probe configuration and recent events:

    kubectl describe pod <pod>
    
  3. Verify the health check path and port match what the application actually serves.

  4. Confirm all required environment variables and secrets are present in the deployment.

  5. Check that dependent services such as databases and caches are reachable from the new pod.

  6. If liveness is 200 while readiness is 503, inspect startup tasks and dependency checks that gate the ready endpoint.

Validation

  • Hit the readiness endpoint directly from a pod or staging environment.
  • Confirm the deployment becomes ready without restarts.
  • Re-run the rollout and watch the probe status until it stabilizes.

Why it matters

Health checks are the deployment gate that protects traffic from a broken release. When they fail, the platform is telling you the new instance is not safely serving requests.

Try it locally

kubectl logs <pod> --previous
kubectl describe pod <pod>
kubectl rollout status deploy/<name>
kubectl get events --sort-by=.lastTimestamp

How Faultline detects it

Use faultline explain health-check-failure to see the full playbook.

faultline analyze build.log
faultline explain health-check-failure

Generated from playbooks/bundled/log/deploy/health-check-failure.yaml. Do not edit directly.

Try it on your own failed log

$ faultline analyze failed.log
Want this across every CI run? Faultline Teams tracks recurring failures across all your repos and surfaces patterns in a shared dashboard.