Process killed without diagnostic output

A CI process was killed externally, producing little or no diagnostic output.

process-killed-no-logs high confidence runtime

Matched signals

  • Killed
  • signal: killed
  • exit status 137
  • exit code: 137
  • Exit Code: 137
  • exit code 143
  • signal 9
  • SIGKILL

Process killed without diagnostic output

What this failure means

A CI process was killed externally, producing little or no diagnostic output. Exit code 137 (SIGKILL) is the most common indicator. The kill is usually triggered by the OOM killer, a CI job timeout, or a watchdog process terminating a runaway job.

Symptoms

Faultline looks for one or more of these log fragments:

Killed
signal: killed
exit status 137
exit code: 137
Exit Code: 137
exit code 143
signal 9
SIGKILL

Diagnosis

A bare Killed with exit 137 means the process was sent SIGKILL from outside, not from application code. The absence of a stack trace or panic is a key diagnostic clue.

Exit code reference:

  • 137 = 128 + 9 (SIGKILL): OOM killer, container memory limit, or explicit kill -9
  • 143 = 128 + 15 (SIGTERM): CI job timeout, graceful shutdown request
  • 130 = 128 + 2 (SIGINT): Manual cancellation or Ctrl-C in script

Determine the cause:

  1. Check runner memory usage at the time of failure in CI platform dashboards.

  2. Check whether the job time limit was reached:

    # GitHub Actions: look for "The operation was canceled" in the job log
    # GitLab CI: look for "Job ... timed out" in the runner log
    
  3. Check for OOM events in the runner or container:

    # Linux kernel OOM log
    dmesg | grep -i "oom\|killed process"
    journalctl -k | grep -i "oom\|Out of memory"
    

Fix steps

If OOM-killed:

  1. Profile memory usage and reduce peak consumption (see oom-killed playbook for detailed steps).
  2. Increase the runner or container memory limit.
  3. Reduce parallelism: use --workers=1 or equivalent to lower peak memory.

If killed by job timeout:

  1. Identify and fix slow tests or steps that consistently approach the limit.

  2. Split the job into smaller parallel jobs that each complete within budget.

  3. Increase the timeout as a temporary measure while fixing the root cause:

    # GitHub Actions
    jobs:
      build:
        timeout-minutes: 60   # increase from default 6-hour or explicit limit
    
    # GitLab CI
    job_name:
      timeout: 1h
    

If killed by watchdog or concurrency limit:

  1. Ensure the process respects SIGTERM by installing a signal handler and flushing output before exiting.
  2. Check whether CI concurrency limits cancel in-progress jobs when a newer run starts (GitHub Actions concurrency with cancel-in-progress: true).

General:

  • Add resource monitoring to the CI job so memory and CPU trends are visible before the kill occurs.
  • Ensure test output is flushed before the process exits so partial results are not lost.

Validation

  • Re-run the failing job and confirm it completes without being killed.
  • Check that the exit code is 0 rather than 137 or 143.

Why it matters

A silent kill is one of the hardest CI failures to debug. There is no stack trace, the log ends abruptly, and the true cause (memory exhaustion, timeout, or external cancellation) requires out-of-band investigation. Without an explicit signal, teams often misattribute the failure to a flaky test or network issue.

Prevention

  • Set explicit memory and timeout limits so kills are intentional and documented rather than platform-default surprises.
  • Enable dmesg or platform OOM logging in CI environments.
  • Add a health-check script that logs memory and CPU usage during heavy steps.

How Faultline detects it

Use faultline explain process-killed-no-logs to see the full playbook.

faultline analyze build.log
faultline explain process-killed-no-logs

Generated from playbooks/bundled/log/runtime/process-killed-no-logs.yaml. Do not edit directly.

Try it on your own failed log

$ faultline analyze failed.log
Want this across every CI run? Faultline Teams tracks recurring failures across all your repos and surfaces patterns in a shared dashboard.