If applications were dogs living in yards secured by invisible fences, exit code 139 in Kubernetes would be the equivalent of a dog getting zapped by his electric collar when he tries to step into a neighbor's yard. It's not a pleasant event, neither for dogs nor for Kubernetes applications, but it's necessary to prevent potentially bigger issues – such as massive dog fights or programs trying to overwrite each other's memory space and causing a server to crash.

Now, we're not here to take a stance on whether invisible fences and electric collars are the best way to manage your pets. But what we can do is give you advice on how to deal with code 139 events in Kubernetes – which you need to do if you want to keep your applications running smoothly as part of an effective Kubernetes troubleshooting strategy.

What is Exit Code 139?

In Kubernetes, exit code 139 is an event that occurs when a container receives the SIGSEGV signal from the operating system on its host node.

In Linux and other Unix-like operating systems, SIGSEGV is a type of forced termination signal that tells a process to shut down. The operating system typically generates this signal when it detects a process that is trying to access system memory that either doesn't exist or that the process lacks permission to access – an event known as a segmentation fault (or just segfault, as die-hard Linux geeks like to put it).

If a container receives SIGSEGV, it will usually terminate. That's undesirable, of course, because you typically don't want your containers to shut down unless you decide to shut them down. But the alternative to SIGSEGV is potentially having your entire server crash due to multiple processes trying to use the same memory address – which would be like all of the dogs in the neighborhood rushing into the same yard to brawl. It would be chaos, and everything would stop working because no container would be able to access memory reliably.

So, the operating system sends SIGSEGV error in an effort to prevent a much bigger problem.

How to Identify Exit Code 139

You can detect whether a container stopped due to exit code 139 by running the command docker ps -a. The output will look like this:

As you can see, the STATUS column tells us the container exited with code 139. 

Exit Code 139 vs. Exit Code 143

Exit events involving code 139 are similar to exit code 143 errors in that both types of events typically cause a container to shut down. However, the underlying reason for the shutdown is different.

With exit code 143, your container shuts down because it receives the SIGTERM signal. SIGTERM is a signal the operating system can use to request that a process shut down (although unlike SIGSEGV error, it doesn't force the process to shut down). SIGTERM events may take place because a container is moving to a different node, or because a node is running out of resources and a low-priority container needs to shut down so that other containers won't crash.

This means that, on the whole, code 139 may be the sign of a worse problem, like issues with your application code, than exit code 143. In many cases, exit code 143 doesn't result from a problem at all; it's an event that occurs normally during the process of scheduling and rescheduling containers. But memory access issues don't occur normally, so code 139 is especially important to troubleshoot.

Common Causes of Exit Code 139

Although the underlying cause of exit code 139 – a segmentation error – is always the same, there are several specific conditions that could trigger this type of event. Here's a look at the most common.

Error code 139 causes and fixes 

Problem Description How to fix
Incompatible libraries Application is using a different
version of a library than intended.
Update configuration to point
the right library version.
Coding errors Programming errors cause application
to reference memory incorrectly.
Debug the application to find
the faulty code, then update it.
Hardware issues Incompatibility between memory
subsystems and an application.
Check physical memory for problems.
Migrate to a different server if necessary.

Incompatible Libraries

Applications often depend on libraries, which are collections of prewritten code that applications "borrow" when they execute.

If the application in a container is configured to use a library that is incompatible with it, you might experience segmentation fault issues and see code 139 because the library the application is using doesn't manage memory in the same way as the application. This type of conflict happens most often if you update an application but forget to update the version of the library associated with it.

Coding Errors

A programming error that affects application source code is another common cause of exit code 139. If your application code includes instructions to write to memory that the application can't access, the operating system will react to the event with a SIGSEGV signal.

Coding errors are especially common for applications written in languages that lack built-in memory protection, like C. These languages don't have a native mechanism for preventing programs from trying to access memory that they are not supposed to access; instead, it's up to developers to ensure that they don't make that kind of error when writing code.

Hardware Issues

Incompatibility between memory subsystems and the applications you are running could trigger code 139. This type of issue is rare when you're running applications on modern, x86-based servers, but you may encounter it if you’re using specialized hardware to host your applications, or if your physical memory is faulty.

Typically, when hardware issues are the root cause of code 139, you'll experience the error across multiple libraries. So, if you’re seeing repeated segmentation error events, your hardware may be the culprit.

How to Fix Exit Code 139

To fix code 139 exits, you must first determine what the root cause of the error is. You can follow these steps to work through the various possibilities and gain more information.

Step 1: Check the kernel logs

Segmentation violation events are typically recorded in the /var/log/messages file of the operating system that initiates the events. Thus, when you see code 139, log into the node that was hosting the failed container and check its /var/log/messages file or the equivalent. (Note that on some Linux distributions, including modern versions of Ubuntu, this file is located at /var/log/syslog, not /var/log/messages. Note also that viewing this file may require root access.)

You should see information about an event that corresponds to the segmentation violation, such as:

Unfortunately, this information will rarely tell you exactly why the segmentation violation happened, but it does specify the process associated with it. In addition, the log may include data about other segfault events that could provide clues to help you troubleshoot; for example, if many processes are experiencing segmentation fault issues, you may have a hardware incompatibility problem.

Step 2: Debug the Application

If only one specific container experiences segmentation fault issues, you most likely have a problem with the code in the container's image. In that case, you can take steps to debug the application and confirm that buggy code is the issue.

Details on how to debug applications for segfault issues are beyond the scope of this article, but suffice it to say that in general, you'd load the application's binary file into a debugging tool (like GNU Debugger), then perform a backtrace. The backtrace shows information about what happens within the application leading up to the segmentation error. If you can identify a specific function call linked to the segfault, you can then use that insight to figure out which part of your application code you need to fix to prevent recurring code 139 events.

Step 3: Inspect Memory

Examining the way your system is using memory may also provide clues about exit code 139 causes and what is triggering a memory violation. On Linux, you can run:

To view information about how the system is using memory in general. You can also run top to monitor memory usage by individual processes.

Also consider using a tool to run a check of your physical memory hardware in order to rule out hardware failure as a cause of strange memory violation behavior.

Step 4: Add Code to Handle Segfaults Gracefully

If you still can't figure out why your container keeps exiting with error code 139, it may be possible to modify your application such that it will recover from SIGSEGV signals gracefully, without crashing.

Here again, this is a complex topic that requires a lot of knowledge about the programming language you are working with. But as an example, we'll point to this great code from Morten Piibeleht, which shows how a C program could handle some types of SIGSEGV signals (specifically, those triggered by an invalid pointer) gracefully:

If your application is written in C, and you add this code to it and then you rebuild the application, it should be able to handle segfault events gracefully instead of crashing.

This isn't a fix for exit code 139 as much as it's a Band-Aid. But if you've tried everything else and you just want your applications not to crash, this approach is worth a try.

Key Takeaways

To sum up, exit code 139 happens when a container receives the SIGSEGV signal, which instructs it to shut down due to a memory violation issue. The signal exists to prevent a process from destabilizing an entire server.

The specific causes of exit code 139 can be difficult to track down because the operating system doesn't usually produce much information about why a SIGSEGV signal was sent. But you can troubleshoot effectively by looking at operating system logs to determine whether this type of problem is associated with just one container or process (in which case it's likely triggered by buggy code), or is happening across the server (which means you more likely have a hardware issue).

If you do suspect buggy code, debugging the application may help you to pinpoint where you need to update your code. And if all else fails, it may be possible to build logic into your app that lets it handle SIGSEGV events gracefully.

FAQS

Here are answers to common questions about CrashLoopBackOff

How do I delete CrashLoopBackoff Pod?

To delete a Pod that is stuck in a CrashLoopBackOff, run:

kubectl delete pods pod-name

If the Pod won't delete – which can happen for various reasons, such as the Pod being bound to a persistent storage volume – you can run this command with the --force flag to force deletion. This tells Kubernetes to ignore errors and warnings when deleting the Pod.

How do I fix CrashLoopBackoff without logs?

If you don't have Pod or container logs, you can troubleshoot CrashLoopBackOff using the command:

kubectl describe pod pod-name

The output will include information that allows you to confirm that a CrashLoopBackOff error has occurred. In addition, the output may provide clues about why the error occurred – such as a failure to pull the container image or connect to a certain resource.

If you're still not sure what's causing the error, you can use the other troubleshooting methods described above – such as checking DNS settings and environment variables – to troubleshoot CrashLoopBackOff without having logs.

Once you determine the cause of the error, fixing it is as easy as resolving the issue. For example, if you have a misconfigured file, simply update the file.

How do I fix CrashLoopBackOff containers with unready status?

If a container experiences a CrashLoopBackOff and is in the unready state, it means that it failed a readiness probe – a type of health check Kubernetes uses to determine whether a container is ready to receive traffic.

In some cases, the cause of this issue is simply that the health check is misconfigured, and Kubernetes therefore deems the container unready even if there is not actually a problem. To determine whether this might be the root cause of your issue, check which command (or commands) are run as part of the readiness check. This is defined in the container spec of the YAML file for the Pod. Make sure the readiness checks are not attempting to connect to resources that don't actually exist.

If your readiness probe is properly configured, you can investigate further by running:

kubectl get events

This will show events related to the Pod, including information about changes to its status. You can use this data to figure out how far the Pod progressed before getting stuck in the unready status. For example, if its container images were pulled successfully, you'll see that.

You can also run the following command to get further information about the Pod's configuration:

kubectl describe pod pod-name

Checking Pod logs, too, may provide insights related to why it's unready.

For further guidance, check out our guide to Kubernetes readiness probes.

Sign up for Updates

Keep up with all things cloud-native observability.

We care about data. Check out our privacy policy.

We care about data. Check out our privacy policy.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.