Node Disk Pressure in Kubernetes: Causes, Detection, and Fixes
.png)
Key Takeaways
- Node disk pressure happens when a Kubernetes node runs low on disk space, triggering pod evictions and stopping new workloads from being scheduled, which can quickly destabilize a cluster.
- It’s mainly caused by buildup from container images, logs, and ephemeral storage, especially in high-churn or data-heavy environments like CI pipelines and microservices.
- Kubernetes detects this through kubelet thresholds on disk space and inodes, marking nodes with DiskPressure=True and automatically reclaiming space by evicting lower-priority pods first.
- Diagnosing the issue requires combining Kubernetes signals (node conditions) with actual node-level inspection to find what’s consuming storage (e.g., logs, images, or volumes).
- Preventing disk pressure depends on proactive controls like log rotation, image cleanup, storage limits, and monitoring—so issues are caught early before they impact reliability.
As Kubernetes scales to meet the demands of AI and distributed microservices, the volume of logs, container images, and ephemeral data has reached an all-time high. According to the 2025 CNCF Annual Survey, with 82% of organizations now running Kubernetes in production, effective storage management is no longer optional - it is a prerequisite for cluster stability. Node disk pressure is a critical signal that a node’s storage is exhausted, a condition that, if ignored, triggers aggressive pod evictions and halts new scheduling.
When a Kubernetes node runs low on available disk space, the kubelet marks the node with the Node disk pressure condition. This state shows that the node cannot safely schedule additional workloads because disk usage has exceeded safe thresholds. If left unresolved, disk pressure can lead to pod evictions, degraded application performance, and even cluster instability.
Understanding node disk pressure, how Kubernetes detects it, and how to resolve it matters a lot when you're trying to keep production clusters from falling over. This guide walks through the causes, how detection actually works, and what good prevention looks like in practice.
What Is Node Disk Pressure in Kubernetes
Node disk pressure is a node condition in Kubernetes indicating that the node is running out of disk space or ephemeral storage resources. Kubernetes monitors disk usage through the kubelet, which continuously evaluates disk consumption for container images, log files, and ephemeral storage. When disk usage exceeds configured thresholds, Kubernetes sets the node condition:
What This Means in Practice
- The node has insufficient free disk space
- Kubernetes may evict pods to reclaim storage
- The scheduler may stop placing new pods on the node
- Cluster stability may degrade if disk pressure continues
.png)
Resources Contributing to Disk Pressure
- Container image layers
- Container log files
- Ephemeral volumes
- EmptyDir volumes
- Image cache
- Temporary files from workloads
Disk pressure is particularly common in clusters running high-volume logging, CI pipelines, data processing workloads, or microservices architectures.
How Kubernetes Detects and Signals Node Disk Pressure
Kubernetes detects disk pressure through kubelet eviction signals and resource thresholds. The kubelet monitors several filesystem metrics, including:
- Node filesystem usage
- Image filesystem usage
- Available ephemeral storage
- Disk inode availability
When these values fall below defined thresholds, the kubelet marks the node with
Disk Pressure Signals Monitored by kubelet
Example: Viewing Node Conditions
Example output:
When disk pressure occurs, Kubernetes activates eviction policies to reclaim space.
Symptoms and Impact of Node Disk Pressure on Cluster Workloads
Node disk pressure creates several operational issues in Kubernetes clusters.
Common Symptoms
- Pods entering Evicted status
- New pods failing to schedule
- Containers restarting repeatedly
- Increased application latency
- Log ingestion failures
- Node instability
Impact on Cluster Behavior
- Pod Evictions: Kubernetes removes pods consuming ephemeral storage.
- Scheduling Restrictions: The scheduler avoids nodes under disk pressure.
- Reduced Reliability: Critical workloads may fail to start.
- Observability Gap: Log pipelines may fail if log files consume disk space.
Without proactive monitoring, disk pressure can escalate quickly in large clusters.
Common Causes of Node Disk Pressure in Kubernetes Nodes
Disk pressure typically results from uncontrolled growth of container artifacts, logs, or ephemeral storage.
.png)
Understanding these root causes helps teams design effective remediation strategies.
How Node Disk Pressure Triggers Pod Evictions and Scheduling Failures
When disk pressure occurs, Kubernetes activates the eviction manager. The eviction manager attempts to reclaim disk space by removing pods based on QoS class and resource usage.
Pod Eviction Priority
Pods are evicted in the following order:
- BestEffort pods
- Burstable pods
- Guaranteed pods (last)
Pods consuming high ephemeral storage are evicted first.
Scheduler Behavior
When DiskPressure=True:
- The scheduler avoids the node
- New pods are scheduled on other nodes
- If no nodes are available, pods remain Pending
Example event message:
How to Diagnose Node Disk Pressure in Kubernetes Clusters
Diagnosing node disk pressure in Kubernetes means looking at two things together: the node conditions Kubernetes is reporting and what's actually consuming storage on the affected nodes themselves. Disk pressure happens when a node burns through its available disk space or ephemeral storage, so the job is figuring out which nodes are in trouble, how their storage is actually being used, and which workloads or leftover artifacts are the ones eating up all the space.
In practice, you work through this in a pretty logical order: you start by checking what Kubernetes thinks is wrong at the node condition level, then you get into the node's filesystem directly to see the real usage numbers, and finally you narrow it down to the specific culprits whether that's bloated container images, runaway log files, or volumes that have grown way beyond what anyone expected.
Checking Node Disk Pressure Status with Kubectl
The most direct way to detect node disk pressure is by checking node conditions using kubectl. Kubernetes nodes report conditions like DiskPressure, MemoryPressure, and PIDPressure, which are updated by the kubelet. When disk usage exceeds configured thresholds, the kubelet sets DiskPressure=True, indicating that the node is running low on available disk space.
Use the following command:
Output example:
Detailed inspection:
Look for:
You can also inspect node conditions via JSON:
Inspecting Disk Usage on Affected Nodes
After identifying a node under disk pressure, the next step is to inspect the node’s disk usage to determine where space is being consumed. Kubernetes workloads store container images, logs, and ephemeral data on the node filesystem, so reviewing filesystem utilization helps identify which directories or partitions are nearing capacity and contributing to the reduced free disk space.
SSH into the affected node and check filesystem usage.
Example output:
Check inode usage:
High inode usage can also trigger disk pressure.
Identifying Images, Logs, and Volumes Causing Node Disk Pressure
The next step is identifying the workloads or artifacts causing the disk pressure. Common contributors include accumulated container images, large or unrotated log files, and excessive ephemeral storage used by pods. Workloads using emptyDir volumes or generating large temporary files can also consume significant disk space, making it important to pinpoint the exact source of the storage usage.
.png)
Identify Large Container Images
or
Check Container Logs
or
Check kubelet Directories
Inspect Ephemeral Storage Usage
Look for:
How to Fix Node Disk Pressure in Kubernetes Environments
Fixing node disk pressure in Kubernetes environments involves freeing up disk space on affected nodes and addressing the sources of high disk usage. This can include removing unused container images, cleaning up log files, deleting temporary data from ephemeral storage, or expanding the node’s disk capacity to restore normal cluster operations.
Immediate Remediation Steps
- Remove unused container images
- Clean container logs
- Delete unused pods
- Restart kubelet if required
- Expand the node disk volume if the infrastructure allows.
Cluster-Level Solutions
- Increase node storage
- Use centralized log aggregation
- Reduce image size
- Implement automatic image garbage collection
How to Prevent Node Disk Pressure with Kubelet Configuration and Resource Limits
Preventing disk pressure requires proper kubelet configuration and resource limits.
Example kubelet configuration:
This instructs kubelet to start evicting pods before disk space becomes critically low.
Configure Ephemeral Storage Limits
Example pod configuration:
This prevents individual pods from consuming excessive disk space.
Configure Log Rotation
These settings prevent uncontrolled log file growth.
Best Practices for Avoiding Node Disk Pressure in Production Clusters
Maintaining stable storage availability on Kubernetes nodes requires proactive operational practices and well-defined resource management policies. Applying consistent storage governance across workloads helps ensure nodes remain healthy and prevents disruptions caused by disk resource exhaustion.
Following these practices significantly reduces disk pressure incidents.
Real-Time Visibility into Node Disk Pressure with groundcover
Modern Kubernetes observability platforms help teams detect and troubleshoot node disk pressure before it impacts application performance or cluster stability. groundcover provides eBPF-based Kubernetes observability, combining kernel-level telemetry with Kubernetes metrics to give teams real-time visibility into node resources, container behavior, and storage usage across the cluster.
- Real-Time Disk Usage Monitoring Across Nodes and Containers: Platform teams can monitor node resource utilization and storage trends to identify disk pressure risks early using groundcover’s Kubernetes observability platform.
- Automatic Detection of Resource Anomalies Affecting Cluster Health: groundcover provides real-time insights and alerts that help teams detect abnormal storage usage patterns and infrastructure issues before they escalate.
- Deep Container Insights without Requiring Intrusive Instrumentation: Using eBPF-based telemetry, groundcover collects infrastructure and application signals directly from the kernel without requiring code instrumentation or sidecars.
- Faster Troubleshooting of Storage Bottlenecks: By correlating Kubernetes events, container metrics, and infrastructure telemetry, teams can quickly identify workloads responsible for abnormal disk consumption.
Because groundcover uses eBPF-based telemetry, it captures low-level system activity while maintaining minimal overhead on production clusters. This enables DevOps and platform teams to quickly determine:
- which pods are consuming excessive disk space
- which nodes are approaching disk pressure thresholds
- which workloads generate large volumes of logs or ephemeral storage
With this level of real-time visibility, teams can detect and resolve disk pressure issues earlier, improving the reliability and stability of Kubernetes workloads.
Conclusion
Node disk pressure is one of the most common operational issues affecting Kubernetes clusters. When nodes run low on disk space, Kubernetes triggers eviction policies and restricts scheduling to protect cluster stability. Understanding the causes, detection signals, and remediation strategies is essential for maintaining healthy Kubernetes environments.
By implementing proper kubelet configurations, ephemeral storage limits, log rotation policies, and observability tools, teams can proactively manage disk resources and prevent node disk pressure from disrupting workloads. Modern observability platforms such as groundcover provide the real-time insights required to detect disk pressure early and maintain reliable Kubernetes operations.















