Kubernetes Observability

Datadog Pricing Explained: Why Your Bill Keeps Growing

Chris Churilo
April 18, 2026
7
min read
Kubernetes Observability

Datadog is one of the most powerful observability platforms available. It is also one of the most expensive and it is also one of the hardest to predict costs for. If you have submitted a budget forecast and seen the actual invoice come in 30–40% higher, you are not misconfiguring anything. The pricing model is genuinely structured in a way that makes total cost of ownership difficult to calculate upfront.

This guide breaks down how Datadog charges, which combinations drive the biggest surprises, and why the bill tends to grow faster than your infrastructure does.

The structure: 20+ products, each billed separately

Datadog offers more than 20 separately priced products. Infrastructure monitoring, APM, log management, database monitoring, LLM observability, security, real user monitoring, CI visibility, AI SRE investigations. Each requires a separate purchase and introduces its own usage-based charges.

Within each product, you pay based on usage volume: per host, per GB ingested, per million log events indexed, per million LLM requests monitored. The bill is the sum of all these dimensions, and they interact with each other in ways that are difficult to model before you have real production data.

According to Datadog's own investor presentations, roughly 50% of customers use fewer than a third of available products. That is not a product quality problem. It is a pricing consequence. When activating a new product means a new budget line and a procurement conversation, teams rationally default to using only what they have already paid for.

"Our Datadog bill is a threat, an ever-growing line item that threatens to consume  what remains of our cloud spend budget."

From a real engineering team's internal job description for a 'Datadog Whisperer':   a role created specifically to reduce Datadog spend.

The five billing dimensions that compound

Understanding the bill means understanding how each major product category charges, and how those charges stack on top of each other.

1. Infrastructure: the base that everything else requires

Infrastructure monitoring charges per host per month: $15 on Pro, $23 on Enterprise. For a 500-node Kubernetes cluster on Enterprise, that is $11,500/month before any logs or traces are sent. Every other Datadog product (APM, database monitoring, network monitoring) requires a compatible infrastructure tier on those same hosts, so the infrastructure bill is a multiplier, not just a line item.

Custom metrics are allotted per host (100 on Pro, 200 on Enterprise). Beyond the allotment, the charge is $1 per 100 additional custom metrics per month, averaged across your entire account. A single service emitting high-cardinality metrics (tagged by user ID, request ID, or pod name) can push your whole account over the allotment without any deliberate change to instrumentation.

2. APM: per host, on top of infrastructure, with ingestion limits

APM costs $31–$40 per host per month in addition to infrastructure (or $36–$47 standalone). Each APM host includes an allotment of 150 GB of ingested spans and 1 million indexed spans per month. Overages are charged at $0.10/GB for additional span ingestion and $1.27–$2.50 per million additional indexed spans, depending on the retention period.

In practice, most teams running Datadog APM enable trace sampling at 25–50% to stay within their ingestion allotment. The consequence is that during an incident, when you need complete trace coverage most, you are working from an incomplete picture.

3. Logs: ingestion, indexing, and retention are three separate charges

Log billing is where the largest surprises occur, because it layers three distinct cost dimensions:

  • Ingestion: $0.10/GB of uncompressed data received by Datadog, regardless of whether those logs are indexed or immediately discarded. If 85% of your logs are filtered out after ingestion, you still pay for 100% of the volume that arrived.
  • Standard Indexing: $1.70 per million log events, for logs that are searchable and can trigger monitors. Default retention is 15 days. Extending to 30 days requires a plan change; anything longer means a sales conversation.
  • Flex Logs: A cheaper storage tier at $0.05/million events with retention up to 15 months. But Flex Logs removes support for monitors and Watchdog Insights. Teams that route logs to Flex to save money lose the ability to alert on those logs in real time, creating a forced choice between cost and coverage.

There is also a rehydration cost that catches teams off guard: when archived logs are pulled back into Datadog for analysis, the charge is $0.10 per compressed GB scanned, not per GB retrieved. If you need to find a specific event in a large archive, you pay for the full scan even if you retrieve only a few lines. This cost arrives at exactly the moment of an incident, when you least want a surprise.

4. The on-demand surcharge

Every Datadog product carries an on-demand rate approximately 50% higher than the annual committed price. Any usage above your committed volume (an incident-driven log spike, a traffic surge, a deployment that briefly inflates host count) is billed at the on-demand rate automatically. The annual committed rate is a floor, not a ceiling.

5. Each new product adds a new unpredictable variable

The product catalog keeps expanding. LLM Observability is now priced at $8 per 10,000 monitored LLM requests per month. The newest AI product, Bits AI SRE Investigations, runs $500/month per 20 investigations, a standalone charge with no bundling into existing contracts. Every new capability that teams want to adopt requires a separate evaluation of what it will cost at their scale.

What the log bill looks like in practice

Setup: 7 TB/day of logs. Only 15% indexed by Datadog.
 
What Datadog charges:
  Ingestion fee (100% of data, even what's discarded):
    7,000 GB/day × 30 days × $0.10/GB  =  $21,000/month
 
  Indexing fee (the 15% you actually keep and search):
    charged per million log events at $1.70/million
    15-day default retention; 30+ days requires a sales call
 
  Result: you pay to ingest logs you never see,
          then pay again to index the ones you keep.
 
Same data stored in your own S3 (via groundcover BYOC):
  210,000 GB × $0.023/GB × ~10% compression  =  ~$483/month
  100% of logs retained. Always queryable. No rehydration.
 
The original notes from a real groundcover customer: $17,850/month
was spent on logs that went straight to the trash.

Why the total is hard to predict

None of the individual rates are secret. They are listed on the pricing page. The unpredictability comes from the fact that the inputs driving each dimension change independently, and not always in ways that are visible before the invoice arrives.

The variables that make Datadog bills hard to forecast:

→ Host count fluctuates daily with Kubernetes auto-scaling

→ Log volume spikes during incidents, precisely when you need it most

→ Custom metric cardinality grows silently when developers add tags

→ APM span volume scales with traffic; overages hit at $0.10/GB

→ LLM request volume grows with every new AI feature shipped

→ On-demand surcharge (~50%) applies automatically to any excess

The compounding effect is the core problem. Teams size their contracts based on current usage. Six months later, infrastructure has grown, a high-cardinality service has been added, an AI feature has shipped, and the bill reflects the product of all those changes multiplied across every dimension simultaneously.

What teams typically do about it

There are three patterns that emerge when teams try to control a growing Datadog bill, and each one trades observability coverage for cost.

The first is filtering and sampling: drop logs below a severity threshold before they reach Datadog, sample traces to 25–50%, and enforce cardinality limits on metric tags. This works, but it means you may not have the log line or trace that explains the next incident.

The second is selective activation: only use the Datadog products already under contract and resist expanding to others. Route lower-priority logs to Flex to reduce indexing costs, accepting that you can no longer alert on them in real time. This is now the modal approach for mature Datadog customers, and it means deliberately using a fraction of the platform you are paying for.

The third is rearchitecting around the data storage model entirely. The root cause of the bill (and the reason the two approaches above are necessary) is that Datadog stores your data in their cloud and charges you a margin on every GB that flows through. Teams evaluating alternatives are increasingly looking at architectures where observability data stays in their own cloud account, eliminating the per-data-unit charge at the source.

A different pricing model

The structural reason Datadog's bill is hard to control is that the business model is built on data volume. The more telemetry you send, the more Datadog charges. That creates a direct misalignment: the platform is most valuable when you send everything, but the pricing penalizes you for doing so.

groundcover is built on a Bring Your Own Cloud (BYOC) architecture that inverts this. Because observability data lives in the customer's own S3, not in groundcover's cloud, there is no per-GB, per-event, or per-request charge. groundcover charges a flat per-node rate that covers the full platform: APM, logs, metrics, distributed traces, LLM observability, and every feature released going forward.

What changes when there is no per-byte charge:

→ 100% of customers use APM (vs. ~25% on Datadog)

→ Log retention defaults to 60–365 days, no sales conversation required

→ All logs are always queryable (no Flex trade-offs, no rehydration costs)

→ Trace sampling is off by default, full production coverage

→ LLM observability and AI features are included, no per-request billing

→ Teams send 5–10x more telemetry data than they did on Datadog

The pricing model determines how much visibility a team actually has into their systems, not just what they could have in theory. A model that charges per data unit creates a structural incentive to send less data. One that charges per node creates an incentive to instrument everything.

Side-by-side: where the cost structure differs

The table below covers the dimensions most likely to produce a surprise on a Datadog invoice. For a full feature and pricing comparison, see the groundcover vs. Datadog guide.

| Dimension | Datadog | groundcover | | ------------------- | ----------------------------------------------------------------- | --------------------------------------------- | | Log ingestion | $0.10/GB, charged even for logs you discard | Stored in your S3; no per-GB ingestion charge | | Log retention | 15 days default; 30+ days = sales call; monitors lost on Flex | 60–365 days, fully queryable, no extra cost | | Log rehydration | $0.10/compressed GB scanned. Pay for the search, not just results | No rehydration; logs are always live | | APM / traces | $31–$40/host/month; overage charged per GB and per million spans | Included. All hosts. No sampling required. | | Custom metrics | 100–200/host allotted; overages at $1/100; spikes silently | Included; no cardinality budget | | On-demand surcharge | ~50% premium over annual rate on every product, automatically | No on-demand penalty | | Feature access | 20+ products, each contracted and billed separately | One SKU, all features included | | Where data lives | Datadog's cloud; you pay their margin on every GB retained | Your cloud; you pay S3 rates directly |

To put this in concrete terms: one team running ~700 Kubernetes nodes with 5 TB/day of logs and 500K custom metrics was paying $2.54M/year on Datadog. The same setup on groundcover cost $297K/year, an 87% reduction, with full APM enabled and no trace sampling.

Next steps

If you want to benchmark your current Datadog spend, the groundcover vs. Datadog comparison guide covers a full feature-by-feature and cost breakdown. If you are actively evaluating a switch, the migration guide walks through data parity, automated dashboard migration, and running both platforms in parallel before committing.

groundcover has a free trial with unlimited seats, all features, full BYOC, so you can run it against production data alongside your existing setup.

Sign up for Updates

Keep up with all things cloud-native observability.

We care about data. Check out our privacy policy.

Observability
for what comes next.

Start in minutes. No migrations. No data leaving your infrastructure. No surprises on the bill.