Bedrock AI pricing: $3.7M saved across two years.
A mid-cap life sciences company restructured its AWS Bedrock model commitments, Provisioned Throughput allocations, and inference overlay terms after a 9x increase in AI workload spend over six months. The result: a 41% net effective reduction across the full AI footprint.
Numbers that speak.
Two-year Bedrock savings
Cumulative reduction against pre-negotiation Bedrock and Provisioned Throughput run rate.
Net effective discount
Combined model commitment, PT overlay, and EDP service overlay.
Negotiation cycle
From AI usage baseline through signed Bedrock addendum and EDP amendment.
Throughput protection
Reserved inference capacity for the validated production models.
The starting position.
The customer had quietly become one of AWS Bedrock's larger life sciences accounts. Their molecular property prediction pipeline ran continuous inference against Claude and a fine-tuned Anthropic model, and a second drug-target identification system used Bedrock Knowledge Bases over a 14TB scientific literature corpus. In the six months between Q1 and Q3, Bedrock spend had climbed from a $42K monthly pilot to a $410K monthly production load. The original EDP — signed two years earlier when Bedrock was barely on AWS's price sheet — had no Bedrock overlay, no Provisioned Throughput pricing, and no Knowledge Bases discount structure.
The customer's CFO had flagged the run rate as unsustainable. Engineering pushed back — the inference workload was scientifically critical and could not be paused. The procurement team had asked AWS for a Bedrock-specific overlay and had received a stock answer: pay-as-you-go pricing was “the program standard” for generative AI services. That answer is technically accurate but operationally negotiable for accounts spending more than $200K monthly on Bedrock.
What the customer needed
- A Bedrock model-commitment structure that priced the actual inference profile
- Provisioned Throughput allocation for the validated production models, with overage protection
- A Knowledge Bases overlay that addressed the embeddings, storage, and retrieval cost split
- An EDP amendment to bring Bedrock spend under the existing enterprise discount structure
How we negotiated this.
Drawing on the firm's $2.4B+ AWS spend reviewed and 500+ engagements, the team built a model-level inference forecast across all production and pre-production Bedrock workloads. The forecast distinguished between baseline traffic, peak research bursts, and experimental capacity, because each pattern prices differently under Provisioned Throughput.
Phase 1 — Inference baseline and benchmark (weeks 1-3)
The first three weeks built the inference baseline. We instrumented Bedrock usage at the model-version level — tokens in, tokens out, average concurrency, and p95 latency — for the previous 90 days. The baseline produced three usage profiles: continuous high-volume inference (the molecular prediction pipeline), bursty research traffic (the drug-target identification system), and experimental low-volume calls (the data science sandbox). Each profile maps to a different optimal commercial structure.
In parallel we benchmarked the Bedrock commitment terms across nine comparable life sciences and biotech EDP accounts in the $4M to $12M Bedrock spend range. The benchmark established that customers in this tier were achieving 28% to 47% effective discounts through model-commitment plus Provisioned Throughput plus EDP overlay combinations.
Phase 2 — Open the Bedrock addendum (weeks 4-6)
The opening ask was a 24-month Bedrock model commitment at 80% of baseline volume, Provisioned Throughput for the two production models at a 35% discount versus on-demand, a Knowledge Bases overlay covering both embeddings and OpenSearch Serverless storage, and an EDP amendment to roll all Bedrock spend into the enterprise discount calculation. AWS's initial counter was a Bedrock commitment at 95% of baseline, PT at a 22% discount, no Knowledge Bases overlay, and EDP inclusion only for non-PT Bedrock spend.
Phase 3 — Close (weeks 7-9)
The closing weeks ran through three escalations. The PT discount moved to 32% after we presented the customer's GCP Vertex AI counterquote (genuinely solicited, used as a forecast disclosure rather than a threat). The Knowledge Bases overlay was accepted after we documented that the OpenSearch Serverless component was effectively a hidden Bedrock dependency and should not be priced separately. The final agreement landed with the Bedrock commitment at 82% baseline volume, PT at 32% discount, Knowledge Bases overlay covering 70% of the embeddings and storage cost, and full EDP inclusion. Legal redlines added an additional ten business days. Countersignature landed on day 63.
What the customer actually achieved.
The restructured Bedrock commercial framework produced $3.7M in two-year savings versus the trajectory the customer was on. The savings break down across four buckets and represent 41% of the original projected Bedrock run rate.
Where the savings came from
- Bedrock model commitment — $1.6M from pricing the 82% baseline commitment versus on-demand, with the upside captured through PT
- Provisioned Throughput overlay — $1.2M from the 32% PT discount on the two production model workloads
- Knowledge Bases and OpenSearch overlay — $640K from the 70% storage-and-embeddings overlay across the 14TB scientific corpus
- EDP discount cascade — $260K incremental from rolling Bedrock spend into the enterprise discount calculation, raising the customer above the next EDP tier threshold
The throughput protection clause
The most operationally important provision is the throughput-protection clause. It guarantees three times the customer's baseline inference capacity at the negotiated PT rate, available within 30 days of written request. This addresses the operational risk that originally drove the negotiation — the engineering team could not pause production inference if AWS throttled the account. The clause makes the commercial structure compatible with the scientific timeline of drug discovery, which cannot be deferred for procurement reasons.
What the customer did with the savings
The $3.7M was redirected approximately as follows. Roughly $1.6M funded a fine-tuning program against a proprietary biomedical corpus, which the customer had previously been running as a contained experiment. About $1.1M went to expanding the molecular property prediction pipeline to two additional therapeutic areas. The remaining $1M was returned to the cloud budget to offset other workload growth through the end of the fiscal year. The investment in the fine-tuning program is expected to produce a second-order saving in subsequent years as the proprietary model reduces per-inference token consumption.
“We were a Bedrock pilot one quarter and a top-tier Bedrock customer the next. The negotiation locked in pricing that matched our actual research load and protected the inference capacity our scientists depend on.”
Other AWS AI and EDP outcomes.
Get the same outcome.
500+ engagements. $340M+ documented client savings. We build your Bedrock and EDP negotiation strategy within 48 hours of kickoff.