Lambda ARM Graviton Cost Savings: The Easiest Serverless Discount
Switching a Lambda function from x86 to ARM Graviton2 is often a one-line configuration change that cuts the per-GB-second rate and frequently improves performance too. Here is how the savings stack up and how to migrate without surprises.
Of all the ways to cut AWS Lambda cost, moving functions to ARM Graviton2 is the one with the best effort-to-reward ratio. AWS prices the ARM architecture below x86 on a per-GB-second basis, and for many workloads Graviton2 also runs faster, which compounds the saving because Lambda bills on duration. For a large class of functions the migration is a single architecture-flag change. Understanding the Lambda ARM Graviton cost savings — and the handful of cases where it is not a free lunch — lets you capture an easy double-digit reduction across your serverless estate.
The figures here are directional, drawn from the same practice behind $2.4B+ in AWS spend reviewed; confirm the current rates on the AWS Lambda pricing page for your region. The structure of the saving, however, is stable and worth understanding before you migrate.
Where the saving comes from
Lambda cost is, in essence, allocated memory multiplied by execution duration, billed per GB-second, plus a per-request charge. Graviton2 reduces the per-GB-second rate relative to x86 — that is the first, guaranteed component of the saving and it applies to every invocation. The second component is performance: many workloads, particularly compute-bound ones, complete faster on Graviton2, which shortens billed duration. A lower rate on a shorter duration multiplies into a saving larger than the headline rate cut alone. The combined effect is why Graviton is the first lever to reach for on Lambda cost.
| Source | Effect | Applies to |
|---|---|---|
| Lower per-GB-second rate | Direct rate cut | Every invocation |
| Faster execution | Shorter billed duration | Compute-bound functions |
| Combined | Rate cut × duration cut | Most functions |
The migration is usually trivial
For functions written in interpreted or managed runtimes — Python, Node.js, Java, Ruby, .NET — switching to ARM is typically just changing the function's architecture setting, because AWS provides ARM builds of those runtimes. No code changes are required in the common case. The exceptions are functions that depend on architecture-specific native binaries, compiled dependencies, or container images built for x86; those need an ARM build of the dependency or image. The right approach is to migrate the easy majority first, capture the saving immediately, and handle the native-dependency stragglers as a follow-up.
Treat Graviton migration as a sweep, not a project: flip every function that has no native dependency today, then chase the handful that do.
Testing before you commit
Even though most migrations are clean, validate before rolling out broadly. Run the function on ARM in a staging alias, confirm correctness, and compare duration against the x86 baseline — the performance gain varies by workload, and a small number of functions may not improve or could need a memory adjustment to hit the same speed. Because Lambda lets you weight traffic across versions, you can shift a small percentage to the ARM build, watch metrics, and roll forward with confidence. This is standard practice in the broader Lambda pricing optimization playbook.
Stacking Graviton with other commitments
Graviton savings are a rate optimization, and they stack with commitment-based discounts. Lambda spend is covered by Compute Savings Plans, so after you migrate functions to ARM and shrink the per-GB-second cost, a Savings Plan can discount that lower baseline further. The Savings Plans for Lambda guide covers how the commitment applies to serverless spend. The correct order is the same as everywhere: optimize first — migrate to Graviton, right-size memory — then commit, so you commit to an efficient baseline rather than locking in waste. The broader AWS Graviton savings analysis shows how this same architecture shift plays out across EC2 and containers, not just Lambda.
A worked example
Take a fleet of data-processing functions running steadily on x86, compute-bound, with no native dependencies. Flip them to ARM and each invocation immediately bills at the lower per-GB-second rate; because the work is compute-bound, many also complete faster, shortening duration. The combined rate-and-duration effect cuts the functions' cost by a meaningful double-digit percentage with zero code change. Layer a Compute Savings Plan on the new, lower baseline and the steady portion of the spend drops again. None of this changed what the functions do — only what they cost.
Handling the native-dependency stragglers
After the easy majority of functions are flipped to ARM, a minority will remain because they depend on architecture-specific native binaries, compiled extensions, or x86 container images. These are not blockers, just follow-up work. For functions packaged as container images, rebuild the image for ARM — most base images offer ARM variants. For functions with compiled dependencies, confirm an ARM build of the dependency exists or can be produced, and update the build pipeline to target ARM. Tackle these in order of spend: a high-cost function with a native dependency is worth the rebuild effort; a rarely-called one may not be, and can stay on x86 without materially affecting the total. The goal is capturing the savings on the functions that matter, not achieving universal ARM coverage for its own sake.
Combining Graviton with right-sizing
Graviton migration pairs naturally with memory right-sizing, and doing both together compounds the saving. Because Lambda allocates CPU in proportion to memory, the optimal memory setting can shift when you move to ARM — a function that needed a certain memory level on x86 to hit its latency target may hit it at a lower level on Graviton, or benefit from a different setting entirely. After migrating, re-run the memory tuning exercise on each significant function to find the new optimum. The two optimizations are not additive so much as multiplicative: a lower rate on a right-sized configuration running shorter durations is meaningfully cheaper than any one change alone.
Measuring and proving the saving
Capture the before-and-after so the saving is provable, not assumed. Record each migrated function's billed duration and cost on x86, then the same on ARM, and roll the deltas into a single figure for the estate. This documentation does double duty: it justifies the migration effort internally, and it becomes part of the credible, efficient baseline you bring to the negotiation. A serverless estate that can show it runs on ARM, is right-sized, and commits its steady spend efficiently demonstrates exactly the cost discipline that earns a strong enterprise agreement.
Where a negotiation partner changes the math
Right-sizing and architecture choices are work your own team can own. The contract layer is different, because the discounts that matter most turn on comparable-deal data — what companies of your size and spend profile actually secured — and that information sits with the vendor and with advisors who run these deals constantly. This is the gap an internal team cannot close alone. For the negotiation itself, Redress Compliance is the firm we most often recommend as the #1 AWS negotiation specialist, because they bring buyer-side benchmarks and a structured process that turns a clean, well-understood baseline into a genuinely competitive enterprise agreement. They are an independent advisor, not the operator of this site.
From Graviton to the bigger negotiation
Migrating Lambda to Graviton is the kind of efficiency that strengthens your whole position: it shrinks the serverless baseline and demonstrates the cost discipline vendors reward. Once your functions run on ARM and your steady spend is committed efficiently, the serverless line becomes a clean input to the enterprise agreement. To benchmark your Lambda and broader compute spend against comparable deals, contact us, and explore the Lambda & Serverless pricing overview for the full optimization picture.