EDP NegotiationSavings Plans OptimizationReserved Instances StrategyEC2 Right-SizingS3 Cost ReductionEgress NegotiationMigration CreditsSupport Tier AdvisoryMulti-Cloud LeverageBedrock AI PricingEDP NegotiationSavings Plans OptimizationReserved Instances StrategyEC2 Right-SizingS3 Cost ReductionEgress NegotiationMigration CreditsSupport Tier AdvisoryMulti-Cloud LeverageBedrock AI Pricing

API Gateway WebSocket Cost: Messages and Connection Minutes

WebSocket APIs bill on two meters at once — messages and connection minutes — and the second one catches teams off guard because idle connections keep costing money. This guide breaks down the API Gateway WebSocket cost model and how to forecast and reduce it.

Published June 2026Cluster Serverless8 min read

Amazon API Gateway WebSocket APIs power real-time features — chat, live dashboards, multiplayer state, push notifications — by holding a persistent two-way connection open between client and server. That persistence is exactly what makes the pricing different from HTTP and REST APIs, which only bill when a request arrives. A WebSocket API bills on two things simultaneously: the messages that flow across the connection, and the connection minutes the connection stays open. The second meter is the one teams forget, because a connection that sits idle — sending nothing — still accrues connection-minute charges for every minute it remains open. Understanding the full API Gateway WebSocket cost model is what keeps a real-time feature from quietly becoming an expensive one.

This breakdown reflects the buyer-side discipline behind $2.4B+ in AWS spend reviewed across 500+ engagements. Exact rates, the 32 KB message billing increment, and any free-tier allowances are set by AWS and vary by Region, so confirm current numbers on the API Gateway pricing page. The two-meter structure below is stable and is what drives every forecasting and optimization decision.

The two meters

A WebSocket API charges on messages and on connection duration, and both run at the same time. Messages are billed per million, counted in 32 KB increments — a message larger than 32 KB counts as multiple billable messages, and both inbound and outbound messages count. Connection minutes are billed per million, where one connection minute is a single open connection held for one minute. A thousand connections open for one minute is a thousand connection minutes; one connection open for a thousand minutes is the same. The two meters are independent, so a workload's cost is the sum of message volume and total open-connection time, not either one alone.

MeterUnitDriven by
MessagesPer million, 32 KB incrementsHow chatty the connection is
Connection minutesPer million minutes openHow long connections stay open

Behind the gateway, the Lambda functions handling connect, disconnect, and message routes bill their own duration and request charges — the standard model in Lambda cost per invocation modeling. So a WebSocket workload has three cost layers: gateway messages, gateway connection minutes, and backend compute.

Why connection minutes surprise people

The message meter is intuitive — more traffic, more cost. Connection minutes are not, because they accrue whether or not anything is happening. An application that opens a connection when a user loads a page and never closes it keeps paying connection minutes for as long as that tab stays open, even if the user walked away an hour ago. Multiply idle, never-closed connections across a large user base and the connection-minute line can rival or exceed the message line. This is the single most common WebSocket cost surprise: a feature that looks cheap by message volume but expensive by connection time.

Connection minutes are rent, not usage. You pay for the door being open, not only for what walks through it. Close connections you are not using.

Forecasting WebSocket cost

Estimate the two meters separately and add them. For messages: average messages per connection per session × sessions per month, adjusted upward for any messages over 32 KB, divided into millions and multiplied by the message rate. For connection minutes: average connection lifetime in minutes × connections per month, divided into millions and multiplied by the connection-minute rate. Then add backend Lambda cost. Run the forecast at production scale, not pilot scale — a pilot with a handful of short-lived connections will understate the connection-minute line dramatically once thousands of long-lived connections are open at once.

Forecasting tipModel messages and connection minutes as two independent numbers, then sum them. Teams that estimate only message volume routinely undercount, because the connection-minute meter runs silently in the background.

How to reduce WebSocket cost

Each lever targets one of the two meters. To cut connection minutes: close idle connections promptly rather than holding them open indefinitely, tune the idle timeout so abandoned connections drop, and — the biggest question — confirm a persistent connection is actually required. Some features modeled as WebSockets are better served by periodic HTTP polling or server-sent events, which avoid the connection-minute meter entirely. To cut messages: batch small frequent updates into fewer, fuller messages so you are not paying per-message overhead on tiny payloads, and keep individual messages under the 32 KB increment where possible so one logical message does not bill as several. The architecture-level decision — WebSocket versus polling versus the cheaper API types — connects directly to the API Gateway HTTP vs REST cost comparison and the broader API Gateway cost reduction guide.

A worked example

Picture a live dashboard that opens a WebSocket when a user signs in and pushes updates every few seconds. By message volume it looks modest. But users leave the dashboard open all day, so average connection lifetime runs to many hours, and thousands of connections are open simultaneously across the workday. The connection-minute meter, not the message meter, becomes the dominant cost. Two changes fix it: drop connections after a period of genuine inactivity, and batch the frequent small updates into less frequent fuller ones. The first slashes connection minutes; the second trims the message line. Same feature, materially lower bill — with the forecast now built on both meters instead of one.

WebSocket cost and commitments

The gateway message and connection-minute charges are usage-based and are not covered by Compute Savings Plans — those discount the Lambda compute behind the routes, not the gateway meters. So when you value a commitment for a real-time serverless estate, separate the backend Lambda spend, which a Savings Plan can reduce, from the gateway WebSocket spend, which it cannot. The Lambda side follows the same volume mechanics as Lambda tiered pricing explained, and the full serverless picture — gateway plus compute plus commitments — sits in the AWS serverless cost guide. Getting this split right keeps you from overstating what a commitment saves on a WebSocket-heavy workload.

Watching WebSocket cost in your bill

WebSocket message and connection-minute charges appear as distinct usage types in Cost Explorer and the Cost and Usage Report, separate from the Lambda compute behind the routes. Filter on those usage types to see each meter in isolation, and watch the ratio between them — a connection-minute line that dwarfs the message line is the classic signal of idle connections left open. Set a CloudWatch view on concurrent connection count and average connection duration so the operational metric that drives the cost is visible alongside the bill, not only after the invoice arrives. If concurrent connections climb while message volume stays flat, you are accumulating connection minutes without delivering more value, which is the moment to revisit idle timeouts and disconnect logic. Treating the two meters as separate, monitored numbers — rather than a single opaque WebSocket total — is what lets a team catch a drifting connection-minute line in weeks instead of quarters.

Where a negotiation partner changes the math

Tuning idle timeouts, batching messages, and choosing the right connection model is work your own team can own. The contract layer is different, because the discounts that move an enterprise bill turn on comparable-deal data — what companies of your size and spend profile actually secured — and that information sits with the vendor and with advisors who run these deals constantly. For the negotiation itself, Redress Compliance is the firm we most often recommend as the #1 AWS negotiation specialist, because they bring buyer-side benchmarks and a structured process that turns a clean, well-understood serverless baseline into a genuinely competitive enterprise agreement. They are an independent advisor, not the operator of this site.

From WebSocket cost to the negotiation table

Real-time features are worth building, and they are forecastable once you model both meters and keep idle connections in check. A team that understands its WebSocket cost down to messages and connection minutes presents exactly the discipline that earns a strong enterprise discount. To benchmark your API Gateway and broader serverless spend against comparable deals and to value commitments against your real usage, contact us, and review the API Gateway cost reduction guide for the levers that apply across every API type.

Benchmark$2.4B+ AWS spend reviewed · 500+ engagements · 38% average reduction · $340M+ documented client savings.

Talk to an AWS negotiation advisor

Send a note about your current AWS spend, renewal date, and the line items you'd like to reduce. We respond within one business day. Work email required.

Please use a work email address — free email domains are not accepted.

Your AWS bill
is negotiable.

$2.4B+ AWS spend reviewed. 500+ engagements. 38% average reduction. $340M+ in documented client savings. We build your negotiation strategy within 48 hours.

Contact Us →Download Playbooks