AWS Rekognition Cost Strategy: Image, Video, and Custom Labels
Computer vision workloads on AWS land in one of three Rekognition pricing models, and the difference between them — for the same business outcome — can be 30x. Picking the wrong model is the single most common Rekognition cost mistake we see across 500+ engagements and $2.4B+ in AWS spend reviewed.
This guide breaks down Rekognition's pricing surface, the operational patterns that inflate the bill, and the contract-layer plays that compound the savings.
The Three Pricing Surfaces
1. Rekognition Image
Pay-per-API-call, per image. Operations include label detection, face detection, face comparison, text-in-image, content moderation, and PPE detection. Different operations bill at different rates. Volume tiers reduce per-call cost; the highest tier is roughly half the entry-tier rate.
The trap: workloads that call multiple operations per image (DetectLabels + DetectModerationLabels + DetectFaces on every upload) bill independently for each operation. A pipeline running three operations per image is paying 3x what a single-operation pipeline costs.
2. Rekognition Video
Pay-per-minute of video analyzed, with separate pricing for stored video analysis and streaming video analysis. Normalized to comparable content volume, video is typically 10–30x the per-image cost. Workloads that "just need to check the video too" routinely add an order of magnitude to the bill without anyone noticing.
3. Rekognition Custom Labels and Custom Moderation
A completely different commercial model: pay-per-hour to run a custom model endpoint, plus training cost. Custom Labels endpoints bill while running regardless of inferences processed. For intermittent workloads, this is brutal — a single endpoint running 24/7 is several thousand dollars per month even if it serves 100 inferences.
Where the Real Money Goes
Across audited Rekognition workloads, four patterns dominate cost:
- Multi-operation calls per image. Five operations per upload instead of one.
- Stored Video analysis on full-length videos. When only the first 30 seconds need analysis.
- Idle Custom Labels endpoints. Especially in non-production environments.
- Face Collections growth. Large face indexes have storage costs that grow linearly forever.
The Optimization Sequence
Step 1: Audit Operation Mix
Pull CloudTrail events for Rekognition. Bucket by operation. Identify operations that are called but whose results are not consumed downstream. We routinely find DetectFaces calls on every image in pipelines where no downstream code touches the face data.
Step 2: Pre-Filter Before Calling
For image workloads with significant garbage input (uploads, user-submitted content), a cheap pre-filter saves Rekognition calls on images that should never have been analyzed. A simple S3 metadata check (file size, dimensions) eliminates 5–15% of calls on most upload pipelines.
Step 3: Sample, Don't Saturate, Video
Video analysis at every frame is rarely required. Sampling at 1 frame per second instead of 30 frames per second is a 30x cost reduction with negligible quality loss for most labeling, moderation, and detection use cases.
Step 4: Schedule Custom Labels Endpoints
If Custom Labels endpoints support batch workloads that run on a schedule, start them on schedule and stop them when done. We have cut Custom Labels spend 70%+ on workloads that previously ran 24/7 endpoints to support 4-hour daily batch jobs.
Step 5: Migrate Long-Tail Moderation to Foundation Models
For moderation use cases with custom policies — where Rekognition Content Moderation's stock categories do not align — foundation models with vision are often cheaper and more accurate. The architecture becomes: Rekognition for standard categories, Bedrock for custom policy checks. We cover the Bedrock side in Bedrock AI pricing strategy.
The Contract Layer
Rekognition spend, like Comprehend and Textract, can be negotiated into the EDP at full discount tier. Three specific tactics:
- EDP inclusion at full tier. AWS will offer a carved-out smaller discount. Push back. We routinely close Rekognition spend into the full blended EDP tier.
- Custom Labels commitment pricing. For Custom Labels workloads above $30K/month, AWS will discount the hourly endpoint rate via private offer. Ask.
- Cross-service flex. Bundle Rekognition spend with Bedrock and SageMaker spend into a single AI-services pool that floats across SKUs. This protects you when workloads shift between Rekognition and foundation-model alternatives.
The Sequence That Actually Works
- Audit operation mix. Eliminate unused operations.
- Pre-filter inputs. Especially for user-uploaded content.
- Sample video. Don't analyze every frame.
- Schedule Custom Labels endpoints. Stop running them 24/7 in non-production.
- Right-size face collections. Archive or delete inactive entries.
- Negotiate Rekognition into the EDP at full tier.
For related cost work on other AI services, see Textract pricing analysis, SageMaker inference cost reduction, and EDP negotiation.
Optimizing your Rekognition spend?
We audit computer vision pipelines, right-size operation mix, and negotiate AI services into EDP commits at full tier. 38% average reduction.
Contact Us →Frequently Asked Questions
How is Rekognition video pricing different from image pricing?
Rekognition Video bills per minute of video processed and is typically 10-30x the per-image cost when normalized to comparable content volumes. Stored video analysis and streaming video analysis have separate rate structures.
Is Rekognition Custom Labels worth the hourly endpoint cost?
Custom Labels endpoints bill per-hour while running, regardless of inferences processed. It is cost-effective only with sustained, high-volume inference. Intermittent workloads are dramatically cheaper on standard Rekognition or SageMaker.
Can foundation models replace Rekognition for moderation?
For custom moderation policies that don't align with Rekognition's stock categories, foundation models with vision are often cheaper and more accurate. Standard moderation categories typically remain cheaper on Rekognition.
How much can Face Collection storage cost?
Face Collection storage is small per record but grows linearly forever. Workloads with millions of indexed faces can quietly accumulate $1,000+ per month of storage. Implement TTL policies and archive inactive entries.
The Bottom Line
Rekognition cost is driven more by which API you call than by the per-call rate. Operation mix audits, video sampling, and Custom Labels endpoint scheduling routinely cut bills 40–60% before any contract work. EDP inclusion at full tier compounds whatever the technical optimization delivered.
If your Rekognition spend has crossed $20,000/month, contact us for a pipeline and contract review.