LLM Pricing Intelligence Center

--

Total Events

--

Price Drops

--

New Models

--

Providers

Price Velocity Chart

Track how LLM API prices have evolved over time. Click any data point to jump to that event.

LLM API Pricing Data ($/Million Tokens)
Date	Provider	Model	Input Price	Output Price

Pricing Event Timeline

Complete history of LLM pricing changes with evidence and sources.

-- events

Loading pricing events...

Savings Calculator

Calculate potential savings from price evolution, batching, and caching. Order matters - savings stack multiplicatively.

Monthly API Spend ($)

Current Model Tier

When did you start?

Enter your details and click Calculate to see potential savings.

Subscription vs API Break-Even

Calculate when a flat-rate subscription becomes cheaper than pay-per-token API pricing.

Subscription Plan

Your Monthly Token Usage

20M tokens (~4,000 conversations)

⏳

Loading... Fetching current API rates

Subscription

$200/mo

vs

API (STCI-FRONTIER)

$--/mo

Break-even at: -- tokens/month

That's roughly -- conversations

Break-Even Threshold Over Time

As API prices drop, you need more usage to justify subscriptions.

Current Market Snapshot

Live pricing from our daily index.

Frontier

--

Best: --

Efficient

--

Best: --

Open

--

Best: --

Cost Optimization Features

Beyond price drops, providers offer structural features that can reduce costs significantly.

Prompt Caching

50-90% savings

Cache static prompt prefixes. Works for system prompts, few-shot examples, and documents.

OpenAI Anthropic Google

Implementation checklist

Identify prompts with static prefixes >1024 tokens
Enable caching via API parameter
Monitor cache hit rates
Adjust prompt structure to maximize hits

Batch API

50% savings

Submit requests in batches with 24-hour turnaround. Ideal for async workloads.

OpenAI Anthropic

Implementation checklist

Identify non-time-sensitive requests
Create JSONL batch files
Submit via batch endpoint
Poll for completion

Model Routing

30-70% savings

Route simple queries to cheaper models. Use frontier models only when needed.

Any

Implementation checklist

Classify query complexity
Set up routing logic
Monitor quality by tier
Adjust thresholds based on results

Context Window Optimization

20-50% savings

Reduce prompt size through summarization, chunking, and selective retrieval.

Any

Implementation checklist

Measure current prompt sizes
Implement summarization for long context
Use RAG for selective retrieval
Compress examples and instructions

Stackability Matrix

These features compound. A well-optimized implementation can combine multiple savings:

	Caching	Batching	Routing
Caching	-	Yes	Yes
Batching	Yes	-	Yes
Routing	Yes	Yes	-

Methodology & Verification

Data Collection

Events are manually curated from official provider announcements, pricing pages, and blog posts. Each event includes a direct source URL.

Verification

All events are verified against primary sources. The verifiedAt timestamp indicates when we last confirmed the data.

Confidence Levels

High: Official pricing page or announcement
Medium: Third-party aggregator or documentation
Low: Community reports or indirect sources

Severity Rubric

Critical: ≥50% price change OR new tier undercuts by ≥30%
High: 20-49% change OR major model launch
Medium: 5-19% change OR limited scope
Low: ≤4% change OR minor updates

Corrections Policy

If you find an error in our data, please open an issue on GitHub. We aim to correct verified errors within 24 hours.

Stay Updated

Get notified when major pricing changes happen. No spam, just signal.