Pricing Intelligence Center

Evidence-backed LLM pricing history & analysis

-- Pricing Events
-- Months Tracked
-- Avg. Price Drop
--
Total Events
--
Price Drops
--
New Models
--
Providers

Price Velocity Chart

Track how LLM API prices have evolved over time. Click any data point to jump to that event.

Pricing Event Timeline

Complete history of LLM pricing changes with evidence and sources.

-- events

Loading pricing events...

Savings Calculator

Calculate potential savings from price evolution, batching, and caching. Order matters - savings stack multiplicatively.

Enter your details and click Calculate to see potential savings.

Subscription vs API Break-Even

Calculate when a flat-rate subscription becomes cheaper than pay-per-token API pricing.

20M tokens (~4,000 conversations)
Loading... Fetching current API rates

Subscription

$200/mo
vs

API (STCI-FRONTIER)

$--/mo

Break-even at: -- tokens/month

That's roughly -- conversations

Break-Even Threshold Over Time

As API prices drop, you need more usage to justify subscriptions.

Current Market Snapshot

Live pricing from our daily index.

Frontier

--
Best: --

Efficient

--
Best: --

Open

--
Best: --

Cost Optimization Features

Beyond price drops, providers offer structural features that can reduce costs significantly.

Prompt Caching

50-90% savings

Cache static prompt prefixes. Works for system prompts, few-shot examples, and documents.

OpenAI Anthropic Google
Implementation checklist
  • Identify prompts with static prefixes >1024 tokens
  • Enable caching via API parameter
  • Monitor cache hit rates
  • Adjust prompt structure to maximize hits

Batch API

50% savings

Submit requests in batches with 24-hour turnaround. Ideal for async workloads.

OpenAI Anthropic
Implementation checklist
  • Identify non-time-sensitive requests
  • Create JSONL batch files
  • Submit via batch endpoint
  • Poll for completion

Model Routing

30-70% savings

Route simple queries to cheaper models. Use frontier models only when needed.

Any
Implementation checklist
  • Classify query complexity
  • Set up routing logic
  • Monitor quality by tier
  • Adjust thresholds based on results

Context Window Optimization

20-50% savings

Reduce prompt size through summarization, chunking, and selective retrieval.

Any
Implementation checklist
  • Measure current prompt sizes
  • Implement summarization for long context
  • Use RAG for selective retrieval
  • Compress examples and instructions

Stackability Matrix

These features compound. A well-optimized implementation can combine multiple savings:

Caching Batching Routing
Caching - Yes Yes
Batching Yes - Yes
Routing Yes Yes -

Methodology & Verification

Data Collection

Events are manually curated from official provider announcements, pricing pages, and blog posts. Each event includes a direct source URL.

Verification

All events are verified against primary sources. The verifiedAt timestamp indicates when we last confirmed the data.

Confidence Levels

  • High: Official pricing page or announcement
  • Medium: Third-party aggregator or documentation
  • Low: Community reports or indirect sources

Severity Rubric

  • Critical: ≥50% price change OR new tier undercuts by ≥30%
  • High: 20-49% change OR major model launch
  • Medium: 5-19% change OR limited scope
  • Low: ≤4% change OR minor updates

Corrections Policy

If you find an error in our data, please open an issue on GitHub. We aim to correct verified errors within 24 hours.

Stay Updated

Get notified when major pricing changes happen. No spam, just signal.