Price Velocity Chart
Track how LLM API prices have evolved over time. Click any data point to jump to that event.
| Date | Provider | Model | Input Price | Output Price |
|---|
Pricing Event Timeline
Complete history of LLM pricing changes with evidence and sources.
Loading pricing events...
Savings Calculator
Calculate potential savings from price evolution, batching, and caching. Order matters - savings stack multiplicatively.
Processed locally. No data uploaded.
Enter your details and click Calculate to see potential savings.
Subscription vs API Break-Even
Calculate when a flat-rate subscription becomes cheaper than pay-per-token API pricing.
Subscription
API (STCI-FRONTIER)
Break-even at: -- tokens/month
That's roughly -- conversations
Break-Even Threshold Over Time
As API prices drop, you need more usage to justify subscriptions.
Current Market Snapshot
Live pricing from our daily index.
Frontier
Efficient
Open
Cost Optimization Features
Beyond price drops, providers offer structural features that can reduce costs significantly.
Prompt Caching
Cache static prompt prefixes. Works for system prompts, few-shot examples, and documents.
Implementation checklist
- Identify prompts with static prefixes >1024 tokens
- Enable caching via API parameter
- Monitor cache hit rates
- Adjust prompt structure to maximize hits
Batch API
Submit requests in batches with 24-hour turnaround. Ideal for async workloads.
Implementation checklist
- Identify non-time-sensitive requests
- Create JSONL batch files
- Submit via batch endpoint
- Poll for completion
Model Routing
Route simple queries to cheaper models. Use frontier models only when needed.
Implementation checklist
- Classify query complexity
- Set up routing logic
- Monitor quality by tier
- Adjust thresholds based on results
Context Window Optimization
Reduce prompt size through summarization, chunking, and selective retrieval.
Implementation checklist
- Measure current prompt sizes
- Implement summarization for long context
- Use RAG for selective retrieval
- Compress examples and instructions
Stackability Matrix
These features compound. A well-optimized implementation can combine multiple savings:
| Caching | Batching | Routing | |
|---|---|---|---|
| Caching | - | Yes | Yes |
| Batching | Yes | - | Yes |
| Routing | Yes | Yes | - |
Methodology & Verification
Data Collection
Events are manually curated from official provider announcements, pricing pages, and blog posts. Each event includes a direct source URL.
Verification
All events are verified against primary sources. The verifiedAt timestamp indicates when we last confirmed the data.
Confidence Levels
- High: Official pricing page or announcement
- Medium: Third-party aggregator or documentation
- Low: Community reports or indirect sources
Severity Rubric
- Critical: ≥50% price change OR new tier undercuts by ≥30%
- High: 20-49% change OR major model launch
- Medium: 5-19% change OR limited scope
- Low: ≤4% change OR minor updates
Corrections Policy
If you find an error in our data, please open an issue on GitHub. We aim to correct verified errors within 24 hours.
Stay Updated
Get notified when major pricing changes happen. No spam, just signal.
Unsubscribe anytime. See our Privacy Policy.