An open-source MCP (Model Context Protocol) plugin launched on Hacker News reduces Claude API costs by up to 90% through intelligent cache breakpoint injection. Prompt-caching.ai automatically leverages Anthropic's caching API, where cached content reads cost only 10% of normal token rates, requiring zero configuration across Claude Code, Cursor, Windsurf, and other MCP-compatible clients.
Four Intelligent Caching Modes Optimize Different Workflows
The plugin operates through four distinct modes tailored to common development scenarios. BugFix Mode detects stack traces and caches error context, while Refactor Mode identifies refactor keywords and caches style guides. File Tracking marks second reads of the same file for caching, and Conversation Freeze freezes older messages as cached prefixes after a specified number of turns.
While initial cache creation costs 1.25× normal pricing, subsequent reads provide pure savings. The tool is particularly valuable for extended coding sessions where the same files, system prompts, or documentation are referenced repeatedly.
Benchmark Results Show Dramatic Token Reductions
Real-world testing demonstrates significant cost savings across different use cases:
- Bug fix sessions: 85% token reduction
- Refactor work: 80% reduction
- General coding: 92% reduction
- Repeated file reads: 90% reduction
The tool received 62 points and 24 comments on Hacker News on March 13, 2026, with discussions focusing on practical implementation questions and cost optimization strategies for teams using Claude-based development tools.
Zero-Configuration Installation Across MCP Clients
The plugin works seamlessly with any MCP-compatible development environment without requiring manual configuration. This accessibility makes it practical for individual developers and teams looking to optimize API spending without changing their existing workflows or tooling.
Key Takeaways
- Prompt-caching.ai reduces Claude API costs by up to 90% through automatic cache breakpoint injection
- Cached content reads cost only 10% of normal token rates, though initial cache creation costs 1.25× normal pricing
- Four intelligent modes optimize caching for bug fixes, refactoring, file tracking, and conversation history
- Benchmark results show 80-92% token reduction across different development workflows
- The open-source plugin requires zero configuration and works across Claude Code, Cursor, Windsurf, and other MCP-compatible clients