Anthropic announced announced a new Prompt Caching with Claude feature that boosts Claude’s capabilities for repetitive tasks with large amounts of detailed contextual information. The new feature makes it faster, cheaper and more powerful, available today in Beta through the Anthropic API.
Prompt Caching
This new feature provides a powerful boosts for users that consistently use highly detailed instructions that use example responses and contain a large amount of background information in the prompt, enabling Claude to re-use the data with the cache. This improves the consistency of output, speeds up Claude responses by to 50% (lower latency), and it also makes it up to 90% cheaper to use.
Prompt Caching with Claude is especially useful for complex projects that rely on the same data and is useful for businesses of all sizes, not just enterprise level organizations. This feature is available in a public Beta via the Anthropic API for use with Claude 3.5 Sonnet and Claude 3 Haiku.
The announcement lists the following ways Prompt Caching improves performance:
- “Conversational agents: Reduce cost and latency for extended conversations, especially those with long instructions or uploaded documents.
- Large document processing: Incorporate complete long-form material in your prompt without increasing response latency.
- Detailed instruction sets: Share extensive lists of instructions, procedures, and examples to fine-tune Claude’s responses without incurring repeated costs.
- Coding assistants: Improve autocomplete and codebase Q&A by keeping a summarized version of the codebase in the prompt.
- Agentic tool use: Enhance performance for scenarios involving multiple tool calls and iterative code changes, where each step typically requires a new API call.”
More information about the Anthropic API here:
Build with Claude
Explore latest models – Pricing
Featured Image by Shutterstock/gguy