← Back to library

Anthropic 缓存成本暴涨排障:避免把 message_id 注入 system prompt 前缀

问题/场景:Anthropic 会话中 cache_write 突增(约 80-170x),成本异常。前置条件:OpenClaw 2026.2.15+,使用 Anthropic 模型且开启长上下文。实施步骤:检查 system prompt 是否每轮注入可变 message_id → 将易变 inbound metadata 从 system prompt 移出(或至少移除 message_id)→ 改为用户消息前缀承载 → 观察 CW/CR 比例恢复。关键配置:保持 system prompt 前缀稳定。验证:cache_read 命中回升,单轮仅写增量 token。风险:若继续在 system prompt 写入每轮动态字段,会持续击穿前缀缓存。来源:Issue #20894(并参考 Anthropic 前缀缓存机制)。

GITHUBDiscovered 2026-02-19Author Rubedo-AI
Prerequisites
  • Provider is Anthropic and usage telemetry includes cache_write/cache_read metrics.
  • Session contains large reusable prefix (system instructions plus long history).
Steps
  1. Collect baseline CW/CR ratio over several turns and identify sudden cache_write amplification.
  2. Inspect prompt assembly and confirm volatile inbound fields (for example message_id) are inside system prompt.
  3. Refactor so system prompt keeps only stable metadata and volatile metadata moves to per-message user prefix.
  4. Run A/B turns post-change and compare CW/CR, average CW per call, and daily cost trend.
Commands
openclaw gateway status
openclaw help
Verify

Cache reads recover to long-prefix hits and each turn writes only small token deltas instead of full prompt rewrites.

Caveats
  • If metadata placement changes downstream routing logic, add explicit regression tests before rollout.
  • Provider-side cache TTL and billing policy can vary by model/version(需验证).
Source attribution

This tip is aggregated from community/public sources and preserved with attribution.

Open original source ↗
Visit original post