GitHub PR:web_fetch 增加 Cloudflare Markdown for Agents 兼容
解决“网页抓取文本质量差/噪音多”的场景:优先走 Cloudflare 的 Markdown 输出,让知识采集和摘要更稳定。
GITHUBDiscovered 2026-02-13Author lance33
Prerequisites
- Your OpenClaw version includes the merged web_fetch enhancement from PR #15376.
- Target pages are reachable and not blocked by anti-bot/network policy.
Steps
- Upgrade to a build that includes PR #15376 and restart gateway.
- Run paired fetch tests on the same URL (before/after) to compare markdown cleanliness and retained structure.
- Use the cleaner markdown output as input for downstream summarization/RAG ingestion.
- Keep fallback extraction path for pages where Cloudflare conversion is unavailable.
Commands
openclaw gateway statusopenclaw gateway restartopenclaw helpVerify
For test URLs, extracted markdown has lower boilerplate noise and preserves headings/lists for downstream processing.
Caveats
- Coverage depends on Cloudflare conversion availability and site response headers(需验证)。
- Do not assume identical formatting across all websites; keep per-domain quality checks.
Source attribution
This tip is aggregated from community/public sources and preserved with attribution.
Open original source ↗