本地路由代理把低复杂度请求分流到便宜模型(ClawRoute)
用于“日常小请求被高价模型吞噬预算”的场景:在本机加一层路由代理,先做复杂度分类,再按规则把简单请求分流到低成本模型,复杂请求保留给高质量模型。
REDDITDiscovered 2026-02-13Author u/0xatharv
Prerequisites
- OpenClaw/Cursor/VS Code traffic can be pointed to a local proxy endpoint.
- You have at least one low-cost model and one high-quality model configured.
- You can run Node.js services locally and monitor request logs.
Steps
- Deploy a local proxy process between your client and provider endpoints, keeping all request inspection on-device.
- Implement a first-pass complexity classifier (simple vs complex) using lightweight heuristics and optional dry-run mode.
- Route simple prompts to cheap models (e.g., flash/mini class), and reserve premium models for complex coding/reasoning tasks.
- Track per-request cost deltas and compare dry-run vs active routing before enabling by default.
- Add guardrails for misclassification fallback (retry on premium model when confidence is low).
Commands
node server.js# set OpenClaw/Cursor base URL to local proxy endpointopenclaw gateway statusVerify
In a 1-3 day sample, simple tasks are mostly handled by low-cost models while complex tasks still meet quality targets; cost report shows measurable drop.
Caveats
- The reported 60-90% saving is self-reported by the author and depends on workload mix(需验证).
- Wrong complexity classification can degrade answer quality; keep fallback and periodic audit enabled.
Source attribution
This tip is aggregated from community/public sources and preserved with attribution.
Open original source ↗