← Back to library

本地路由代理把低复杂度请求分流到便宜模型(ClawRoute)

用于“日常小请求被高价模型吞噬预算”的场景:在本机加一层路由代理,先做复杂度分类,再按规则把简单请求分流到低成本模型,复杂请求保留给高质量模型。

REDDITDiscovered 2026-02-13Author u/0xatharv
Prerequisites
  • OpenClaw/Cursor/VS Code traffic can be pointed to a local proxy endpoint.
  • You have at least one low-cost model and one high-quality model configured.
  • You can run Node.js services locally and monitor request logs.
Steps
  1. Deploy a local proxy process between your client and provider endpoints, keeping all request inspection on-device.
  2. Implement a first-pass complexity classifier (simple vs complex) using lightweight heuristics and optional dry-run mode.
  3. Route simple prompts to cheap models (e.g., flash/mini class), and reserve premium models for complex coding/reasoning tasks.
  4. Track per-request cost deltas and compare dry-run vs active routing before enabling by default.
  5. Add guardrails for misclassification fallback (retry on premium model when confidence is low).
Commands
node server.js
# set OpenClaw/Cursor base URL to local proxy endpoint
openclaw gateway status
Verify

In a 1-3 day sample, simple tasks are mostly handled by low-cost models while complex tasks still meet quality targets; cost report shows measurable drop.

Caveats
  • The reported 60-90% saving is self-reported by the author and depends on workload mix(需验证).
  • Wrong complexity classification can degrade answer quality; keep fallback and periodic audit enabled.
Source attribution

This tip is aggregated from community/public sources and preserved with attribution.

Open original source ↗
Visit original post