本地路由代理把低复杂度请求分流到便宜模型（ClawRoute）

用于“日常小请求被高价模型吞噬预算”的场景：在本机加一层路由代理，先做复杂度分类，再按规则把简单请求分流到低成本模型，复杂请求保留给高质量模型。

REDDITDiscovered 2026-02-13Author u/0xatharv

Prerequisites

OpenClaw/Cursor/VS Code traffic can be pointed to a local proxy endpoint.
You have at least one low-cost model and one high-quality model configured.
You can run Node.js services locally and monitor request logs.

Steps

Deploy a local proxy process between your client and provider endpoints, keeping all request inspection on-device.
Implement a first-pass complexity classifier (simple vs complex) using lightweight heuristics and optional dry-run mode.
Route simple prompts to cheap models (e.g., flash/mini class), and reserve premium models for complex coding/reasoning tasks.
Track per-request cost deltas and compare dry-run vs active routing before enabling by default.
Add guardrails for misclassification fallback (retry on premium model when confidence is low).

Commands

node server.js

# set OpenClaw/Cursor base URL to local proxy endpoint

openclaw gateway status

Verify

In a 1-3 day sample, simple tasks are mostly handled by low-cost models while complex tasks still meet quality targets; cost report shows measurable drop.

Caveats

The reported 60-90% saving is self-reported by the author and depends on workload mix（需验证）.
Wrong complexity classification can degrade answer quality; keep fallback and periodic audit enabled.

Source attribution

This tip is aggregated from community/public sources and preserved with attribution.

Open original source ↗

Visit original post