模型分层路由控费:先 lite 后 expert,按任务自动升级
问题/场景:日常助理任务直接使用大模型会快速拉高成本。前置条件:可自托管运行 CianaParrot(或同类架构),并可编辑 `.env`/YAML 配置。实施步骤:1) 启用多层模型(lite→expert);2) 默认请求走轻量层;3) 对复杂任务设置升级条件;4) 结合定时任务观察 token/费用;5) 每周回顾并微调阈值。关键命令:`make build && make up`、`make gateway`。验证方法:常规对话成本下降,复杂任务质量保持可接受。风险与边界:路由阈值过低会频繁升级导致控费失效;过高会影响质量(需验证)。来源归因:GitHub README + 社区讨论。
GITHUBDiscovered 2026-02-25Author emanueleielo
Prerequisites
- Docker and Make are available on your host.
- At least two model tiers are configured with valid API credentials.
Steps
- Clone repo and prepare `.env` with provider keys.
- Define model tiers and routing rules in config (lite as default, expert as escalation).
- Start stack with `make build && make up`, then `make gateway` if host bridge is needed.
- Trigger mixed workload tests and record token/billing delta.
- Tune escalation thresholds based on quality-cost tradeoff weekly.
Commands
git clone https://github.com/emanueleielo/ciana-parrot.gitmake build && make upmake gatewayVerify
Routine requests stay on lite tier while high-complexity prompts escalate and complete successfully.
Caveats
- Model-switch logic can oscillate without hysteresis/guardrails.
- Provider-side pricing/latency changes may require periodic retuning(需验证).
Source attribution
This tip is aggregated from community/public sources and preserved with attribution.
Open original source ↗