模型分层路由控费：先 lite 后 expert，按任务自动升级

问题/场景：日常助理任务直接使用大模型会快速拉高成本。前置条件：可自托管运行 CianaParrot（或同类架构），并可编辑 `.env`/YAML 配置。实施步骤：1) 启用多层模型（lite→expert）；2) 默认请求走轻量层；3) 对复杂任务设置升级条件；4) 结合定时任务观察 token/费用；5) 每周回顾并微调阈值。关键命令：`make build && make up`、`make gateway`。验证方法：常规对话成本下降，复杂任务质量保持可接受。风险与边界：路由阈值过低会频繁升级导致控费失效；过高会影响质量（需验证）。来源归因：GitHub README + 社区讨论。

GITHUBDiscovered 2026-02-25Author emanueleielo

Prerequisites

Docker and Make are available on your host.
At least two model tiers are configured with valid API credentials.

Steps

Clone repo and prepare `.env` with provider keys.
Define model tiers and routing rules in config (lite as default, expert as escalation).
Start stack with `make build && make up`, then `make gateway` if host bridge is needed.
Trigger mixed workload tests and record token/billing delta.
Tune escalation thresholds based on quality-cost tradeoff weekly.

Commands

git clone https://github.com/emanueleielo/ciana-parrot.git

make build && make up

make gateway

Verify

Routine requests stay on lite tier while high-complexity prompts escalate and complete successfully.

Caveats

Model-switch logic can oscillate without hysteresis/guardrails.
Provider-side pricing/latency changes may require periodic retuning（需验证）.

Source attribution

This tip is aggregated from community/public sources and preserved with attribution.

Open original source ↗

Visit original post