← Back to library

学习系统防‘工具饿死’:给关键工具配置 seed priors

场景:Thompson Sampling 在连续拒绝后可能把 web/exec 等关键工具后验压到近零。可通过 seed arms + min pulls 保留探索能力。

GITHUBDiscovered 2026-02-14Author Peleke
Prerequisites
  • Learning/bandit mechanism is enabled and tool arm statistics are inspectable.
  • You can edit learning config and reset learning DB in maintenance windows.
Steps
  1. Audit current arm posteriors and identify critical tools with near-zero inclusion probability.
  2. Define `seedArms` patterns for critical categories (web, exec, fs read/write) with conservative boosts.
  3. Set a non-zero `min_pulls` guard so low-sample arms remain eligible during recovery phase.
  4. Re-run representative tasks and compare tool selection diversity/accuracy before and after seeding.
Commands
openclaw gateway status
openclaw logs --local-time
ls ~/.qortex/learning/openclaw.db
Verify

Critical tools reappear in candidate/executed sets and task success rate recovers without broad over-triggering.

Caveats
  • Over-boosted priors can bias selection and reduce adaptability for new tasks.
  • Issue references ongoing implementation work; final config keys may differ (needs verification).
Source attribution

This tip is aggregated from community/public sources and preserved with attribution.

Open original source ↗
Visit original post