学习系统防‘工具饿死’：给关键工具配置 seed priors

场景：Thompson Sampling 在连续拒绝后可能把 web/exec 等关键工具后验压到近零。可通过 seed arms + min pulls 保留探索能力。

GITHUBDiscovered 2026-02-14Author Peleke

Prerequisites

Steps

Audit current arm posteriors and identify critical tools with near-zero inclusion probability.
Define `seedArms` patterns for critical categories (web, exec, fs read/write) with conservative boosts.
Set a non-zero `min_pulls` guard so low-sample arms remain eligible during recovery phase.
Re-run representative tasks and compare tool selection diversity/accuracy before and after seeding.

Commands

openclaw gateway status

openclaw logs --local-time

ls ~/.qortex/learning/openclaw.db

Verify

Critical tools reappear in candidate/executed sets and task success rate recovers without broad over-triggering.

Caveats

Over-boosted priors can bias selection and reduce adaptability for new tasks.
Issue references ongoing implementation work; final config keys may differ (needs verification).

Source attribution

This tip is aggregated from community/public sources and preserved with attribution.