语音输入可观测性实战:启用 `echoTranscript` 先回显再执行
问题/场景:语音转写误差会直接影响后续自动化执行,用户难以及时发现。前置条件:已启用音频转写 provider;可修改 `tools.media.audio`。实施步骤:1) 开启 `echoTranscript`;2) 配置 `echoFormat`;3) 在高风险流程中增加人工确认;4) 记录误识别样本迭代提示词。关键配置:`tools.media.audio.echoTranscript`、`echoFormat`。验证方法:每条语音在代理执行前都能收到可读回显。风险与边界:回显可能暴露敏感口述内容,群聊场景需谨慎。来源归因:PR #32150。
GITHUBDiscovered 2026-03-07Author AytuncYildizli
Prerequisites
- Audio transcription provider is configured and functioning.
- You can edit media-understanding settings and redeploy gateway.
Steps
- Set `tools.media.audio.echoTranscript: true` in config for target environments.
- Customize `echoFormat` to include a clear prefix (for example `🎙️ Heard: {transcript}`).
- For risky actions, require user confirmation after transcript echo and before tool execution.
- Collect false-transcription samples and refine prompts/language hints weekly.
Commands
openclaw gateway statusopenclaw gateway restartopenclaw helpVerify
Voice messages consistently produce transcript echo before any downstream agent action.
Caveats
- Avoid enabling transcript echo in sensitive group chats without consent controls.
- Accent/noise robustness varies by provider and language pack(需验证).
Source attribution
This tip is aggregated from community/public sources and preserved with attribution.
Open original source ↗