← Back to library

语音输入可观测性实战:启用 `echoTranscript` 先回显再执行

问题/场景:语音转写误差会直接影响后续自动化执行,用户难以及时发现。前置条件:已启用音频转写 provider;可修改 `tools.media.audio`。实施步骤:1) 开启 `echoTranscript`;2) 配置 `echoFormat`;3) 在高风险流程中增加人工确认;4) 记录误识别样本迭代提示词。关键配置:`tools.media.audio.echoTranscript`、`echoFormat`。验证方法:每条语音在代理执行前都能收到可读回显。风险与边界:回显可能暴露敏感口述内容,群聊场景需谨慎。来源归因:PR #32150。

GITHUBDiscovered 2026-03-07Author AytuncYildizli
Prerequisites
  • Audio transcription provider is configured and functioning.
  • You can edit media-understanding settings and redeploy gateway.
Steps
  1. Set `tools.media.audio.echoTranscript: true` in config for target environments.
  2. Customize `echoFormat` to include a clear prefix (for example `🎙️ Heard: {transcript}`).
  3. For risky actions, require user confirmation after transcript echo and before tool execution.
  4. Collect false-transcription samples and refine prompts/language hints weekly.
Commands
openclaw gateway status
openclaw gateway restart
openclaw help
Verify

Voice messages consistently produce transcript echo before any downstream agent action.

Caveats
  • Avoid enabling transcript echo in sensitive group chats without consent controls.
  • Accent/noise robustness varies by provider and language pack(需验证).
Source attribution

This tip is aggregated from community/public sources and preserved with attribution.

Open original source ↗
Visit original post