← Back to library

Cron 调度错误隔离,避免单个坏任务拖垮全局

面向‘一个错误 cron 表达式导致全部定时任务停摆’场景,社区 PR 给出可执行治理方案:按任务隔离异常、累计错误并自动停用高风险任务。

GITHUBDiscovered 2026-02-12Author MarvinDontPanic
Prerequisites
  • You run multiple cron jobs and at least one can be edited by humans/automation.
  • Gateway logs are retained so consecutive scheduler failures can be observed.
Steps
  1. Audit current jobs and identify malformed schedule risk (invalid cron expr, wrong timezone, null everyMs).
  2. Upgrade to a build containing PR #14385, then restart gateway.
  3. Inject one intentionally malformed schedule in staging to verify per-job try/catch isolation.
  4. Confirm error counter increments and auto-disable triggers after repeated failures.
  5. Fix the schedule and verify recovery resets error count while other jobs stay healthy.
Commands
openclaw gateway status
openclaw gateway restart
openclaw help
Verify

A broken cron job no longer blocks healthy jobs; scheduler loop keeps running and only faulty job gets disabled.

Caveats
  • PR is open at discovery time; behavior and field names may change before merge(需验证).
  • Auto-disable threshold may need policy tuning for bursty jobs.
Source attribution

This tip is aggregated from community/public sources and preserved with attribution.

Open original source ↗
Visit original post