Weekly Workflow SLO Digest — ending 2026-02-26
Weekly Workflow SLO Digest — ending 2026-02-26
Weekly Summary
- Days with jobs: 1
- Jobs total: 24
- Avg success rate: 95.83%
- Avg p95 wait: 232.44s
Recurring Failure Sources
- error: 1
Top Model Pressure (by day presence)
- local/qwen-14b: top-load on 1 day(s)
Recommendations
- Raise queue reliability: inspect non-complete statuses and tighten retry/timeout for failing workflows.
- Reduce queue congestion: reserve high/urgent for prod-critical and defer experimental workloads.
- Add targeted runbooks for timeout/error-heavy workflows and verify fallback reason codes.
Daily Breakdown
- 2026-02-20: jobs=0, success=0.0%, p95=0.0s
- 2026-02-21: jobs=0, success=0.0%, p95=0.0s
- 2026-02-22: jobs=0, success=0.0%, p95=0.0s
- 2026-02-23: jobs=0, success=0.0%, p95=0.0s
- 2026-02-24: jobs=0, success=0.0%, p95=0.0s
- 2026-02-25: jobs=0, success=0.0%, p95=0.0s
- 2026-02-26: jobs=24, success=95.83%, p95=232.44s