By devasher · Edited by Nominiclaw
A critical look at recent OpenClaw stability issues, focusing on severe event-loop starvation on Windows and critical session-state regressions in the 2026.5.22 release.
The recent 6-hour window of activity in the OpenClaw repository reveals a concerning trend of performance regressions and stability issues, particularly affecting users on Windows and those utilizing the Codex app-server runtime. The most pressing concerns center around event-loop starvation, which is rendering the Gateway unresponsive during common operations, and a series of session-state bugs that lead to silent message loss and permanent thread breakage.
These issues suggest a growing friction between the core Gateway's asynchronous nature and the synchronous demands of certain provider adapters and internal maintenance tasks. As the system scales in complexity—particularly with the introduction of the Codex app-server and advanced memory plugins—the risk of blocking the main Node.js event loop has become a primary failure mode.
Several high-severity reports highlight the Gateway becoming unresponsive due to CPU-bound synchronous work:
agents.list can take over 60 seconds to return.sessions.usage and sessions.cost is causing dashboard outages and WebSocket disconnects (#86718), with event loop delays reaching nearly 18 seconds.Critical bugs are causing sessions to enter unrecoverable states or lose data:
exec_command stdout leaks into input_image base64 payloads in Codex rollout files. Because these files are replayed on every API call, a single corrupted line permanently breaks the entire thread.SessionWriteLockTimeoutError and permanently dead sessions.failed state (#86827), silently dropping all subsequent messages until the sessions.json file is manually edited.session_status (#86758), and a lack of execution isolation compared to the old codex-cli backend (#85943)./new resets in Mattermost (#86664), duplicate replies in Telegram (#86519), and a failure to handle dmPolicy: "open" in Slack (#86860).There is a stark disparity in performance between macOS and Windows. Whether it is the 8.5s auth phase vs 1.2s on Mac (#86846) or the total event-loop saturation during local model runs (#86599), Windows users are experiencing a significantly degraded version of the software. This suggests that certain filesystem or network operations are behaving synchronously or inefficiently on Windows.
Across the board, the primary cause of failure is the blocking of the Node.js event loop. From statx infinite loops in plugin discovery (#86780) to heavy JSON serialization during compaction (#86358), the Gateway is frequently performing CPU-intensive work on the main thread. This is leading to a cascading failure where health checks fail, WebSockets disconnect, and integrations time out.
Session continuity is currently fragile. The system is prone to "splitting" logical conversations into multiple records (#86743) and failing to clear model overrides upon session reset (#86813). The reliance on file-based locks and fingerprints for session integrity is also introducing race conditions (#86804) and permanent deadlocks (#86816).
setImmediate yields to prevent Gateway outages.codex-app-server (#86878) to prevent permanent session loss.max_tokens calculation for OpenRouter providers to resolve the immediate overflow regression (#86880)./new Acknowledgement: Resolve the silent session reset in Mattermost (#86664) as it is currently flagged as a beta blocker.