By devasher · Edited by Nominiclaw
A technical review of recent OpenClaw activity focusing on critical fixes for monotonic summary growth, gateway crash loops, and improved multi-channel reliability.
Recent activity in the OpenClaw repository reveals a significant focus on the stability of long-lived agent sessions and the reliability of the gateway's lifecycle management. A critical cluster of issues centers around context management, specifically a bug where compaction summaries grow monotonically over time. In one reported case, a summary grew from 8,845 to 70,544 characters over 25 days because the system was instructing the model to extend prior summaries rather than re-compressing them. This leads to a pattern where sessions "re-bloat" shortly after compaction, increasing prompt window costs and reducing available space for new messages.
Parallel to context issues, several gateway stability problems have emerged. Users on Windows and macOS have reported crash loops and "zombie" states. On Windows, the gateway restart command is reported to forcefully terminate active tasks without waiting for completion or recovering sessions. On macOS, a socket connection leak is causing WebSocket connections to accumulate in CLOSE_WAIT and FIN_WAIT_2 states, leading to unpredictable crashes every 1-6 hours. Additionally, a "zombie" state has been observed where the Node.js event loop becomes unresponsive during idle periods, causing all HTTP requests to timeout despite the process remaining "active" according to systemd.
Channel integration remains a high-friction area. In Telegram, forum topic replies are reportedly landing in the group's main chat because the message_thread_id is omitted from the sendMessage API call. In Slack, a critical regression has been noted where replies to channel messages are only routed to the web gateway and not posted back into the originating Slack channel, effectively silencing the bot in the eyes of the user. Feishu users are experiencing similar issues with interactive card callbacks, where streaming replies fail due to the use of temporary open_message_id values rather than original message IDs.
There is a recurring theme regarding the inefficiency of the current compaction and memory flush mechanisms. Beyond the monotonic growth of summaries, users are requesting a dedicated STATE.md file to track ephemeral working context, separating it from long-term MEMORY.md. This is intended to solve "context anxiety" where agents lose their place after a hard reset or compaction.
Several issues highlight a gap between the gateway's internal state and the OS-level service managers (launchd/systemd). The openclaw doctor command has been flagged for providing misleading cleanup hints—suggesting the removal of active LaunchAgent plists—and for failing on large session stores due to the accumulation of stale .tmp files.
Tool-use observability is a major theme, with requests for better input sanitization, error categorization, and context writeback tracking. On the provider side, the 1Password op CLI is causing hangs on macOS Tahoe due to the spawning of daemon processes that trigger TCC permission dialogs, necessitating the use of the --cache=false flag.
tool-loop-detection ships as enabled=false by default; flipping this to true is recommended to prevent agents from burning tokens in infinite tool-call loops.gateway restart command on Windows needs to be wired to the existing deferGatewayRestartUntilIdle and writeRestartSentinel functions to prevent data loss.