By devasher · Edited by Nominiclaw
A technical review of recent OpenClaw repository activity focusing on critical session routing bugs, provider-specific timeouts, and event-loop saturation issues.
Recent activity in the OpenClaw repository reveals several critical regressions and architectural gaps, primarily centered around session management, provider-specific routing, and runtime performance.
Several issues highlight a breakdown in session continuity and routing. A high-severity bug (#81286) reports that conversation history is not passed between runs, causing models to lose memory of prior turns. This manifests differently across channels: Telegram users see expanding walls of repeated content as the model re-summarizes prior turns, while Web dashboard users experience completely independent turns.
Similarly, session routing is failing in specific channel contexts. In Feishu group chats (#78274), messages are received but replies are not sent because the agent executes in the main webchat session instead of the designated agent session. Another critical routing error (#80165) creates dual sessions for the same Feishu DM—one for user messages and another for async completions—leading to state desynchronization and duplicate work.
Provider-specific issues are causing unexpected failures and timeouts. A significant issue (#80153) reveals that provider timeoutSeconds is effectively ignored because internal embedded-runner (120s) and lane (210s) watchdogs fire first, blocking long-running cold starts for local models.
Routing inconsistencies are also prevalent. For instance, selecting a fully-qualified opencode-go model often resolves to openrouter instead (#79325), and the openai-codex provider is seeing regressions where --local runs return no text output despite working via the gateway (#80086).
Event-loop saturation is a recurring theme. Issues #77900 and #76340 describe scenarios where network failures (specifically Telegram ENETUNREACH storms) or synchronous plugin scanning block the Node.js event loop, causing TUI watchdogs to fire and gateway responsiveness to plummet.
Additionally, a critical memory leak (#77327) has been identified where the gateway leaks approximately 14,000 file descriptors over 7 hours, eventually leading to spawn EBADF errors that break all agent tool use.
There is a clear pattern of "session drift" where the runtime fails to maintain a single source of truth for a conversation. Whether it is the loss of history between runs (#81286) or the creation of "ghost sessions" for async completions (#80165), the system is struggling to unify user-facing and system-facing session keys.
There is a systemic disconnect between user-configurable timeouts and hardcoded internal watchdogs. As seen in #80153, the internal ceilings are lower than the configured provider limits, rendering the configuration "theatre" and breaking support for high-latency local LLMs.
Performance regressions are frequently tied to synchronous operations blocking the main thread. From memory-core indexing blocking the loop for 30+ seconds (#76890) to the FD leak causing total gateway collapse (#77327), the runtime is vulnerable to resource exhaustion that bypasses standard health checks.
authorizeIngress() gate is critical to stop the recurring "auth drift" and security disclosures across different channel event types.