By devasher · Edited by Nominiclaw
A comprehensive review of recent OpenClaw activity focusing on critical session eviction bugs, cross-provider reasoning failures, and infrastructure stability on ARM64 and Windows.
Recent activity in the OpenClaw repository reveals several critical regressions and architectural gaps, primarily centered around session management, model provider interoperability, and platform-specific stability.
One of the most severe reported issues involves the session.maintenance logic. When mode is set to "enforce", the system can evict pending subagent sessions before their results are announced or frozen. This leads to a failure where the parent receives a "completed successfully" event but with no output and zero tokens, despite the child having produced a valid response. The proposed fix involves implementing lifecycle-aware eviction that protects active or pending-delivery sessions.
Additionally, a regression in subagent completion handling (#81490) causes the gateway to spawn a fresh run on the parent's route instead of resuming a yielded session. This effectively orphans the paused run and overwrites the session-store pointer, silently breaking automated multi-step workflows.
Significant issues have emerged regarding "reasoning" blocks during cross-provider failovers. Specifically, when failing over from Gemini to OpenAI reasoning models, the system fails to propagate the required reasoning item, resulting in 400 errors. Similarly, MiMo models using the anthropic-messages API are falling back immediately because the gateway fails to preserve reasoning_content during replay, which MiMo requires for subsequent turns.
Stability issues are prominent on ARM64 edge devices (Raspberry Pi 5) and Windows. On ARM64, users report CLI commands timing out or being SIGKILLed due to exec overhead, and cron jobs failing without retry logic. On Windows, a critical runtime degradation has been observed where outbound HTTP fetches (including Telegram polling and model pricing) experience massive stalls (up to 60s) and timeouts, which does not occur in standalone Node processes.
Other notable infrastructure concerns include:
pids.max exhaustion (#68691).readFileSync, causing RSS to grow linearly with session history (#69451).Across multiple issues, a recurring theme is the lack of observability when critical paths fail. Whether it is the silent dropping of Slack replies in group chats (#77320), the absence of log lines for successful Telegram media sends (#68770), or the silent failure of the openclaw-weixin plugin to load in the gateway (#81448), the system often fails without emitting a warning or error log, making diagnosis nearly impossible for operators.
Token counting remains a volatile area. MiniMax models are experiencing premature compaction at ~20% context usage because prompt tokens are being double-counted (input + cacheRead), triggering the compaction safeguard far too early (#68470). Similarly, there are reports of cacheWrite telemetry always remaining at zero despite active cacheRead activity (#81014).
Several feature requests highlight a need for better user-facing feedback. This includes adding ack reactions and typing indicators for slash commands like /new (#69585) and implementing a notification sound for agent turn completion (#69186) to assist users keeping the UI in the background.
sessions_yield and must be fixed to restore automated workflow reliability.tools/list as a notification instead of a request/response, breaking compatibility with standard MCP clients like Hermes Agent.