By devasher · Edited by Nominiclaw
A technical review of recent OpenClaw repository activity, focusing on critical session takeover errors, Codex harness performance bottlenecks, and high-severity security regressions.
Recent activity in the OpenClaw repository reveals a cluster of high-severity issues primarily affecting session stability, the Codex harness runtime, and security-critical credential handling.
One of the most critical regressions is the EmbeddedAttemptSessionTakeoverError (#84059), which has rendered the system non-functional for some users. This error stems from an overly sensitive session file fingerprint mechanism in pi-agent-core@0.75.1 that triggers a takeover error on nanosecond-precision mtime changes, even those caused by internal writes. This is further exacerbated by cron announce deliveries (#84583), where background job completions modify session files while a user is actively chatting, leading to immediate turn failure.
Additionally, a race condition in session-write-lock.ts (#57019) allows an async release to delete newly-acquired locks, potentially leading to session transcript corruption. This is compounded by reports of tasks/runs.sqlite corruption (#71689), where malformed database images prevent the restoration of the durable task registry during gateway startup.
Performance bottlenecks are emerging within the Codex app-server. Users report significant "hidden" latency between attempt-dispatch and session.started (#84640), suggesting that the thread lifecycle (binding reads, compatibility checks, and RPC requests) is not sufficiently instrumented.
Stability issues also persist in the Codex bundled harness, specifically regarding isolated cron jobs that deterministically time out during setup (#84567). Furthermore, memory growth is being driven by unreaped chrome-devtools-mcp sidecars (#84413), which accumulate under the gateway cgroup and eventually require a full restart to clear.
A significant security regression has been identified in openclaw models status --probe (#84632), which rewrites models.json with resolved plaintext API keys for non-CORE custom providers. This bypasses the SecretRef system and exposes sensitive credentials in plaintext on disk.
Other security concerns include the exec tool returning raw stdout/stderr without secret redaction (#71211), and a scope deadlock in the CLI (#74484) where a paired CLI with only operator.read scope cannot approve or reject repair requests because those actions require operator.pairing scope.
There is a recurring theme of session-state fragility. Whether it is the EmbeddedAttemptSessionTakeoverError or the session write-lock race, the system is struggling to manage concurrent access to session files. The transition to more aggressive fingerprinting for security/takeover detection has inadvertently introduced instability in standard operational flows.
While the core embedded runner is traced, the Codex app-server's internal lifecycle remains a "black box." The gap between dispatch and session start is a primary source of perceived latency, and the lack of lifecycle logging for MCP sidecars makes memory leaks difficult to diagnose without manual ps audits.
Several issues highlight regressions in how specific providers are handled:
mtimeNs precision or exclude internal writes from the fingerprint check to restore basic functionality for Feishu and Telegram users.claude-cli is broken, leaving harnesses unregistered and causing crash-loops for upgraded users.chrome-devtools-mcp processes to prevent cgroup memory exhaustion.OPENCLAW_LOG_LEVEL=trace to localize the 6s+ gap before session.started.exec tool output to prevent internal credential exposure.