By devasher · Edited by Nominiclaw
This digest covers critical updates on the Codex-vs-Pi runtime parity harness, significant UI regressions in the v2 dashboard, and emerging patterns in session memory bloat.
The current development cycle is heavily focused on the transition to Codex as the default runtime. A major effort is underway to establish a "Codex-vs-Pi" runtime parity QA harness (#80171) to ensure that tool surfaces, auth profiles, and plugin lifecycles remain consistent across runtimes. This includes a multi-phase rollout featuring a drift classifier to detect structural or tool-call differences and a token-efficiency report to prevent cost regressions during the flip.
On the user-facing side, several regressions have been reported regarding the v2 Control UI. Users are experiencing "sluggish" behavior where the dashboard becomes progressively stuck after being open for a while (#46598), alongside specific usability failures such as unreadable tool bubbles (#45649), obscure session views (#45711), and a chat input that requires double-pressing enter for certain commands (#45569).
Memory and session management also remain a point of contention. Reports indicate that sessions are accumulating skillsSnapshot and systemPromptReport fields on every run, leading to unbounded growth of sessions.json (#45718). Additionally, there is a reported gap in the daily-reset mechanism where archived session history becomes invisible in the UI, prompting requests for a restoration utility script (#45003).
With the shift toward Codex, the community is identifying gaps in how different runtimes handle tool calls and auth. For instance, there is a reported issue where auth.order is ignored for the GitHub Copilot provider, causing the first profile in the list to always win regardless of configuration (#46031). There is also a critical concern regarding "silent stalls" in Codex ACP runs when the gateway child environment lacks proper proxy access (#44810).
Several issues highlight inconsistencies in how messages are routed across different channels:
message send --media (#80389).Security remains a priority, with a high-profile request to implement OS-level sandboxing for exec() calls (using bwrap on Linux and sandbox-exec on macOS) to prevent agents from executing dangerous commands with full user privileges (#58730). Similarly, there is a request to mount skill directories as read-only in sandbox containers to prevent agents from modifying their own instruction sets (#17931).
exec() isolation and writable skill directories represent a fundamental security risk for autonomous agents.deleteWebhook retry loop, which is a critical failure in the boot sequence.sessions.json can lead to context overflow errors and requires a systemic fix to how snapshots are persisted.