This update introduces a critical split in prompt surfaces to prevent instruction leakage across runtimes, stabilizes Discord realtime voice, and centralizes media processing with robust fallback chains.
Merged PRs
- Separate prompt surfaces by selected harness #83454
- fix: fall back from official ClawHub artifact blocks #83566
- Fix Discord realtime voice playback stability #80505
- fix(telegram): harden spool timeout recovery #83575
- fix: harden image metadata fallback #83579
- fix(code-mode): honor agent scoped code mode #83473
- feat(admin-http-rpc): allow web QR login methods #83259
- fix: add resilient media processing fallbacks #83568
- Fix Telegram topic media completion delivery #83556
- fix(android): use realtime relay for talk mode #83130
- fix(codex): stop forcing code-mode-only turns #83561
- Reject empty CLI subprocess replies #83421
- [Fix] Defer gateway update check startup #83520
- fix(messages): apply TTS before message-tool sends #83543
- fix(qqbot): shorten typing keepalive window #83469
- fix: harden release stability recovery and auth fallback #83503
- chore(lint): enable no-underscore-dangle with comprehensive allow list #83422
- fix(codex): hydrate queued inbound images #83533
- fix(tui): bound standalone exit #83501
- fix(messages): keep group visible replies automatic by default #83498
- Load provider owner for Codex harness runtime #83519
- fix(native-pi): pass Telegram images to Ollama #83516
- fix(qa): use supported telegram streaming config in rtt #83514
- fix(qa): use final telegram replies for rtt runs #83509
- fix(telegram): recover stalled isolated spool handlers #83505
- fix(codex): preserve sandbox egress for app-server turns #83502
- refactor(cron): centralize source delivery plan #83377
- [codex] Fix Discord progress mode dropping final replies #83443
- [Test] Add gateway restart benchmark tooling #83299
- [Perf] Overlap gateway startup work before ready #83301
Key Changes
Prompt Engineering and Runtime Isolation
One of the most significant architectural shifts is the introduction of Prompt Surface Separation. Previously, prompt fragments were shared across different runtimes, leading to "double-prompting" risks where instructions for one harness (e.g., PI) would leak into another (e.g., native Codex app-server). The new model explicitly routes prompts based on the selected harness (PI, CLI, ACP, Codex app-server, or Subagent), ensuring that each runtime receives only the guidance relevant to its specific operational context.
Additionally, the Codex harness received several critical updates:
- Code Mode Flexibility: Fixed a regression where Codex app-server threads were forced into
code_mode_only, which stalled tool-using turns. It now defaults to code_mode=true but code_mode_only=false.
- Agent-Scoped Config: The system now honors
codeMode settings defined at the per-agent level, allowing operators to test code-mode on specific agents without a fleet-wide change.
- Sandbox Egress: Fixed a critical bug where sandboxed agents lost network access; the system now derives network access from the OpenClaw sandbox egress configuration.
Realtime Voice and Integration Stability
Significant stability improvements were landed for voice and chat integrations:
- Discord Realtime Voice: Addressed a bug where OpenAI
gpt-realtime-2 sessions would stop recognizing speech after the first reply. This was solved by disabling noise_reduction on the backend bridge and implementing raw PCM prebuffering to eliminate audio stutter.
- Android Talk Mode: Migrated from a legacy STT/TTS pipeline to the modern Gateway relay voice session API, enabling low-latency streaming audio and realtime tool-call handling.
- Telegram Reliability: Hardened the isolated-ingress spool handlers to recover from stalled updates by failing stuck claims into
.failed tombstones and aborting account-scoped work before restarting.
Media Processing and Vision
To resolve issues where image processing failed on fresh installs (due to missing sharp), OpenClaw has centralized media helpers into media-services. This introduces a Sharp-first backend chain with fallbacks to sips, Windows native imaging, ImageMagick, GraphicsMagick, and ffmpeg.
Vision capabilities were also expanded:
- Inbound Image Hydration: Fixed a bug where the Codex app-server dropped inbound image attachments; it now correctly hydrates
MediaPath into queued followup images.
- Ollama Integration: Native PI runs now properly resolve Telegram image media into image blocks for Ollama vision models, preventing the model from silently ignoring visual context.
Gateway Performance and Tooling
Gateway startup latency was reduced by overlapping independent work (such as startup logging and plugin service initialization) before the ready state is returned. To maintain these gains, a new Gateway restart benchmark tool (pnpm test:restart:gateway) was added to provide machine-readable evidence of restart readiness and resource slopes.
Impact
These changes collectively resolve several high-severity pain points for power users and operators:
- Reduced Hallucinations: By separating prompt surfaces and fixing image hydration, models are less likely to confabulate visual observations or follow irrelevant runtime instructions.
- Improved Reliability: The fixes for Discord voice and Telegram spooling eliminate "silent failures" where the bot appears healthy but stops responding to user input.
- Developer Experience: The addition of restart benchmarking and the fix for
code_mode stalling provide operators with better visibility and more predictable behavior during agent evaluation.
- Security and Connectivity: The fix for sandbox egress ensures that research agents can maintain necessary outbound network access without compromising the security of the sandbox environment.