Long-lived gateways under heavy cron/build workloads grow steadily (~18 MB/hr
post-phantom-dispatch-fix) and eventually need a restart-or-OOM. Four retention
sites, all confirmed live on current main:
1. _evict_cached_agent() (/model, /reasoning, codex-runtime, /undo, etc.) popped
the cache entry without releasing the agent's OpenAI client, httpx transport,
SSL context, or conversation history. Only /new cleaned up first. Now releases
clients on a daemon thread, matching _enforce_agent_cache_cap.
2. _release_evicted_agent_soft() now clears _session_messages after
release_clients() — tool outputs (file reads, terminal output, search results)
can be tens of MB per 100+-tool-call session; the list is rebuilt from
persisted session JSON on resume, so dropping it on soft eviction is safe.
3. The session-expiry watcher (permanent finalization) now drops the session's
per-session control dicts (_session_model_overrides, _session_reasoning_overrides,
_pending_approvals, _update_prompt_pending, _pending_model_notes). These leaked
one entry per session per gateway lifetime. NOTE: this is the session-finalize
path, NOT idle agent-cache eviction — an idle-evicted session is still alive and
rebuilds its agent from these overrides, so pruning them there would silently
reset a user's /model choice.
4. _tool_defs_cache is now bounded (_TOOL_DEFS_CACHE_MAX=8) with oldest-first
eviction instead of growing unboundedly across the distinct toolset/config
fingerprints a gateway sees over its lifetime.
Salvaged from #25318 by Michael Steuer (@mssteuer); fix 3 redirected from the
idle-sweep to the session-finalize lifecycle, magic number 8 lifted to a named
constant, test ported.
Fixes#19251
Co-authored-by: Michael Steuer <michael@make.software>