mirror of
https://github.com/patriceckhart/zot.git
synced 2026-06-26 21:36:31 +02:00
8 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
99c9ba8062 |
feat(ext): phase 4 - full-event interception, arg rewrites, /reload-ext
Clears every deferred extension todo in one push:
1) Interception expands to three events: tool_call (already shipped),
turn_start (gate the turn before the model call, e.g. rate-limit /
business-hour), and assistant_message (suppress or rewrite the
user-visible text while keeping the model's original output in
the transcript).
2) Tool-call args can now be rewritten mid-flight. An interceptor
returning modified_args replaces the JSON the tool actually
receives, without the model seeing the rewrite. Chains: each
subscriber sees the previous one's output, letting guards
successively redact / patch / augment. Invalid JSON is dropped
safely.
3) /reload-ext hot-reloads every extension without restarting zot.
The manager gracefully shuts down all running subprocesses,
re-reads extension.json from disk, respawns (including --ext
paths remembered from startup), and the host rebuilds the agent's
tool registry in-place so freshly-registered tools are callable
immediately.
Wire-format changes (extproto):
- EventInterceptResponseFromExt gains modified_args and replace_text
fields (both optional, ignored when block=true).
- EventInterceptFromHost gains Step (for turn_start) and Text (for
assistant_message) alongside the existing tool_call payload.
Core agent changes:
- BeforeToolExecute signature now returns (allowed, reason,
modifiedArgs json.RawMessage). Non-nil+valid JSON args replace
tc.Arguments before Tool.Execute runs.
- New BeforeTurn hook, invoked in runLoop before oneTurn. Blocking
cancels the turn with an EvTurnEnd{StopError} carrying the reason.
- New BeforeAssistantMessage hook, invoked after finalMsg is
assembled but before the EvAssistantMessage emit. Supports
suppress (block=true) and text rewrite (replace_text). Transcript
always gets the original; UI gets the rewritten text.
- New SetTools(reg) so /reload-ext can swap the registry on the
live agent under the agent mutex.
Manager changes:
- InterceptToolCall now returns InterceptResult (Block, Reason,
ModifiedArgs, ReplaceText), with a chain that folds rewrites.
- New InterceptTurnStart and InterceptAssistantMessage.
- New Reload(ctx, grace) tears down and respawns everything,
returning ReloadStats{Stopped, Loaded, Ready, Errors}.
- New SetOnReload(fn) callback the host uses to rebuild the agent
tool registry after a reload.
- LoadExplicit remembers --ext paths so Reload respawns them.
- subscribe accepts "tool_call", "turn_start", "assistant_message"
under "intercept".
SDK (pkg/zotext):
- New handler types: ToolCallHandler, TurnStartHandler,
AssistantMessageHandler, and their decision structs
(ToolCallDecision with ModifiedArgs, AssistantMessageDecision
with ReplaceText).
- New registration methods: InterceptToolCallX (rich variant of
the existing InterceptToolCall), InterceptTurnStart,
InterceptAssistantMessage.
- dispatchIntercept routes per-event with panic recovery and
always emits exactly one event_intercept_response.
TUI:
- /reload-ext slash command registered in slashCatalog and
runSlash. Added to slashCancelsTurn so it waits for idle like
/compact does.
- runReloadExt shows a "reloading extensions..." status, runs the
Manager.Reload on a goroutine, and reports the resulting stats.
Tests:
- internal/core/intercept_test.go: verifies args are actually
rewritten on the way to Tool.Execute, malformed JSON is ignored,
and block surfaces the reason as an error ToolResult.
- internal/agent/extensions/intercept_test.go: end-to-end with a
bash extension subprocess that blocks rm -rf, rewrites other bash
args to "echo GUARDED:", passes through read calls, allows
turn_start, and redacts SECRET in assistant messages. Second test
verifies Reload respawns the subprocess, re-registers its command,
and fires the onReload callback.
Docs:
- docs/extensions.md: rewrote the intercept section to cover all
three events, added a table of event_intercept_response fields,
documented the /reload-ext hot-reload command, expanded the SDK
section with examples of every handler, moved the old "future"
items into a shipped Phase 4.
- README.md: extensions summary mentions intercept beyond tool_call,
/reload-ext added to the slash-commands table and to the
turn-cancel list in "Queued messages".
|
||
|
|
463c62daf3 |
docs: clarify that no extensions are installed by default
Two-line addition to docs/extensions.md and a tightening of the README bullet point. examples/extensions/* are reference code; a fresh `zot install` gives you a clean agent. Users opt in by copying examples (or any other extension) via `zot ext install` or by pointing `zot --ext PATH` at a working directory for one session. No code changes. |
||
|
|
7e94b0776b |
feat(extensions): --ext PATH (short -e) for ad-hoc loading
Loads an extension from any directory for one zot session without needing to copy / install it under $ZOT_HOME. Repeatable. Resolved to absolute before spawn so paths like "." survive a later cwd change. Loaded BEFORE the installed-extension scan so explicit paths win on name conflicts, letting you shadow an installed copy with a work-in-progress version. zot --ext ./my-extension # one extension zot -e ./a -e ./b # multiple zot --ext . # the cwd is itself an extension Manager.LoadExplicit is the public entry point. Spawns happen in parallel like the regular Discover path, with per-path errors returned so a typo in one --ext doesn't break the others. Wired into both interactive (cli.go) and rpc (rpc.go) modes. Help text + docs/extensions.md updated. Verified end-to-end: disabling the installed scratchpad, running `zot rpc --ext .` from its directory, and asking the model to list its tools shows read_notes available again. |
||
|
|
83ae236571 |
feat(extensions): phase 3 — event subscriptions + tool-call interception
Two new capabilities, both ride on the existing subprocess
protocol with a couple of new frame types.
Event subscriptions (one-way notifications):
ext -> host: subscribe {events: [...], intercept: [...]}
host -> ext: event {event, ...payload}
Recognised events: session_start, turn_start, turn_end,
tool_call, assistant_message. Subscribers get fire-and-forget
notifications on each. Useful for telemetry, audit logs, custom
state widgets that follow live agent activity.
Tool-call interception (round-trip, can refuse):
host -> ext: event_intercept {id, event:"tool_call", tool_name, tool_args}
ext -> host: event_intercept_response {id, block?, reason?}
When at least one extension subscribed to "tool_call" intercept,
zot asks each one in turn before running every tool call. First
blocker wins; reason becomes the tool-result error text the model
sees. Per-extension 5s timeout treats unresponsive interceptors
as "allow" so a wedged extension never stalls the agent.
Wire format additions (internal/extproto):
ext -> host: SubscribeFromExt, EventInterceptResponseFromExt
host -> ext: EventFromHost, EventInterceptFromHost
Manager (internal/agent/extensions):
- per-extension eventSubs / interceptSubs sets, populated by the
subscribe frame
- EmitEvent fans out to every subscribed extension on its own
goroutine (won't block the agent on slow stdin writes)
- InterceptToolCall walks subscribers serially, returning the
first refusal; 5s timeout per subscriber (allow on timeout)
- readLoop handles event_intercept_response correlations the
same way it handles command/tool responses
Core (internal/core/agent.go):
- Agent.BeforeToolExecute hook called from runOneTool right
before tool.Execute. Returning (allowed=false, reason)
short-circuits with an IsError tool result containing reason.
- Agent.OnEvent observer fires for every emitted AgentEvent;
composed transparently with the per-Prompt sink via wrapSink
so neither the existing TUI nor the rpc loop need changes.
Wiring (internal/agent/cli.go, rpc.go):
- wireAgentExt sets BeforeToolExecute -> InterceptToolCall and
OnEvent -> fanoutAgentEvent for every freshly-built agent
(initial, login rebuild, model swap)
- fanoutAgentEvent translates core AgentEvent kinds into
extproto.EventFromHost. Internal-only events (text_delta,
tool_progress) are dropped to keep the per-extension stream
sane.
- session_start emitted once after extensions come up
SDK (pkg/zotext):
- On(name, EventHandler) registers per-event observers
- InterceptToolCall(InterceptHandler) registers a single
intercept callback
- Run() now also sends a subscribe frame before the ready
sentinel, with the union of subscribed events + intercept
- Frame loop handles "event" and "event_intercept" frames,
runs the handlers (intercepts on a goroutine to avoid
head-of-line blocking)
- Capabilities advertised: commands + tools + events
Example (examples/extensions/guard):
- subscribes to session_start / turn_start / tool_call / turn_end
and writes one-line audit entries
- intercepts every bash call; refuses commands matching
rm -rf, sudo, dd of=/, mkfs, the fork bomb, chmod -R 777
- end-to-end verified live: agent -> bash("rm -rf /tmp/foo")
-> guard refuses -> model sees the refusal text and surfaces
it in its reply ("the guard blocked it, as expected — the
pattern \brm\s+-rf\b matched")
Docs/extensions.md updated with all five new frame types and the
guard example.
|
||
|
|
74709a0bd9 |
feat(extensions): phase 2 — extension-defined tools
Extensions can now register tools the LLM calls directly. The model
sees them in its tool list alongside the built-ins (read, write,
edit, bash, skill); when it invokes one, zot routes the tool_call
to the owning extension subprocess and feeds the tool_result back.
Wire format additions (internal/extproto):
ext -> host:
register_tool {name, description, schema}
ready # all initial regs flushed
tool_result {id, content[], is_error} # reply to a tool_call
host -> ext:
tool_call {id, name, args} # raw json args from the model
Manager (internal/agent/extensions):
- tracks per-extension RegisterToolFromExt frames
- validates schemas parse as JSON before registering (bad schema
skipped + logged, doesn't crash zot)
- toolIndex map for O(1) lookup
- WaitForReady(grace): blocks per extension on its readyCh until
a ready frame arrives or the grace expires; called once after
Discover so the agent's tool registry is built against the
final set
- Tools() / HasTool() / InvokeTool() public surface
- readLoop closes readyCh on stdout EOF so a wedged extension
doesn't permanently block WaitForReady
extensionTool (internal/agent/extensions/tool.go):
implements core.Tool. Execute() round-trips through
Manager.InvokeTool with a 60s default timeout, decodes
base64 image blocks, surfaces extension+tool name in
ToolResult.Details for the renderer.
internal/agent/build.go:
- new ExtensionToolSource interface (declared here to avoid the
build->extensions->core import cycle) + ExtensionToolInfo
mirror of extensions.ToolInfo
- Resolved.MergeExtensionTools(): folds extension tools into
ToolRegistry, re-renders the system prompt's tool summary
with both built-in and extension tools listed
- Resolved gains private bookkeeping fields so the rebuild
works without re-running Resolve
internal/agent/cli.go:
extension manager built BEFORE the agent in interactive mode
so MergeExtensionTools can fire before NewAgent. Same in
buildAgent + buildAgentFor closures so login / model-switch
rebuilds also include extension tools. extToolAdapter bridges
*extensions.Manager to ExtensionToolSource.
internal/agent/rpc.go:
extension lifecycle now also runs in `zot rpc` mode. Notify and
Display from extensions surface as `ext_notify` / `ext_display`
events on the rpc stream so any consumer can react.
pkg/zotext (Go SDK):
- ToolHandler, ToolResult, ToolContent types
- Tool(name, desc, schema, fn) registration method
- TextResult / TextErrorResult / Image / ImageBytes constructors
- Run() now also flushes register_tool frames + a final ready
sentinel after the last registration
examples/extensions/weather: working Go example registering one
tool. Deterministic fake weather (sha1 of city -> temp + cond) so
the demo is repeatable. Plus README explaining how to install.
Tests:
internal/agent/extensions/tool_test.go: spawns a mock /bin/sh
extension that registers a tool, sends ready, and echoes tool
calls. Verifies registration timing, lookup via HasTool/Tools,
invoke roundtrip via InvokeTool.
End-to-end verified against live anthropic backend:
prompt: "What is the weather in Berlin?"
-> [tool_call] weather({"city":"Berlin"})
-> [tool_result] Berlin: 16°C, fog (deterministic fake)
-> reply: "Berlin is 16°C."
Docs/extensions.md updated with phase 2 wire format, the new SDK
tool API, and the weather example.
|
||
|
|
222a62c70f |
feat: skills — reusable instructions discovered from SKILL.md files
A skill is a single SKILL.md file with a YAML frontmatter header,
discovered from well-known directories at startup. Two integration
points:
1. The system prompt gains a short manifest listing each skill's
name + one-line description. Cheap (a few dozen tokens).
2. A built-in `skill` tool lets the model load any one skill's
full body on demand and follow the instructions there.
The on-demand-load model keeps token usage cheap: only the
manifest goes into every request; the body is fetched as a tool
result the one or two turns the model actually needs it.
Discovery (priority order — first match wins per name):
./.zot/skills/<name>/SKILL.md project (native)
$ZOT_HOME/skills/<name>/SKILL.md global (native)
./.claude/skills/<name>/SKILL.md project (claude-compat)
~/.claude/skills/<name>/SKILL.md global (claude-compat)
./.agents/skills/<name>/SKILL.md project (agent-compat)
~/.agents/skills/<name>/SKILL.md global (agent-compat)
Compat paths are deliberate: any SKILL.md written for a related
ecosystem works in zot unchanged.
Frontmatter fields:
name optional; defaults to directory name
description required; shown in the system prompt
allowed-tools optional list; informational (no enforcement)
permissions optional per-tool patterns; informational
allowed-tools and permissions are parsed but not enforced this
version. They render in the body so the model can self-regulate.
What landed:
- internal/skills: discovery + frontmatter parsing (no yaml dep —
hand-rolled subset for the limited shape skills use), the on-
demand `skill` tool implementing core.Tool, system-prompt
addendum, FindByName lookup helper. Real unit tests cover all
five locations + dedup priority + parser corner cases.
- internal/agent/build.go: Resolve discovers skills, registers the
skill tool when at least one was found, appends the manifest to
the system prompt's append list. Resolved gains a SkillTool
field so the tui can read the live set.
- internal/agent/modes/skills_dialog.go: /skills picker with two
modes — list view (cursor + paging) and body view (markdown-
rendered with scroll). Refreshes its snapshot each open via
cfg.SkillSnapshot so edits to a SKILL.md during a session are
reflected immediately.
- /skills slash command + entry in slashCatalog.
- examples/skills/code-review and examples/skills/test-fix as
starter skills demonstrating procedural style + frontmatter.
- docs/skills.md: full reference covering discovery, frontmatter,
inspection, authoring tips, and ecosystem compat.
End-to-end verified against the live anthropic backend:
prompt: "What skills do you have available?"
-> "- code-review\n- test-fix"
prompt: "Use the skill tool to load the code-review skill,
then summarize step 1."
-> [tool_call] skill({"name":"code-review"})
-> [tool_result] body returned
-> "Step 1 is to establish what changed by running git status..."
|
||
|
|
0c92d6e914 |
feat: extension system (subprocess + json-rpc, any language)
Phase 1: extensions can register slash commands and push chat
notifications. Tools and event subscriptions land in later phases.
Architecture: each extension is its own subprocess. Zot launches
it on startup, completes a hello/hello_ack handshake over its
stdin/stdout, then routes slash commands the extension registered.
Crash isolation, language agnostic, works with any executable
that can read/write json lines.
What lands here:
- internal/extproto: shared wire-format types (Frame, HelloFromExt,
RegisterCommandFromExt, CommandResponseFromExt, NotifyFromExt,
HelloAckFromHost, CommandInvokedFromHost, ShutdownFromHost...).
Both the host and the SDK marshal/unmarshal the same types.
- internal/agent/extensions: discovery + lifecycle manager.
- Discover() walks $ZOT_HOME/extensions and ./.zot/extensions
(project-local first, global second; first wins for duplicates)
- Spawns each enabled extension, captures stderr to
$ZOT_HOME/logs/ext-<name>.log
- Reads frames in a goroutine, dispatches register_command and
notify, correlates command_response by id
- Stop() sends shutdown, waits 2s, then SIGTERM/SIGKILL
- HostHooks abstracts the tui callbacks (Notify/Submit/Insert/Display)
- Interactive bridge: extensions slot into the slash dispatcher
*after* the built-in catalog, so built-ins always win on conflict.
Extension-registered commands also flow into the autocomplete
popup and /help via slashSuggester.SetExtra. NotifyFromExt frames
render as muted [ext-name] notes above the editor.
- internal/agent/extcmd: `zot ext` CLI.
list / install <path|git-url> / remove / enable / disable / logs
- pkg/zotext: public Go SDK. Construct an Extension, register
Command(name, desc, fn), call Run(). Fn returns a Response built
with Prompt(), Insert(), Display(), Noop(), or Errorf(). Stderr
via Logf() so stdout stays clean for the protocol.
- examples/extensions/hello: working Go example registering /hello
and /summon, plus README + extension.json.
- docs/extensions.md: full protocol reference, including a
~30-line raw-Python example for users who don't want the SDK.
Tests: internal/agent/extensions/manager_test.go spawns a mock
extension via /bin/sh and exercises the full handshake -> register
-> invoke -> response cycle. Verifies the hello frame ordering,
correlation-by-id, and graceful shutdown.
Verified manually: built and installed the example, drove it via
stdin pipes, confirmed clean handshake + correct frame ordering
and shutdown_ack. Builds vet-clean on darwin / linux / windows.
Editor.Insert exported (was Editor.insert) so the extension hooks
can drop text into the input.
|
||
|
|
a670442c9e |
feat: zotcore SDK + zot rpc subprocess protocol
two new ways to embed the zot agent runtime in third-party apps:
1. pkg/zotcore - public Go SDK
- Runtime type: New(Config), Prompt(ctx,text,imgs)->chan Event,
Cancel, Compact, SetModel, State, Messages, Cost, ListModels,
Close. Concurrent-safe; one prompt at a time per Runtime,
ErrBusy if you try to overlap. Spawn multiple Runtimes for
multiple projects.
- Public types mirror the JSON-RPC wire schema 1:1 so consumers
can share parsing code with the out-of-process clients.
- Internal core/agent/provider stay internal; SDK is a thin
facade that exposes only what's stable.
2. zot rpc subcommand - newline-delimited JSON on stdin/stdout
- 'zot rpc' (or 'zot --rpc') turns the agent runtime into a
subprocess that any language can drive via pipes.
- Commands: hello, prompt, abort, compact, get_state,
get_messages, clear, set_model, get_models, ping. Each
optionally carries an id; the matching response echoes it.
- Stream notifications: turn_start, user_message,
assistant_start, text_delta, tool_call, tool_progress,
tool_result, assistant_message, usage, turn_end, done,
error, compact_done. Same shape as the existing --json mode
events (modes.EventToJSON / ContentToJSON were exported
for reuse).
- Auth: optional ZOTCORE_RPC_TOKEN env var; first command
must be hello {token: ...} when set. Without the env var
the spawning process is implicitly trusted.
- Concurrency: one prompt or compact at a time per process,
enforced by a turnMu mutex. abort fires immediately
regardless. Stdin close exits the process.
3. docs/rpc.md - full schema reference
4. examples/rpc/{python,node,shell,go} - reference clients
5. examples/sdk - in-process Go embedding example
6. README updated with a new modes entry and an embedding section
|