mirror of https://github.com/patriceckhart/zot.git synced 2026-06-26 21:36:31 +02:00

https://www.zot.sh/

Find a file

patriceckhart 83ae236571 feat(extensions): phase 3 — event subscriptions + tool-call interception Two new capabilities, both ride on the existing subprocess protocol with a couple of new frame types. Event subscriptions (one-way notifications): ext -> host: subscribe {events: [...], intercept: [...]} host -> ext: event {event, ...payload} Recognised events: session_start, turn_start, turn_end, tool_call, assistant_message. Subscribers get fire-and-forget notifications on each. Useful for telemetry, audit logs, custom state widgets that follow live agent activity. Tool-call interception (round-trip, can refuse): host -> ext: event_intercept {id, event:"tool_call", tool_name, tool_args} ext -> host: event_intercept_response {id, block?, reason?} When at least one extension subscribed to "tool_call" intercept, zot asks each one in turn before running every tool call. First blocker wins; reason becomes the tool-result error text the model sees. Per-extension 5s timeout treats unresponsive interceptors as "allow" so a wedged extension never stalls the agent. Wire format additions (internal/extproto): ext -> host: SubscribeFromExt, EventInterceptResponseFromExt host -> ext: EventFromHost, EventInterceptFromHost Manager (internal/agent/extensions): - per-extension eventSubs / interceptSubs sets, populated by the subscribe frame - EmitEvent fans out to every subscribed extension on its own goroutine (won't block the agent on slow stdin writes) - InterceptToolCall walks subscribers serially, returning the first refusal; 5s timeout per subscriber (allow on timeout) - readLoop handles event_intercept_response correlations the same way it handles command/tool responses Core (internal/core/agent.go): - Agent.BeforeToolExecute hook called from runOneTool right before tool.Execute. Returning (allowed=false, reason) short-circuits with an IsError tool result containing reason. - Agent.OnEvent observer fires for every emitted AgentEvent; composed transparently with the per-Prompt sink via wrapSink so neither the existing TUI nor the rpc loop need changes. Wiring (internal/agent/cli.go, rpc.go): - wireAgentExt sets BeforeToolExecute -> InterceptToolCall and OnEvent -> fanoutAgentEvent for every freshly-built agent (initial, login rebuild, model swap) - fanoutAgentEvent translates core AgentEvent kinds into extproto.EventFromHost. Internal-only events (text_delta, tool_progress) are dropped to keep the per-extension stream sane. - session_start emitted once after extensions come up SDK (pkg/zotext): - On(name, EventHandler) registers per-event observers - InterceptToolCall(InterceptHandler) registers a single intercept callback - Run() now also sends a subscribe frame before the ready sentinel, with the union of subscribed events + intercept - Frame loop handles "event" and "event_intercept" frames, runs the handlers (intercepts on a goroutine to avoid head-of-line blocking) - Capabilities advertised: commands + tools + events Example (examples/extensions/guard): - subscribes to session_start / turn_start / tool_call / turn_end and writes one-line audit entries - intercepts every bash call; refuses commands matching rm -rf, sudo, dd of=/, mkfs, the fork bomb, chmod -R 777 - end-to-end verified live: agent -> bash("rm -rf /tmp/foo") -> guard refuses -> model sees the refusal text and surfaces it in its reply ("the guard blocked it, as expected — the pattern \brm\s+-rf\b matched") Docs/extensions.md updated with all five new frame types and the guard example.		2026-04-19 14:57:03 +02:00
.github/workflows	release: use rare opt-out token that prose won't match	2026-04-18 11:23:02 +02:00
cmd/zot	add installation process and github workflow	2026-04-18 10:50:02 +02:00
docs	feat(extensions): phase 3 — event subscriptions + tool-call interception	2026-04-19 14:57:03 +02:00
examples	feat(extensions): phase 3 — event subscriptions + tool-call interception	2026-04-19 14:57:03 +02:00
internal	feat(extensions): phase 3 — event subscriptions + tool-call interception	2026-04-19 14:57:03 +02:00
pkg	feat(extensions): phase 3 — event subscriptions + tool-call interception	2026-04-19 14:57:03 +02:00
.goreleaser.yaml	release: remove debug job, gate brew upload on HOMEBREW_TAP_TOKEN	2026-04-18 11:21:32 +02:00
go.mod	fix ci: portable syscall.Select via x/sys/unix; gofmt pass	2026-04-18 10:55:42 +02:00
go.sum	fix ci: portable syscall.Select via x/sys/unix; gofmt pass	2026-04-18 10:55:42 +02:00
install.ps1	fix ci: portable syscall.Select via x/sys/unix; gofmt pass	2026-04-18 10:55:42 +02:00
install.sh	fix ci: portable syscall.Select via x/sys/unix; gofmt pass	2026-04-18 10:55:42 +02:00
LICENSE	initial commit	2026-04-17 20:36:38 +02:00
Makefile	add installation process and github workflow	2026-04-18 10:50:02 +02:00
README.md	feat: skills — reusable instructions discovered from SKILL.md files	2026-04-19 14:32:30 +02:00

README.md

zot

yet another coding agent harness, lightweight and written (vibe-slopped) in go.

one static binary.
two providers atm (anthropic, openai/codex).
four tools (read, write, edit, bash).
three run modes (interactive tui, print, json).
built-in telegram bot.
extensions in any language via subprocess + json-rpc; see docs/extensions.md.
reusable instructions via SKILL.md files; see docs/skills.md.
no community atm.

install

one-liner (macos + linux)

curl -fsSL https://raw.githubusercontent.com/patriceckhart/zot/main/install.sh | bash

detects your os/arch, downloads the latest release from github, verifies the sha256 against the release's checksums.txt, extracts the binary, and drops it in /usr/local/bin, ~/.local/bin, or ~/bin — whichever is writable first. pass a version or prefix to pin:

curl -fsSL https://raw.githubusercontent.com/patriceckhart/zot/main/install.sh | bash -s -- v0.0.1 ~/bin

one-liner (windows, powershell)

iwr -useb https://raw.githubusercontent.com/patriceckhart/zot/main/install.ps1 | iex

drops zot.exe into $HOME\bin and adds it to the user PATH if missing. open a fresh terminal afterwards.

homebrew (macos + linux)

brew install patriceckhart/tap/zot

the tap lives at patriceckhart/homebrew-tap.

go install

go install github.com/patriceckhart/zot/cmd/zot@latest

from source

git clone https://github.com/patriceckhart/zot
cd zot
make build        # produces ./bin/zot
make install      # into $GOPATH/bin

prebuilt binaries

every release on the releases page ships archives for linux, macos, and windows on amd64 + arm64 (except windows/arm64), plus a checksums.txt file. download, verify, chmod +x, and drop on your $PATH.

authenticate

the easiest way is to just run zot and type /login. the tui opens even without credentials and walks you through a browser-based login flow.

credential lookup order

--api-key flag
ANTHROPIC_API_KEY / OPENAI_API_KEY env var
$ZOT_HOME/auth.json (api key or oauth token; mode 0600)

$ZOT_HOME defaults to:

macOS: ~/Library/Application Support/zot
linux: $XDG_STATE_HOME/zot or ~/.local/state/zot
windows: %LOCALAPPDATA%\zot

`/login` flow

run zot and type /login. pick one of two methods:

api key — a small local web server starts on 127.0.0.1:<free-port>, your browser opens a form, you paste your sk-ant-... or sk-... key. zot probes the provider once and saves it to auth.json if accepted.
subscription — use your claude pro/max or chatgpt plus/pro subscription. the oauth flow pins the callback to a fixed port per provider (localhost:53692 for anthropic, localhost:1455 for openai) because those are the only ports their auth servers will redirect to.
- anthropic uses the claude code oauth flow; messages go to api.anthropic.com with a bearer token and the claude-code identity headers.
- openai uses the codex cli oauth flow; messages go to chatgpt.com/backend-api/codex/responses with the chatgpt-account-id extracted from the returned id_token.

note on subscription login: the oauth client ids used are the ones published in anthropic's claude code cli and openai's codex cli. reusing them from a third-party tool is against their terms of service and may be revoked at any time. use it at your own risk; the api-key flow is the safe default.

token refresh

oauth access tokens are short-lived (anthropic ~8h, openai ~30d). zot refreshes them automatically:

at every credential lookup, zot checks the stored expiry and — if past it (with a 60s safety margin) — hits the provider's oauth/token endpoint with the stored refresh_token, persists the new access_token + refresh_token + expiry back to auth.json, and hands the fresh token to the client.
the telegram bridge additionally refreshes once per turn so a bot that runs for days keeps working without manual intervention.
if the refresh itself fails (the refresh_token was revoked, or the account was logged out everywhere), the error bubbles up to the caller: the tui shows it in the status line, the bot replies with it in your dm. run /login to get a fresh token pair.

all data lives under $ZOT_HOME:

$ZOT_HOME/
├── config.json         # last-used provider/model/theme, saved automatically
├── auth.json           # api keys and oauth tokens (mode 0600)
├── sessions/           # jsonl transcripts, one dir per cwd
├── models-cache.json   # live /v1/models discovery cache (6h ttl)
└── logs/               # app log files

usage

zot                              # interactive tui
zot "fix the failing test"       # tui, pre-filled prompt
zot -p "list all go files"       # print final text, exit
zot --json "refactor main.go"    # newline-delimited json events, exit
zot --continue                   # resume the most recent session for this cwd
zot --resume                     # pick a session to resume
zot --list-models                # show supported models
zot --help

flags

flag	description
`--provider anthropic\|openai`	pick the provider
`--model <id>`	pick the model (see `--list-models`)
`--api-key <key>`	override api key
`--base-url <url>`	override provider base url (tests / self-hosted)
`--system-prompt <text>`	replace the default system prompt
`--append-system-prompt <text>`	append text to the system prompt (repeatable)
`--reasoning low\|medium\|high`	enable reasoning on supported models
`-c`, `--continue`	resume the latest session for this cwd
`-r`, `--resume`	pick a session to resume
`--session <path>`	resume a specific session file
`--no-session`	don't read or write session files
`--cwd <path>`	use `<path>` as the working directory
`--no-tools`	disable all tools
`--tools <csv>`	only enable the listed tools
`--max-steps <n>`	cap agent loop iterations (default 50)

tools

read — read text files (or inline images: png / jpg / gif / webp)
write — create or overwrite files, making parent directories as needed
edit — one or more exact-match replacements in an existing file
bash — run a shell command in the session cwd, with merged stdout/stderr and a timeout

when the sandbox is on (see /lock), all four tools refuse paths outside the session cwd.

modes

interactive (default): chat tui with streaming output, spinner, cost meter, slash commands.
print: zot -p "prompt" runs the agent to completion and writes only the final assistant text to stdout.
json: zot --json "prompt" emits one json object per agent event to stdout, newline-delimited. the schema is documented in instructions.md §8.
rpc: zot rpc runs as a long-lived child process; commands in on stdin, events + responses out on stdout, both as ndjson. designed for embedding zot in third-party apps written in any language. see docs/rpc.md for the wire schema and examples/rpc/{python,node,shell,go} for working clients.

embedding

two ways to drive zot from another program:

go in-process: import github.com/patriceckhart/zot/pkg/zotcore. one Runtime per project; Prompt(ctx, text, images) returns a channel of Event. small example in examples/sdk/.
any language out-of-process: spawn zot rpc as a subprocess and exchange newline-delimited json over its stdin/stdout. wire format and event schema in docs/rpc.md. reference clients live under examples/rpc/.

both interfaces share the same event schema, so transcripts captured by one can be replayed through the other.

slash commands

type / in the tui to open the autocomplete popup. available commands:

command	description
`/help`	show key bindings and commands
`/login`	log in via api key or subscription (opens a dialog)
`/logout [provider]`	clear credentials for `anthropic`, `openai`, or all when omitted
`/model`	pick a model from a list (or `/model <id>` to set directly)
`/sessions`	resume a previous session for this directory
`/jump`	scroll the chat to a previous turn (or `/jump <text>` to filter)
`/btw`	side-chat with full context that doesn't add to the main thread
`/compact`	summarize the transcript into one message to free up context
`/lock`	confine tools to the current directory
`/unlock`	allow tools to touch paths outside again
`/clear`	clear the chat transcript
`/exit`	exit zot

`/sessions`

shows previous sessions for the current working directory, newest first, with timestamp, model, message count, cost, and the first user prompt. pick one with ↑/↓, enter to resume, esc to cancel. zot swaps the current session file for the selected one and replays the full transcript (including tool calls) into the agent. sessions remember the model they ended on, so resuming picks up on that exact model even if your global default changed.

`/jump`

opens a turn picker for the current session — one row per user prompt, each showing the turn number, how many tools that turn invoked, and the first line of the prompt. ↑/↓ to pick, enter to jump, esc to cancel. any printable rune while the picker is open extends a filter; backspace narrows it back. /jump <text> pre-applies the filter; if exactly one turn matches, zot jumps straight there without showing the picker.

jumping is non-destructive — the transcript is untouched, the viewport just scrolls so the chosen turn is at the top. a muted line at the top of the chat reads ↑ viewing turn N of M · pgdn to catch up; scroll back to the bottom with pgdn (or keep scrolling with the arrow keys) and the indicator goes away.

`/btw`

opens a side-chat overlay with the full main session as frozen context, so you can ask quick clarifying questions ("does asyncio.gather() catch exceptions?", "btw the bundle budget is 10MB", "what's the default fetch timeout?") without bloating the main thread.

each question fires a one-off model call against system + main transcript + side-chat history so far. responses render in the overlay and stay there. when you press esc to close, nothing has been added to the main session and subsequent main-thread turns don't re-read any of the side-chat exchanges — keeping the running context window lean.

/btw                              # open the overlay, type questions interactively
/btw does PUT replace the whole resource?

inside the overlay: enter sends, esc cancels an in-flight call (or closes the overlay if idle), ctrl+c closes immediately. side-chat exchanges never touch the transcript and aren't persisted to the session file.

`/compact`

sends the current transcript through the model with a structured summarization prompt. the returned summary replaces the transcript as one synthetic user message, with the last few exchanges kept verbatim for continuity. status bar's ctx N/M (P%) meter resets. use it when the context meter creeps past ~80%.

zot also auto-compacts in the background: after any turn that leaves context usage ≥ 85% of the model's window, the agent kicks off a condense pass on its own. you'll see condensing history… (esc to cancel) above the status bar and an (auto) tag next to the context percentage; esc aborts it without touching the transcript.

`/lock`

enforces a sandbox rooted at the cwd shown in the status bar. read / write / edit resolve their target path (including through symlinks) and refuse anything outside the sandbox. bash refuses obvious escape patterns: sudo, rm -rf /, leading cd / / cd .. / cd ~, chmod -R, dd of=/, etc. status bar shows · locked · ~/your/cwd while active.

this is a guardrail against accidents, not a hard security boundary. if you need real isolation, run zot under docker or a proper sandbox.

sessions

every interactive or print/json run (unless --no-session) writes a jsonl transcript under $ZOT_HOME/sessions/<cwd-hash>/. resume any of them with --continue, --resume, --session <path>, or interactively via /sessions inside the tui.

models

--list-models or the /model picker shows the full catalog. three sources:

catalog — models baked into zot, always available
live — ids discovered from GET /v1/models using your stored api key (cached for 6h in $ZOT_HOME/models-cache.json, refreshed in the background on startup)
speculative — ids that appear in the upstream generator but aren't live on the public api yet; they'll 404 today and start working the moment the provider ships them

the context meter in the status line (ctx N/M (P%)) uses the model's advertised context window to show how much of it your last turn consumed.

inline images

when a tool returns an image (e.g. read on a png), zot renders it inline on terminals that support it: iterm2, wezterm, kitty, ghostty. on other terminals you see a text placeholder with mime type, pixel dimensions, and byte size. control with the ZOT_INLINE_IMAGES env var:

value	effect
unset (default)	auto-detect based on `TERM_PROGRAM`
`iterm` / `iterm2`	force iterm2 osc 1337 protocol
`kitty`	force kitty graphics protocol
`off` / `none`	always use the text placeholder

frames containing images are full-repainted (no differential diff) to prevent stale image pixels from lingering through scroll. that costs one terminal flash per image-containing frame; set ZOT_INLINE_IMAGES=off if that bothers you.

queued messages

you can keep typing while the agent is working. pressing enter during a turn queues the message instead of interrupting: it shows up above the status bar as ▸ sliding in: <text> and is delivered as the next user turn the moment the current one finishes. queue as many as you want; they run in order. esc / ctrl+c cancels the active turn and drops the queue so a runaway turn doesn't flood you with stale follow-ups.

slash commands also work while the agent is busy. read-only ones (/help, /jump, /btw, /sessions, /lock, /unlock, /exit) take effect immediately. destructive ones (/clear, /compact, /login, /logout, /model) cancel the active turn first and then run.

keys (interactive mode)

input

key	action
`enter`	submit (queued if the agent is busy)
`alt+enter`	newline
`tab`	complete the selected slash command
`esc`	cancel the current turn (while busy); clear input (while idle)
`ctrl+c`	clear the input + queue (or cancel the current turn). press again within 2s to exit.
`ctrl+d`	exit on empty input
`ctrl+l`	redraw the screen
`ctrl+o`	expand / collapse long tool results

key	action
`ctrl+a` / `ctrl+e`	jump to start / end of line
`alt+←` / `alt+→`	jump one word back / forward
`ctrl+u` / `ctrl+k`	delete to start / end of line
`ctrl+w` · `alt+backspace`	delete the previous word
`up` / `down` (editor non-empty)	cycle through prompt history

chat scroll

key	action
`pgup` / `pgdn`	scroll one page up / down
`up` / `down` (editor empty)	scroll three lines up / down — this is how the mouse wheel reaches the scroll logic on most terminals

telegram bot (bridge)

zot can run as a telegram bot so you can dm it from your phone. it's a built-in subcommand, not a plugin:

zot telegram-bot setup     # paste a BotFather token, verify, save
zot telegram-bot run       # foreground: long-poll in this terminal (ctrl+c to stop)
zot telegram-bot start     # background: detach and return immediately
zot telegram-bot stop      # sigterm the background bot (sigkill after 5s)
zot telegram-bot logs -f   # tail $ZOT_HOME/logs/bot.log (omit -f to just cat)
zot telegram-bot status    # config (token masked) + running/stopped
zot telegram-bot reset     # forget the token + paired user
# short alias: `zot tg ...` is accepted for every subcommand

the background flavor writes the child's pid to $ZOT_HOME/bot.pid and redirects stdout+stderr to $ZOT_HOME/logs/bot.log. zot telegram-bot stop reads that pid, sends sigterm, waits up to five seconds, then escalates to sigkill if the child is still alive. running two instances at once is refused at startup.

use the installed binary for start. go run ./cmd/zot telegram-bot start won't work — go run builds a binary in a temp directory and deletes it when it exits, which kills the detached child. run make install (or go build) first and invoke the installed binary.

setup flow:

talk to @BotFather on telegram, run /newbot, copy the token it gives you.
run zot telegram-bot setup and paste the token when prompted.
run zot telegram-bot run in the directory you want the agent to operate in.
open your bot on telegram, send /start. the first user to do this claims the bridge (stored as allowed_user_id); every other user is rejected.

from then on, any dm you send is forwarded to the agent as a user prompt. attached photos or image/* documents are downloaded and passed to vision-capable models. in-bot telegram commands: /help, /status, /stop (cancel the current turn). config lives in $ZOT_HOME/bot.json (mode 0600).

bot mode respects the usual zot flags — --provider, --model, --cwd, --reasoning, --continue, --no-session, --no-tools, etc. run zot tg run -c --model claude-opus-4-1 to resume the latest session on opus, for example.

development

make build     # build ./bin/zot
make test      # go test -race ./...
make lint      # go vet + gofmt check
make fmt       # gofmt -w .
make release   # cross-compile linux/darwin/windows × amd64/arm64

source layout:

cmd/zot/                      main()
internal/agent/               cli wiring, arg parsing, system prompt, config
internal/agent/modes/         interactive tui, print, json, dialogs
internal/agent/tools/         read, write, edit, bash, sandbox
internal/auth/                credential store, api-key probe, oauth, login server
internal/core/                agent loop, sessions, cost tracking
internal/provider/            anthropic + openai streaming clients, model catalog
internal/tui/                 terminal raw-mode, input parser, editor, renderer, markdown, view

license

MIT

README.md Unescape Escape

zot