Codex explicitly reverted to system /tmp for root pcap staging in c9a8aa1
and documented the exception. No further action needed.
---
Build: pass | Tests: pass — 2456 passed (182 files)
6 KiB
Network Throughput Skill Review Handoff
From: Claude (Linux) Date: 18.may.2026 Status: IN-PROGRESS
Deletion Criteria
- All HIGH items below are resolved
- Skill follows AGENTS.md conventions (project tmp, one command per call, tmux targeting)
- Skill is portable beyond osa/debby pair
HIGH Priority
H1. Hardcoded server/client pair — not portable
File: .agent/skills/network-throughput/SKILL.md:28-32,61,78
Server (osa), client (debby), IPs, interface, and download URL are all
hardcoded. The download URL contains a dated build (15.maj.2026) that will
be superseded.
Fix: Parameterize into a Definitions block at the top:
## Parameters (set before execution)
| Var | Default | Purpose |
| ------------ | ---------------------------------- | ------------------------ |
| SERVER_HOST | osa.smilepowered.org | Server hostname |
| SERVER_IP | (auto from DNS) | Server public IP |
| SERVER_TS_IP | (auto from tailscale status) | Server Tailscale IP |
| SERVER_IF | (auto from ifconfig) | Server NIC |
| CLIENT_HOST | (auto hostname) | Client hostname |
| URL | (auto: latest .img.gz from server) | Download URL |
| DURATION_SEC | 600 | Test duration in seconds |
Resolve IPs dynamically: dig +short $SERVER_HOST and tailscale status --json.
H2. $CLIENT_IP empty → broken tcpdump filter
File: .agent/skills/network-throughput/SKILL.md:63,153
If $CLIENT_IP is not provided, the pcap filter (host and tcp port 443)
is syntactically invalid and tcpdump fails to start. No automatic fallback.
Fix: Add guard:
if [ -z "$CLIENT_IP" ]; then
echo "CLIENT_IP unknown; using unfiltered capture on port 443"
_filter="tcp port 443 or icmp or icmp6"
else
_filter="host ${CLIENT_IP} and tcp port 443"
fi
H3. System /tmp usage — operator accepted (WONTFIX)
File: .agent/skills/network-throughput/SKILL.md:185
CAPTURE_TMP="/tmp/osa-pcap-$TEST_ID" uses system /tmp for root-owned
pcap staging. Codex explicitly reverted the project-local tmp path in
c9a8aa1 and documented it as an intentional exception for root-write
scenarios. The skill now includes a note justifying the deviation.
Status: Operator accepted. The tradeoff (system /tmp for root staging,
chown+move after capture) is documented in the skill. No further action
needed unless the convention changes.
MEDIUM Priority
M1. Multi-line script blocks not decomposed for agent execution
File: Throughout (lines 90-100, 111-118, 148-154, 202-216, 231-241, etc.)
AGENTS.md mandates one command per shell call. The skill is written as a human runbook with multi-line script blocks. An agent must decompose every block, which is error-prone.
Fix: Split each step into numbered individual commands with explicit expected output. Format:
### Step 3: Start server pcap
Run:
mkdir -p /home/clawdie/clawdie-iso/tmp/network-tests/$TEST_ID
date -u '+start-pcap=%Y-%m-%dT%H:%M:%SZ' >> .../timestamps.txt
sudo tcpdump -i vtnet0 -w .../server.pcap -C 200 -W 20 -s 0 tcp port 443 &
M2. No tmux SESSION:WINDOW targeting
File: .agent/skills/network-throughput/SKILL.md:48
Says "run root commands in the visible tmux root window" without specifying
how to target it. AGENTS.md requires SESSION:WINDOW syntax.
Fix: Add explicit targeting guidance:
## Execution Context
- Server commands: `tmux send-keys -t 0:1 "..." Enter`
- Client commands: run directly (Linux agent has shell access)
- Always `tmux list-windows` before targeting
M3. No server-side latency monitor
File: .agent/skills/network-throughput/SKILL.md:202-216
Ping monitor runs only on client. No server-side ping to its gateway or the client. Makes it hard to determine if latency spikes are server-local or path-related.
Fix: Add a parallel server-side ping block targeting the client IP and server default gateway.
M4. No disk space pre-check
File: .agent/skills/network-throughput/SKILL.md:92,113
df -h is collected but never checked. A 4GB pcap ring (200MB x 20) plus
multi-GB download could fill a small disk.
Fix: Add a threshold check after preflight:
_available=$(df -k "$CLIENT_DIR" | awk 'NR==2{print $4}')
if [ "$_available" -lt 5242880 ]; then
echo "WARNING: less than 5GB free on CLIENT_DIR volume"
fi
M5. tcpdump timeout not tied to DURATION_SEC
File: .agent/skills/network-throughput/SKILL.md:149,66
Capture timeout is hardcoded at 900s, default duration is 600s. If someone
sets DURATION_SEC=900, the capture ends simultaneously with the download,
losing post-download counters.
Fix: Calculate capture timeout as DURATION_SEC + 300.
M6. Unexplained 100.103.255.41 / "domedog" IP
File: .agent/skills/network-throughput/SKILL.md:207,484
This IP appears in ping targets and the summary template but is never defined in the Definitions section.
Fix: Add to Definitions:
| DOMEDOG_TS_IP | 100.103.255.41 | Tailscale IP of third reference host |
LOW Priority
L1. User-facing date in result template has no format
File: .agent/skills/network-throughput/SKILL.md:462
Date/time UTC: field should use DD.mmm.YYYY HH:MM per AGENTS.md for
user-facing content.
L2. No error handling on any command
Server preflight (lines 90-100), client preflight (lines 111-118), and all subsequent blocks have no error checking. Silent failures produce misleading diagnostics.
Fix: Add set -e or explicit || echo "WARN: ..." after critical
commands.
L3. No DNS resolution timing
The test uses HTTPS but never captures DNS lookup time. DNS issues could masquerade as throughput problems.
L4. No uplink / ACK-path measurement
Download-only test. Upload throughput matters for ACK path quality.
Delete After
git rm doc/NETWORK-SKILL-HANDOFF.md