113 lines
3.9 KiB
Markdown
113 lines
3.9 KiB
Markdown
|
|
---
|
||
|
|
name: freebsd-truss-debug
|
||
|
|
description: Debug FreeBSD process failures with truss — trace syscalls to find the exact kernel call that fails (EACCES, ENOENT, etc.).
|
||
|
|
---
|
||
|
|
|
||
|
|
# FreeBSD truss Debugging
|
||
|
|
|
||
|
|
`truss` traces every system call a process makes to the kernel. When a command
|
||
|
|
works from a shell but fails from a daemon/service, `truss` shows exactly which
|
||
|
|
syscall returns the error and why.
|
||
|
|
|
||
|
|
## Quick reference
|
||
|
|
|
||
|
|
```sh
|
||
|
|
# Trace a NEW process (follow children)
|
||
|
|
sudo truss -f -o /tmp/trace.out command [args]
|
||
|
|
|
||
|
|
# Attach to a RUNNING process
|
||
|
|
sudo truss -f -o /tmp/trace.out -p PID
|
||
|
|
|
||
|
|
# Common filters
|
||
|
|
grep 'ERR#' /tmp/trace.out # all errors
|
||
|
|
grep -v 'ERR#2' # exclude "No such file" noise
|
||
|
|
grep 'fork\|rfork\|execve' # process creation only
|
||
|
|
grep 'EACCES\|EPERM\|ERR#13' # permission errors
|
||
|
|
```
|
||
|
|
|
||
|
|
## When to use
|
||
|
|
|
||
|
|
Use `truss` when a command works in one context but not another. Common scenarios:
|
||
|
|
|
||
|
|
- Daemon (via `daemon(8)` or rc.d) gets EACCES but shell works fine → PATH issue
|
||
|
|
- Permission denied but `sudo -u <user>` works → staging directory ownership
|
||
|
|
- "Text file busy" on binary replacement → process still holding the file
|
||
|
|
- Silent failures with no error message → syscall trace reveals the hidden error
|
||
|
|
|
||
|
|
## Walkthrough: debugging a daemon spawn failure
|
||
|
|
|
||
|
|
### 1. Start daemon under truss
|
||
|
|
|
||
|
|
```sh
|
||
|
|
sudo service daemon_name stop
|
||
|
|
sleep 1; sudo rm -f /var/run/socket.sock /tmp/trace.out
|
||
|
|
sudo truss -f -o /tmp/trace.out \
|
||
|
|
env COLIBRI_JAIL_PRIV_MODE=sudo \
|
||
|
|
COLIBRI_DAEMON_SOCKET=/var/run/socket.sock \
|
||
|
|
COLIBRI_DAEMON_DATA_DIR=/var/db/app \
|
||
|
|
/usr/local/bin/daemon-binary &
|
||
|
|
sleep 3 # wait for socket ready
|
||
|
|
```
|
||
|
|
|
||
|
|
**Important:** pass the daemon's expected env vars explicitly so the trace
|
||
|
|
captures the real spawn path, not a misconfigured one.
|
||
|
|
|
||
|
|
### 2. Trigger the failing operation
|
||
|
|
|
||
|
|
```sh
|
||
|
|
client-command --socket /var/run/socket.sock trigger-failure
|
||
|
|
sleep 2
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. Stop and analyze
|
||
|
|
|
||
|
|
```sh
|
||
|
|
sudo pkill daemon-binary; wait
|
||
|
|
wc -l /tmp/trace.out # expect hundreds-thousands of lines
|
||
|
|
|
||
|
|
# Find the error
|
||
|
|
grep 'ERR#13\|ERR#1\|EACCES\|EPERM' /tmp/trace.out | grep -v 'ERR#2'
|
||
|
|
|
||
|
|
# Find process creation (fork + exec)
|
||
|
|
grep 'fork\|rfork\|execve' /tmp/trace.out
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4. Interpret
|
||
|
|
|
||
|
|
| Pattern | Meaning |
|
||
|
|
|---------|---------|
|
||
|
|
| `fork() = ERR#13` | Can't create child process (resource limits?) |
|
||
|
|
| `execve("/path/to/bin") ERR#13` | Binary exists but can't execute (permissions, MAC) |
|
||
|
|
| `execve("sudo") ERR#2` | Bare name — PATH doesn't include `/usr/local/bin` |
|
||
|
|
| `open("/path") ERR#13` | File exists but can't open (ownership, mode) |
|
||
|
|
| `mkdir("/path") ERR#13` | Parent directory not writable |
|
||
|
|
| No fork/exec at all | Error happens BEFORE spawn — staging/validation failure |
|
||
|
|
|
||
|
|
## Common daemon pitfalls caught by truss
|
||
|
|
|
||
|
|
1. **Bare command names**: daemon(8) clears/reorders PATH — `execvp("sudo")` can't find `/usr/local/bin/sudo`. Fix: use absolute paths or a fixed search list.
|
||
|
|
|
||
|
|
2. **Staging directory ownership**: daemon runs as unprivileged user but staging path was created by root. Fix: pre-create with correct ownership in bootstrap script.
|
||
|
|
|
||
|
|
3. **Orphaned processes holding socket**: `service stop` killed the supervisor but old background daemons still hold the socket. Fix: `ps aux | grep 'daemon: name'` to find all supervisors, kill them all before starting.
|
||
|
|
|
||
|
|
4. **Capsicum sandboxing**: if `cap_enter()` appears in the trace, the process entered capability mode and subsequent `open()`/`fork()` calls may fail. Fix: do all setup BEFORE `cap_enter()`.
|
||
|
|
|
||
|
|
## ktrace / kdump (alternative)
|
||
|
|
|
||
|
|
For long-running processes where `truss` output would be too large:
|
||
|
|
|
||
|
|
```sh
|
||
|
|
# Record
|
||
|
|
sudo ktrace -f /tmp/ktrace.out -p PID
|
||
|
|
# ... trigger the bug ...
|
||
|
|
sudo ktrace -C # stop tracing
|
||
|
|
|
||
|
|
# Read
|
||
|
|
kdump -f /tmp/ktrace.out | less
|
||
|
|
kdump -f /tmp/ktrace.out | grep 'fork\|execve\|ERR'
|
||
|
|
```
|
||
|
|
|
||
|
|
`ktrace` writes to a binary file, so it's faster than `truss` for high-throughput
|
||
|
|
processes. Use `kdump` to decode. Same syscall output, different capture mechanism.
|