fix(gpu): make NVIDIA auto-detection actually read the right PCI id (Sam & Claude)

The #29 detector grepped `chip=0x10de...`, but FreeBSD's chip field is
chip=0x<device><vendor> with vendor 0x10de in the LOW 16 bits — so it never
matched and the device id / recommended-branch logic was dead.

- nvidia_device_id: match `chip=0x<4hex>10de` and strip to the device id
  (chip=0x1c8c10de -> 1c8c).
- nvidia_branch_for_device: non-overlapping architecture ranges returning the
  build's lane labels {390,470,590} so detected vs staged compare correctly;
  empty/unknown -> 590 (safe default for modern unknown hardware).

Validated on Linux against representative ids: Fermi 0e22->390, Kepler 0fc8->470,
Maxwell 1380 / Pascal 1b81 / Turing 1c8c / Ada 2684 ->590, empty->590. sh -n clean.

This is the detection brain for the universal NVIDIA auto-install lane; the
on-image NVIDIA repo + boot-time install is the FreeBSD build-side work (handoff).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Sam & Claude 2026-06-04 22:07:02 +02:00
parent c8fb5826c9
commit d2ec1b7a72

View file

@ -44,32 +44,39 @@ clawdie_live_gpu_nvidia_device_id()
_block=$(clawdie_live_gpu_nvidia_block "$1")
[ -n "$_block" ] || return 1
echo "$_block" | grep -Eo 'chip=0x10de[0-9a-fA-F]+' | head -1 | cut -c11-14
# FreeBSD pciconf chip field is chip=0x<device><vendor>. NVIDIA's vendor id
# 0x10de sits in the LOW 16 bits, so the device id is the 4 hex digits BEFORE
# it (e.g. chip=0x1c8c10de -> device id 1c8c). The earlier "chip=0x10de..."
# form read the vendor half and never matched.
echo "$_block" | grep -Eo 'chip=0x[0-9a-fA-F]{4}10de' | head -1 |
sed 's/^chip=0x//; s/10de$//'
}
# Map an NVIDIA PCI device id (4 hex digits) to the recommended driver branch.
# Returns labels in the build's lane vocabulary {390, 470, 590} so the value can
# be compared directly with clawdie_live_gpu_nvidia_branch staged from build.sh.
# (The "590" lane installs the current nvidia-driver-580 package.)
#
# Coarse architecture heuristic by device-id range (validate against real
# hardware; refine the boundaries as needed):
# Fermi ( < 0x0f00) -> 390
# Kepler (0x0f00 .. 0x12ff) -> 470
# Maxwell+ ( >= 0x1300) -> 590 (current: Maxwell/Pascal/Turing/Ampere/Ada/...)
# Unknown/empty -> 590, the safest default for modern unknown hardware.
clawdie_live_gpu_nvidia_branch_for_device()
{
_device_id="$1"
[ -n "$_device_id" ] || return 1
[ -n "$_device_id" ] || { echo "590"; return 0; }
_device_num=$(printf '%d' "0x$_device_id" 2>/dev/null || echo "0")
if [ "$_device_num" -ge 8288 ]; then
echo "590"
return 0
if [ "$_device_num" -ge 4864 ]; then
echo "590" # >= 0x1300 Maxwell, Pascal, Turing, Ampere, Ada, ...
elif [ "$_device_num" -ge 3840 ]; then
echo "470" # 0x0f00 .. 0x12ff Kepler
else
echo "390" # < 0x0f00 Fermi
fi
if [ "$_device_num" -ge 4928 ] && [ "$_device_num" -le 8582 ]; then
echo "470"
return 0
fi
if [ "$_device_num" -ge 4480 ] && [ "$_device_num" -le 5100 ]; then
echo "390"
return 0
fi
echo "590"
}
clawdie_live_gpu_has_pci_vendor()