SeaBiscuit

Inference on Idle Hardware

SeaBiscuit Documentation

Reference for installing, operating, and tuning SeaBiscuit on‑prem.

System requirements (validated tiers)

The table lists the lowest configurations we have tested end‑to‑end. Hardware below the Baseline tier may still work for smaller or quantised models, but it is not yet covered by support SLAs.

TierUse caseCPURAMGPU / VRAMNIC
Baseline Models ≤ 13 B params, 4‑bit or FP16 4 cores (AVX2) 16 GB Optional; any PCIe GPU 10 GbE
Large‑model 30 B–70 B param splits 8 cores (AVX‑512) 32 GB ≥1 GPU with ≥16 GB VRAM (A100/H100, MI‑300, etc.) 25 GbE RDMA
or HDR InfiniBand

Both tiers assume at least 50 GB of free NVMe for model files and logs.

Network guidelines

SeaBiscuit adapts its partitioning to the available fabric. To split one model across hosts you need sustained bandwidth ≥ 3 GB/s (25 GbE) or InfiniBand FDR/HDR.

FabricMinimumPreferred
Ethernet25 GbE RDMA/ROCE100 GbE
InfiniBandFDR 56 Gb/sHDR 200 Gb/s

10 GbE networks are supported but the planner keeps tensors >1 MB on the same host.

Placement config

# /etc/seabiscuit/policy.yaml
prefer_same_rack: true
max_block_memory_gb: 16
cpu_block_allow:
  - layer_norm
  - softmax

Monitoring

Metrics at /metrics (Prometheus). Grafana dashboard JSON in dashboards/.

Glossary

TermDefinition
AVX2 / AVX‑512CPU SIMD extensions; speed up FP16 math on CPUs.
GbEEthernet rated in gigabits per second.
FDR / HDRInfiniBand generations at 56 Gb/s and 200 Gb/s.
Min‑cutGraph partition algorithm SeaBiscuit uses to split models.
ROCERDMA over Converged Ethernet.

FAQ

Can I run on a single laptop?

Yes for small (≤7 B) models, but performance will be limited and not covered by support.

Does SeaBiscuit support training?

Not yet – inference only.