Skip to content
Scaling & Monitoring

Scale from 1 to 100 Test Stations

Architecture guide for scaling test infrastructure from 1 to 100 stations, covering naming, script distribution, config management, networking, and monitoring.

JJulien Buteau
advanced12 min readMarch 14, 2026

Run one station or a hundred: TofuPilot handles both, but the architecture decisions you make at 10 stations determine whether you succeed at 100. This guide walks through the changes needed at each scale threshold.

Scaling Challenges by Stage

StationsWhat breaksRoot cause
1NothingYou're fine
5-10Config driftManual setup per station
10-25Script version mismatchNo deployment pipeline
25-50Debugging blindnessNo centralized logs
50-100Network saturationAll stations hit cloud simultaneously
100+Identity collisionNon-unique station names

Architecture at Each Scale

1 Station

A single machine running your test script. No special infrastructure needed.

[DUT] -> [Test Station] -> [TofuPilot Cloud]

10 Stations

At 10 stations, the bottleneck is configuration. You need a shared config source, consistent naming, and a way to push script updates.

[10 DUTs] -> [10 Test Stations] -> [TofuPilot Cloud] | [Git repo for scripts]

100 Stations

At 100, you need a deployment pipeline, centralized monitoring, and staggered upload scheduling.

[100 DUTs] -> [100 Test Stations] -> [TofuPilot Cloud] | | [CI/CD pipeline] [Health dashboard] [Config server]

Station Naming and Registration

TofuPilot identifies stations by name. A collision corrupts per-station analytics.

{site}-{line}-{station_number} # Examples: taipei-line1-001 munich-eol-001

Set the station name as an environment variable:

/etc/environment
TOFUPILOT_STATION_ID=taipei-line1-001TOFUPILOT_API_KEY=tp_live_xxxxxxxxxxxx

Test Script Distribution

Option 1: Git Pull (1-20 stations)

/etc/cron.d/update-test-script
0 6 * * * testuser cd /opt/testscripts && git pull origin main

Pros: Simple. Cons: Requires network access to Git host. Fails silently.

Option 2: PyInstaller Binary (20-50 stations)

build_and_deploy.sh
#!/bin/bashpyinstaller --onefile test_main.py --name test_runnerfor station in taipei-line1-{001..020}; do  rsync -az dist/test_runner "${station}:/opt/testscripts/test_runner_new"  ssh "${station}" "mv /opt/testscripts/test_runner_new /opt/testscripts/test_runner"done

Pros: No Python on stations. Cons: Slower iteration, larger binary.

Option 3: Docker (50-100 stations)

Dockerfile
FROM python:3.11-slimWORKDIR /appCOPY requirements.txt .RUN pip install --no-cache-dir -r requirements.txtCOPY test_main.py .CMD ["python", "test_main.py"]
docker-compose.yml
version: "3.9"services:  test_runner:    image: your-registry/test-runner:latest    restart: unless-stopped    environment:      - TOFUPILOT_STATION_ID      - TOFUPILOT_API_KEY    devices:      - /dev/ttyUSB0:/dev/ttyUSB0

Update all stations:

deploy.sh
#!/bin/bashparallel-ssh -h stations.txt "docker compose pull && docker compose up -d"

Distribution Method Comparison

MethodStationsPython on stationUpdate speedRollback
Git pull1-20YesFastgit revert
PyInstaller20-50NoMediumReplace binary
Docker50-100NoFastPrevious image tag

Centralized Configuration

Never hardcode values that vary per station or product.

Config tierScopeExample
Environment variablePer stationTOFUPILOT_STATION_ID, TOFUPILOT_API_KEY
Config filePer productVoltage limits, timing thresholds
Test scriptPer test phaseOpenHTF phase logic
config/product_v2.yaml
voltage_rail_3v3:  min: 3.2  max: 3.4voltage_rail_5v:  min: 4.8  max: 5.2boot_time_ms:  min: 0  max: 3000
test_main.py
import yamlimport openhtf as htffrom openhtf.util import unitsfrom tofupilot.openhtf import TofuPilotwith open("config/product_v2.yaml") as f:    cfg = yaml.safe_load(f)@htf.measures(    htf.Measurement("voltage_3v3").in_range(        cfg["voltage_rail_3v3"]["min"],        cfg["voltage_rail_3v3"]["max"]    ).with_units(units.VOLT))def phase_power_check(test):    voltage = read_voltage_rail("3v3")    test.measurements.voltage_3v3 = voltage

Network Topology

Stations communicate only outbound to TofuPilot. No inbound ports required.

DestinationPortPurpose
tofupilot.app443Test result upload
Your Git host443Script updates
Your container registry443Image pulls

At 100 stations, stagger uploads to avoid thundering herd:

upload_utils.py
import timeimport randomdef upload_with_jitter(upload_fn, max_jitter_seconds=10):    time.sleep(random.uniform(0, max_jitter_seconds))    upload_fn()

Monitoring Station Health

TofuPilot tracks per-station yield, throughput, and test duration automatically. Open the Analytics tab filtered by station to spot degradation.

Station Health Checklist

CheckIntervalAction on failure
Heartbeat to TofuPilot1 minAlert on-call, check network
Test script versionOn startAuto-update via pipeline
Disk space1 hourArchive old logs, alert if under 5 GB
USB fixture connectivityBefore each runFail test with clear error

Station Bootstrap Script

bootstrap.sh
#!/bin/bashset -eSTATION_ID=$1API_KEY=$2if [ -z "$STATION_ID" ] || [ -z "$API_KEY" ]; then  echo "Usage: bootstrap.sh <station-id> <api-key>"  exit 1fiecho "TOFUPILOT_STATION_ID=${STATION_ID}" >> /etc/environmentecho "TOFUPILOT_API_KEY=${API_KEY}" >> /etc/environmentcurl -fsSL https://get.docker.com | shmkdir -p /opt/testrunnercat > /opt/testrunner/docker-compose.yml << EOFversion: "3.9"services:  test_runner:    image: your-registry/test-runner:latest    restart: unless-stopped    env_file: /etc/environmentEOFdocker compose -f /opt/testrunner/docker-compose.yml up -decho "Station ${STATION_ID} is running."
terminal
sudo bash bootstrap.sh taipei-line1-042 tp_live_xxxxxxxxxxxx

More Guides

Put this guide into practice