Every hardware test produces data. Voltages, pass/fail results, serial numbers, timestamps, operator IDs. The question isn't whether you have data. It's whether you can find it, trust it, and act on it six months later.
This guide covers how to structure, store, and query electronics test data with TofuPilot so you get automatic traceability, yield tracking, and process control without building any of it yourself.
What Test Data Management Means
Test data management is the practice of organizing test results so they're queryable, traceable, and useful over time. Here's what it needs to answer:
| Question | What it requires |
|---|---|
| Did this unit pass? | A test run linked to a serial number with a clear pass/fail outcome |
| What failed and why? | Per-step measurements with limits, not just a top-level verdict |
| Is yield dropping? | Time-series aggregation of pass/fail across runs |
| Is this measurement drifting? | Historical measurement values with timestamps and limits |
| Can we trace this unit's full history? | Every test run linked to a unit, across stations and revisions |
| Which station has the worst yield? | Station-level metadata on every run |
If your current system can't answer all six, you've got a data problem, not a test problem.
Why Spreadsheets and File Systems Fail at Scale
Most teams start with CSV exports or shared drives. That works for 10 units. It falls apart at 1,000.
| Capability | Spreadsheets / file system | Structured database (TofuPilot) |
|---|---|---|
| Schema consistency | No. Columns drift across files | Yes. Every run follows the same model |
| Query by serial number | Manual search | Instant lookup |
| Yield over time | Build it yourself in Excel | Built-in, automatic |
| Measurement traceability | Fragile, depends on naming | Enforced by data model |
| Multi-station aggregation | Copy-paste across files | Automatic, per-station metadata |
| Concurrent access | File locks, merge conflicts | Native multi-user |
| Audit trail | None | Immutable run history |
The core issue: flat files don't enforce relationships between units, runs, steps, and measurements. Without those relationships, every query is a one-off script.
The Test Data Model
TofuPilot organizes test data into a hierarchy that maps directly to how hardware testing works:
| Entity | What it represents | Example |
|---|---|---|
| Procedure | A test definition (what you're testing) | "FCT_PowerBoard_v2" |
| Run | A single execution of a procedure against a unit | Run #4821, serial SN-0042, PASS |
| Step | A phase or stage within a run | "measure_3v3_rail" |
| Measurement | A single data point within a step, with optional limits | 3.28 V (min: 3.13, max: 3.47) |
| Unit | A physical device identified by serial number | SN-0042 |
| Sub-unit | A component tracked within a parent unit | WiFi module WF-1122 inside SN-0042 |
Every run links to a procedure, a unit, and optionally a station. Steps and measurements nest inside runs. This structure means you can query in any direction: "show me all runs for this unit," "show me all measurements for this step across 10,000 runs," or "show me yield by station for this procedure."
How TofuPilot Structures Test Data Automatically
If you're using OpenHTF, TofuPilot captures the full test structure (phases, measurements, limits, attachments) with zero extra code. Just wrap your test with TofuPilot:
import openhtf as htf
from openhtf.util import units
from tofupilot.openhtf import TofuPilot
@htf.measures(
htf.Measurement("rail_3v3").in_range(3.13, 3.47).with_units(units.VOLT),
)
def measure_3v3_rail(test):
voltage = read_voltage("3V3_RAIL")
test.measurements.rail_3v3 = voltage
@htf.measures(
htf.Measurement("rail_5v").in_range(4.75, 5.25).with_units(units.VOLT),
)
def measure_5v_rail(test):
voltage = read_voltage("5V_RAIL")
test.measurements.rail_5v = voltage
@htf.measures(
htf.Measurement("current_draw").in_range(0.05, 0.5).with_units(units.AMPERE),
)
def measure_current_draw(test):
current = read_current("VIN")
test.measurements.current_draw = current
def main():
test = htf.Test(
measure_3v3_rail,
measure_5v_rail,
measure_current_draw,
procedure_id="FCT_PowerBoard_v2",
)
with TofuPilot(test):
test.execute(test_start=lambda: "SN-0042")
if __name__ == "__main__":
main()That single TofuPilot(test) wrapper sends the full run (phases, measurements, limits, units, outcome, duration) to TofuPilot. No serialization code, no API calls, no file management.
Using the Python Client Directly
Not using OpenHTF? The TofuPilot Python client lets you log structured test data from any test framework or custom script:
from tofupilot import TofuPilotClient
client = TofuPilotClient()
client.create_run(
procedure_id="FCT_PowerBoard_v2",
unit_under_test={"serial_number": "SN-0042"},
run_passed=True,
steps=[
{
"name": "measure_3v3_rail",
"step_passed": True,
"measurements": [
{
"name": "rail_3v3",
"measured_value": 3.28,
"units": "V",
"lower_limit": 3.13,
"upper_limit": 3.47,
}
],
},
{
"name": "measure_5v_rail",
"step_passed": True,
"measurements": [
{
"name": "rail_5v",
"measured_value": 5.01,
"units": "V",
"lower_limit": 4.75,
"upper_limit": 5.25,
}
],
},
{
"name": "measure_current_draw",
"step_passed": True,
"measurements": [
{
"name": "current_draw",
"measured_value": 0.12,
"units": "A",
"lower_limit": 0.05,
"upper_limit": 0.5,
}
],
},
],
)Same data model, same queryability. The client handles validation, batching, and retries.
Built-in Analytics
TofuPilot computes FPY, Cpk, and failure Pareto charts automatically from your test data. Open the Analytics tab on any procedure to see yield trends, or create a custom Report for cross-procedure analysis.
First Pass Yield (FPY) shows the percentage of units that pass on the first attempt, plotted over time. You can filter by station, date range, or unit revision. A sudden FPY drop tells you something changed: a new component lot, a fixture problem, or a test script regression.
Process Capability (Cpk) is calculated per measurement across all runs. TofuPilot plots the measurement distribution against your spec limits and computes Cp and Cpk automatically. A Cpk below 1.33 means your process is too close to the limits.
Failure Pareto ranks test steps by failure count so you can focus on the biggest contributor first. TofuPilot builds this chart for any procedure, any time range.
Custom Reports let you combine FPY, Cpk, and failure data across multiple procedures into a single view. Use these for weekly quality reviews or to compare yield across production lines.
Comparison: Approaches to Test Data Management
| Capability | CSV / shared drive | Custom database | TofuPilot |
|---|---|---|---|
| Structured data model | No | You build it | Built in |
| Query by serial number | Manual | SQL queries | Instant search |
| FPY tracking | Spreadsheet formulas | You build it | Automatic |
| Cpk analysis | Export to Minitab | You build it | Built in per measurement |
| Failure Pareto | Manual sorting | You build it | Automatic |
| Multi-station support | Separate files | You build it | Native, per-station metadata |
| Unit traceability | Fragile | You build it | Full history per serial number |
| Sub-unit tracking | Not practical | You build it | Built in |
| Audit trail | None | You build it | Immutable |
| Setup time | Minutes | Weeks to months | Minutes |
| Maintenance | Low (until it breaks) | Ongoing | Zero |
The pattern is clear. You can build all of this yourself, and many teams have. But every hour spent on test infrastructure is an hour not spent on the product you're actually testing.