Data-Driven Manufacturing Testing with TofuPilot

Most hardware test decisions are still made on gut feeling. "I think yield dropped last week." "This component lot seems worse." "The night shift probably isn't following the procedure." TofuPilot replaces opinions with measurements, so every quality decision is backed by data.

The Problem with Opinion-Based Testing

Hardware test operations generate enormous amounts of data. But in most organizations, that data goes into local files and never gets analyzed systematically. Decisions are made based on:

The last failure someone remembers
Anecdotal reports from operators
Monthly summary reports that are outdated by the time they're reviewed
Institutional knowledge that walks out the door when engineers leave

Data-driven testing means every decision (change limits, adjust process, qualify a supplier, release a batch) is backed by actual measurement data.

What Data-Driven Testing Looks Like

Setting Test Limits

Opinion-based: "The spec says 3.3V +/- 5%, so let's use 3.135V to 3.465V as test limits."

Data-driven: Measure 1,000 units. The distribution is centered at 3.31V with a standard deviation of 0.015V. Set test limits at 3.25V to 3.37V (4-sigma). This gives a Cpk of 1.33, balancing quality with false failure rates.

calculate_limits.py

import numpy as np# Production measurement data from TofuPilotvalues = [3.31, 3.30, 3.32, 3.29, 3.31, 3.30, 3.33, 3.31, 3.28, 3.32]# ... (1000 values in practice)mean = np.mean(values)std = np.std(values, ddof=1)# Set limits at mean +/- 4 sigma for Cpk ~ 1.33limit_low = round(mean - 4 * std, 3)limit_high = round(mean + 4 * std, 3)print(f"Mean: {mean:.3f} V")print(f"Std:  {std:.4f} V")print(f"Recommended limits: {limit_low} V to {limit_high} V")

Qualifying a New Supplier

Opinion-based: "The samples looked fine. Let's approve the supplier."

Data-driven: Run 50 units with the new supplier's components through your full test suite. Compare measurement distributions against the baseline from your current supplier.

Measurement	Current supplier (mean)	New supplier (mean)	Shift	Verdict
vcc_3v3	3.310 V	3.305 V	-0.005 V	OK
current_idle	42.1 mA	48.3 mA	+6.2 mA	Investigate
boot_time	340 ms	335 ms	-5 ms	OK

The current draw shift is within limits but significant. Investigate the root cause before full qualification.

Deciding When to Service a Fixture

Opinion-based: "It's been 6 months. Time for scheduled maintenance."

Data-driven: Track station-specific measurement variance over time. Service the fixture when variance exceeds a threshold, not on a calendar.

Station	Variance (week 1)	Variance (week 8)	Variance (week 16)	Action
STN-01	0.008	0.009	0.011	OK
STN-02	0.007	0.012	0.025	Service now
STN-03	0.009	0.010	0.010	OK

Station 2's variance tripled. Service it now. Stations 1 and 3 are fine. Skip their scheduled maintenance.

Building a Data-Driven Test Culture

Step 1: Centralize Everything

No data-driven decisions are possible when test data lives in 15 different places. TofuPilot centralizes all test results from all stations, all procedures, all sites.

centralize.py

from tofupilot import TofuPilotClientclient = TofuPilotClient()# Every station, every test, every site pushes to TofuPilotclient.create_run(    procedure_id="ICT-BOARD-V4",    unit_under_test={"serial_number": "PCB-20251087"},    run_passed=True,    steps=[...],)

Step 2: Define Key Metrics

Pick the metrics that matter most to your operation:

Metric	What it tells you	How often to review
FPY by procedure	Which test is your bottleneck	Daily
Cpk by measurement	Process capability	Weekly
Failure pareto	Top quality issues	Daily
Station FPY comparison	Equipment health	Weekly
Yield trend	Direction of quality	Daily

Step 3: Make Decisions from Dashboards

Replace meeting-based decisions with dashboard-based decisions. Instead of a weekly quality review where someone presents slides, open TofuPilot's dashboard and look at the actual data.

Questions the dashboard answers directly:

"Should we release this batch?" Check FPY and measurement distributions.
"Is Station 3 performing?" Compare its metrics to other stations.
"Did the process change improve yield?" Compare before and after.
"Is this supplier's quality acceptable?" Compare measurement distributions.

Step 4: Automate Decisions Where Possible

Some decisions can be fully automated. Use TofuPilot's API to build automated gates:

Auto-release batches with FPY above 98%
Auto-flag stations with yield below 95%
Auto-reject units with any critical measurement failure
Auto-escalate when a new failure mode appears in the top 3

The Payoff

Data-driven testing doesn't require new test equipment or radical process changes. It requires centralizing the data you're already collecting and building the habit of making decisions from dashboards instead of opinions.

The typical result: yield improves 2-5% in the first quarter, not because you changed anything fundamental, but because you started seeing problems sooner and fixing them faster.

Data-Driven Manufacturing Testing