Low first pass yield costs more than scrap. It means retest cycles, debug time, and late shipments. This guide covers practical strategies to find what's failing, why it's failing, and how to fix it using production data from TofuPilot.
Prerequisites
- A TofuPilot account with test runs already flowing in
- OpenHTF test scripts with measurements and limits defined
- At least a few hundred runs for meaningful analysis
Step 1: Understand Where You're Losing Yield
Before fixing anything, you need to know what's actually failing. Most teams have a gut feeling about their worst tests, but the data usually tells a different story.
TofuPilot's Analytics tab shows your FPY over time, broken down by procedure. Start there. Look for:
- Procedures with FPY below 95%: these are your top targets
- FPY drops that correlate with a date: usually a process change, new component lot, or test script update
- Stations with lower FPY than others running the same procedure: points to fixturing or calibration issues
The Pareto chart on the Analytics tab ranks failure modes by frequency. This is where you focus. Fixing the top two or three failure modes typically recovers most of your lost yield.
Step 2: Run Root Cause Analysis on Top Failures
Once you've identified your worst-performing measurements, dig into the data. There are three common root cause categories.
Test limit issues
Limits that are too tight cause false failures. Limits that are too loose let bad units through. Both hurt yield, just in different ways.
TofuPilot's control charts show measurement trends with 3-sigma limits automatically. Use the Cpk view to identify measurements with poor process capability. A Cpk below 1.0 means your process spread is wider than your spec limits, and you'll keep failing units until you fix either the process or the limits.
| Cpk Range | Interpretation | Action |
|---|---|---|
| < 1.0 | Process not capable | Widen limits or improve process |
| 1.0 - 1.33 | Barely capable | Monitor closely, plan improvement |
| 1.33 - 1.67 | Capable | Acceptable for most production |
| > 1.67 | Highly capable | Consider tightening limits |
Process issues
If a measurement's distribution shifts over time or between stations, that's a process issue, not a limit issue. Common causes:
- Component lot variation (especially passives and connectors)
- Fixture wear or contamination
- Environmental changes (temperature, humidity in the test area)
- Operator-dependent steps with inconsistent execution
Test script issues
Sometimes the test itself is the problem. Flaky measurements, missing settling time, or incorrect instrument configuration can all show up as yield loss. Look for measurements with bimodal distributions or high variance that doesn't correlate with the DUT.
Step 3: Refine Your Measurement Limits
Good limits come from a combination of the component datasheet, your design margins, and actual production data. Here's the workflow.
Start with datasheet limits
Your initial limits should reflect the design intent. Pull minimum and maximum values from the relevant datasheets and your design specs.
import openhtf as htf
from openhtf.util import units
@htf.measures(
htf.Measurement("vdd_3v3")
.with_units(units.VOLT)
.in_range(minimum=3.135, maximum=3.465) # 3.3V +/- 5% from regulator datasheet
.doc("Main 3.3V rail, measured at C12"),
htf.Measurement("vdd_1v8")
.with_units(units.VOLT)
.in_range(minimum=1.746, maximum=1.854) # 1.8V +/- 3% from LDO datasheet
.doc("Core 1.8V rail, measured at C45"),
)
def measure_power_rails(test):
# Instrument reads go here
test.measurements.vdd_3v3 = read_voltage("3V3_TP")
test.measurements.vdd_1v8 = read_voltage("1V8_TP")Tighten with production data
After collecting a few hundred units, check the actual distribution in TofuPilot's control charts. If your measurements cluster tightly around the nominal with Cpk > 1.67, you've got room to tighten. Tighter limits catch marginal units before they become field returns.
Add marginal limits for early warning
OpenHTF supports marginal limits that flag units as "marginal pass" without outright failing them. This gives you early warning when a measurement is drifting toward spec limits.
import openhtf as htf
from openhtf.util import units
@htf.measures(
htf.Measurement("vdd_3v3")
.with_units(units.VOLT)
.in_range(
minimum=3.135, maximum=3.465,
marginal_minimum=3.168, marginal_maximum=3.432, # Inner 2% band
)
.doc("Main 3.3V rail with marginal detection"),
)
def measure_power_rails(test):
test.measurements.vdd_3v3 = read_voltage("3V3_TP")Marginal units pass but get flagged in TofuPilot. When marginal rates climb, you know a process shift is underway before it starts causing hard failures.
Step 4: Improve Test Structure for Reliable Results
Flaky tests kill yield numbers. A few structural practices make a big difference.
Separate measurements cleanly
Each measurement should test one thing. If a single phase measures five different voltages, a failure in any one of them makes it harder to identify the root cause.
import openhtf as htf
from openhtf.util import units
import time
# Good: one measurement per phase, clear naming
@htf.measures(
htf.Measurement("supply_current_idle")
.with_units(units.AMPERE)
.in_range(minimum=0.010, maximum=0.025)
.doc("Board idle current draw at 3.3V input"),
)
def measure_idle_current(test):
set_load("idle")
time.sleep(0.5) # Settling time matters
test.measurements.supply_current_idle = read_current("ISENSE")
@htf.measures(
htf.Measurement("supply_current_active")
.with_units(units.AMPERE)
.in_range(minimum=0.080, maximum=0.150)
.doc("Board current draw during active processing"),
)
def measure_active_current(test):
set_load("active")
time.sleep(1.0) # Active mode needs longer settling
test.measurements.supply_current_active = read_current("ISENSE")Add settling time and retries for noisy measurements
If a measurement is inherently noisy (RF power, current draw during transitions), average multiple readings or add a retry with a short delay. Don't just hope for the best.
import openhtf as htf
import time
@htf.measures(
htf.Measurement("tx_power_dbm")
.in_range(minimum=18.0, maximum=22.0)
.doc("Transmit power at 2.4 GHz, averaged over 5 readings"),
)
def measure_tx_power(test):
readings = []
for _ in range(5):
readings.append(read_rf_power("TX_OUT"))
time.sleep(0.1)
test.measurements.tx_power_dbm = sum(readings) / len(readings)Use descriptive measurement names
Names like m1, test_3, or voltage make it impossible to do root cause analysis at scale. Use names that describe what's being measured and where.
| Bad Name | Good Name |
|---|---|
voltage_1 | vdd_3v3_at_c12 |
test_pass | wifi_association_2g4 |
current | supply_current_idle |
temp | pcb_temp_post_burn_in |
Step 5: Build a Continuous Improvement Workflow
Improving FPY isn't a one-time project. It's a weekly practice.
Weekly review process
- Check FPY trend in TofuPilot's Analytics tab. Is it improving, flat, or declining?
- Review the Pareto chart for top failure modes. Have the top failures changed since last week?
- Inspect control charts for any measurements showing drift or increased variance
- Update limits if production data shows Cpk values that justify tightening or loosening
- Track marginal rates as a leading indicator of future yield loss
When to act vs. when to monitor
| Signal | Action |
|---|---|
| FPY drops > 2% in a week | Investigate immediately |
| New failure mode enters top 3 | Root cause within 48 hours |
| Cpk drops below 1.33 | Plan process improvement |
| Marginal rate increases > 50% | Investigate component lots or fixture |
| FPY stable above target | Monitor weekly, no action needed |
Summary
Improving FPY follows a repeatable pattern: identify the top failures with Pareto analysis, run root cause analysis using TofuPilot's control charts and Cpk data, refine limits based on production distributions, and monitor weekly for regressions. The teams that sustain high yield are the ones that treat this as a continuous loop, not a one-time fix.