What Is Test Observability
Test observability is the ability to understand what your test systems are doing, why units are failing, and how station performance is changing, all from the data the systems produce. It borrows from software engineering's observability concept (logs, metrics, traces) and applies it to manufacturing test. This guide covers what test observability means, how it differs from test data management, and what it looks like in practice.
Observability vs Data Management
| Aspect | Test Data Management | Test Observability |
|---|---|---|
| Focus | Storing and retrieving test records | Understanding system behavior in real time |
| Question answered | "What happened to this unit?" | "Why is yield dropping right now?" |
| Data use | Compliance, traceability, audits | Debugging, optimization, early warning |
| Timeliness | After the fact (batch reports) | Real-time (streaming dashboards) |
| Scope | Individual test results | System-wide patterns and correlations |
Test data management answers: did this unit pass? Test observability answers: why are 15% of units failing phase_voltage_check on station 3 since Tuesday?
The Three Pillars of Test Observability
Borrowing from software observability, test observability has three pillars:
1. Measurements (Metrics)
Structured measurement data with units, limits, and timestamps. This is the equivalent of application metrics in software.
| Metric | What It Shows |
|---|---|
| First pass yield per station | Station health |
| Measurement distribution per phase | Process stability |
| Cycle time per unit | Throughput efficiency |
| Failure rate per test step | Where defects concentrate |
| Marginal rate per measurement | Early warning of drift |
2. Test Records (Logs)
Detailed records of every test run: serial number, phases executed, measurements collected, pass/fail decisions, timestamps, and errors.
| Record Field | Why It Matters |
|---|---|
| Serial number | Links test data to the physical unit |
| Phase sequence | Shows exactly what ran and in what order |
| Measurement values | The actual data for analysis |
| Error messages | What went wrong when a phase failed |
| Station ID | Which equipment ran the test |
| Firmware/software version | Which test script version was used |
3. Traces (Correlation)
The ability to trace a unit's journey across test stages: IQC, assembly, FCT, EOL, ORT. Traces connect upstream events to downstream outcomes.
| Trace | What It Reveals |
|---|---|
| Unit history | Every test this serial number has ever been through |
| Lot traceability | All units from the same component lot |
| Station correlation | Whether failures cluster on specific stations |
| Temporal correlation | Whether failures cluster at specific times |
What Observable Test Systems Look Like
Without Observability
The test engineer gets a call: "yield dropped on line 2." They walk to the station, review logs, export data to Excel, build charts, and three hours later identify the root cause.
With Observability
The test engineer sees an alert: "yield on station 3 dropped below 90% in the last hour. Top failure: phase_voltage_check. Measurement distribution shifted 200mV compared to yesterday." They open the dashboard, see the shift, correlate it with a supplier lot change at IQC, and contain the issue in minutes.
Prerequisites
- Python 3.10+
- OpenHTF installed (
pip install openhtf) - TofuPilot Python SDK installed (
pip install tofupilot)
Step 1: Instrument Your Tests
Every test phase should produce structured measurements. This is the foundation of observability.
import openhtf as htf
from openhtf.util import units
@htf.measures(
htf.Measurement("supply_voltage_V")
.in_range(minimum=4.9, maximum=5.1)
.with_units(units.VOLT),
htf.Measurement("supply_current_mA")
.in_range(minimum=90, maximum=110)
.with_units(units.MILLIAMPERE),
)
def phase_power_check(test):
"""Measure and validate power supply characteristics."""
test.measurements.supply_voltage_V = 5.01
test.measurements.supply_current_mA = 98.5
@htf.measures(
htf.Measurement("signal_amplitude_V")
.in_range(minimum=1.8, maximum=2.2)
.with_units(units.VOLT),
)
def phase_signal_check(test):
"""Measure output signal amplitude."""
test.measurements.signal_amplitude_V = 2.01Step 2: Stream to TofuPilot
TofuPilot provides the observability layer. Every run streams in real time with full measurement detail.
from tofupilot.openhtf import TofuPilot
test = htf.Test(
phase_power_check,
phase_signal_check,
)
with TofuPilot(test):
test.execute(test_start=lambda: input("Scan serial: "))Step 3: Monitor and Respond
TofuPilot provides the observability tools:
| Tool | What It Shows |
|---|---|
| Yield dashboard | Real-time FPY across stations and procedures |
| Failure Pareto | Which test steps fail most, ranked by frequency |
| Measurement distributions | Histograms with limit overlays for every measurement |
| Control charts | SPC charts detecting out-of-control conditions |
| Station comparison | Side-by-side performance across stations |
| Unit history | Full test trace for any serial number |
| Alerts | Notifications when yield drops below threshold |
Observability Maturity Levels
| Level | Capability | Question Answered |
|---|---|---|
| 1. Logging | Test results are stored | "Did this unit pass?" |
| 2. Monitoring | Dashboards show yield and throughput | "How is the line performing?" |
| 3. Alerting | Notifications on yield drops or measurement drift | "Is something going wrong right now?" |
| 4. Debugging | Drill down into failures, correlate across stations and lots | "Why is this happening?" |
| 5. Predicting | Use patterns to forecast future failures | "What will happen next?" |
Most manufacturing operations are at level 1 or 2. Levels 3-4 are where test observability delivers the highest ROI. Level 5 is where predictive quality begins.