top of page

Validation of Medical Device in the Field
UX Researcher | Human Factors Engineer | Insights Lead

Verification & Validation Study Plan 

1) Overview

  • Device: Class II medical device (blinded model)

  • Purpose: Demonstrate that the device meets predefined performance and usability requirements in a controlled, real-world environment (e.g., clinic, sports facility, rehabilitation center) comparable to laboratory benchmarks.

  • Regulatory framing: Aligns with design control expectations for V&V (requirements-based verification; clinical/real-world validation), risk management per ISO 14971 principles, and human factors per FDA HF/IEC 62366 guidance.

  • Program structure: Evidence collected across multiple complementary studies: bench/verification, controlled-environment validation, usability & workflow, and repeatability/reproducibility.

 

2) Requirements & acceptance criteria were set ​

 

3) Multi-Study Design

 

Study 1 — Bench/Engineering Verification

  • Objective: Verify hardware and firmware meet written requirements (accuracy, response time, data integrity).

  • Design: Controlled benchtop tests using traceable standards or simulator inputs across the full specified operating range.

  • Endpoints: Accuracy vs. standard, latency, data loss rate, calibration curve fit (R²), battery/runtime specs.

  • Analysis: Requirement-by-requirement pass/fail; regression fits; tolerance stacks.

 

Study 2 — Controlled Real-World Validation (Primary)

  • Objective: Validate device performance in a non-lab but controlled setting mimicking intended use (e.g., clinic/gym with controlled protocols).

  • Design: Prospective, cross-over comparison against a reference standard; randomized measurement order; pre-/post-calibration checks.

  • Participants: N=XX adults meeting inclusion/exclusion criteria relevant to intended users.

  • Protocol: Standardized activities/postures/workloads; environmental controls (temperature, humidity); scripted setup and calibration; triplicate measurements per condition.

  • Endpoints:

    • Accuracy/Agreement: MAPE, RMSE, Bland-Altman bias & limits of agreement.

    • Precision/Reliability: ICC(2,k), CV across repeated trials and operators.

    • Calibration Stability: Pre- vs. post-session drift.

  • Analysis: Mixed-effects models (condition, operator, order as factors), equivalence margins tied to clinical requirements.

 

Study 3 — Usability & Workflow (Human Factors Validation)

  • Objective: Confirm that representative users can set up, calibrate, operate, and troubleshoot the device safely and effectively.

  • Design: Moderated formative → summative sessions with think-aloud; realistic time pressures and distractions.

  • Participants: Intended users (e.g., clinicians, researchers, coaches).

  • Tasks: Unbox/setup, donning/positioning, calibration, data collection, basic troubleshooting, data export.

  • Endpoints: Critical task success (0 critical use errors), task completion time, error taxonomy, SUS score, NASA-TLX workload.

  • Outputs: Risk controls/usability specs; labeling/IFU updates; training guidance.

 

Study 4 — Repeatability & Reproducibility (R&R)

  • Objective: Quantify variability across days, operators, and devices.

  • Design: Gage R&R with at least 2–3 device units, 2–3 operators, repeated sessions across multiple days; balanced ANOVA.

  • Endpoints: %Contribution by part/operator/device/day; overall ICC; CV.

  • Acceptance: %R&R within predefined threshold 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4) Calibration & Troubleshooting Protocol

  • Calibration schedule: Initial, mid-session spot checks, end-session verification; documented with pass/fail thresholds.

  • Failure handling: Immediate re-calibration, log root cause, swap device if unresolved.

  • Drift tracking: Control charts across sessions; trigger limits for maintenance or firmware review.

 

5) Data Integrity & Documentation

  • Pre-registration: Protocol and primary endpoints registered internally (or public registry if desired).

  • Data handling: Time-synced devices; immutable logs; audit trail; blinded analyst for primary outcomes.

  • Quality controls: Duplicate data exports, checksum verification, version-controlled analysis scripts.

 

6) Statistical Plan 

  • Sample size: Powered for equivalence on primary accuracy metric (two one-sided tests) with α=0.05, 1–β=0.80, equivalence bounds tied to clinical requirements.

  • Primary analysis: Equivalence vs. reference; Bland-Altman with proportional bias check; ICC for reliability.

  • Secondary analyses: Subgroup (operator, environment band), sensitivity (exclude calibration failures), robustness (device unit effects).

  • Missing data: Predefined rules for outliers, dropouts, and device faults.

 

7) Risk Management & Labeling Inputs

  • HF/Risk links: Map observed use errors to FMEA; implement mitigations (UI prompts, setup guides, quick-start labels).

  • Labeling updates: Clarify calibration cadence, environmental constraints, and troubleshooting flow.

  • Training: Role-based modules; performance checklist for competency.

 

8) Deliverables

  • Verification Report (Study 1)

  • Validation Report (Study 2) with full stats & traceability to requirements

  • HF/Usability Report (Study 3) with residual risk rationale

  • R&R Report (Study 4) and manufacturing/QA recommendations

​

​

bottom of page