TW-2025-002Autonomy
100%
Tomco USA · Technical Whitepaper · TW-2025-002AutonomyVol. 1, No. 2Jun 2025

Safety Assurance Framework for Physical AI Systems

Aligning ISO/PAS 8800:2025, UL 4600:2023, and ISO 21448 SOTIF for Autonomous Ground Vehicles and Robotics Platforms

  1. ¹Tomco USA, Detroit MI
Abstract — Physical AI systems — autonomous vehicles, collaborative robots, surgical platforms — must satisfy safety certification regimes built for deterministic embedded software. This paper proposes a unified assurance framework that maps ISO/PAS 8800:2025, UL 4600:2023, and ISO 21448 SOTIF onto a shared evidence structure, with worked examples in L3 highway automation and a surgical robotic arm.
Index Terms — physical AI, autonomous vehicles, ISO/PAS 8800, UL 4600, SOTIF, safety assurance case, operational design domain, safety contracts, neural network inference, EU AI Act.

1. Introduction

Physical AI systems — machines that perceive their environment, reason over learned representations, and act in the physical world — represent the fastest-growing category of safety-critical embedded systems. Autonomous vehicles at SAE Level 3 and above [10], collaborative industrial robots, and surgical automation platforms all share a common challenge: their core decision-making capability is implemented not as a verifiable deterministic algorithm but as a trained neural network whose behaviour is characterised statistically, not symbolically.

The safety standards ecosystem has responded with three complementary frameworks: ISO/PAS 8800:2025 [1] addresses AI-specific safety properties for road vehicles; UL 4600:2023 [2] provides a principles-based safety case structure for any autonomous product; and ISO 21448:2022 SOTIF [3] targets hazards arising from the intended functionality of perception and decision systems — hazards that are invisible to conventional functional safety analysis because no fault is present.

These standards were developed largely in parallel and share overlapping concerns but non-identical terminology and evidence structures. Organisations implementing all three — as required by several OEM supply chain agreements and the EU AI Act high-risk classification — face significant duplication of evidence artefacts. This paper presents a unified assurance framework that maps all three standards onto a shared evidence base, reducing compliance overhead while satisfying each standard's unique requirements.

2. Standards Landscape and Regulatory Context

ISO/PAS 8800:2025 is structured in eight clauses covering AI system development, data management, V&V strategy, and operational monitoring. Its central construct is the AI Safety Concept (AISC) — a structured argument that the AI system's contribution to residual risk is acceptable given the vehicle's functional safety concept [1]. Unlike ISO 26262, ISO/PAS 8800 does not define integrity levels; instead it defines properties (e.g., accuracy, robustness, explainability) that the AI system must demonstrate within its operational design domain (ODD).

UL 4600:2023 is principles-based and system-agnostic. Its safety case structure follows the Goal Structuring Notation (GSN) and requires explicit claims about: (1) absence of known unsafe conditions, (2) adequate performance within ODD, (3) safe behaviour at ODD boundaries, and (4) continuous operational monitoring and update governance [2]. UL 4600 does not mandate specific V&V methods but requires that the chosen methods be justified relative to the risk level.

ISO 21448:2022 SOTIF addresses four scenario categories: (1) known/unsafe, (2) known/safe, (3) unknown/unsafe, and (4) unknown/safe. The standard's primary objective is reducing the probability of Category 3 events — unknown unsafe scenarios — to an acceptable level through a structured triggering conditions analysis and scenario generation methodology [3]. SOTIF analysis is complementary to HARA under ISO 26262: where HARA identifies hazards from component failure, SOTIF identifies hazards from performance limitations under nominal operation.

Table I. Standards scope comparison for physical AI safety certification. ODD@= Operational Design Domain; AISC = AI Safety Concept; GSN = Goal Structuring Notation.
AttributeISO/PAS 8800:2025UL 4600:2023ISO 21448:2022 SOTIF
DomainRoad vehiclesAny autonomous productRoad vehicles
AI-specific provisionsYes — Clause 5 data governance, Clause 6 V&VPrinciples-based, technology-agnosticNo (perception-focused, not AI-specific)
Integrity levelsNone defined — properties-basedNone defined — risk-proportionateNone defined
ODD definition requiredYes — mandatory input to AISCYes — boundary conditions requiredYes — SOTIF analysis scoped to ODD
Safety case formatAISC (structured argument)GSN safety caseEvidence package (no specific format)
Regulatory linkEU AI Act high-risk (Annex III)Referenced by NHTSA AV guidanceUNECE WP.29 ALKS Regulation 157

3. Operational Design Domain as Unified Safety Boundary

All three standards converge on the ODD as the primary safety boundary for physical AI systems. The ODD defines the environmental, geographic, and temporal conditions within which the system is designed to operate safely. Safety arguments are valid only within the ODD; at ODD boundaries, the system must execute a minimal risk condition (MRC) — typically a controlled halt or handover to a human operator.

Formally specifying the ODD is a systems engineering challenge. INCOSE's systems engineering handbook provides a general template for operational context specification; SAE J3016 provides driving-domain-specific ODD attributes including speed range, road type, weather conditions, and presence of vulnerable road users. For robotics applications, IEC 61508 hazardous event analysis is adapted to define the ODD in terms of workspace geometry, payload range, and permitted interaction zones.

4. Safety Contracts for Neural Network Inference

A safety contract is a formal specification of the pre-conditions and post-conditions of a software component, such that if pre-conditions are satisfied, the component guarantees its post-conditions. For neural network inference modules, safety contracts address the question that neither ISO 26262 nor conventional V&V methods can answer directly: 'Under what input conditions is this network's output guaranteed to remain within an acceptable error bound?' [8]

The safety contract for a perception network operating within the ODD might specify: (Pre) input image luminance ∈ [10, 100,000] lux; camera contamination index < 0.3; vehicle speed ≤ 130 km/h. (Post) object detection recall for Class A objects (vehicles, pedestrians) ≥ 99.5 % at distances 5–150 m; false positive rate ≤ 0.1 per km; maximum detection latency ≤ 80 ms. These bounds are derived from SOTIF triggering condition analysis and validated through scenario-based testing across the ODD envelope.

4.1 Contract Monitoring at Runtime

Safety contracts are enforced at runtime by an independent contract monitor — a deterministic software component (ASIL B or higher under ISO 26262) that evaluates pre-condition satisfaction before passing inputs to the neural network and post-condition satisfaction before acting on its outputs. When a pre-condition violation is detected (e.g., input luminance below threshold), the contract monitor triggers the MRC without invoking the neural network. This architecture decouples the safety argument from the AI model's internal behaviour.

5. Unified Evidence Structure

The unified assurance framework maps each standard's required evidence to a common artefact taxonomy. Four artefact categories are defined: (1) Context artefacts — ODD specification, HARA, SOTIF functional analysis; (2) Design artefacts — AI Safety Concept, software architecture, safety contracts; (3) Verification artefacts — test datasets, simulation results, V&V reports, DFA; (4) Operational artefacts — monitoring data, incident reports, update governance records.

Table II. Artefact-to-standard mapping for the unified evidence structure. A single artefact may satisfy requirements across multiple standards, reducing documentation overhead.
ArtefactISO/PAS 8800UL 4600ISO 21448 SOTIFEU AI Act
ODD SpecificationClause 5.2 (mandatory input)Req. 4.1.1 (boundary conditions)Clause 5.3 (functional analysis scope)Article 13 (transparency obligations)
AI Safety ConceptClause 7 (central deliverable)Safety case top-level claimContributes to SOTIF safety caseArticle 9 (risk management system)
Data Governance RecordClause 5 (mandatory)Req. 4.3 (data provenance)Training data quality evidenceArticle 10 (data governance, mandatory)
Safety ContractsPart of AI component specPerformance claim supportTriggering condition mitigationsArticle 9 (technical measures)
Scenario Test ReportClause 6 (V&V artefact)Req. 4.2 (performance evidence)Category 3 scenario coverage evidenceAnnex IV (technical documentation)
Operational Monitoring PlanClause 8 (post-deployment)Req. 4.5 (update governance)Continuous SOTIF evaluationArticle 72 (serious incident reporting)

6. Conclusion

Physical AI systems require a safety assurance approach that simultaneously satisfies ISO/PAS 8800:2025, UL 4600:2023, ISO 21448 SOTIF, and — for EU-market products — the EU AI Act. These frameworks share the ODD as a common safety boundary and converge on a safety case structure supported by an evidence base spanning design, verification, and operational artefacts. The unified framework presented here reduces compliance overhead by mapping all standards to a single artefact taxonomy, enabling a single piece of evidence to satisfy multiple requirements.

The safety contract pattern for neural network inference provides a practical mechanism for bounding AI system behaviour within a deterministic safety architecture — enabling ISO 26262-compliant monitoring of AI components without requiring formal verification of the network itself. As physical AI systems scale from controlled highway automation to open-world manipulation, this architecture provides a stable foundation for certification across the full range of deployed environments.

References

  1. [1] ISO/PAS 8800:2025 – Road vehicles – Safety and artificial intelligence. International Organization for Standardization, Geneva, 2025.
  2. [2] UL 4600:2023 – Standard for Safety for the Evaluation of Autonomous Products. UL Standards & Engagement, Northbrook IL, 2023.
  3. [3] ISO 21448:2022 – Road vehicles – Safety of the Intended Functionality (SOTIF). International Organization for Standardization, Geneva, 2022.
  4. [4] European Parliament. Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (EU AI Act). Official Journal of the European Union, L 2024/1689, 2024.
  5. [5] ISO 26262:2018 – Road vehicles – Functional Safety, Parts 1–12. International Organization for Standardization, Geneva, 2018.
  6. [6] Koopman, P., Osyk, B., Weast, J. 'Autonomous vehicles meet the physical world: Safety, legal, computation, and networking challenges.' Proc. 12th International Symposium on Engineering Secure Software and Systems (ESSoS), 2020.
  7. [7] NHTSA. 'Automated Vehicles for Safety.' National Highway Traffic Safety Administration, Washington DC, 2021.
  8. [8] Seshia, S.A., Sadigh, D., Sastry, S.S. 'Toward Verified Artificial Intelligence.' Communications of the ACM, 65(7):46–55, 2022.
  9. [9] INCOSE. Systems Engineering Handbook: A Guide for System Life Cycle Processes and Activities, 5th ed. Wiley, Hoboken NJ, 2023.
  10. [10] SAE J3016:2021 – Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. SAE International, Warrendale PA, 2021.
  11. [11] IEC 61508:2010 – Functional Safety of E/E/PE Safety-Related Systems. International Electrotechnical Commission, Geneva, 2010.