Tomco USA · Technical Whitepaper · TW-2025-002AutonomyVol. 1, No. 2Jun 2025

Safety Assurance Framework for Physical AI Systems

Aligning ISO/PAS 8800:2025, UL 4600:2023, and ISO 21448 SOTIF for Autonomous Ground Vehicles and Robotics Platforms

Tomco USA Engineering Staff^¹, Autonomy & AI Safety Practice — ISO/PAS 8800 · UL 4600 Practitioners

^¹Tomco USA, Detroit MI

Abstract — Physical AI systems — autonomous vehicles, collaborative robots, surgical platforms — must satisfy safety certification regimes built for deterministic embedded software. This paper proposes a unified assurance framework that maps ISO/PAS 8800:2025, UL 4600:2023, and ISO 21448 SOTIF onto a shared evidence structure, with worked examples in L3 highway automation and a surgical robotic arm.

Index Terms — physical AI, autonomous vehicles, ISO/PAS 8800, UL 4600, SOTIF, safety assurance case, operational design domain, safety contracts, neural network inference, EU AI Act.

1. Introduction

Physical AI systems — machines that perceive their environment, reason over learned representations, and act in the physical world — represent the fastest-growing category of safety-critical embedded systems. Autonomous vehicles at SAE Level 3 and above [10], collaborative industrial robots, and surgical automation platforms all share a common challenge: their core decision-making capability is implemented not as a verifiable deterministic algorithm but as a trained neural network whose behaviour is characterised statistically, not symbolically.

The safety standards ecosystem has responded with three complementary frameworks: ISO/PAS 8800:2025 [1] addresses AI-specific safety properties for road vehicles; UL 4600:2023 [2] provides a principles-based safety case structure for any autonomous product; and ISO 21448:2022 SOTIF [3] targets hazards arising from the intended functionality of perception and decision systems — hazards that are invisible to conventional functional safety analysis because no fault is present.

These standards were developed largely in parallel and share overlapping concerns but non-identical terminology and evidence structures. Organisations implementing all three — as required by several OEM supply chain agreements and the EU AI Act high-risk classification — face significant duplication of evidence artefacts. This paper presents a unified assurance framework that maps all three standards onto a shared evidence base, reducing compliance overhead while satisfying each standard's unique requirements.

2. Standards Landscape and Regulatory Context

ISO/PAS 8800:2025 is structured in eight clauses covering AI system development, data management, V&V strategy, and operational monitoring. Its central construct is the AI Safety Concept (AISC) — a structured argument that the AI system's contribution to residual risk is acceptable given the vehicle's functional safety concept [1]. Unlike ISO 26262, ISO/PAS 8800 does not define integrity levels; instead it defines properties (e.g., accuracy, robustness, explainability) that the AI system must demonstrate within its operational design domain (ODD).

UL 4600:2023 is principles-based and system-agnostic. Its safety case structure follows the Goal Structuring Notation (GSN) and requires explicit claims about: (1) absence of known unsafe conditions, (2) adequate performance within ODD, (3) safe behaviour at ODD boundaries, and (4) continuous operational monitoring and update governance [2]. UL 4600 does not mandate specific V&V methods but requires that the chosen methods be justified relative to the risk level.

ISO 21448:2022 SOTIF addresses four scenario categories: (1) known/unsafe, (2) known/safe, (3) unknown/unsafe, and (4) unknown/safe. The standard's primary objective is reducing the probability of Category 3 events — unknown unsafe scenarios — to an acceptable level through a structured triggering conditions analysis and scenario generation methodology [3]. SOTIF analysis is complementary to HARA under ISO 26262: where HARA identifies hazards from component failure, SOTIF identifies hazards from performance limitations under nominal operation.

Table I. Standards scope comparison for physical AI safety certification. ODD@= Operational Design Domain; AISC = AI Safety Concept; GSN = Goal Structuring Notation.

Attribute	ISO/PAS 8800:2025	UL 4600:2023	ISO 21448:2022 SOTIF
Domain	Road vehicles	Any autonomous product	Road vehicles
AI-specific provisions	Yes — Clause 5 data governance, Clause 6 V&V	Principles-based, technology-agnostic	No (perception-focused, not AI-specific)
Integrity levels	None defined — properties-based	None defined — risk-proportionate	None defined
ODD definition required	Yes — mandatory input to AISC	Yes — boundary conditions required	Yes — SOTIF analysis scoped to ODD
Safety case format	AISC (structured argument)	GSN safety case	Evidence package (no specific format)
Regulatory link	EU AI Act high-risk (Annex III)	Referenced by NHTSA AV guidance	UNECE WP.29 ALKS Regulation 157

3. Operational Design Domain as Unified Safety Boundary

All three standards converge on the ODD as the primary safety boundary for physical AI systems. The ODD defines the environmental, geographic, and temporal conditions within which the system is designed to operate safely. Safety arguments are valid only within the ODD; at ODD boundaries, the system must execute a minimal risk condition (MRC) — typically a controlled halt or handover to a human operator.

Formally specifying the ODD is a systems engineering challenge. INCOSE's systems engineering handbook provides a general template for operational context specification; SAE J3016 provides driving-domain-specific ODD attributes including speed range, road type, weather conditions, and presence of vulnerable road users. For robotics applications, IEC 61508 hazardous event analysis is adapted to define the ODD in terms of workspace geometry, payload range, and permitted interaction zones.

4. Safety Contracts for Neural Network Inference

A safety contract is a formal specification of the pre-conditions and post-conditions of a software component, such that if pre-conditions are satisfied, the component guarantees its post-conditions. For neural network inference modules, safety contracts address the question that neither ISO 26262 nor conventional V&V methods can answer directly: 'Under what input conditions is this network's output guaranteed to remain within an acceptable error bound?' [8]

The safety contract for a perception network operating within the ODD might specify: (Pre) input image luminance ∈ [10, 100,000] lux; camera contamination index < 0.3; vehicle speed ≤ 130 km/h. (Post) object detection recall for Class A objects (vehicles, pedestrians) ≥ 99.5 % at distances 5–150 m; false positive rate ≤ 0.1 per km; maximum detection latency ≤ 80 ms. These bounds are derived from SOTIF triggering condition analysis and validated through scenario-based testing across the ODD envelope.

4.1 Contract Monitoring at Runtime

Safety contracts are enforced at runtime by an independent contract monitor — a deterministic software component (ASIL B or higher under ISO 26262) that evaluates pre-condition satisfaction before passing inputs to the neural network and post-condition satisfaction before acting on its outputs. When a pre-condition violation is detected (e.g., input luminance below threshold), the contract monitor triggers the MRC without invoking the neural network. This architecture decouples the safety argument from the AI model's internal behaviour.

5. Unified Evidence Structure

The unified assurance framework maps each standard's required evidence to a common artefact taxonomy. Four artefact categories are defined: (1) Context artefacts — ODD specification, HARA, SOTIF functional analysis; (2) Design artefacts — AI Safety Concept, software architecture, safety contracts; (3) Verification artefacts — test datasets, simulation results, V&V reports, DFA; (4) Operational artefacts — monitoring data, incident reports, update governance records.

Table II. Artefact-to-standard mapping for the unified evidence structure. A single artefact may satisfy requirements across multiple standards, reducing documentation overhead.

Artefact	ISO/PAS 8800	UL 4600	ISO 21448 SOTIF	EU AI Act
ODD Specification	Clause 5.2 (mandatory input)	Req. 4.1.1 (boundary conditions)	Clause 5.3 (functional analysis scope)	Article 13 (transparency obligations)
AI Safety Concept	Clause 7 (central deliverable)	Safety case top-level claim	Contributes to SOTIF safety case	Article 9 (risk management system)
Data Governance Record	Clause 5 (mandatory)	Req. 4.3 (data provenance)	Training data quality evidence	Article 10 (data governance, mandatory)
Safety Contracts	Part of AI component spec	Performance claim support	Triggering condition mitigations	Article 9 (technical measures)
Scenario Test Report	Clause 6 (V&V artefact)	Req. 4.2 (performance evidence)	Category 3 scenario coverage evidence	Annex IV (technical documentation)
Operational Monitoring Plan	Clause 8 (post-deployment)	Req. 4.5 (update governance)	Continuous SOTIF evaluation	Article 72 (serious incident reporting)

6. Conclusion

Physical AI systems require a safety assurance approach that simultaneously satisfies ISO/PAS 8800:2025, UL 4600:2023, ISO 21448 SOTIF, and — for EU-market products — the EU AI Act. These frameworks share the ODD as a common safety boundary and converge on a safety case structure supported by an evidence base spanning design, verification, and operational artefacts. The unified framework presented here reduces compliance overhead by mapping all standards to a single artefact taxonomy, enabling a single piece of evidence to satisfy multiple requirements.

The safety contract pattern for neural network inference provides a practical mechanism for bounding AI system behaviour within a deterministic safety architecture — enabling ISO 26262-compliant monitoring of AI components without requiring formal verification of the network itself. As physical AI systems scale from controlled highway automation to open-world manipulation, this architecture provides a stable foundation for certification across the full range of deployed environments.

References

[1] ISO/PAS 8800:2025 – Road vehicles – Safety and artificial intelligence. International Organization for Standardization, Geneva, 2025.
[2] UL 4600:2023 – Standard for Safety for the Evaluation of Autonomous Products. UL Standards & Engagement, Northbrook IL, 2023.
[3] ISO 21448:2022 – Road vehicles – Safety of the Intended Functionality (SOTIF). International Organization for Standardization, Geneva, 2022.
[4] European Parliament. Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (EU AI Act). Official Journal of the European Union, L 2024/1689, 2024.
[5] ISO 26262:2018 – Road vehicles – Functional Safety, Parts 1–12. International Organization for Standardization, Geneva, 2018.
[6] Koopman, P., Osyk, B., Weast, J. 'Autonomous vehicles meet the physical world: Safety, legal, computation, and networking challenges.' Proc. 12th International Symposium on Engineering Secure Software and Systems (ESSoS), 2020.
[7] NHTSA. 'Automated Vehicles for Safety.' National Highway Traffic Safety Administration, Washington DC, 2021.
[8] Seshia, S.A., Sadigh, D., Sastry, S.S. 'Toward Verified Artificial Intelligence.' Communications of the ACM, 65(7):46–55, 2022.
[9] INCOSE. Systems Engineering Handbook: A Guide for System Life Cycle Processes and Activities, 5th ed. Wiley, Hoboken NJ, 2023.
[10] SAE J3016:2021 – Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. SAE International, Warrendale PA, 2021.
[11] IEC 61508:2010 – Functional Safety of E/E/PE Safety-Related Systems. International Electrotechnical Commission, Geneva, 2010.

Safety & Systems

Emerging & Cross-Cutting

Learning Tracks

For Teams

Content

Company