AI’s performance is often dubious, and to use it for safety-critical applications without continuous monitoring or iterative adaptation is perhaps the worst possible way
Machines don’t have morality. They can’t philosophize, or solve a moral quandary, or understand causality like humans do — and that is AI’s Achille’s heel.
Expectation vs. reality
A KPMG survey conducted on 17,000 respondents from 17 countries around the world shows public AI trust and acceptance is at a low. Curiously, the survey finds that their attitude shifts widely with the application in question. For example, the polls show acceptance of AI to be at the lowest when used for human resources — whereas, it is at the highest in healthcare matters.
But here’s the real gut punch. AI’s outputs are often not validated with empirical evidence. In high-stake situations, this small omission can translate into fatal consequences. Imagine a self-driving car in a high-speed lane. There are so many possible conditions that can present on the road — and if the AI system behind the wheels does not consider each one of them at a micro-second level, things can very easily go sideways.
The task can be overwhelming for AI. Evidence: Tesla’s autonomous vehicles have a troubling history of crashes; ChatGPT has confidently whipped up lies and half-truths to questions unknown to it. These instances have sparked a heated debate over AI systems’ integrity.
“AI is always approximating,” said Sophie Gerken, Solutions Manager at Keysight in an interview with RCR Wireless News. “And it is important to keep in mind that AI will nearly always provide an answer, even if this answer is wrong or delivered with a low prediction confidence.”
One may argue what are pre-deployment trials and simulations are for. Granted they are there to ensure that the model delivers as promised, but there’s a “reality gap”.
“AI systems often deliver strong performance in the lab, but in deployment, they encounter data distributions, edge cases, and environmental variations that were not fully represented during training,” Gerken said.
“Even high-fidelity simulations cannot perfectly reproduce sensor characteristics, actuator effects, environmental variability, rare corner cases, or domain-specific interactions,” she added.
Making models transparent and trustworthy
Keysight launched a new software at CES 2026 that seeks to correct this problem. The new AI Software Integrity Builder is a lifecycle tool that is designed to establish trust and transparency in AI systems by closing this gap.
The black box nature of AI systems poses serious hazards in safety-critical industries like automotive, industrial automation, transportation systems, and so on. A small error resulting from low explainability can be the difference between life and death. Standards like ISO/PAS 8800 and the EU AI Act are clear on outcomes, but vague on methods. So if an AI system has an explainability problem, it is a broken technology.
Keysight positions the new software as an AI assurance solution that lets engineers compare a model’s lab behavior with that in the field. Where most solutions stop at dataset analysis and performance validation, the AI Software Integrity Builder ensures safety by providing insights into core areas like data integrity, model reasoning, real-world behavior, and conformance.
It affords developers a glance into the neural processes behind AI’s decision-making, and answers questions, like what’s happening inside the model? Are the training datasets complete, balanced and high-quality? Is the model behaving as it should in training, and reliably thereafter?
The solution tells developers about gaps, biases, and inconsistencies in data — and helps understand model limitations by surfacing underlying patterns and correlations.
As for who are Keysight’s targeted end users, Gerken responded, “Any environment that must demonstrate compliance, reliability, and safe AI behavior under diverse operating conditions can benefit from the AI Software Integrity Builder. Beyond automotive, this includes, for example, domains such as industrial automation, robotics, rail and transportation systems, semiconductor and electronics manufacturing, and other industries where AI interacts with safety‑relevant physical processes. The solution is designed to adapt to different operational domains.”
Do more with less
One of the highlights is inference-based testing, a capability that stands it apart from point solutions, Gerkin said. The feature allows engineers to detect deviations and drifts, and get recommendations on how to fix them in future iterations.
“Since most tools stop at model evaluation and do not include inference‑based testing, customers often need to combine multiple tools themselves, resulting in fragmented processes and incomplete conformance,” she said.
Keysight’s broader goal with the AI Software Integrity Builder is to take a fragmented testing workflow and turn it into a seamless sequence of tasks where trustworthiness is established at the roots, not left for future iterations.
The networks of the future will rely on AI‑enabled edge intelligence and massive inflow of uplink data from IoT devices, creating new safety‑critical contexts. In that future, real‑world AI assurance becomes essential, not optional. So, before we get there, AI systems need to get better at what they do — especially when operating in safety-critical environments.
AI systems may or may not learn causality in the future, but for now, the responsibility lies with the makers to feed it quality data, understand why it’s doing what it’s doing, and what it can and cannot do — to make it trustworthy, as they try to guide it towards higher performance thresholds. Because, as the New York Times columnist Thomas L. Friedman rightly said, without trust, AI has the potential to to be a “nuclear bazooka”.
