Does AI really do what you ask? Student develops control method

How do you know whether artificial intelligence does exactly what you ask? That question is becoming increasingly important now that AI is used more often in areas such as self-driving cars and medical equipment. Panagiotis Kalogeropoulos (Panos), a Master’s student at Fontys University of Applied Sciences in Eindhoven, developed a method to test the reliability of AI. Earlier this month, he presented his research during a workshop of the NASA Formal Methods Symposium in Los Angeles.

“Companies with very expensive and/or dangerous equipment want to make use of the benefits of artificial intelligence, without entrusting human lives or the safety of the equipment to an unreliable LLM,” says Kalogeropoulos. “You cannot rely on a ‘black box’. You must be able to check whether AI does what you think it does.”

The problem the Fontys student is working on affects an increasing number of organizations. The rise of AI is fundamentally changing the way we work, also in high-risk environments. Organizations in those environments also want to make use of the benefits of Generative AI, and it is crucial to accelerate research into generating sustainable, safe, and reliable output from LLMs.

Control method for safe AI

Together with lecturer and researcher Herman Jurjus, Kalogeropoulos developed a method to verify AI outputs before systems become operational. The research was carried out within the Fontys High Tech Embedded Systems research group.

Kalogeropoulos’ method works as a double safety check. Before an AI system carries out an instruction, his framework checks two things: has the AI understood the instruction correctly, and is the action safe to perform? This works as follows: the AI first creates code based on the human instruction. The system then evaluates the generated code and produces a risk assessment from different stakeholder perspectives. Based on that risk evaluation, people can approve or reject the code's use.

At the same time, a panel of multiple AI systems assesses from different perspectives whether the proposed action is dangerous and assigns a risk factor to each potential failure scenario, for example, whether a robot could collide with something. Only when both checks remain below a certain risk threshold is the action allowed to proceed. In case of doubt or danger, the system blocks the action and asks for human control. In this way, organizations can deploy AI systems without blindly relying on a ‘black box’.

Does AI really do what you ask? Student develops control method

By: Team IO+

Control method for safe AI