European Commission - Directorate General for Communications Networks, Content and Technology

07/17/2025 | News release | Distributed by Public on 07/18/2025 01:02

AI Office contributes to the third-joint testing exercise of the International Network of AI Safety Institutes

The AI Office, as a member of the Network, is actively participating, contributing with deep technical expertise on the evaluation of general-purpose AI models. It was one of the three Network's members that ran the agentic evaluations for the cybersecurity strand.

AdobeStock © Supatman

To accompany the convening of the International Network of AI Safety Institutes in Vancouver, Canada, the Network published the results of its third joint testing exercise. The exercise focused on evaluating "agents"-a class of advanced AI programmes that autonomously reason, plan, use tools, and execute tasks. The goal is to advance global understanding of how to safely and reliably test these emerging agentic systems, which pose novel risks due to reduced human oversight.

This third test focused on two priority risk areas:

  • Leakage of sensitive information and fraud
  • Cybersecurity

This exercise builds on insights from two earlier joint testing exercises conducted by the Network in San Francisco (November 2024) and in Paris (February 2025). The objective of these exercises is to allow the Network to further refine best practices for testing advanced AI systems.

Traditional evaluation methods have proven insufficient to capture the complexity of autonomous agent behaviour. To address this, participating members brought together their collective technical and linguistic expertise. The emphasis of this exercise was not only on testing outcomes but also on improving methodologies-recognising that small changes in evaluation design can significantly impact results.

This collaborative effort marks an important step in advancing the science of agentic evaluation and represents a critical investment in the safe and trustworthy development of advanced AI systems.

Related topics

Artificial intelligence
European Commission - Directorate General for Communications Networks, Content and Technology published this content on July 17, 2025, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on July 18, 2025 at 07:02 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at support@pubt.io