PRESS RELEASE: Airside Labs' Pre-Flight Benchmark Added to AISI Community Evaluations Package

updated on 04 May 2025

FOR IMMEDIATE RELEASE
April 11, 2025

Aviation AI Benchmark Now Available Through UK's AI Security Institute's inspect_evals Framework

LONDON, UK - Airside Labs is pleased to announce that its Pre-Flight benchmark for aviation AI systems has been accepted into the community evaluations package included with inspect_evals on GitHub, making the specialized aviation testing tool more accessible to AI developers and researchers worldwide.

This integration means that Pre-Flight can now be easily run through the widely-used inspect framework, allowing organizations to evaluate AI systems against aviation-specific criteria using familiar tools and methodologies.

"Having Pre-Flight included in the community evaluations package represents an important step forward for aviation AI safety testing," said Alexis Brooker, founder of Airside Labs. "This integration makes it significantly easier for developers to include aviation-specific testing in their AI evaluation pipelines."

The inspect_evals framework provides standardized tools for AI system evaluation across various domains. With Pre-Flight now available as part of the community package, developers can run aviation-specific tests alongside other evaluations using a consistent interface and methodology.

Pre-Flight's test suite includes hundreds of validated questions covering air traffic management, flight operations, airport procedures, and regulatory compliance. These tests are designed to determine whether AI systems can accurately reason through complex aviation scenarios and identify potentially dangerous misunderstandings before deployment.

"As AI systems become more prevalent in aviation applications, the need for accessible, domain-specific testing becomes increasingly important," noted Brooker. "The inclusion in the community evaluations package means more developers can incorporate aviation safety testing into their workflows with minimal additional setup."

Developers and organizations can access Pre-Flight through the inspect_evals repository on GitHub. Airside Labs will continue to maintain and expand the aviation benchmark, with regular updates based on industry feedback and evolving AI capabilities.

For more information about Pre-Flight and how to use it within the inspect_evals framework, visit inspect_evals/src/inspect_evals/pre_flight at main · UKGovernmentBEIS/inspect_evals or access the helper repository directly here.

About Airside Labs

Airside Labs is a specialised software, data, and AI experimentation startup focused on the aviation sector. The company conducts innovative research and development projects, validates AI performance, and assists clients in achieving product-market fit more efficiently. Through a range of professional services including data analysis, software development, and AI model creation and testing, Airside Labs helps aviation-related businesses accelerate their innovation processes and optimise operations through data-driven insights and cutting-edge technology solutions.

Contact:
Alexis Brooker
Founder, Airside Labs

Read more

Built on Unicorn Platform