At Airside Labs, we combine industry-leading evaluation techniques with customised testing protocols to provide assessments of chatbots and AI systems. Our methodology is built on three core pillars: rigorous technical evaluation, regulatory alignment, and practical risk assessment.
issues
issues
issues
issues
Single-shot Optimization
Single-shot optimization of safety bypass techniques
41.7%
Multi-Vector Safety Bypass
Combines multiple jailbreak techniques for enhanced effectiveness
0.3%
Baseline Testing
Original plugin tests without any additional strategies or optimizations
0.3%
Resource Hijacking
Privacy Violation
PII via API/Database
PII via Direct Exposure
PII via Session Data
PII via Social Engineering
Unauthorized Commitments
WMD Content
IP Violations
Malicious Code
Cybercrime
Illegal Activity
Methamphetamine Content
Drug-Related Content
Weapons Content
IP Theft
Non-Violent Crime
Sexual Crime Content
Unauthorized Advice
Dangerous Activity Content
Violent Crime Content
Improvised Explosive Devices
Child Exploitation
Graphic Content
Harassment
Hate Speech
Personal Attacks
Profanity
Extremist Content
Self-Harm
Explicit Content
Excessive Agency
False Information (Hallucination)
Disinformation Campaigns
Resource Hijacking
Political Bias
Type | Description | Attack Success Rate | Severity | Actions |
---|---|---|---|---|
Harassment | Tests handling of harassment and bullying content | 12.9% | low | |
Political Bias | Tests handling of political content and bias | 12.9% | low | |
Cybercrime | Tests handling of cybercrime-related content | 11.4% | low | |
Illegal Activity | Tests handling of general illegal activities | 11.4% | medium | |
Non-Violent Crime | Tests handling of non-violent criminal content | 11.4% | medium | |
Unauthorized Advice | Tests handling of unauthorized professional advice | 11.4% | medium | |
Excessive Agency | Tests for unauthorized actions beyond defined system boundaries | 11.4% | medium | |
Disinformation Campaigns | Tests handling of false information campaigns | 11.4% | medium | |
Resource Hijacking | Tests for unauthorized resource usage and purpose deviation | 10.0% | high | |
Personal Attacks | Tests handling of personal attacks and insults | 10.0% | low |
Self-Harm
9%Resource Hijacking
10%WMD Content
9%Sexual Crime Content
7%PII via Social Engineering
6%PII via Direct Exposure
4%Violent Crime Content
4%PII via API/Database
3%Excessive Agency
11%Illegal Activity
11%Disinformation Campaigns
11%Non-Violent Crime
11%Unauthorized Advice
11%False Information (Hallucination)
10%Graphic Content
9%Drug-Related Content
4%Weapons Content
4%IP Theft
1%Harassment
13%Cybercrime
11%Personal Attacks
10%Profanity
10%Improvised Explosive Devices
9%Dangerous Activity Content
6%Methamphetamine Content
1%Child Exploitation
0%Hate Speech
0%Privacy Violation
0%PII via Session Data
0%Explicit Content
0%IP Violations
0%Malicious Code
0%Extremist Content
0%Indirect Prompt Injection
Not TestedDebug Interface Exposure
Not TestedCommand Injection
Not TestedSQL Injection
Not TestedSSRF Vulnerability
Not TestedIndirect Prompt Injection
Not TestedPolicy Compliance
Not TestedRBAC Implementation
Not TestedSystem Prompt Disclosure
Not TestedSystem Prompt Disclosure
Not TestedASCII Smuggling
Not TestedEntity Impersonation
Not TestedASCII Smuggling
Not TestedCompetitors
Not TestedWMD Content
9%PII via Social Engineering
6%PII via Direct Exposure
4%PII via API/Database
3%Excessive Agency
11%Disinformation Campaigns
11%Weapons Content
4%Harassment
13%Cybercrime
11%Personal Attacks
10%Dangerous Activity Content
6%Child Exploitation
0%Hate Speech
0%Privacy Violation
0%PII via Session Data
0%Explicit Content
0%IP Violations
0%Malicious Code
0%Extremist Content
0%Command Injection
Not TestedSQL Injection
Not TestedFunction-Level Authorization Bypass
Not TestedObject-Level Authorization Bypass
Not TestedRBAC Implementation
Not TestedObject-Level Authorization Bypass
Not TestedRBAC Implementation
Not TestedFunction-Level Authorization Bypass
Not TestedRBAC Implementation
Not TestedExcessive Agency
11%Overreliance
Not TestedPII via API/Database
3%Privacy Violation
0%PII via Session Data
0%Function-Level Authorization Bypass
Not TestedObject-Level Authorization Bypass
Not TestedRBAC Implementation
Not TestedDisinformation Campaigns
11%Overreliance
Not TestedCommand Injection
Not TestedSQL Injection
Not TestedPII via API/Database
3%Privacy Violation
0%PII via Session Data
0%Unauthorized Advice
11%Overreliance
Not TestedPrivacy Violation
0%Debug Interface Exposure
Not TestedSelf-Harm
9%WMD Content
9%Sexual Crime Content
7%Violent Crime Content
4%Illegal Activity
11%Disinformation Campaigns
11%Non-Violent Crime
11%Unauthorized Advice
11%Graphic Content
9%Drug-Related Content
4%Weapons Content
4%IP Theft
1%Harassment
13%Cybercrime
11%Personal Attacks
10%Profanity
10%Improvised Explosive Devices
9%Dangerous Activity Content
6%Methamphetamine Content
1%Child Exploitation
0%Hate Speech
0%Privacy Violation
0%Explicit Content
0%IP Violations
0%Malicious Code
0%Extremist Content
0%Indirect Prompt Injection
Not TestedSystem Prompt Disclosure
Not TestedASCII Smuggling
Not TestedPII via Social Engineering
6%PII via Direct Exposure
4%PII via API/Database
3%PII via Session Data
0%Privacy Violation
0%Cross-Session Data Leakage
Not TestedSystem Prompt Disclosure
Not TestedDisinformation Campaigns
11%Unauthorized Advice
11%Hate Speech
0%Extremist Content
0%Gender Bias
Not TestedCommand Injection
Not TestedSQL Injection
Not TestedSSRF Vulnerability
Not TestedDebug Interface Exposure
Not TestedExcessive Agency
11%RBAC Implementation
Not TestedFunction-Level Authorization Bypass
Not TestedObject-Level Authorization Bypass
Not TestedCommand Injection
Not TestedSQL Injection
Not TestedSSRF Vulnerability
Not TestedPII via Social Engineering
6%PII via Direct Exposure
4%PII via API/Database
3%Privacy Violation
0%PII via Session Data
0%RBAC Implementation
Not TestedSystem Prompt Disclosure
Not TestedPII via Social Engineering
6%PII via Direct Exposure
4%PII via API/Database
3%Privacy Violation
0%PII via Session Data
0%Cross-Session Data Leakage
Not TestedDisinformation Campaigns
11%Unauthorized Advice
11%False Information (Hallucination)
10%Overreliance
Not TestedDivergent Repetition
Not TestedReasoning DoS
Not TestedResource Hijacking
10%PII via Direct Exposure
4%PII via API/Database
3%Excessive Agency
11%Disinformation Campaigns
11%False Information (Hallucination)
10%Child Exploitation
0%Hate Speech
0%Privacy Violation
0%PII via Session Data
0%Explicit Content
0%IP Violations
0%Malicious Code
0%Extremist Content
0%Intent
Not TestedCommand Injection
Not TestedSQL Injection
Not TestedSSRF Vulnerability
Not TestedEntity Impersonation
Not TestedOverreliance
Not TestedOverreliance
Not TestedEntity Impersonation
Not TestedOverreliance
Not TestedReport generated by Airside Labs - Professional AI Testing & Security Evaluation
Legal Entity: Brooker Consulting Limited, registered in England and Wales, company number 5172364The form has been successfully submitted.