Sample AI Chatbot Compliance Report

AI Chatbot Compliance Report: MITRE, OWASP, NIST, GDPR, EU AI Act suite

Powered by Airside Labs - AI Testing & Security Evaluation
May 23, 2025
Target: customer-service-agent
Depth: 6,565 probes
Critical

1

issues

High

7

issues

Medium

11

issues

Low

8

issues

Attack Methods

Single-shot Optimization

Single-shot optimization of safety bypass techniques

41.7%

146 / 350 attacks succeeded

Multi-Vector Safety Bypass

Combines multiple jailbreak techniques for enhanced effectiveness

0.3%

6 / 1750 attacks succeeded

Baseline Testing

Original plugin tests without any additional strategies or optimizations

0.3%

1 / 350 attacks succeeded
Security & Access Control
Data protection, access control, and system security risks.
16 failed probes
404/420 passed
  • Resource Hijacking

  • Privacy Violation

  • PII via API/Database

  • PII via Direct Exposure

  • PII via Session Data

  • PII via Social Engineering

Compliance & Legal
Regulatory compliance, legal, and policy violation risks.
70 failed probes
1050/1120 passed
  • Unauthorized Commitments

  • WMD Content

  • IP Violations

  • Malicious Code

  • Cybercrime

  • Illegal Activity

  • Methamphetamine Content

  • Drug-Related Content

  • Weapons Content

  • IP Theft

  • Non-Violent Crime

  • Sexual Crime Content

  • Unauthorized Advice

  • Dangerous Activity Content

  • Violent Crime Content

  • Improvised Explosive Devices

Trust & Safety
Harmful, inappropriate, or offensive content generation risks.
35 failed probes
595/630 passed
  • Child Exploitation

  • Graphic Content

  • Harassment

  • Hate Speech

  • Personal Attacks

  • Profanity

  • Extremist Content

  • Self-Harm

  • Explicit Content

Brand
Output reliability, accuracy, and brand reputation risks.
39 failed probes
311/350 passed
  • Excessive Agency

  • False Information (Hallucination)

  • Disinformation Campaigns

  • Resource Hijacking

  • Political Bias

Vulnerabilities and Mitigations
Type Description Attack Success Rate Severity Actions
Harassment Tests handling of harassment and bullying content 12.9% low
Political Bias Tests handling of political content and bias 12.9% low
Cybercrime Tests handling of cybercrime-related content 11.4% low
Illegal Activity Tests handling of general illegal activities 11.4% medium
Non-Violent Crime Tests handling of non-violent criminal content 11.4% medium
Unauthorized Advice Tests handling of unauthorized professional advice 11.4% medium
Excessive Agency Tests for unauthorized actions beyond defined system boundaries 11.4% medium
Disinformation Campaigns Tests handling of false information campaigns 11.4% medium
Resource Hijacking Tests for unauthorized resource usage and purpose deviation 10.0% high
Personal Attacks Tests handling of personal attacks and insults 10.0% low

Rows per page:

1–10 of 36

Framework Compliance (0/5)
6.0% Attack Success Rate (138/2310 tests failed across 33 plugins)
MITRE ATLAS
critical
Framework Results
71%
  • Failed:
  • Self-Harm

    9%
  • Resource Hijacking

    10%
  • WMD Content

    9%
  • Sexual Crime Content

    7%
  • PII via Social Engineering

    6%
  • PII via Direct Exposure

    4%
  • Violent Crime Content

    4%
  • PII via API/Database

    3%
  • Excessive Agency

    11%
  • Illegal Activity

    11%
  • Disinformation Campaigns

    11%
  • Non-Violent Crime

    11%
  • Unauthorized Advice

    11%
  • False Information (Hallucination)

    10%
  • Graphic Content

    9%
  • Drug-Related Content

    4%
  • Weapons Content

    4%
  • IP Theft

    1%
  • Harassment

    13%
  • Cybercrime

    11%
  • Personal Attacks

    10%
  • Profanity

    10%
  • Improvised Explosive Devices

    9%
  • Dangerous Activity Content

    6%
  • Methamphetamine Content

    1%
  • Passed:
  • Child Exploitation

    0%
  • Hate Speech

    0%
  • Privacy Violation

    0%
  • PII via Session Data

    0%
  • Explicit Content

    0%
  • IP Violations

    0%
  • Malicious Code

    0%
  • Extremist Content

    0%
  • Indirect Prompt Injection

    Not Tested
  • Debug Interface Exposure

    Not Tested
  • Command Injection

    Not Tested
  • SQL Injection

    Not Tested
  • SSRF Vulnerability

    Not Tested
  • Indirect Prompt Injection

    Not Tested
  • Policy Compliance

    Not Tested
  • RBAC Implementation

    Not Tested
  • System Prompt Disclosure

    Not Tested
  • System Prompt Disclosure

    Not Tested
  • ASCII Smuggling

    Not Tested
  • Entity Impersonation

    Not Tested
  • ASCII Smuggling

    Not Tested
  • Competitors

    Not Tested
NIST AI RMF
high
Framework Results
31%
  • Failed:
  • WMD Content

    9%
  • PII via Social Engineering

    6%
  • PII via Direct Exposure

    4%
  • PII via API/Database

    3%
  • Excessive Agency

    11%
  • Disinformation Campaigns

    11%
  • Weapons Content

    4%
  • Harassment

    13%
  • Cybercrime

    11%
  • Personal Attacks

    10%
  • Dangerous Activity Content

    6%
  • Passed:
  • Child Exploitation

    0%
  • Hate Speech

    0%
  • Privacy Violation

    0%
  • PII via Session Data

    0%
  • Explicit Content

    0%
  • IP Violations

    0%
  • Malicious Code

    0%
  • Extremist Content

    0%
  • Command Injection

    Not Tested
  • SQL Injection

    Not Tested
  • Function-Level Authorization Bypass

    Not Tested
  • Object-Level Authorization Bypass

    Not Tested
  • RBAC Implementation

    Not Tested
OWASP API Top 10
high
01. Broken Object Level Authorization
2 Untested
  • Not Tested:
  • Object-Level Authorization Bypass

    Not Tested
  • RBAC Implementation

    Not Tested
02. Broken Authentication
2 Untested
  • Not Tested:
  • Function-Level Authorization Bypass

    Not Tested
  • RBAC Implementation

    Not Tested
03. Broken Object Property Level Authorization
100% ASR
  • Failed:
  • Excessive Agency

    11%
  • Not Tested:
  • Overreliance

    Not Tested
04. Unrestricted Resource Consumption
33% ASR
  • Failed:
  • PII via API/Database

    3%
  • Passed:
  • Privacy Violation

    0%
  • PII via Session Data

    0%
05. Broken Function Level Authorization
3 Untested
  • Not Tested:
  • Function-Level Authorization Bypass

    Not Tested
  • Object-Level Authorization Bypass

    Not Tested
  • RBAC Implementation

    Not Tested
06. Unrestricted Access to Sensitive Business Flows
100% ASR
  • Failed:
  • Disinformation Campaigns

    11%
  • Not Tested:
  • Overreliance

    Not Tested
07. Server Side Request Forgery
2 Untested
  • Not Tested:
  • Command Injection

    Not Tested
  • SQL Injection

    Not Tested
08. Security Misconfiguration
33% ASR
  • Failed:
  • PII via API/Database

    3%
  • Passed:
  • Privacy Violation

    0%
  • PII via Session Data

    0%
09. Improper Inventory Management
100% ASR
  • Failed:
  • Unauthorized Advice

    11%
  • Not Tested:
  • Overreliance

    Not Tested
10. Unsafe Consumption of APIs
0% ASR
  • Passed:
  • Privacy Violation

    0%
  • Not Tested:
  • Debug Interface Exposure

    Not Tested
OWASP LLM Top 10
critical
01. Prompt Injection
73% ASR
  • Failed:
  • Self-Harm

    9%
  • WMD Content

    9%
  • Sexual Crime Content

    7%
  • Violent Crime Content

    4%
  • Illegal Activity

    11%
  • Disinformation Campaigns

    11%
  • Non-Violent Crime

    11%
  • Unauthorized Advice

    11%
  • Graphic Content

    9%
  • Drug-Related Content

    4%
  • Weapons Content

    4%
  • IP Theft

    1%
  • Harassment

    13%
  • Cybercrime

    11%
  • Personal Attacks

    10%
  • Profanity

    10%
  • Improvised Explosive Devices

    9%
  • Dangerous Activity Content

    6%
  • Methamphetamine Content

    1%
  • Passed:
  • Child Exploitation

    0%
  • Hate Speech

    0%
  • Privacy Violation

    0%
  • Explicit Content

    0%
  • IP Violations

    0%
  • Malicious Code

    0%
  • Extremist Content

    0%
  • Not Tested:
  • Indirect Prompt Injection

    Not Tested
  • System Prompt Disclosure

    Not Tested
  • ASCII Smuggling

    Not Tested
02. Sensitive Information Disclosure
60% ASR
  • Failed:
  • PII via Social Engineering

    6%
  • PII via Direct Exposure

    4%
  • PII via API/Database

    3%
  • Passed:
  • PII via Session Data

    0%
  • Privacy Violation

    0%
  • Not Tested:
  • Cross-Session Data Leakage

    Not Tested
  • System Prompt Disclosure

    Not Tested
03. Supply Chain
No Plugins
    04. Improper Output Handling
    50% ASR
    • Failed:
    • Disinformation Campaigns

      11%
    • Unauthorized Advice

      11%
    • Passed:
    • Hate Speech

      0%
    • Extremist Content

      0%
    • Not Tested:
    • Gender Bias

      Not Tested
    05. Insecure Output Handling
    4 Untested
    • Not Tested:
    • Command Injection

      Not Tested
    • SQL Injection

      Not Tested
    • SSRF Vulnerability

      Not Tested
    • Debug Interface Exposure

      Not Tested
    06. Excessive Agency
    100% ASR
    • Failed:
    • Excessive Agency

      11%
    • Not Tested:
    • RBAC Implementation

      Not Tested
    • Function-Level Authorization Bypass

      Not Tested
    • Object-Level Authorization Bypass

      Not Tested
    • Command Injection

      Not Tested
    • SQL Injection

      Not Tested
    • SSRF Vulnerability

      Not Tested
    07. System Prompt Leakage
    60% ASR
    • Failed:
    • PII via Social Engineering

      6%
    • PII via Direct Exposure

      4%
    • PII via API/Database

      3%
    • Passed:
    • Privacy Violation

      0%
    • PII via Session Data

      0%
    • Not Tested:
    • RBAC Implementation

      Not Tested
    • System Prompt Disclosure

      Not Tested
    08. Vector and Embedding Weaknesses
    60% ASR
    • Failed:
    • PII via Social Engineering

      6%
    • PII via Direct Exposure

      4%
    • PII via API/Database

      3%
    • Passed:
    • Privacy Violation

      0%
    • PII via Session Data

      0%
    • Not Tested:
    • Cross-Session Data Leakage

      Not Tested
    09. Misinformation
    100% ASR
    • Failed:
    • Disinformation Campaigns

      11%
    • Unauthorized Advice

      11%
    • False Information (Hallucination)

      10%
    • Not Tested:
    • Overreliance

      Not Tested
    10. Unbounded Consumption
    2 Untested
    • Not Tested:
    • Divergent Repetition

      Not Tested
    • Reasoning DoS

      Not Tested
    EU AI Act
    high
    Framework Results
    17%
    • Failed:
    • Resource Hijacking

      10%
    • PII via Direct Exposure

      4%
    • PII via API/Database

      3%
    • Excessive Agency

      11%
    • Disinformation Campaigns

      11%
    • False Information (Hallucination)

      10%
    • Passed:
    • Child Exploitation

      0%
    • Hate Speech

      0%
    • Privacy Violation

      0%
    • PII via Session Data

      0%
    • Explicit Content

      0%
    • IP Violations

      0%
    • Malicious Code

      0%
    • Extremist Content

      0%
    • Intent

      Not Tested
    • Command Injection

      Not Tested
    • SQL Injection

      Not Tested
    • SSRF Vulnerability

      Not Tested
    • Entity Impersonation

      Not Tested
    • Overreliance

      Not Tested
    • Overreliance

      Not Tested
    • Entity Impersonation

      Not Tested
    • Overreliance

      Not Tested

    Report generated by Airside Labs - Professional AI Testing & Security Evaluation

    Legal Entity: Brooker Consulting Limited, registered in England and Wales, company number 5172364

    Take Our 2-Minute Compliance Quiz for AI Regulations

    Identify potential compliance vulnerabilities in your AI systems. Take our quick quiz now.

    Contact the team for a follow up

    Send us an email to review and discuss compliance scope, potential test plans and reporting evidence needs.

    Error. Your form has not been submittedEmoji
    This is what the server says:
    There must be an @ at the beginning.
    I will retry
    Reply
    We respect your privacy. Your information is safe and will never be shared.