Data exfiltration successful
このページは日本語には対応しておりません。随時翻訳に取り組んでいます。
翻訳に関してご質問やご意見ございましたら、
お気軽にご連絡ください。
Goal
Detect successful data exfiltration from AI-enabled services. This detection identifies when an attacker has successfully manipulated an LLM to leak sensitive information, including PII, credentials, or other confidential data. Unlike blocked attempts, these are confirmed security incidents requiring immediate response.
Strategy
Monitor application security events for successful (unblocked) data exfiltration using @ai_guard.attack_categories:data-exfiltration and -@ai_guard.blocked:true. Integration with Sensitive Data Scanner (@ai_guard.sds.categories) enables precise classification of the leaked data type.
Signal severity is determined as follows:
CRITICAL — Data exfiltration was not blocked and included PII or credentials (@ai_guard.attack_categories:data-exfiltration -@ai_guard.blocked:true @ai_guard.sds.categories:(pii OR credentials)). This represents a confirmed data breach with potential regulatory implications.HIGH — Data exfiltration was not blocked, either containing other sensitive data categories or unclassified data (@ai_guard.attack_categories:data-exfiltration -@ai_guard.blocked:true). This represents a security incident requiring investigation.
Triage and response
IMMEDIATE ACTIONS:
- Enable blocking mode — Immediately enable AI Guard in blocking mode for the affected service or tool to prevent further exfiltration.
- Block the source — Block the attacking IP addresses to interrupt ongoing exploitation.
- Review LLM responses — Inspect the flagged requests and LLM responses to identify exactly what data was exfiltrated.
INCIDENT INVESTIGATION:
4. Determine breach scope — Identify all affected users, sessions, and data records that may have been exposed.
5. Forensic analysis — Review:
- Request patterns leading to exfiltration
- System prompts and their effectiveness
- Input sanitization gaps
- Output filtering weaknesses
- Root cause analysis — Determine how the attacker bypassed AI Guard protections (if enabled) or why blocking mode was not active.
REMEDIATION:
7. User notification — For critical signals with PII/credentials, notify affected users according to your incident response plan.
8. Regulatory reporting — Assess reporting obligations under GDPR, CCPA, or other applicable data protection regulations.
9. Harden defenses — Update system prompts, implement stricter output filters, and enhance input sanitization.
10. Security review — Conduct a comprehensive security review of your AI service architecture and data access patterns.