For AI agents: A markdown version of this page is available at https://docs.datadoghq.com/security/default_rules/def-000-zlc.md. A documentation index is available at /llms.txt.

Data exfiltration successful

Goal

Detect successful data exfiltration from AI-enabled services. This detection identifies when an attacker has successfully manipulated an LLM to leak sensitive information, including PII, credentials, or other confidential data.

Strategy

Monitor application security events for successful (unblocked) data exfiltration using @ai_guard.attack_categories:data-exfiltration and -@ai_guard.blocked:true. Integration with Sensitive Data Scanner (@ai_guard.sds.categories) enables precise classification of the leaked data type.

Signal severity is determined as follows:

  • CRITICAL Data exfiltration was not blocked and included PII or credentials (@ai_guard.sds.categories:(pii OR credentials)). This represents a confirmed data breach with potential regulatory implications.
  • HIGH Data exfiltration was not blocked, either containing other sensitive data categories or unclassified data. This represents a security incident requiring investigation.

Triage and response

  1. Enable AI Guard in blocking mode for the affected service or tool to prevent further exfiltration.
  2. Block the attacking IP addresses to interrupt ongoing exploitation.
  3. Review the flagged requests and LLM responses to identify exactly what data was exfiltrated.
  4. Determine the breach scope — identify all affected users, sessions, and data records that may have been exposed.
  5. For critical signals with PII or credentials, assess reporting obligations under GDPR, CCPA, or other applicable data protection regulations.
  6. Review system prompts, input sanitization, and output filtering to harden defenses against future attempts.