Incident AI

This product is not supported for your selected Datadog site. ().
Cette page n'est pas encore disponible en français, sa traduction est en cours.
Si vous avez des questions ou des retours sur notre projet de traduction actuel, n'hésitez pas à nous contacter.

Overview

Incident AI transforms how your team manages incidents by automating coordination tasks and providing intelligent insights throughout the incident lifecycle. Built into Datadog Incident Management, it works in Slack and the Datadog platform to help you respond faster and learn from every incident.

Key capabilities include:

  • Incident summaries: Get context-aware summaries when you join incident channels.
  • Related incident detection: Automatic detection of related incidents to identify systemic issues and recurring issues.
  • Request information or take action: Declare incidents, update severity and status, search incident history, and more—all through conversational prompts in Slack.
  • AI-enhanced notifications: Dynamically populate stakeholder updates with AI-generated summaries of contributing factors, impact, and remediation across email, MS Teams, Slack, and other channels.
  • Automated follow-ups: Incident AI collects action items mentioned during incidents and suggests them as follow-up tasks when the incident is resolved.
  • Intelligent postmortems: Generate comprehensive first drafts with AI-powered sections covering executive summaries, timelines, customer impact, and lessons learned—giving responders a strong foundation to build on.

Get started with incident coordination

Incident AI helps coordinate incidents—especially those involving multiple teams—by suggesting next steps throughout the incident lifecycle. This streamlines communication and improves overall process management.

  1. Connect Datadog to Slack.
    1. In any Slack channel, run the /dd connect command.
    2. Follow the on-screen prompts to complete the connection process.
  2. Enable the Slack integration in Datadog Incident Management.
    1. In the Integrations section of the Incidents settings page, find the Slack settings.
    2. Enable the following toggles:
      • Push Slack channel messages to the incident timeline
      • Activate Incident AI features in incident Slack channels for your organization
        Note: Incident AI’s incident management features can only be activated for one Datadog organization within a single Slack workspace.
  3. To interact with Incident AI in a Slack channel, invite it by running the @Datadog command.

Customize stakeholder notifications

Incident AI can dynamically populate key details in stakeholder notifications, delivering clearer, faster updates across the tools your team already uses. Notification rules support delivery to a wide variety of destinations, including email, Datadog On-Call, MS Teams, Slack, and more, ensuring AI-enhanced updates reach the right people, on the right platform, at the right time.

  1. In your Incidents settings, go to Notification Templates.
  2. Create a new template or edit an existing one.
  3. In the message body, insert any of the following AI variables:
    FieldVariable
    AI Contributing Factors{{incident.ai_contributing_factors}}
    AI Impact{{incident.ai_impact}}
    AI Issue{{incident.ai_issue}}
    AI Remediation{{incident.ai_remediation}}
    New message template with AI variables in it
  4. Click Save to save the template.
  5. Go to your incident Notification Rules.
  6. Click New Rule.
  7. Under With template…, select the message template you just created.
  8. Click Save to save the notification rule.

Proactive incident summaries

When you join an incident channel in Slack (connected to Datadog Incident Management), Incident AI automatically posts a summary containing key information about the incident such as the contributing factors, impact, issue, and remediation. This summary is only visible to you.

When an incident is changed to resolved, Incident AI posts a final summary. This is visible to everyone in the channel.

Example incident summary in Slack

Proactive follow-up task suggestion

After an incident is resolved, Incident AI collects any follow-up tasks responders mentioned during the incident. It then prompts you to review and create them with a single click. These tasks are saved as Incident Follow-Ups in Datadog Incident Management. For more information, see Incident Follow-ups.

To view suggested follow-up tasks:

  1. Navigate to the relevant incident in Datadog.
  2. Open the Remediation tab to view a list of all follow-up tasks you’ve saved from Slack.

Incident AI automatically flags related incidents if they are declared within 20 minutes of each other, helping you identify broader systemic issues.

Chat with Incident AI

Use natural language prompts to request information or take action from Slack:

FunctionalityExample prompt
Declare an incident@Datadog Declare an incident
Change severity@Datadog Update this incident to SEV-3
Change status@Datadog Mark this incident as stable
@Datadog Resolve this incident
Request new summary@Datadog Give me a summary of this incident
@Datadog Summarize incident-262
Note: Private incidents are not summarized.
Search incident history@Datadog How many incidents are currently ongoing?
@Datadog Show me all Sev-1 incidents that occurred in the past week.
Dive into specific incidents@Datadog What was the root cause of incident-123?
@Datadog What remediation actions did the responders take in incident-123?
Find related incidents@Datadog Are there any related incidents?
@Datadog Find me incidents related to DDoS attacks from the past month
Early detection inquiry@Datadog A customer is unable to check out. Is there an incident?
@Datadog Are there any incidents now impacting the payments service?

Customize postmortem templates with AI incident variables

  1. In Datadog, navigate to your incident Postmortem Templates.
  2. Click New Postmortem Template.
  3. Customize your template using the following AI variables for dynamic AI-generated content:
    DescriptionVariable
    Executive summary{{incident.ai_summary}}
    System context and dependencies{{incident.ai_system_overview}}
    Key event timeline{{incident.ai_key_timeline}}
    Summary of customer impact{{incident.ai_customer_impact}}
    Follow-up actions{{incident.ai_action_items}}
    Key takeaways for future prevention{{incident.ai_lessons_learned}}

    Note: AI variables must be preceded by a section header.

  4. Click Save. Your new template appears as a template option during postmortem generation.

Generate a first draft of the incident postmortem

To generate an AI-assisted postmortem draft:

  1. In Datadog, navigate to the resolved incident you’d like to generate a postmortem for.
  2. Ensure the incident timeline contains at least 10 messages.
  3. Click Generate Postmortem.
  4. Under Choose Template, select either the out-of-the-box General incident with AI content template, or a custom template that you’ve created.
  5. Click Generate. Allow up to one minute for the postmortem to be generated. Do not close the tab during this time.
  6. Review the AI-generated postmortem draft. It serves as a starting point for your incident responders. Datadog recommends reviewing and refining the draft before sharing it.

Further reading

Documentation, liens et articles supplémentaires utiles: