This product is not supported for your selected Datadog site. ().

Overview

A New Flaky Test PR Gate blocks pull requests that introduce at least one new flaky test. This helps keep flaky tests out of your default branch.

A new flaky test is a test that:

  • was added in the pull request branch and has not run in any other branch.
  • has exhibited flaky behavior for the first time.
Early Flake Detection and automatic fix verification with Attempt To Fix are supported only with native libraries, not JUnit XML uploads. If you use JUnit XML uploads, verify fixes manually and mark the test as Fixed in Flaky Tests Management.

Before you begin:

How it works

If Datadog detects at least one new flaky test in the feature branch, the PR gate fails.

Set up the PR gate

  1. Go to Create Rule and select New Flaky Tests.
A PR Gate rule that fails when a pull request introduces at least one new flaky test
  1. Select the repositories where the rule should be evaluated. By default, it applies to all repositories in your organization.

  2. Optional: enable Early Flake Detection to improve detection of new flaky tests.

After you create the PR gate, a new status check appears in pull requests from the selected repositories.

Preview from new flaky test PR gate

In a pull request

The following screenshots use GitHub as an example. The status check and optional PR comment appear in the pull request.

If Datadog detects a new flaky test, the status check fails:

GitHub pull request check failing because a new flaky test is detected

If PR comments are enabled, Datadog adds a comment listing the new flaky tests and their error messages:

GitHub pull request comment showing the new flaky test

When you open the failed check, Datadog shows the PR gate details in the Datadog UI:

Datadog PR gate detail view

Make the PR gate pass

The PR gate stays red until the flaky test is marked as Fixed in Flaky Tests Management.

There are two ways to mark the new flaky test as Fixed:

  1. (Recommended) Automatically triggered Attempt To Fix
  2. Manually mark the test as Fixed

Automatically triggered Attempt To Fix

If the new flaky test is reported by one of the supported native libraries, the next test execution after detection automatically triggers Attempt To Fix. The native library retries the test to verify whether the flakiness has been resolved. The expected flow is:

  1. A developer pushes a new test.
  2. CI runs and the test is detected as newly flaky.
  3. The New Flaky PR gate fails.
  4. A developer pushes a fix.
  5. The native library retries the newly flaky test to verify the fix.
    • If the fix worked, the PR gate passes.
    • If the fix did not work, the PR gate fails.

Manually mark the test as Fixed

Not: This is not necessary if the test is reported with a supported native library.

If the new flaky test is reported with JUnit XML upload, you must verify the fix yourself and then mark the test as fixed manually. The expected flow is:

  1. A developer pushes a new test.
  2. CI runs and the test is detected as newly flaky.
  3. The New Flaky PR gate fails.
  4. A developer pushes a fix.
  5. Verify the fix manually, for example by retrying CI or using programmatic retries.
    • If the fix worked, mark the test as Fixed manually. The PR gate passes.
    • If the fix did not work, keep iterating. The PR gate fails.

How to manually mark the new flaky test as Fixed

In the Datadog UI, follow the link from the PR gate details:

Datadog PR gate detail view with a link to Flaky Tests Management

This link opens Flaky Tests Management in the Datadog UI with the test selected:

Datadog Flaky Tests Management view with the selected test

Change the status from Active to Fixed:

Datadog Flaky Tests Management status menu showing Fixed

Require the PR gate in GitHub

If you use GitHub, you can mark the PR gate as a required status check in branch protection.

By default, the PR gate is optional, so pull requests can still be merged when a new flaky test is detected:

GitHub branch protection rule requiring the PR gate status check

For more information, see the GitHub documentation for status checks.

Further Reading