Early Flake Detection

Docs > Test Optimization in Datadog > Working with Flaky Tests > Early Flake Detection

Overview

Early Flake Detection is Datadog’s test flakiness solution that enhances code quality by identifying flaky tests early in the development cycle. For more information about flaky tests, see Flaky Test Management.

By running newly added tests multiple times, Datadog can detect flakiness before these tests are merged into the default branch. A study shows that up to 75% of flaky tests can be identified with this approach.

Known Tests: Datadog’s backend stores unique tests for a given test service. Before a test session runs, the Datadog library fetches the list of these known tests.
Detection of New Tests: If a test is not in the list of known tests, it is considered new and is automatically retried up to ten times.
Flakiness Identification: Running a test multiple times helps uncover issues like race conditions, which may cause the test to pass and fail intermittently. If any of the test attempts fail, the test is automatically tagged as flaky.

Running a test multiple times increases the likelihood of exposing random conditions that cause flakiness. Early Flake Detection helps ensure that only stable, reliable tests are integrated into the default branch:

How Early Flake Detection works in your commits

You can choose to block the merge of the feature branch with a Quality Gate. For more information, see the Quality Gates documentation.

Setup

Before implementing Early Flake Detection, you must configure Test Optimization for your development environment. If you are reporting data through the Datadog Agent, use v6.40 or 7.40 and later.

Configuration

After you have set up your Datadog library for Test Optimization, you can configure Early Flake Detection from the Test Optimization Settings page.

Early flake Detection in Test Service Settings.

Navigate to Software Delivery > Test Optimization > Settings.
Click Configure on the Early Flake Detection column for a test service.
Click the toggle to enable Early Flake Detection and add or modify the list of Excluded Branches from Early Flake Detection.

Enabling Early Flake Detection and defining excluded branches in the test service configuration

Compatibility

The required test framework and dd-trace versions are:

dd-trace-js:

>=5.23.0 for the 5.x release.
>=4.47.0 for the 4.x release.

The test framework compatibility is the same as Test Optimization Compatibility, with the exception of playwright, which is only supported from >=1.38.0.

dd-trace-java>=1.34.0

The test framework compatibility is the same as Test Optimization Compatibility, with the exception of Scala Weaver.

dd-trace-dotnet>=2.51.0

dd-trace-py >= 3.0.0 (pytest >= 7.2.0)

datadog-ci-rb>=1.5.0

orchestrion >= 0.9.4 + dd-trace-go >= 1.69.1

dd-sdk-swift-testing>=2.5.2

Manage Excluded Branches

Excluded Branches do not have any tests retried by Early Flake Detection. Tests run in these branches are not considered new for the purposes of Early Flake Detection. You can manage the list of excluded branches on the Test Optimization Settings page, ensuring that the feature is tailored to your specific workflow and branch structure.

Explore results in the Test Optimization Explorer

You can use the following facets to query sessions that run Early Flake Detection and new tests in the Test Optimization Explorer.

Test Session: Test sessions running Early Flake Detection have the @test.early_flake.enabled tag set to true.
New Tests: New tests have the @test.is_new tag set to true, and retries for this test have the @test.is_retry tag set to true.

Troubleshooting

If you suspect there are issues with Early Flake Detection, navigate to the Test Optimization Settings page, look for your repository, and click Configure. Disable Early Flake Detection by clicking on the toggle.

A new test is not being retried

This could be caused by a couple of reasons:

This test has ran previously.
This test is slower than five minutes. There is a mechanism not to run Early Flake Detection on tests that are too slow, since retrying these tests could cause significant delays in CI pipelines.

A test was retried that is not new

If the Datadog library can’t fetch the full list of known tests, the Datadog library may retry tests that are not new. There is a mechanism to prevent this error from slowing down the CI pipeline, but if it happens, contact Datadog Support.