- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
",t};e.buildCustomizationMenuUi=t;function n(e){let t='
",t}function s(e){let n=e.filter.currentValue||e.filter.defaultValue,t='${e.filter.label}
`,e.filter.options.forEach(s=>{let o=s.id===n;t+=``}),t+="${e.filter.label}
`,t+=`Setting up Deployment Gates involves two steps:
transaction-backend
).dev
).default
): Unique name for multiple gates on the same service/environment. This can be used to:fast-deploy
vs default
)pre-deploy
vs post-deploy
)pre-deploy
vs canary-20pct
)Dry Run
to test gate behavior without impacting deployments. The evaluation of
a dry run gate always responds with a pass status, but the in-app result is the real status based
on rules evaluation. This is particularly useful when performing an initial evaluation of the
gate behavior without impacting the deployment pipeline.Each gate requires one or more rules. All rules must pass for the gate to succeed. For each rule, specify:
Check all P0 monitors
).Monitor
or Faulty Deployment Detection
.Dry Run
to test rule evaluation without affecting the gate result.This rule type evaluates the state of your monitors. The evaluation fails if:
ALERT
or NO_DATA
state.In the Query field, enter a monitor query using Search Monitor syntax. Use the following syntax to filter on specific tags:
service:transaction-backend
scope:"service:transaction-backend"
group:"service:transaction-backend"
env:prod service:transaction-backend
env:prod (service:transaction-backend OR group:"service:transaction-backend" OR scope:"service:transaction-backend")
tag:"use_deployment_gates" team:payment
tag:"use_deployment_gates" AND (NOT group:("team:frontend"))
group
filters evaluate only matching groups.muted:false
).This rule type uses Watchdog’s APM Faulty Deployment Detection to compare the deployed version against previous versions of the same service. It can detect:
database
or inferred service
.After a gate is configured with at least one rule, you can evaluate the gate while deploying the related service with an API call:
curl -X POST "https://api./api/unstable/deployments/gates/evaluate" \
-H "Content-Type: application/json" \
-H "DD-API-KEY: <YOUR_API_KEY>" \
-d @- << EOF
{
"data": {
"type": "deployment_gates_evaluation_request",
"attributes": {
"service": "transaction-backend",
"env": "staging",
"identifier": "my-custom-identifier", # Optional, defaults to "default"
"version": "v123-456", # Required for APM Faulty Deployment Detection rules
"primary_tag": "region:us-central-1" # Optional, scopes down APM Faulty Deployment Detection rules analysis to the selected primary tag
}
}
}'
Note: A 404 HTTP response can be because the gate was not found, or because the gate was found but has no rules.
If a 200 HTTP status code is returned, the response is in the following format:
{
"data": {
"id": "<random_response_uuid>",
"type": "deployment_gates_evaluation_response",
"attributes": {
"dry_run": false,
"evaluation_id": "e9d2f04f-4f4b-494b-86e5-52f03e10c8e9",
"evaluation_url": "https://app./ci/deployment-gates/evaluations?index=cdgates&query=level%3Agate+%40evaluation_id%3Ae9d2f14f-4f4b-494b-86e5-52f03e10c8e9",
"gate_id": "e140302e-0cba-40d2-978c-6780647f8f1c",
"gate_status": "pass",
"rules": [
{
"name": "Check service monitors",
"status": "fail",
"reason": "One or more monitors in ALERT state: https://app./monitors/34330981",
"dry_run": true
}
]
}
}
}
If the field data.attributes.dry_run
is true
, the field data.attributes.gate_status
is always pass
.
Use this script as a starting point. For the API_URL variable, be sure to replace <YOUR_DD_SITE>
with your Datadog site name (for example, ).
#!/bin/sh
# Configuration
MAX_RETRIES=3
DELAY_SECONDS=5
API_URL="https://api.<YOUR_DD_SITE>/api/unstable/deployments/gates/evaluate"
API_KEY="<YOUR_API_KEY>"
PAYLOAD=$(cat <<EOF
{
"data": {
"type": "deployment_gates_evaluation_request",
"attributes": {
"service": "$1",
"env": "$2",
"version": "$3"
}
}
}
EOF
)
current_attempt=0
while [ $current_attempt -lt $MAX_RETRIES ]; do
current_attempt=$((current_attempt + 1))
RESPONSE=$(curl -s -w "%{http_code}" -o response.txt -X POST "$API_URL" \
-H "Content-Type: application/json" \
-H "DD-API-KEY: $API_KEY" \
-d "$PAYLOAD")
# Extracts the last 3 digits of the status code
HTTP_CODE=$(echo "$RESPONSE" | tail -c 4)
RESPONSE_BODY=$(cat response.txt)
if [ ${HTTP_CODE} -ge 500 ] && [ ${HTTP_CODE} -le 599 ]; then
# Status code 5xx indicates a server error, so the call is retried
echo "Attempt $current_attempt: 5xx Error ($HTTP_CODE). Retrying in $DELAY_SECONDS seconds..."
sleep $DELAY_SECONDS
continue
elif [ ${HTTP_CODE} -ne 200 ]; then
# Only 200 is an expected status code
echo "Unexpected HTTP Code ($HTTP_CODE): $RESPONSE_BODY"
exit 1
fi
# At this point, we have received a 200 status code. So, we check the gate status returned
GATE_STATUS=$(echo "$RESPONSE_BODY" | jq -r '.data.attributes.gate_status')
if [[ "$GATE_STATUS" == "pass" ]]; then
echo "Gate evaluation PASSED"
exit 0
else
echo "Gate evaluation FAILED"
exit 1
fi
done
# If we arrive here, it means that we received several 5xx errors from the API. To not block deployments, we can treat this case as a success
echo "All retries exhausted, but treating 5xx errors as success."
exit 0
The script has the following characteristics:
service
, environment
, and version
(optionally add identifier
and primary_tag
if needed). The version
is only required if one or more APM Faulty Deployment Detection rules are evaluated.response.txt
file.data.attributes.gate_status
) and passes or fails the script based on its value.This is a general behavior, and you should change it based on your personal use case and preferences. The script uses curl
(to perform the request) and jq
(to process the returned JSON). If those commands are not available, install them at the beginning of the script (for example, by adding apk add --no-cache curl jq
).
To call Deployment Gates from an Argo Rollouts Kubernetes Resource, you can create an AnalysisTemplate or a ClusterAnalysisTemplate. The template should contain a Kubernetes job that is used to perform the analysis.
Use this script as a starting point. For the API_URL variable, be sure to replace <YOUR_DD_SITE>
with your Datadog site name (for example, ).
apiVersion: argoproj.io/v1alpha1
kind: ClusterAnalysisTemplate
metadata:
name: datadog-job-analysis
spec:
args:
- name: service
- name: env
metrics:
- name: datadog-job
provider:
job:
spec:
ttlSecondsAfterFinished: 300
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: datadog-check
image: alpine:latest
command: ["/bin/sh", "-c"]
args:
- |
apk add --no-cache curl jq
# Configuration
MAX_RETRIES=3
DELAY_SECONDS=5
API_URL="https://api.<YOUR_DD_SITE>/api/unstable/deployments/gates/evaluate"
API_KEY="<YOUR_API_KEY>"
PAYLOAD='{
"data": {
"type": "deployment_gates_evaluation_request",
"attributes": {
"service": "{{ args.service }}",
"env": "{{ args.env }}",
"version": "{{ args.version }}",
}
}
}'
current_attempt=0
while [ $current_attempt -lt $MAX_RETRIES ]; do
current_attempt=$((current_attempt + 1))
RESPONSE=$(curl -s -w "%{http_code}" -o response.txt -X POST "$API_URL" \
-H "Content-Type: application/json" \
-H "DD-API-KEY: $API_KEY" \
-d "$PAYLOAD")
# Extracts the last 3 digits of the status code
HTTP_CODE=$(echo "$RESPONSE" | tail -c 4)
RESPONSE_BODY=$(cat response.txt)
if [ ${HTTP_CODE} -ge 500 ] && [ ${HTTP_CODE} -le 599 ]; then
# Status code 5xx indicates a server error, so the call is retried
echo "Attempt $current_attempt: 5xx Error ($HTTP_CODE). Retrying in $DELAY_SECONDS seconds..."
sleep $DELAY_SECONDS
continue
elif [ ${HTTP_CODE} -ne 200 ]; then
# Only 200 is an expected status code
echo "Unexpected HTTP Code ($HTTP_CODE): $RESPONSE_BODY"
exit 1
fi
# At this point, we have received a 200 status code. So, we check the gate status returned
GATE_STATUS=$(echo "$RESPONSE_BODY" | jq -r '.data.attributes.gate_status')
if [[ "$GATE_STATUS" == "pass" ]]; then
echo "Gate evaluation PASSED"
exit 0
else
echo "Gate evaluation FAILED"
exit 1
fi
done
# If we arrive here, it means that we received several 5xx errors from the API. To not block deployments, we can treat this case as a success
echo "All retries exhausted, but treating 5xx errors as success."
exit 0
service
, env
, and any other optional fields needed (such as version
). For more information, see the official Argo Rollouts docs.ttlSecondsAfterFinished
field removes the finished jobs after 5 minutes.backoffLimit
field is set to 0 as the job might fail if the gate evaluation fails, and it should not be retried in that case.After you have created the analysis template, reference it from the Argo Rollouts strategy:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollouts-demo
labels:
tags.datadoghq.com/service: transaction-backend
tags.datadoghq.com/env: dev
spec:
replicas: 5
strategy:
canary:
steps:
...
- analysis:
templates:
- templateName: datadog-job-analysis
clusterScope: true # Only needed for cluster analysis
args:
- name: env
valueFrom:
fieldRef:
fieldPath: metadata.labels['tags.datadoghq.com/env']
- name: service
valueFrom:
fieldRef:
fieldPath: metadata.labels['tags.datadoghq.com/service']
- name: version #Only required if one or more APM Faulty Deployment Detection rules are evaluated
valueFrom:
fieldRef:
fieldPath: metadata.labels['tags.datadoghq.com/version']
- ...
When integrating Deployment Gates into your Continuous Delivery workflow, an evaluation phase is recommended to confirm the product is working as expected before it impacts deployments. You can do this using the Dry Run evaluation mode and the Deployment Gates Evaluations page:
Dry Run
.pass
status and the deployments are not impacted by the gate result.Dry Run
or Active
). It means you can understand when the gate would have failed and what was the reason behind it.Dry Run
to Active
. Afterwards, the API starts returning the “real” status and deployments start getting promoted or rolled back based on the gate result.추가 유용한 문서, 링크 및 기사: