- 重要な情報
- はじめに
- 用語集
- ガイド
- エージェント
- インテグレーション
- OpenTelemetry
- 開発者
- API
- CoScreen
- アプリ内
- Service Management
- インフラストラクチャー
- アプリケーションパフォーマンス
- 継続的インテグレーション
- ログ管理
- セキュリティ
- UX モニタリング
- 管理
Supported OS
このチェックは、YARN ResourceManager からメトリクスを収集します。以下は、メトリクスの一例です。
yarn.apps
メトリクスは GAUGE
ではなく RATE
として誤って報告されるため、yarn.apps.<メトリクス>
メトリクスは非推奨になりました。yarn.apps.<メトリクス>_gauge
メトリクスを使用してください。
YARN チェックは Datadog Agent パッケージに含まれています。YARN ResourceManager に追加でインストールする必要はありません。
ホストで実行中の Agent に対してこのチェックを構成するには:
Agent の構成ディレクトリ
のルートにある conf.d/
フォルダーの yarn.d/conf.yaml
ファイルを編集します。
init_config:
instances:
## @param resourcemanager_uri - string - required
## The YARN check retrieves metrics from YARNS's ResourceManager. This
## check must be run from the Master Node and the ResourceManager URI must
## be specified below. The ResourceManager URI is composed of the
## ResourceManager's hostname and port.
## The ResourceManager hostname can be found in the yarn-site.xml conf file
## under the property yarn.resourcemanager.address
##
## The ResourceManager port can be found in the yarn-site.xml conf file under
## the property yarn.resourcemanager.webapp.address
#
- resourcemanager_uri: http://localhost:8088
## @param cluster_name - string - required - default: default_cluster
## A friendly name for the cluster.
#
cluster_name: default_cluster
すべてのチェックオプションの一覧と説明については、チェックコンフィギュレーションの例 を参照してください。
Agent を再起動 すると、Datadog への YARN メトリクスの送信が開始されます。
コンテナ環境の場合は、オートディスカバリーのインテグレーションテンプレート のガイドを参照して、次のパラメーターを適用してください。
パラメーター | 値 |
---|---|
<インテグレーション名> | yarn |
<初期コンフィギュレーション> | 空白または {} |
<インスタンスコンフィギュレーション> | {"resourcemanager_uri": "http://%%host%%:%%port%%", "cluster_name": "<クラスター名>"} |
Datadog Agent で、ログの収集はデフォルトで無効になっています。以下のように、datadog.yaml
ファイルでこれを有効にします。
logs_enabled: true
yarn.d/conf.yaml
ファイルのコメントを解除して、ログコンフィギュレーションブロックを編集します。環境に基づいて、 type
、path
、service
パラメーターの値を変更してください。使用可能なすべての構成オプションの詳細については、サンプル yarn.d/conf.yaml
を参照してください。
logs:
- type: file
path: <LOG_FILE_PATH>
source: yarn
service: <SERVICE_NAME>
# To handle multi line that starts with yyyy-mm-dd use the following pattern
# log_processing_rules:
# - type: multi_line
# pattern: \d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d{3}
# name: new_log_start_with_date
Docker 環境のログを有効にするには、Docker ログ収集 を参照してください。
Agent の status サブコマンド
を実行し、Checks セクションで yarn
を探します。
yarn.metrics.apps_submitted (gauge) | The number of submitted apps Shown as task |
yarn.metrics.apps_completed (gauge) | The number of completed apps Shown as task |
yarn.metrics.apps_pending (gauge) | The number of pending apps Shown as task |
yarn.metrics.apps_running (gauge) | The number of running apps Shown as task |
yarn.metrics.apps_failed (gauge) | The number of failed apps Shown as task |
yarn.metrics.apps_killed (gauge) | The number of killed apps Shown as task |
yarn.metrics.reserved_mb (gauge) | The size of reserved memory Shown as mebibyte |
yarn.metrics.available_mb (gauge) | The amount of available memory Shown as mebibyte |
yarn.metrics.allocated_mb (gauge) | The amount of allocated memory Shown as mebibyte |
yarn.metrics.total_mb (gauge) | The amount of total memory Shown as mebibyte |
yarn.metrics.reserved_virtual_cores (gauge) | The number of reserved virtual cores Shown as core |
yarn.metrics.available_virtual_cores (gauge) | The number of available virtual cores Shown as core |
yarn.metrics.allocated_virtual_cores (gauge) | The number of allocated virtual cores Shown as core |
yarn.metrics.total_virtual_cores (gauge) | The total number of virtual cores Shown as core |
yarn.metrics.containers_allocated (gauge) | The number of containers allocated |
yarn.metrics.containers_reserved (gauge) | The number of containers reserved |
yarn.metrics.containers_pending (gauge) | The number of containers pending |
yarn.metrics.total_nodes (gauge) | The total number of nodes Shown as node |
yarn.metrics.active_nodes (gauge) | The number of active nodes Shown as node |
yarn.metrics.lost_nodes (gauge) | The number of lost nodes Shown as node |
yarn.metrics.unhealthy_nodes (gauge) | The number of unhealthy nodes Shown as node |
yarn.metrics.decommissioned_nodes (gauge) | The number of decommissioned nodes Shown as node |
yarn.metrics.decommissioning_nodes (gauge) | The number of decommissioning nodes Shown as node |
yarn.metrics.rebooted_nodes (gauge) | The number of rebooted nodes Shown as node |
yarn.apps.progress_gauge (gauge) | The progress of the application, displayed as 0, 10, & 100, which represent the 3 states: hasn't started, in progress, & completed Shown as percent |
yarn.apps.started_time_gauge (gauge) | The time in which application started (in ms since epoch) Shown as millisecond |
yarn.apps.finished_time_gauge (gauge) | The time in which the application finished (in ms since epoch) Shown as millisecond |
yarn.apps.elapsed_time_gauge (gauge) | The elapsed time since the application started (in ms) Shown as millisecond |
yarn.apps.allocated_mb_gauge (gauge) | The sum of memory in MB allocated to the applications running containers Shown as mebibyte |
yarn.apps.allocated_vcores_gauge (gauge) | The sum of virtual cores allocated to the applications running containers Shown as core |
yarn.apps.running_containers_gauge (gauge) | The number of containers currently running for the application Shown as container |
yarn.apps.memory_seconds_gauge (gauge) | The amount of memory the application has allocated (megabyte-seconds) Shown as mebibyte |
yarn.apps.vcore_seconds_gauge (gauge) | The amount of CPU resources the application has allocated (virtual core-seconds) Shown as core |
yarn.apps.progress (rate) | Deprecated use yarn.apps.progress_gauge instead Shown as percent |
yarn.apps.started_time (rate) | Deprecated use yarn.apps.startedtimegauge instead Shown as second |
yarn.apps.finished_time (rate) | Deprecated use yarn.apps.finishedtimegauge instead Shown as second |
yarn.apps.elapsed_time (rate) | Deprecated use yarn.apps.elapsedtimegauge instead Shown as second |
yarn.apps.allocated_mb (rate) | Deprecated use yarn.apps.allocatedmbgauge instead Shown as mebibyte |
yarn.apps.allocated_vcores (rate) | Deprecated use yarn.apps.allocatedvcoresgauge instead Shown as core |
yarn.apps.running_containers (rate) | Deprecated use yarn.apps.runningcontainersgauge instead |
yarn.apps.memory_seconds (rate) | Deprecated use yarn.apps.memorysecondsgauge instead Shown as second |
yarn.apps.vcore_seconds (rate) | Deprecated use yarn.apps.vcoresecondsgauge instead Shown as second |
yarn.node.last_health_update (gauge) | The last time the node reported its health (in ms since epoch) Shown as millisecond |
yarn.node.used_memory_mb (gauge) | The total amount of memory currently used on the node (in MB) Shown as mebibyte |
yarn.node.avail_memory_mb (gauge) | The total amount of memory currently available on the node (in MB) Shown as mebibyte |
yarn.node.used_virtual_cores (gauge) | The total number of vCores currently used on the node Shown as core |
yarn.node.available_virtual_cores (gauge) | The total number of vCores available on the node Shown as core |
yarn.node.num_containers (gauge) | The total number of containers currently running on the node |
yarn.queue.root.max_capacity (gauge) | The configured maximum queue capacity in percentage for root queue Shown as percent |
yarn.queue.root.used_capacity (gauge) | The used queue capacity in percentage for root queue Shown as percent |
yarn.queue.root.capacity (gauge) | The configured queue capacity in percentage for root queue Shown as percent |
yarn.queue.num_pending_applications (gauge) | The number of pending applications in this queue Shown as task |
yarn.queue.user_am_resource_limit.memory (gauge) | The maximum memory resources a user can use for Application Masters (in MB) Shown as mebibyte |
yarn.queue.user_am_resource_limit.vcores (gauge) | The maximum vCpus a user can use for Application Masters Shown as core |
yarn.queue.absolute_capacity (gauge) | The absolute capacity percentage this queue can use of entire cluster Shown as percent |
yarn.queue.user_limit_factor (gauge) | The minimum user limit percent set in the configuration |
yarn.queue.user_limit (gauge) | The user limit factor set in the configuration |
yarn.queue.num_applications (gauge) | The number of applications currently in the queue Shown as task |
yarn.queue.used_am_resource.memory (gauge) | The memory resources used for Application Masters (in MB) Shown as mebibyte |
yarn.queue.used_am_resource.vcores (gauge) | The vCpus used for Application Masters Shown as core |
yarn.queue.absolute_used_capacity (gauge) | The absolute used capacity percentage this queue is using of the entire cluster Shown as percent |
yarn.queue.resources_used.memory (gauge) | The total memory resources this queue is using (in MB) Shown as mebibyte |
yarn.queue.resources_used.vcores (gauge) | The total vCpus this queue is using Shown as core |
yarn.queue.am_resource_limit.vcores (gauge) | The maximum vCpus this queue can use for Application Masters Shown as core |
yarn.queue.am_resource_limit.memory (gauge) | The maximum memory resources this queue can use for Application Masters (in MB) Shown as mebibyte |
yarn.queue.capacity (gauge) | The configured queue capacity in percentage relative to its parent queue Shown as percent |
yarn.queue.num_active_applications (gauge) | The number of active applications in this queue Shown as task |
yarn.queue.absolute_max_capacity (gauge) | The absolute maximum capacity percentage this queue can use of the entire cluster Shown as percent |
yarn.queue.used_capacity (gauge) | The used queue capacity in percentage Shown as percent |
yarn.queue.num_containers (gauge) | The number of containers being used |
yarn.queue.max_capacity (gauge) | The configured maximum queue capacity in percentage relative to its parent queue Shown as percent |
yarn.queue.max_applications (gauge) | The maximum number of applications this queue can have Shown as task |
yarn.queue.max_applications_per_user (gauge) | The maximum number of applications per user this queue can have Shown as task |
yarn.queue.max_active_applications (gauge) | The maximum number of active applications this queue can have Shown as task |
yarn.queue.max_active_applications_per_user (gauge) | The maximum number of active applications per user this queue can have Shown as task |
Yarn チェックには、イベントは含まれません。
yarn.can_connect
Returns CRITICAL
if the Agent cannot connect to the ResourceManager URI to collect metrics, otherwise OK
.
Statuses: ok, critical
yarn.application.status
By default, returns OK
if the Yarn application state is NEW
, NEW_SAVING
, SUBMITTED
, ACCEPTED
, RUNNING
, or FINISHED
; UNKNOWN
if the application state is ALL
; and CRITICAL
if the Yarn application state is FAILED
or KILLED
.
Statuses: ok, unknown, critical
ご不明な点は、Datadog のサポートチーム までお問合せください。