Amazon SageMaker は、フルマネージド型の機械学習サービスです。Amazon SageMaker を使用して、データサイエンティストや開発者は、機械学習モデルを構築およびトレーニングした後に、実稼働準備ができたホスト環境にモデルを直接デプロイすることができます。
このインテグレーションを有効にすると、Datadog にすべての SageMaker メトリクスを表示できます。
Amazon Web Services インテグレーションをまだセットアップしていない場合は、最初にセットアップします。
SageMaker
をオンにします。Amazon SageMaker から S3 バケットまたは CloudWatch のいずれかにログを送信するよう構成します。
注: S3 バケットにログを送る場合は、Target prefix が amazon_sagemaker
に設定されているかを確認してください。
Datadog ログコレクション AWS Lambda 関数 をまだ設定していない場合は、設定を行ってください。
lambda 関数がインストールされたら、AWS コンソールから、Amazon SageMaker ログを含む S3 バケットまたは CloudWatch のロググループに手動でトリガーを追加します。
aws.sagemaker.invocation_4xx_errors (count) | The average number of InvokeEndpoint requests where the model returned a 4xx HTTP response code. |
aws.sagemaker.invocation_4xx_errors.sum (count) | The sum of the number of InvokeEndpoint requests where the model returned a 4xx HTTP response code. |
aws.sagemaker.invocation_5xx_errors (count) | The average number of InvokeEndpoint requests where the model returned a 5xx HTTP response code. |
aws.sagemaker.invocation_5xx_errors.sum (count) | The sum of the number of InvokeEndpoint requests where the model returned a 5xx HTTP response code. |
aws.sagemaker.invocations (count) | The sum of the number of InvokeEndpoint requests sent to a model endpoint. |
aws.sagemaker.invocations.sample_count (count) | The sample count of the number of InvokeEndpoint requests sent to a model endpoint. |
aws.sagemaker.invocations_per_instance (count) | The number of invocations sent to a model normalized by InstanceCount in each ProductionVariant. |
aws.sagemaker.model_latency (count) | The average interval of time taken by a model to respond as viewed from Amazon SageMaker. Shown as microsecond |
aws.sagemaker.model_latency.sum (count) | The sum of the interval of time taken by a model to respond as viewed from Amazon SageMaker. Shown as microsecond |
aws.sagemaker.model_latency.mininmum (count) | The minimum interval of time taken by a model to respond as viewed from Amazon SageMaker. Shown as microsecond |
aws.sagemaker.model_latency.maximum (count) | The maximum interval of time taken by a model to respond as viewed from Amazon SageMaker. Shown as microsecond |
aws.sagemaker.model_latency.sample_count (count) | The sample count interval of time taken by a model to respond as viewed from Amazon SageMaker. Shown as microsecond |
aws.sagemaker.overhead_latency (count) | The average interval of time added to the time taken to respond to a client request by Amazon SageMaker overheads. Shown as microsecond |
aws.sagemaker.overhead_latency.sum (count) | The sum of the interval of time added to the time taken to respond to a client request by Amazon SageMaker overheads. Shown as microsecond |
aws.sagemaker.overhead_latency.minimum (count) | The minimum interval of time added to the time taken to respond to a client request by Amazon SageMaker overheads. Shown as microsecond |
aws.sagemaker.overhead_latency.maximum (count) | The maximum interval of time added to the time taken to respond to a client request by Amazon SageMaker overheads. Shown as microsecond |
aws.sagemaker.overhead_latency.sample_count (count) | The sample count of the interval of time added to the time taken to respond to a client request by Amazon SageMaker overheads. Shown as microsecond |
aws.sagemaker.cpu_utilization (count) | The percentage of CPU units that are used by the containers on an instance. Shown as percent |
aws.sagemaker.memory_utilization (count) | The percentage of memory that is used by the containers on an instance. Shown as percent |
aws.sagemaker.gpu_utilization (count) | The percentage of GPU units that are used by the containers on an instance. Shown as percent |
aws.sagemaker.gpu_memory_utilization (count) | The percentage of GPU memory used by the containers on an instance. Shown as percent |
aws.sagemaker.disk_utilization (count) | The percentage of disk space used by the containers on an instance uses. Shown as percent |
aws.sagemaker.dataset_objects_auto_annotated (count) | The number of dataset objects auto-annotated in a labeling job. |
aws.sagemaker.dataset_objects_human_annotated (count) | The number of dataset objects annotated by a human in a labeling job. |
aws.sagemaker.dataset_objects_labeling_failed (count) | The number of dataset objects that failed labeling in a labeling job. |
aws.sagemaker.jobs_failed (count) | The sum of the number of labeling jobs that failed. |
aws.sagemaker.jobs_failed.sample_count (count) | The sample count of the number of labeling jobs that failed. |
aws.sagemaker.jobs_succeeded (count) | The sum of the number of labeling jobs that succeeded. |
aws.sagemaker.jobs_succeeded.sample_count (count) | The sample count number of labeling jobs that succeeded. |
aws.sagemaker.jobs_stopped (count) | The sum of the number of labeling jobs that were stopped. |
aws.sagemaker.jobs_stopped.sample_count (count) | The sample count of the number of labeling jobs that were stopped. |
aws.sagemaker.total_dataset_objects_labeled (count) | The maximum number of dataset objects labeled successfully in a labeling job. |
Amazon SageMaker インテグレーションには、イベントは含まれません。
Amazon SageMaker インテグレーションには、サービスのチェック機能は含まれません。
ご不明な点は、Datadog のサポートチームまでお問合せください。
このページ