This product is not supported for your selected
Datadog site. (
).
이 페이지는 아직 영어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우
언제든지 연락주시기 바랍니다.gcp_tpu_instance
accelerator_type
Type: STRING
Provider name: acceleratorType
Description: Required. The type of hardware accelerators associated with this node.
ancestors
Type: UNORDERED_LIST_STRING
api_version
Type: STRING
Provider name: apiVersion
Description: Output only. The API version that created this Node.
Possible values:
API_VERSION_UNSPECIFIED
- API version is unknown.
V1_ALPHA1
- TPU API V1Alpha1 version.
V1
- TPU API V1 version.
V2_ALPHA1
- TPU API V2Alpha1 version.
cidr_block
Type: STRING
Provider name: cidrBlock
Description: The CIDR block that the TPU node will use when selecting an IP address. This CIDR block must be a /29 block; the Compute Engine networks API forbids a smaller block, and using a larger block would be wasteful (a node can only consume one IP address). Errors will occur if the CIDR block has already been used for a currently existing TPU node, the CIDR block conflicts with any subnetworks in the user’s provided network, or the provided network is peered with another network that is using that CIDR block.
create_time
Type: TIMESTAMP
Provider name: createTime
Description: Output only. The time when the node was created.
description
Type: STRING
Provider name: description
Description: The user-supplied description of the TPU. Maximum of 512 characters.
health
Type: STRING
Provider name: health
Description: The health status of the TPU node.
Possible values:
HEALTH_UNSPECIFIED
- Health status is unknown: not initialized or failed to retrieve.
HEALTHY
- The resource is healthy.
DEPRECATED_UNHEALTHY
- The resource is unhealthy.
TIMEOUT
- The resource is unresponsive.
UNHEALTHY_TENSORFLOW
- The in-guest ML stack is unhealthy.
UNHEALTHY_MAINTENANCE
- The node is under maintenance/priority boost caused rescheduling and will resume running once rescheduled.
health_description
Type: STRING
Provider name: healthDescription
Description: Output only. If this field is populated, it contains a description of why the TPU Node is unhealthy.
ip_address
Type: STRING
Provider name: ipAddress
Description: Output only. DEPRECATED! Use network_endpoints instead. The network address for the TPU Node as visible to Compute Engine instances.
labels
Type: UNORDERED_LIST_STRING
name
Type: STRING
Provider name: name
Description: Output only. Immutable. The name of the TPU
network
Type: STRING
Provider name: network
Description: The name of a network they wish to peer the TPU node to. It must be a preexisting Compute Engine network inside of the project on which this API has been activated. If none is provided, “default” will be used.
network_endpoints
Type: UNORDERED_LIST_STRUCT
Provider name: networkEndpoints
Description: Output only. The network endpoints where TPU workers can be accessed and sent work. It is recommended that Tensorflow clients of the node reach out to the 0th entry in this map first.
ip_address
Type: STRING
Provider name: ipAddress
Description: The IP address of this network endpoint.
port
Type: INT32
Provider name: port
Description: The port of this network endpoint.
organization_id
Type: STRING
parent
Type: STRING
port
Type: STRING
Provider name: port
Description: Output only. DEPRECATED! Use network_endpoints instead. The network port for the TPU Node as visible to Compute Engine instances.
project_id
Type: STRING
project_number
Type: STRING
region_id
Type: STRING
resource_name
Type: STRING
scheduling_config
Type: STRUCT
Provider name: schedulingConfig
Description: The scheduling options for this node.
preemptible
Type: BOOLEAN
Provider name: preemptible
Description: Defines whether the node is preemptible.
reserved
Type: BOOLEAN
Provider name: reserved
Description: Whether the node is created under a reservation.
service_account
Type: STRING
Provider name: serviceAccount
Description: Output only. The service account used to run the tensor flow services within the node. To share resources, including Google Cloud Storage data, with the Tensorflow job running in the Node, this account must have permissions to that data.
state
Type: STRING
Provider name: state
Description: Output only. The current state for the TPU Node.
Possible values:
STATE_UNSPECIFIED
- TPU node state is not known/set.
CREATING
- TPU node is being created.
READY
- TPU node has been created.
RESTARTING
- TPU node is restarting.
REIMAGING
- TPU node is undergoing reimaging.
DELETING
- TPU node is being deleted.
REPAIRING
- TPU node is being repaired and may be unusable. Details can be found in the help_description
field.
STOPPED
- TPU node is stopped.
STOPPING
- TPU node is currently stopping.
STARTING
- TPU node is currently starting.
PREEMPTED
- TPU node has been preempted. Only applies to Preemptible TPU Nodes.
TERMINATED
- TPU node has been terminated due to maintenance or has reached the end of its life cycle (for preemptible nodes).
HIDING
- TPU node is currently hiding.
HIDDEN
- TPU node has been hidden.
UNHIDING
- TPU node is currently unhiding.
UNKNOWN
- TPU node has unknown state after a failed repair.
symptoms
Type: UNORDERED_LIST_STRUCT
Provider name: symptoms
Description: Output only. The Symptoms that have occurred to the TPU Node.
create_time
Type: TIMESTAMP
Provider name: createTime
Description: Timestamp when the Symptom is created.
details
Type: STRING
Provider name: details
Description: Detailed information of the current Symptom.
symptom_type
Type: STRING
Provider name: symptomType
Description: Type of the Symptom.
Possible values:
SYMPTOM_TYPE_UNSPECIFIED
- Unspecified symptom.
LOW_MEMORY
- TPU VM memory is low.
OUT_OF_MEMORY
- TPU runtime is out of memory.
EXECUTE_TIMED_OUT
- TPU runtime execution has timed out.
MESH_BUILD_FAIL
- TPU runtime fails to construct a mesh that recognizes each TPU device’s neighbors.
HBM_OUT_OF_MEMORY
- TPU HBM is out of memory.
PROJECT_ABUSE
- Abusive behaviors have been identified on the current project.
worker_id
Type: STRING
Provider name: workerId
Description: A string used to uniquely distinguish a worker within a TPU node.
Type: UNORDERED_LIST_STRING
tensorflow_version
Type: STRING
Provider name: tensorflowVersion
Description: Required. The version of Tensorflow running in the Node.
use_service_networking
Type: BOOLEAN
Provider name: useServiceNetworking
Description: Whether the VPC peering for the node is set up through Service Networking API. The VPC Peering should be set up before provisioning the node. If this field is set, cidr_block field should not be specified. If the network, that you want to peer the TPU Node to, is Shared VPC networks, the node must be created with this this field enabled.
zone_id
Type: STRING