gcp_tpu_instance

accelerator_type

Type: STRING
Provider name: acceleratorType
Description: Required. The type of hardware accelerators associated with this node.

ancestors

Type: UNORDERED_LIST_STRING

api_version

Type: STRING
Provider name: apiVersion
Description: Output only. The API version that created this Node.
Possible values:

  • API_VERSION_UNSPECIFIED - API version is unknown.
  • V1_ALPHA1 - TPU API V1Alpha1 version.
  • V1 - TPU API V1 version.
  • V2_ALPHA1 - TPU API V2Alpha1 version.

cidr_block

Type: STRING
Provider name: cidrBlock
Description: The CIDR block that the TPU node will use when selecting an IP address. This CIDR block must be a /29 block; the Compute Engine networks API forbids a smaller block, and using a larger block would be wasteful (a node can only consume one IP address). Errors will occur if the CIDR block has already been used for a currently existing TPU node, the CIDR block conflicts with any subnetworks in the user’s provided network, or the provided network is peered with another network that is using that CIDR block.

create_time

Type: TIMESTAMP
Provider name: createTime
Description: Output only. The time when the node was created.

description

Type: STRING
Provider name: description
Description: The user-supplied description of the TPU. Maximum of 512 characters.

health

Type: STRING
Provider name: health
Description: The health status of the TPU node.
Possible values:

  • HEALTH_UNSPECIFIED - Health status is unknown: not initialized or failed to retrieve.
  • HEALTHY - The resource is healthy.
  • DEPRECATED_UNHEALTHY - The resource is unhealthy.
  • TIMEOUT - The resource is unresponsive.
  • UNHEALTHY_TENSORFLOW - The in-guest ML stack is unhealthy.
  • UNHEALTHY_MAINTENANCE - The node is under maintenance/priority boost caused rescheduling and will resume running once rescheduled.

health_description

Type: STRING
Provider name: healthDescription
Description: Output only. If this field is populated, it contains a description of why the TPU Node is unhealthy.

ip_address

Type: STRING
Provider name: ipAddress
Description: Output only. DEPRECATED! Use network_endpoints instead. The network address for the TPU Node as visible to Compute Engine instances.

labels

Type: UNORDERED_LIST_STRING

name

Type: STRING
Provider name: name
Description: Output only. Immutable. The name of the TPU

network

Type: STRING
Provider name: network
Description: The name of a network they wish to peer the TPU node to. It must be a preexisting Compute Engine network inside of the project on which this API has been activated. If none is provided, “default” will be used.

network_endpoints

Type: UNORDERED_LIST_STRUCT
Provider name: networkEndpoints
Description: Output only. The network endpoints where TPU workers can be accessed and sent work. It is recommended that Tensorflow clients of the node reach out to the 0th entry in this map first.

  • ip_address
    Type: STRING
    Provider name: ipAddress
    Description: The IP address of this network endpoint.
  • port
    Type: INT32
    Provider name: port
    Description: The port of this network endpoint.

organization_id

Type: STRING

parent

Type: STRING

port

Type: STRING
Provider name: port
Description: Output only. DEPRECATED! Use network_endpoints instead. The network port for the TPU Node as visible to Compute Engine instances.

project_id

Type: STRING

project_number

Type: STRING

resource_name

Type: STRING

scheduling_config

Type: STRUCT
Provider name: schedulingConfig
Description: The scheduling options for this node.

  • preemptible
    Type: BOOLEAN
    Provider name: preemptible
    Description: Defines whether the node is preemptible.
  • reserved
    Type: BOOLEAN
    Provider name: reserved
    Description: Whether the node is created under a reservation.

service_account

Type: STRING
Provider name: serviceAccount
Description: Output only. The service account used to run the tensor flow services within the node. To share resources, including Google Cloud Storage data, with the Tensorflow job running in the Node, this account must have permissions to that data.

state

Type: STRING
Provider name: state
Description: Output only. The current state for the TPU Node.
Possible values:

  • STATE_UNSPECIFIED - TPU node state is not known/set.
  • CREATING - TPU node is being created.
  • READY - TPU node has been created.
  • RESTARTING - TPU node is restarting.
  • REIMAGING - TPU node is undergoing reimaging.
  • DELETING - TPU node is being deleted.
  • REPAIRING - TPU node is being repaired and may be unusable. Details can be found in the help_description field.
  • STOPPED - TPU node is stopped.
  • STOPPING - TPU node is currently stopping.
  • STARTING - TPU node is currently starting.
  • PREEMPTED - TPU node has been preempted. Only applies to Preemptible TPU Nodes.
  • TERMINATED - TPU node has been terminated due to maintenance or has reached the end of its life cycle (for preemptible nodes).
  • HIDING - TPU node is currently hiding.
  • HIDDEN - TPU node has been hidden.
  • UNHIDING - TPU node is currently unhiding.
  • UNKNOWN - TPU node has unknown state after a failed repair.

symptoms

Type: UNORDERED_LIST_STRUCT
Provider name: symptoms
Description: Output only. The Symptoms that have occurred to the TPU Node.

  • create_time
    Type: TIMESTAMP
    Provider name: createTime
    Description: Timestamp when the Symptom is created.
  • details
    Type: STRING
    Provider name: details
    Description: Detailed information of the current Symptom.
  • symptom_type
    Type: STRING
    Provider name: symptomType
    Description: Type of the Symptom.
    Possible values:
    • SYMPTOM_TYPE_UNSPECIFIED - Unspecified symptom.
    • LOW_MEMORY - TPU VM memory is low.
    • OUT_OF_MEMORY - TPU runtime is out of memory.
    • EXECUTE_TIMED_OUT - TPU runtime execution has timed out.
    • MESH_BUILD_FAIL - TPU runtime fails to construct a mesh that recognizes each TPU device’s neighbors.
    • HBM_OUT_OF_MEMORY - TPU HBM is out of memory.
    • PROJECT_ABUSE - Abusive behaviors have been identified on the current project.
  • worker_id
    Type: STRING
    Provider name: workerId
    Description: A string used to uniquely distinguish a worker within a TPU node.

tags

Type: UNORDERED_LIST_STRING

tensorflow_version

Type: STRING
Provider name: tensorflowVersion
Description: Required. The version of Tensorflow running in the Node.

use_service_networking

Type: BOOLEAN
Provider name: useServiceNetworking
Description: Whether the VPC peering for the node is set up through Service Networking API. The VPC Peering should be set up before provisioning the node. If this field is set, cidr_block field should not be specified. If the network, that you want to peer the TPU Node to, is Shared VPC networks, the node must be created with this this field enabled.