vSphere
Datadog's Research Report: The State of Serverless Report: The State of Serverless

vSphere

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Vsphere Graph

Overview

This check collects resource usage metrics from your vSphere cluster-CPU, disk, memory, and network usage. It also watches your vCenter server for events and emits them to Datadog.

Setup

Installation

The vSphere check is included in the Datadog Agent package, so you don’t need to install anything else on your vCenter server.

Configuration

In the Administration section of vCenter, add a read-only user called datadog-readonly.

Then, edit the vsphere.d/conf.yaml file in the conf.d/ folder at the root of your Agent’s configuration directory. See the sample vsphere.d/conf.yaml for all available configuration options:

init_config:

instances:
  - name: main-vcenter # how metrics are tagged, i.e. 'vcenter_server:main-vcenter'
    host: <VCENTER_HOSTNAME>          # e.g. myvcenter.example.com
    username: <USER_YOU_JUST_CREATED> # e.g. datadog-readonly@vsphere.local
    password: <PASSWORD>

Restart the Agent to start sending vSphere metrics and events to Datadog.

Note: The Datadog Agent doesn’t need to be on the same server as the vSphere appliance software. An Agent with the vSphere check enabled can be set up -no matter what OS it’s running on- to point to a vSphere appliance server. Update your <VCENTER_HOSTNAME> accordingly.

Compatibility

Starting with version 3.3.0 of the check, shipped in Agent version 6.5.0/5.27.0, a new optional parameter collection_level is available to select which metrics to collect from vCenter, and the optional parameter all_metrics was deprecated. Along with this change, the names of the metrics sent to Datadog by the integration have changed, with the addition of a suffix specifying the rollup type of the metric exposed by vCenter (.avg, .sum, etc.).

By default, starting with version 3.3.0, the collection_level is set to 1 and the new metric names with the additional suffix are sent by the integration.

The following scenarios are possible when using the vSphere integration:

  1. You never used the integration before, and you just installed an Agent with version 6.5.0+ / 5.27.0+. There is nothing specific in this case. Use the integration, configure the collection_level, and view your metrics in Datadog.

  2. You used the integration with an Agent older than 6.5.0/5.27.0, and upgraded to a newer version/

    • If your configuration specifically set the all_metrics parameter to either true or false, nothing changes (the same metrics are sent to Datadog). You should then update your dashboards and monitors to use the new metric names before switching to the new collection_level parameter, since all_metrics is deprecated and will eventually be removed.
    • If your configuration did not specify the all_metrics parameter, upon upgrade the integration defaults to the collection_level parameter set to 1 and sends the metrics with the new name to Datadog. Warning: this breaks your dashboard graphs and monitors scoped on the deprecated metrics, which stop being sent. To prevent this, you should explicitly set all_metrics: false in your configuration to continue reporting the same metrics, then update your dashboards and monitors to use the new metrics before switching back to using collection_level.

Configuration Options

OptionsRequiredDescription
ssl_verifyNoSet to false to disable SSL verification, when connecting to vCenter.
ssl_capathNoSet to the absolute file path of a directory containing CA certificates in PEM format.
host_include_only_regexNoUse a regex like this if you want the check to only fetch metrics for these ESXi hosts and the VMs running on it.
vm_include_only_regexNoUse a regex to include only the VMs that are matching this pattern.
include_only_markedNoSet to true if you’d like to only collect metrics on vSphere VMs which are marked by a custom field with the value ‘DatadogMonitored’. To set this custom field, you can use the UI to apply a tag or through the CLI with PowerCLI An example working on VSphere 5.1 is: `Get-VM VM
collection_levelNoA number between 1 and 4 to specify how many metrics are sent, 1 meaning only important monitoring metrics and 4 meaning every metric available.
all_metricsNo(Deprecated) When set to true, this collects EVERY metric from vCenter, which means a LOT of metrics. When set to false, this collects a subset of metrics we selected that are interesting to monitor
event_configNoEvent config is a dictionary. For now the only switch you can flip is collect_vcenter_alarms which sends the alarms set in vCenter as events.

Validation

Run the Agent’s status subcommand and look for vsphere under the Checks section.

Data Collected

Metrics

vsphere.clusterServices.cpufairness.latest
(gauge)
Fairness of distributed CPU resource allocation
vsphere.clusterServices.effectivecpu.avg
(gauge)
Total available CPU resources of all hosts within a cluster
Shown as megahertz
vsphere.clusterServices.effectivemem.avg
(gauge)
Total amount of machine memory of all hosts in the cluster that is available for use for virtual machine memory (physical memory for use by the Guest OS) and virtual machine overhead memory
Shown as mebibyte
vsphere.clusterServices.failover.latest
(gauge)
vSphere HA number of failures that can be tolerated
vsphere.clusterServices.memfairness.latest
(gauge)
Fairness of distributed memory resource allocation
vsphere.cpu.coreUtilization.avg
(gauge)
CPU utilization of the corresponding core (if hyper-threading is enabled) as a percentage
Shown as percent
vsphere.cpu.costop.sum
(gauge)
Time the virtual machine is ready to run, but is unable to run due to co-scheduling constraints
Shown as millisecond
vsphere.cpu.cpuentitlement.latest
(gauge)
Amount of CPU resources allocated to the virtual machine or resource pool
Shown as megahertz
vsphere.cpu.demand.avg
(gauge)
The amount of CPU resources a virtual machine would use if there were no CPU contention or CPU limit
Shown as megahertz
vsphere.cpu.demandEntitlementRatio.latest
(gauge)
CPU resource entitlement to CPU demand ratio (in percents)
Shown as percent
vsphere.cpu.entitlement.latest
(gauge)
CPU resources devoted by the ESXi scheduler
Shown as megahertz
vsphere.cpu.extra
(gauge)
Milliseconds of extra CPU time
Shown as millisecond
vsphere.cpu.idle.sum
(gauge)
Total time that the CPU spent in an idle state
Shown as millisecond
vsphere.cpu.latency.avg
(gauge)
Percent of time the virtual machine is unable to run because it is contending for access to the physical CPU(s)
Shown as percent
vsphere.cpu.maxlimited.sum
(gauge)
Time the virtual machine is ready to run, but is not running because it has reached its maximum CPU limit setting
Shown as millisecond
vsphere.cpu.overlap.sum
(gauge)
Time the virtual machine was interrupted to perform system services on behalf of itself or other virtual machines.
Shown as millisecond
vsphere.cpu.readiness.avg
(gauge)
Percentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU.
Shown as percent
vsphere.cpu.ready
(gauge)
Milliseconds of CPU time spent in ready state
Shown as millisecond
vsphere.cpu.ready.sum
(gauge)
Milliseconds of CPU time spent in ready state
Shown as millisecond
vsphere.cpu.reservedCapacity.avg
(gauge)
Total CPU capacity reserved by virtual machines
Shown as megahertz
vsphere.cpu.run.sum
(gauge)
Time the virtual machine is scheduled to run
Shown as millisecond
vsphere.cpu.swapwait.sum
(gauge)
CPU time spent waiting for swap-in
Shown as millisecond
vsphere.cpu.system.sum
(gauge)
Amount of time spent on system processes on each virtual CPU in the virtual machine. This is the host view of the CPU usage, not the guest operating system view.
Shown as millisecond
vsphere.cpu.totalCapacity.avg
(gauge)
Total CPU capacity reserved by and available for virtual machines
Shown as megahertz
vsphere.cpu.totalmhz.avg
(gauge)
Total megahertz of CPU being used
Shown as megahertz
vsphere.cpu.usage
(gauge)
Percentage of CPU capacity being used
Shown as percent
vsphere.cpu.usage.avg
(gauge)
Percentage of CPU capacity being used
Shown as percent
vsphere.cpu.usagemhz
(gauge)
Total megehertz of CPU being used
Shown as megahertz
vsphere.cpu.usagemhz.avg
(gauge)
CPU usage, as measured in megahertz
Shown as megahertz
vsphere.cpu.used.sum
(gauge)
Time accounted to the virtual machine. If a system service runs on behalf of this virtual machine, the time spent by that service (represented by cpu.system) should be charged to this virtual machine. If not, the time spent (represented by cpu.overlap) should not be charged against this virtual machine.
Shown as millisecond
vsphere.cpu.utilization.avg
(gauge)
CPU utilization as a percentage during the interval (CPU usage and CPU utilization might be different due to power management technologies or hyper-threading)
Shown as percent
vsphere.cpu.wait.sum
(gauge)
Total CPU time spent in wait state.The wait total includes time spent the CPU Idle, CPU Swap Wait, and CPU I/O Wait states.
Shown as millisecond
vsphere.datastore.busResets.sum
(gauge)
Number of SCSI-bus reset commands issued
Shown as command
vsphere.datastore.commandsAborted.sum
(gauge)
Number of SCSI commands aborted
Shown as command
vsphere.datastore.datastoreIops.avg
(gauge)
Storage I/O Control aggregated IOPS
Shown as operation
vsphere.datastore.datastoreMaxQueueDepth.latest
(gauge)
Storage I/O Control datastore maximum queue depth
Shown as command
vsphere.datastore.datastoreNormalReadLatency.latest
(gauge)
Storage DRS datastore normalized read latency
Shown as millisecond
vsphere.datastore.datastoreNormalWriteLatency.latest
(gauge)
Storage DRS datastore normalized write latency
Shown as millisecond
vsphere.datastore.datastoreReadBytes.latest
(gauge)
Storage DRS datastore bytes read
Shown as millisecond
vsphere.datastore.datastoreReadIops.latest
(gauge)
Storage DRS datastore read I/O rate
Shown as operation
vsphere.datastore.datastoreReadLoadMetric.latest
(gauge)
Storage DRS datastore metric for read workload model
vsphere.datastore.datastoreReadOIO.latest
(gauge)
Storage DRS datastore outstanding read requests
Shown as request
vsphere.datastore.datastoreVMObservedLatency.latest
(gauge)
The average datastore latency as seen by virtual machines
Shown as microsecond
vsphere.datastore.datastoreWriteBytes.latest
(gauge)
Storage DRS datastore bytes written
Shown as millisecond
vsphere.datastore.datastoreWriteIops.latest
(gauge)
Storage DRS datastore write I/O rate
Shown as operation
vsphere.datastore.datastoreWriteLoadMetric.latest
(gauge)
Storage DRS datastore metric for write workload model
vsphere.datastore.datastoreWriteOIO.latest
(gauge)
Storage DRS datastore outstanding write requests
Shown as request
vsphere.datastore.maxTotalLatency.latest
(gauge)
Highest latency value across all datastores used by the host
Shown as millisecond
vsphere.datastore.numberReadAveraged.avg
(gauge)
Average number of read commands issued per second to the datastore
Shown as command
vsphere.datastore.numberWriteAveraged.avg
(gauge)
Average number of write commands issued per second to the datastore during the collection interval
vsphere.datastore.read.avg
(gauge)
Rate of reading data from the datastore
Shown as kibibyte
vsphere.datastore.siocActiveTimePercentage.avg
(gauge)
Percentage of time Storage I/O Control actively controlled datastore latency
Shown as percent
vsphere.datastore.sizeNormalizedDatastoreLatency.avg
(gauge)
Storage I/O Control size-normalized I/O latency
Shown as microsecond
vsphere.datastore.throughput.contention.avg
(gauge)
Average amount of time for an I/O operation to the datastore or LUN across all ESX hosts accessing it.
Shown as millisecond
vsphere.datastore.throughput.usage.avg
(gauge)
The current bandwidth usage for the datastore or LUN.
Shown as kibibyte
vsphere.datastore.totalReadLatency.avg
(gauge)
Average amount of time for a read operation from the datastore
Shown as millisecond
vsphere.datastore.totalWriteLatency.avg
(gauge)
Average amount of time for a write operation from the datastore
Shown as millisecond
vsphere.datastore.write.avg
(gauge)
Rate of writing data to the datastore
Shown as kibibyte
vsphere.disk.busResets.sum
(gauge)
Number of SCSI-bus reset commands issued
Shown as command
vsphere.disk.capacity.contention.avg
(gauge)
The amount of storage capacity overcommitment for the entity, measured in percent.
Shown as percent
vsphere.disk.capacity.latest
(gauge)
Configured size of the datastore
Shown as kibibyte
vsphere.disk.capacity.provisioned.avg
(gauge)
Provisioned size of the entity
Shown as kibibyte
vsphere.disk.capacity.usage.avg
(gauge)
The amount of storage capacity currently being consumed by or on the entity.
Shown as kibibyte
vsphere.disk.commands.sum
(gauge)
Number of SCSI commands issued
Shown as command
vsphere.disk.commandsAborted
(gauge)
Number of SCSI commands aborted
Shown as occurrence
vsphere.disk.commandsAborted.sum
(gauge)
Number of SCSI commands aborted
Shown as command
vsphere.disk.commandsAveraged.avg
(gauge)
Average number of SCSI commands issued per second
Shown as command
vsphere.disk.deltaused.latest
(gauge)
Storage overhead of a virtual machine or a datastore due to delta disk backings
Shown as kibibyte
vsphere.disk.deviceLatency
(gauge)
Average amount of time it takes to complete an SCSI command from physical device
Shown as millisecond
vsphere.disk.deviceLatency.avg
(gauge)
Average amount of time it takes to complete an SCSI command from physical device
Shown as millisecond
vsphere.disk.deviceReadLatency
(gauge)
Average amount of time it takes to complete read from physical device
Shown as millisecond
vsphere.disk.deviceReadLatency.avg
(gauge)
Average amount of time to read from the physical device
Shown as millisecond
vsphere.disk.deviceWriteLatency
(gauge)
Average amount of time it takes to complete write to the physical device (LUN)
Shown as millisecond
vsphere.disk.deviceWriteLatency.avg
(gauge)
Average amount of time to write from the physical device
Shown as millisecond
vsphere.disk.kernelLatency.avg
(gauge)
Average amount of time spent by VMkernel to process each SCSI command
Shown as millisecond
vsphere.disk.kernelReadLatency.avg
(gauge)
Average amount of time spent by VMkernel to process each SCSI read command
Shown as millisecond
vsphere.disk.kernelWriteLatency.avg
(gauge)
Average amount of time spent by VMkernel to process each SCSI write command
Shown as millisecond
vsphere.disk.maxQueueDepth.avg
(gauge)
Maximum queue depth
Shown as command
vsphere.disk.maxTotalLatency.latest
(gauge)
Highest latency value across all disks used by the host
Shown as millisecond
vsphere.disk.numberRead.sum
(gauge)
Number of disk reads during the collection interval.
vsphere.disk.numberReadAveraged.avg
(gauge)
Average number of read commands issued per second to the datastore
Shown as command
vsphere.disk.numberWrite.sum
(gauge)
Number of disk writes during the collection interval.
vsphere.disk.numberWriteAveraged.avg
(gauge)
Average number of write commands issued per second to the datastore
Shown as command
vsphere.disk.provisioned.latest
(gauge)
Amount of storage set aside for use by a datastore or a virtual machine. Files on the datastore and the virtual machine can expand to this size but not beyond it
Shown as kibibyte
vsphere.disk.queueLatency
(gauge)
Average amount of time spent in VMkernel queue (per SCSI command)
Shown as millisecond
vsphere.disk.queueLatency.avg
(gauge)
Average amount of time spent in the VMkernel queue per SCSI command
Shown as millisecond
vsphere.disk.queueReadLatency.avg
(gauge)
Average amount of time spent in the VMkernel queue per SCSI read command
Shown as millisecond
vsphere.disk.queueWriteLatency.avg
(gauge)
Average amount of time spent in the VMkernel queue per SCSI write command
Shown as millisecond
vsphere.disk.read.avg
(gauge)
Average number of kilobytes read from the disk each second
Shown as kibibyte
vsphere.disk.scsiReservationCnflctsPct.avg
(gauge)
Number of SCSI reservation conflicts for the LUN as a percent of total commands during the collection interval
Shown as percent
vsphere.disk.scsiReservationConflicts.sum
(gauge)
Number of SCSI reservation conflicts for the LUN during the collection interval
vsphere.disk.totalLatency
(gauge)
Sum of average amount of time (in kernel and device) to process an SCSI command issued by the Guest OS to the vm
Shown as millisecond
vsphere.disk.totalLatency.avg
(gauge)
Average amount of time taken during the collection interval to process a SCSI command issued by the guest OS to the virtual machine.
Shown as millisecond
vsphere.disk.totalReadLatency.avg
(gauge)
Average amount of time taken to process a SCSI read command issued from the guest OS to the virtual machine
Shown as millisecond
vsphere.disk.totalWriteLatency.avg
(gauge)
Average amount of time taken to process a SCSI write command issued by the guest OS to the virtual machine
Shown as millisecond
vsphere.disk.unshared.latest
(gauge)
Amount of space associated exclusively with a virtual machine
Shown as kibibyte
vsphere.disk.usage.avg
(gauge)
Aggregated disk I/O rate
Shown as kibibyte
vsphere.disk.used.latest
(gauge)
Amount of space actually used by the virtual machine or the datastore. May be less than the amount provisioned at any given time, depending on whether the virtual machine is powered-off, whether snapshots have been created or not, and other such factors
Shown as kibibyte
vsphere.disk.write.avg
(gauge)
Average number of kilobytes written to the disk each second
Shown as kibibyte
vsphere.hbr.hbrNetRx.avg
(gauge)
Kilobytes per second of outgoing host-based replication network traffic (for this virtual machine or host).
Shown as kibibyte
vsphere.hbr.hbrNetTx.avg
(gauge)
Average amount of data transmitted per second
Shown as kibibyte
vsphere.hbr.hbrNumVms.avg
(gauge)
Number of powered-on virtual machines running on this host that currently have host-based replication protection enabled.
vsphere.mem.active
(gauge)
Kilobytes of memory that the VMkernel estimates is being actively used based on recently touched memory pages
Shown as kibibyte
vsphere.mem.active.avg
(gauge)
Amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages
Shown as kibibyte
vsphere.mem.activewrite.avg
(gauge)
Estimate for the amount of memory actively being written to by the virtual machine
Shown as kibibyte
vsphere.mem.compressed
(gauge)
Kilobytes of memory that have been compressed
Shown as kibibyte
vsphere.mem.compressed.avg
(gauge)
Amount of memory reserved by userworlds
Shown as kibibyte
vsphere.mem.compressionRate.avg
(gauge)
Rate of memory compression for the virtual machine
Shown as kibibyte
vsphere.mem.consumed
(gauge)
Kilobytes of used memory
Shown as kibibyte
vsphere.mem.consumed.avg
(gauge)
Amount of host physical memory consumed by a virtual machine, host, or cluster
Shown as kibibyte
vsphere.mem.consumed.userworlds.avg
(gauge)
Amount of physical memory consumed by userworlds on this host
Shown as kibibyte
vsphere.mem.consumed.vms.avg
(gauge)
Amount of physical memory consumed by VMs on this host.
Shown as kibibyte
vsphere.mem.decompressionRate.avg
(gauge)
Rate of memory decompression for the virtual machine
Shown as kibibyte
vsphere.mem.entitlement.avg
(gauge)
Amount of host physical memory the virtual machine is entitled to, as determined by the ESX scheduler
Shown as kibibyte
vsphere.mem.granted.avg
(gauge)
Amount of host physical memory or physical memory that is mapped for a virtual machine or a host
Shown as kibibyte
vsphere.mem.heap.avg
(gauge)
VMkernel virtual address space dedicated to VMkernel main heap and related data
Shown as kibibyte
vsphere.mem.heapfree.avg
(gauge)
Free address space in the VMkernel main heap.Varies based on number of physical devices and configuration options. There is no direct way for the user to increase or decrease this statistic. For informational purposes only: not useful for performance monitoring.
Shown as kibibyte
vsphere.mem.latency.avg
(gauge)
Percentage of time the virtual machine is waiting to access swapped or compressed memory
Shown as percent
vsphere.mem.llSwapIn.avg
(gauge)
Amount of memory swapped-in from host cache
Shown as kibibyte
vsphere.mem.llSwapInRate.avg
(gauge)
Rate at which memory is being swapped from host cache into active memory
Shown as kibibyte
vsphere.mem.llSwapOut.avg
(gauge)
Amount of memory swapped-out to host cache
Shown as kibibyte
vsphere.mem.llSwapOutRate.avg
(gauge)
Rate at which memory is being swapped from active memory to host cache
Shown as kibibyte
vsphere.mem.llSwapUsed.avg
(gauge)
Space used for caching swapped pages in the host cache
Shown as kibibyte
vsphere.mem.lowfreethreshold.avg
(gauge)
Threshold of free host physical memory below which ESX/ESXi will begin reclaiming memory from virtual machines through ballooning and swapping
Shown as kibibyte
vsphere.mem.mementitlement.latest
(gauge)
Memory allocation as calculated by the VMkernel scheduler based on current estimated demand and reservation, limit, and shares policies set for all virtual machines and resource pools in the host or cluster
Shown as mebibyte
vsphere.mem.overhead
(gauge)
Kilobytes of memory allocated to a vm beyond its reserved amount
Shown as kibibyte
vsphere.mem.overhead.avg
(gauge)
Host physical memory consumed by the virtualization infrastructure for running the virtual machine
Shown as kibibyte
vsphere.mem.overheadMax.avg
(gauge)
Host physical memory reserved for use as the virtualization overhead for the virtual machine
Shown as kibibyte
vsphere.mem.overheadTouched.avg
(gauge)
Actively touched overhead host physical memory (KB) reserved for use as the virtualization overhead for the virtual machine
Shown as kibibyte
vsphere.mem.reservedCapacity.avg
(gauge)
Total amount of memory reservation used by powered-on virtual machines and vSphere services on the host
Shown as mebibyte
vsphere.mem.shared.avg
(gauge)
Amount of guest physical memory that is shared with other virtual machines, relative to a single virtual machine or to all powered-on virtual machines on a host
Shown as kibibyte
vsphere.mem.sharedcommon.avg
(gauge)
Amount of machine memory that is shared by all powered-on virtual machines and vSphere services on the host
Shown as kibibyte
vsphere.mem.state.latest
(gauge)
One of four threshold levels representing the percentage of free memory on the host. The counter value determines swapping and ballooning behavior for memory reclamation
Shown as kibibyte
vsphere.mem.swapin.avg
(gauge)
Amount of memory swapped-in from disk
Shown as kibibyte
vsphere.mem.swapinRate.avg
(gauge)
Rate at which memory is swapped from disk into active memory
Shown as kibibyte
vsphere.mem.swapout.avg
(gauge)
Amount of memory swapped-out to disk
Shown as kibibyte
vsphere.mem.swapoutRate.avg
(gauge)
Rate at which memory is being swapped from active memory to disk
Shown as kibibyte
vsphere.mem.swapped.avg
(gauge)
Current amount of guest physical memory swapped out to the virtual machine swap file by the VMkernel. Swapped memory stays on disk until the virtual machine needs it. This statistic refers to VMkernel swapping and not to guest OS swapping
Shown as kibibyte
vsphere.mem.swaptarget.avg
(gauge)
Target size for the virtual machine swap file. The VMkernel manages swapping by comparing swaptarget against swapped
Shown as kibibyte
vsphere.mem.swapused.avg
(gauge)
Amount of memory that is used by swap. Sum of memory swapped of all powered on VMs and vSphere services on the host
Shown as kibibyte
vsphere.mem.sysUsage.avg
(gauge)
Amount of host physical memory used by VMkernel for core functionality, such as device drivers and other internal uses. Does not include memory used by virtual machines or vSphere services
Shown as kibibyte
vsphere.mem.totalCapacity.avg
(gauge)
Total amount of memory reservation used by and available for powered-on virtual machines and vSphere services on the host
Shown as mebibyte
vsphere.mem.totalmb.avg
(gauge)
Total amount of host physical memory of all hosts in the cluster that is available for virtual machine memory (physical memory for use by the guest OS) and virtual machine overhead memory
Shown as kibibyte
vsphere.mem.unreserved.avg
(gauge)
Amount of memory that is unreserved. Memory reservation not used by the Service Console, VMkernel, vSphere services and other powered on VMs user-specified memory reservations and overhead memory
Shown as kibibyte
vsphere.mem.usage.avg
(gauge)
Memory usage as percent of total configured or available memory
Shown as percent
vsphere.mem.vmfs.pbc.capMissRatio.latest
(gauge)
Trailing average of the ratio of capacity misses to compulsory misses for the VMFS PB Cache
Shown as percent
vsphere.mem.vmfs.pbc.overhead.latest
(gauge)
Amount of VMFS heap used by the VMFS PB Cache
Shown as kibibyte
vsphere.mem.vmfs.pbc.size.latest
(gauge)
Space used for holding VMFS Pointer Blocks in memory
Shown as mebibyte
vsphere.mem.vmfs.pbc.sizeMax.latest
(gauge)
Maximum size the VMFS Pointer Block Cache can grow to
Shown as mebibyte
vsphere.mem.vmfs.pbc.workingSet.latest
(gauge)
Amount of file blocks whose addresses are cached in the VMFS PB Cache
Shown as tebibyte
vsphere.mem.vmfs.pbc.workingSetMax.latest
(gauge)
Maximum amount of file blocks whose addresses are cached in the VMFS PB Cache
Shown as tebibyte
vsphere.mem.vmmemctl
(gauge)
Kilobytes of memory allocated by the virtual machine memory control driver (vmmemctl)
Shown as kibibyte
vsphere.mem.vmmemctl.avg
(gauge)
Amount of memory allocated by the virtual machine memory control driver (vmmemctl)
Shown as kibibyte
vsphere.mem.vmmemctltarget.avg
(gauge)
Target value set by VMkernal for the virtual machine's memory balloon size. In conjunction with vmmemctl metric, this metric is used by VMkernel to inflate and deflate the balloon for a virtual machine
Shown as kibibyte
vsphere.mem.zero.avg
(gauge)
Memory that contains 0s only. Included in shared amount. Through transparent page sharing, zero memory pages can be shared among virtual machines that run the same operating system
Shown as kibibyte
vsphere.mem.zipSaved.latest
(gauge)
Memory saved due to memory zipping
Shown as kibibyte
vsphere.mem.zipped.latest
(gauge)
Memory zipped
Shown as kibibyte
vsphere.net.broadcastRx.sum
(gauge)
Number of broadcast packets received
Shown as packet
vsphere.net.broadcastTx.sum
(gauge)
Number of broadcast packets transmitted
Shown as packet
vsphere.net.bytesRx.avg
(gauge)
Average amount of data received per second
Shown as kibibyte
vsphere.net.bytesTx.avg
(gauge)
Average amount of data transmitted per second
Shown as kibibyte
vsphere.net.droppedRx.sum
(gauge)
Number of received packets dropped
Shown as packet
vsphere.net.droppedTx.sum
(gauge)
Number of transmitted packets dropped
Shown as packet
vsphere.net.errorsRx.sum
(gauge)
Number of packets with errors received
Shown as packet
vsphere.net.errorsTx.sum
(gauge)
Number of packets with errors transmitted
Shown as packet
vsphere.net.multicastRx.sum
(gauge)
Number of multicast packets received
Shown as packet
vsphere.net.multicastTx.sum
(gauge)
Number of multicast packets transmitted
Shown as packet
vsphere.net.packetsRx.sum
(gauge)
Number of packets received
Shown as packet
vsphere.net.packetsTx.sum
(gauge)
Number of packets transmitted
Shown as packet
vsphere.net.pnicBytesRx.avg
(gauge)
vsphere.net.pnicBytesTx.avg
(gauge)
vsphere.net.received.avg
(gauge)
Average rate at which data was received during the interval. This represents the bandwidth of the network
Shown as kibibyte
vsphere.net.transmitted.avg
(gauge)
Average rate at which data was transmitted during the interval. This represents the bandwidth of the network
Shown as kibibyte
vsphere.net.unknownProtos.sum
(gauge)
Number of frames with unknown protocol received
Shown as kibibyte
vsphere.net.usage.avg
(gauge)
Network utilization (combined transmit- and receive-rates)
Shown as kibibyte
vsphere.network.received
(rate)
Number of kilobytes received by the host
Shown as kibibyte
vsphere.network.transmitted
(rate)
Number of kilobytes transmitted by the host
Shown as kibibyte
vsphere.power.energy.sum
(gauge)
Total energy (in joule) used since last stats reset.
vsphere.power.power.avg
(gauge)
Current power usage
Shown as watt
vsphere.power.powerCap.avg
(gauge)
Maximum allowed power usage.
Shown as watt
vsphere.rescpu.actav1.latest
(gauge)
CPU active average over 1 minute
Shown as percent
vsphere.rescpu.actav15.latest
(gauge)
CPU active average over 15 minutes
Shown as percent
vsphere.rescpu.actav5.latest
(gauge)
CPU active average over 5 minutes
Shown as percent
vsphere.rescpu.actpk1.latest
(gauge)
CPU active peak over 1 minute
Shown as percent
vsphere.rescpu.actpk15.latest
(gauge)
CPU active peak over 15 minutes
Shown as percent
vsphere.rescpu.actpk5.latest
(gauge)
CPU active peak over 5 minutes
Shown as percent
vsphere.rescpu.maxLimited1.latest
(gauge)
Amount of CPU resources over the limit that were refused, average over 1 minute
Shown as percent
vsphere.rescpu.maxLimited15.latest
(gauge)
Amount of CPU resources over the limit that were refused, average over 15 minutes
Shown as percent
vsphere.rescpu.maxLimited5.latest
(gauge)
Amount of CPU resources over the limit that were refused, average over 5 minutes
Shown as percent
vsphere.rescpu.runav1.latest
(gauge)
CPU running average over 1 minute
Shown as percent
vsphere.rescpu.runav15.latest
(gauge)
CPU running average over 15 minutes
Shown as percent
vsphere.rescpu.runav5.latest
(gauge)
CPU running average over 5 minutes
Shown as percent
vsphere.rescpu.runpk1.latest
(gauge)
CPU running peak over 1 minute
Shown as percent
vsphere.rescpu.runpk15.latest
(gauge)
CPU running peak over 15 minutes
Shown as percent
vsphere.rescpu.runpk5.latest
(gauge)
CPU running peak over 5 minutes
Shown as percent
vsphere.rescpu.sampleCount.latest
(gauge)
Group CPU sample count.
vsphere.rescpu.samplePeriod.latest
(gauge)
Group CPU sample period.
Shown as millisecond
vsphere.storageAdapter.commandsAveraged.avg
(gauge)
Average number of commands issued per second by the storage adapter
Shown as command
vsphere.storageAdapter.maxTotalLatency.latest
(gauge)
Highest latency value across all storage adapters used by the host
Shown as millisecond
vsphere.storageAdapter.numberReadAveraged.avg
(gauge)
Average number of read commands issued per second by the storage adapter
Shown as command
vsphere.storageAdapter.numberWriteAveraged.avg
(gauge)
Average number of write commands issued per second by the storage adapter
Shown as command
vsphere.storageAdapter.outstandingIOs.avg
(gauge)
The number of I/Os that have been issued but have not yet completed
Shown as command
vsphere.storageAdapter.queueDepth.avg
(gauge)
The maximum number of I/Os that can be outstanding at a given time
Shown as command
vsphere.storageAdapter.queueLatency.avg
(gauge)
Average amount of time spent in the VMkernel queue per SCSI command
Shown as millisecond
vsphere.storageAdapter.queued.avg
(gauge)
The current number of I/Os that are waiting to be issued
Shown as command
vsphere.storageAdapter.read.avg
(gauge)
Rate of reading data by the storage adapter
Shown as kibibyte
vsphere.storageAdapter.totalReadLatency.avg
(gauge)
Average amount of time for a read operation by the storage adapter
Shown as millisecond
vsphere.storageAdapter.totalWriteLatency.avg
(gauge)
Average amount of time for a write operation by the storage adapter
Shown as millisecond
vsphere.storageAdapter.write.avg
(gauge)
Rate of writing data by the storage adapter
Shown as kibibyte
vsphere.storagePath.busResets.sum
(gauge)
Number of SCSI-bus reset commands issued
Shown as command
vsphere.storagePath.commandsAborted.sum
(gauge)
Number of SCSI commands aborted
Shown as command
vsphere.storagePath.commandsAveraged.avg
(gauge)
Average number of commands issued per second on the storage path during the collection interval
Shown as command
vsphere.storagePath.maxTotalLatency.latest
(gauge)
Highest latency value across all storage paths used by the host
Shown as millisecond
vsphere.storagePath.numberReadAveraged.avg
(gauge)
Average number of read commands issued per second on the storage path during the collection interval
vsphere.storagePath.numberWriteAveraged.avg
(gauge)
Average number of write commands issued per second on the storage path during the collection interval
vsphere.storagePath.read.avg
(gauge)
Rate of reading data on the storage path
Shown as kibibyte
vsphere.storagePath.totalReadLatency.avg
(gauge)
Average amount of time for a read issued on the storage path. Total latency = kernel latency + device latency.
Shown as millisecond
vsphere.storagePath.totalWriteLatency.avg
(gauge)
Average amount of time for a write issued on the storage path. Total latency = kernel latency + device latency.
Shown as millisecond
vsphere.storagePath.write.avg
(gauge)
Rate of writing data on the storage path
Shown as kibibyte
vsphere.sys.heartbeat.latest
(gauge)
Number of heartbeats issued per virtual machine
vsphere.sys.heartbeat.sum
(gauge)
Number of heartbeats issued per virtual machine
vsphere.sys.osUptime.latest
(gauge)
Total time elapsed, in seconds, since last operating system boot-up
Shown as second
vsphere.sys.resourceCpuAct1.latest
(gauge)
CPU active average over 1 minute of the system resource group
Shown as percent
vsphere.sys.resourceCpuAct5.latest
(gauge)
CPU active average over 5 minutes of the system resource group
Shown as percent
vsphere.sys.resourceCpuAllocMax.latest
(gauge)
CPU allocation limit (in MHz) of the system resource group
Shown as megahertz
vsphere.sys.resourceCpuAllocMin.latest
(gauge)
CPU allocation reservation (in MHz) of the system resource group
Shown as megahertz
vsphere.sys.resourceCpuAllocShares.latest
(gauge)
CPU allocation shares of the system resource group
vsphere.sys.resourceCpuMaxLimited1.latest
(gauge)
CPU maximum limited over 1 minute of the system resource group
Shown as percent
vsphere.sys.resourceCpuMaxLimited5.latest
(gauge)
CPU maximum limited over 5 minutes of the system resource group
Shown as percent
vsphere.sys.resourceCpuRun1.latest
(gauge)
CPU running average over 1 minute of the system resource group
Shown as percent
vsphere.sys.resourceCpuRun5.latest
(gauge)
CPU running average over 5 minutes of the system resource group
Shown as percent
vsphere.sys.resourceCpuUsage.avg
(gauge)
Amount of CPU used by the Service Console and other applications during the interval by the Service Console and other applications.
Shown as megahertz
vsphere.sys.resourceFdUsage.latest
(gauge)
Number of file descriptors used by the system resource group
vsphere.sys.resourceMemAllocMax.latest
(gauge)
Memory allocation limit (in KB) of the system resource group
Shown as kibibyte
vsphere.sys.resourceMemAllocMin.latest
(gauge)
Memory allocation reservation (in KB) of the system resource group
Shown as kibibyte
vsphere.sys.resourceMemAllocShares.latest
(gauge)
Memory allocation shares of the system resource group
vsphere.sys.resourceMemConsumed.latest
(gauge)
Memory consumed by the system resource group
Shown as kibibyte
vsphere.sys.resourceMemCow.latest
(gauge)
Memory shared by the system resource group
Shown as kibibyte
vsphere.sys.resourceMemMapped.latest
(gauge)
Memory mapped by the system resource group
Shown as kibibyte
vsphere.sys.resourceMemOverhead.latest
(gauge)
Overhead memory consumed by the system resource group
Shown as kibibyte
vsphere.sys.resourceMemShared.latest
(gauge)
Memory saved due to sharing by the system resource group
Shown as kibibyte
vsphere.sys.resourceMemSwapped.latest
(gauge)
Memory swapped out by the system resource group
Shown as kibibyte
vsphere.sys.resourceMemTouched.latest
(gauge)
Memory touched by the system resource group
Shown as kibibyte
vsphere.sys.resourceMemZero.latest
(gauge)
Zero filled memory used by the system resource group
Shown as kibibyte
vsphere.sys.uptime.latest
(gauge)
Total time elapsed since last system startup
Shown as second
vsphere.virtualDisk.busResets.sum
(gauge)
Number of SCSI-bus reset commands issued
Shown as command
vsphere.virtualDisk.commandsAborted.sum
(gauge)
Number of SCSI commands aborted
Shown as command
vsphere.virtualDisk.largeSeeks.latest
(gauge)
Number of seeks during the interval that were greater than 8192 LBNs apart
vsphere.virtualDisk.mediumSeeks.latest
(gauge)
Number of seeks during the interval that were between 64 and 8192 LBNs apart
vsphere.virtualDisk.numberReadAveraged.avg
(gauge)
Average number of read commands issued per second to the virtual disk
Shown as command
vsphere.virtualDisk.numberWriteAveraged.avg
(gauge)
Average number of write commands issued per second to the virtual disk
Shown as command
vsphere.virtualDisk.read.avg
(gauge)
Average number of kilobytes read from the virtual disk each second
Shown as kibibyte
vsphere.virtualDisk.readIOSize.latest
(gauge)
Average read request size in bytes
vsphere.virtualDisk.readLatencyUS.latest
(gauge)
Read latency in microseconds
Shown as microsecond
vsphere.virtualDisk.readLoadMetric.latest
(gauge)
Storage DRS virtual disk metric for the read workload model
vsphere.virtualDisk.readOIO.latest
(gauge)
Average number of outstanding read requests to the virtual disk
Shown as request
vsphere.virtualDisk.smallSeeks.latest
(gauge)
Number of seeks during the interval that were less than 64 LBNs apart
vsphere.virtualDisk.totalReadLatency.avg
(gauge)
Average amount of time for a read operation from the virtual disk
Shown as millisecond
vsphere.virtualDisk.totalWriteLatency.avg
(gauge)
Average amount of time for a write operation from the virtual disk
Shown as millisecond
vsphere.virtualDisk.write.avg
(gauge)
Average number of kilobytes written to the virtual disk each second
Shown as kibibyte
vsphere.virtualDisk.writeIOSize.latest
(gauge)
Average write request size in bytes
vsphere.virtualDisk.writeLatencyUS.latest
(gauge)
Write latency in microseconds
Shown as microsecond
vsphere.virtualDisk.writeLoadMetric.latest
(gauge)
Storage DRS virtual disk metric for the write workload model
vsphere.virtualDisk.writeOIO.latest
(gauge)
Average number of outstanding write requests to the virtual disk
Shown as request
vsphere.vm.count
(gauge)
Timeserie with value 1 for each VM. Make 'sum by {X}' queries to count all the VMs with the tag X.
vsphere.host.count
(gauge)
Timeserie with value 1 for each ESXi Host. Make 'sum by {X}' queries to count all the Hosts with the tag X.
vsphere.datastore.count
(gauge)
Timeserie with value 1 for each Datastore. Make 'sum by {X}' queries to count all the Datastores with the tag X.
vsphere.datacenter.count
(gauge)
Timeserie with value 1 for each Datacenter. Make 'sum by {X}' queries to count all the Datacenters with the tag X.
vsphere.cluster.count
(gauge)
Timeserie with value 1 for each Cluster. Make 'sum by {X}' queries to count all the Clusters with the tag X.
vsphere.vmop.numChangeDS.latest
(gauge)
Number of datastore change operations for powered-off and suspended virtual machines
Shown as operation
vsphere.vmop.numChangeHost.latest
(gauge)
Number of host change operations for powered-off and suspended virtual machines
Shown as operation
vsphere.vmop.numChangeHostDS.latest
(gauge)
Number of host and datastore change operations for powered-off and suspended virtual machines
Shown as operation
vsphere.vmop.numClone.latest
(gauge)
Number of virtual machine clone operations
Shown as operation
vsphere.vmop.numCreate.latest
(gauge)
Number of virtual machine create operations
Shown as operation
vsphere.vmop.numDeploy.latest
(gauge)
Number of virtual machine template deploy operations
Shown as operation
vsphere.vmop.numDestroy.latest
(gauge)
Number of virtual machine delete operations
Shown as operation
vsphere.vmop.numPoweroff.latest
(gauge)
Number of virtual machine power off operations
Shown as operation
vsphere.vmop.numPoweron.latest
(gauge)
Number of virtual machine power on operations
Shown as operation
vsphere.vmop.numRebootGuest.latest
(gauge)
Number of virtual machine guest reboot operations
Shown as operation
vsphere.vmop.numReconfigure.latest
(gauge)
Number of virtual machine reconfigure operations
Shown as operation
vsphere.vmop.numRegister.latest
(gauge)
Number of virtual machine register operations
Shown as operation
vsphere.vmop.numReset.latest
(gauge)
Number of virtual machine reset operations
Shown as operation
vsphere.vmop.numSVMotion.latest
(gauge)
Number of migrations with Storage vMotion (datastore change operations for powered-on VMs)
Shown as operation
vsphere.vmop.numShutdownGuest.latest
(gauge)
Number of virtual machine guest shutdown operations
Shown as operation
vsphere.vmop.numStandbyGuest.latest
(gauge)
Number of virtual machine standby guest operations
Shown as operation
vsphere.vmop.numSuspend.latest
(gauge)
Number of virtual machine suspend operations
Shown as operation
vsphere.vmop.numUnregister.latest
(gauge)
Number of virtual machine unregister operations
Shown as operation
vsphere.vmop.numVMotion.latest
(gauge)
Number of migrations with vMotion (host change operations for powered-on VMs)
Shown as operation
vsphere.vmop.numXVMotion.latest
(gauge)
Number of host and datastore change operations for powered-on and suspended virtual machines
Shown as operation
datadog.vsphere.query_metrics.time.avg
(gauge)
Time required to run a query_metrics operation (avg)
Shown as second
datadog.vsphere.query_metrics.time.max
(gauge)
Time required to run a query_metrics operation (max)
Shown as second
datadog.vsphere.query_metrics.time.count
(gauge)
Time required to run a query_metrics operation (count)
Shown as second
datadog.vsphere.query_metrics.time.median
(gauge)
Time required to run a query_metrics operation (med)
Shown as second
datadog.vsphere.query_metrics.time.95percentile
(gauge)
Time required to run a query_metrics operation (95th)
Shown as second
datadog.vsphere.query_tags.time
(gauge)
Time required to query vSphere tags
Shown as second
datadog.vsphere.collect_events.time
(gauge)
Time required to collect events
Shown as second
datadog.vsphere.refresh_infrastructure_cache.time
(gauge)
Time required to refresh the infra cache
Shown as second
datadog.vsphere.refresh_metrics_metadata_cache.time
(gauge)
Time required to refresh the metrics metadata cache
Shown as second

Events

This check watches vCenter’s Event Manager for events and emits them to Datadog. It does NOT emit the following event types:

  • AlarmStatusChangedEvent:Gray
  • VmBeingHotMigratedEvent
  • VmResumedEvent
  • VmReconfiguredEvent
  • VmPoweredOnEvent
  • VmMigratedEvent
  • TaskEvent:Initialize powering On
  • TaskEvent:Power Off virtual machine
  • TaskEvent:Power On virtual machine
  • TaskEvent:Reconfigure virtual machine
  • TaskEvent:Relocate virtual machine
  • TaskEvent:Suspend virtual machine
  • TaskEvent:Migrate virtual machine
  • VmMessageEvent
  • VmSuspendedEvent
  • VmPoweredOffEvent

Service Checks

vcenter.can_connect:
Returns CRITICAL if the Agent cannot connect to vCenter to collect metrics, otherwise OK.

Troubleshooting

Further Reading

See our blog post on monitoring vSphere environments with Datadog.