Microsoft Exchange Server

Supported OS Windows

Integration version2.1.1

Overview

Get metrics from Microsoft Exchange Server

  • Visualize and monitor Exchange server performance

Setup

Installation

The Exchange check is included in the Datadog Agent package, so you don’t need to install anything else on your servers.

Configuration

  1. Edit the exchange_server.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Exchange Server performance data.

  2. Restart the Agent.

Note: Versions 1.11.0 or later of this check use a new implementation for metric collection, which requires Python 3. For hosts that are unable to use Python 3, or if you would like to use a legacy version of this check, refer to the following config.

Log collection

  1. Collecting logs is disabled by default in the Datadog Agent, you need to enable it in datadog.yaml:

    logs_enabled: true
    
  2. Add this configuration block to your exchange_server.d/conf.yaml file to start collecting your Exchange Server Logs:

    logs:
      - type: file
        path: "C:\\Program Files\\Microsoft\\Exchange Server\\V15\\TransportRoles\\Logs\\CommonDiagnosticsLog\\*"
        source: exchange-server
      - type: file
        path: "C:\\Program Files\\Microsoft\\Exchange Server\\V15\\TransportRoles\\Logs\\ThrottlingService\\*"
        source: exchange-server
      - type: file
        path: "C:\\Program Files\\Microsoft\\Exchange Server\\V15\\TransportRoles\\Logs\\Hub\\Connectivity\\*"
        source: exchange-server
    

    Note: The only logs supported are CommonDiagnosticsLog, ThrottlingService, and Connectivity logs due to Exchange Server outputting many different types of logs. Contact Datadog support to request other logs formats.

    Change the path parameter value and configure it for your environment. See the sample exchange_server.d/conf.yaml for all available configuration options.

  3. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for exchange_server under the Checks section.

Data Collected

Metrics

exchange.activemanager.database_mounted
(gauge)
Shows the number of active database copies on the server.
exchange.activesync.ping_pending
(gauge)
Shows the number of ping commands currently pending in the queue.
Shown as command
exchange.activesync.requests_persec
(gauge)
Shows the number of HTTP requests received from the client via ASP.NET per second. Determines the current Exchange ActiveSync request rate. Used only to determine current user load.
Shown as request
exchange.activesync.sync_persec
(gauge)
Shows the number of sync commands processed per second. Clients use this command to synchronize items within a folder.
Shown as command
exchange.adaccess_domain_controllers.ldap_read
(gauge)
Shows the time in milliseconds (ms) to send an LDAP read request to the specified domain controller and receive a response.
Shown as millisecond
exchange.adaccess_domain_controllers.ldap_search
(gauge)
Shows the time (in ms) to send an LDAP search request and receive a response.
Shown as millisecond
exchange.adaccess_processes.ldap_read
(gauge)
Shows the time (in ms) to send an LDAP read request to the specified domain controller and receive a response.
Shown as millisecond
exchange.adaccess_processes.ldap_search
(gauge)
Shows the time (in ms) to send an LDAP search request and receive a response.
Shown as millisecond
exchange.autodiscover.requests_persec
(gauge)
Shows the number of Autodiscover service requests processed each second. Determines current user load.
Shown as request
exchange.database.io_db_reads_attached_persec
(gauge)
Shows the number of database read operations per second for each attached database instance.
Shown as read
exchange.database.io_db_reads_recovery_avg_latency
(gauge)
Shows the average length of time, in ms, per passive database read operation.
Shown as millisecond
exchange.database.io_db_writes_attached_persec
(gauge)
Shows the number of database write operations per second for each attached database instance.
Shown as write
exchange.database.io_db_writes_recovery_avg_latency
(gauge)
Shows the average length of time, in ms, per passive database write operation.
Shown as millisecond
exchange.database.io_log_writes_avg_latency
(gauge)
Shows the average length of time, in ms, per Log write operation.
Shown as millisecond
exchange.database.io_log_writes_persec
(gauge)
Shows the number of log writes per second for each database instance.
Shown as write
exchange.database.io_reads_avg_latency
(gauge)
Shows the average length of time, in milliseconds (ms), per database read operation.
Shown as millisecond
exchange.database.io_writes_avg_latency
(gauge)
Shows the average length of time, in ms, per database write operation.
Shown as millisecond
exchange.httpproxy.avg_auth_latency
(gauge)
Shows the average time spent authenticating CAS requests over the last 200 samples.
Shown as millisecond
exchange.httpproxy.clientaccess_processing_latency
(gauge)
Shows the average latency (ms) of CAS processing time (does not include time spent proxying) over the last 200 requests.
Shown as millisecond
exchange.httpproxy.mailbox_proxy_failure_rate
(gauge)
Shows the percentage of connectivity related failures between this Client Access Server and MBX servers over the last 200 samples.
Shown as percent
exchange.httpproxy.outstanding_requests
(gauge)
Shows the number of concurrent outstanding proxy requests.
Shown as request
exchange.httpproxy.proxy_requests_persec
(gauge)
Shows the number of proxy requests processed each second.
Shown as request
exchange.httpproxy.requests_persec
(gauge)
Shows the number of requests processed each second.
Shown as request
exchange.httpproxy.server_locator_latency
(gauge)
Shows the average latency (ms) of MailboxServerLocator web service calls.
Shown as millisecond
exchange.is.clienttype.rpc_latency
(gauge)
Shows a server RPC latency, in ms, averaged for the past 1,024 packets for a particular client protocol.
Shown as millisecond
exchange.is.clienttype.rpc_ops_persec
(gauge)
Shows the number of RPC operations per second for each client type connection.
Shown as operation
exchange.is.store.rpc_latency
(gauge)
RPC Latency average (msec) is the average latency in milliseconds of RPC requests per database. Average is calculated over all RPCs since exrpc32 was loaded.
Shown as millisecond
exchange.is.store.rpc_ops_persec
(gauge)
Shows the number of RPC operations per second for each database instance.
Shown as operation
exchange.is.store.rpc_requests
(gauge)
Indicates the overall RPC requests currently executing within the information store process.
Shown as request
exchange.memory.available
(gauge)
Shows the amount of physical memory, in megabytes (MB), immediately available for allocation to a process or for system use. It's equal to the sum of memory assigned to the standby (cached), free, and zero page lists.[Not Exchange Server specific metric]
Shown as mebibyte
exchange.memory.committed
(gauge)
Shows the ratio of Memory\Committed Bytes to the Memory\Commit Limit.[Not Exchange Server specific metric]
Shown as percent
exchange.netlogon.semaphore_acquires
(count)
The total number of times that the semaphore has been obtained over the lifetime of the security channel connection, or since system startup for _Total.
exchange.netlogon.semaphore_hold_time
(gauge)
The average time (in seconds) that the semaphore is held over the last sample.
Shown as second
exchange.netlogon.semaphore_holders
(gauge)
The number of threads that are holding the semaphore.
Shown as thread
exchange.netlogon.semaphore_timeouts
(count)
The total number of times that a thread has timed out while it waited for the semaphore over the lifetime of the security channel connection, or since system startup for _Total.
Shown as timeout
exchange.netlogon.semaphore_waiters
(gauge)
The number of threads that are waiting to obtain the semaphore.
Shown as thread
exchange.network.outbound_errors
(gauge)
Indicates the number of outbound packets that couldn't be transmitted because of errors.
Shown as error
exchange.network.tcpv4.conns_reset
(count)
Shows the number of times TCP connections have made a direct transition to the CLOSED state from either the ESTABLISHED state or the CLOSE-WAIT state.[Not Exchange Server specific metric]
Shown as connection
exchange.network.tcpv6.connection_failures
(gauge)
Shows the number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT.[Not Exchange Server specific metric]
Shown as error
exchange.network.tcpv6.conns_reset
(count)
Shows the number of times TCP connections have made a direct transition to the CLOSED state from either the ESTABLISHED state or the CLOSE-WAIT state.[Not Exchange Server specific metric]
Shown as connection
exchange.owa.requests_persec
(gauge)
Shows the number of requests handled by Outlook Web App per second. Determines current user load.
Shown as request
exchange.owa.unique_users
(gauge)
Shows the number of unique users currently logged on to Outlook Web App. This value monitors the number of unique active user sessions, so that users are only removed from this counter after they log off or their session times out. Determines current user load.
Shown as user
exchange.processor.cpu_privileged
(gauge)
Shows the percentage of processor time spent in privileged mode. Privileged mode is a processing mode designed for operating system components and hardware-manipulating drivers. It allows direct access to hardware and all memory.
Shown as percent
exchange.processor.cpu_time
(gauge)
Shows the percentage of time that the processor is executing application or operating system processes. This is when the processor isn't idle.
Shown as percent
exchange.processor.cpu_user
(gauge)
Shows the percentage of processor time spent in user mode. User mode is a restricted processing mode designed for applications, environment subsystems, and integral subsystems.
Shown as percent
exchange.processor.queue_length
(gauge)
Indicates the number of threads each processor is servicing. Processor Queue Length shows the number of threads that are delayed in the Processor Ready Queue and are waiting to be scheduled for execution. The value listed is the last observed value at the time the measurement was taken.[Not Exchange Server specific metric]
Shown as thread
exchange.rpc.active_user_count
(gauge)
Shows the number of unique users that have shown some activity in the last 2 minutes.
Shown as user
exchange.rpc.averaged_latency
(gauge)
Shows the latency, in milliseconds (ms), averaged for the past 1,024 packets.
Shown as millisecond
exchange.rpc.conn_count
(gauge)
Shows the total number of client connections maintained.
Shown as connection
exchange.rpc.ops_persec
(gauge)
Shows the rate at which RPC operations occur, per second.
Shown as operation
exchange.rpc.requests
(gauge)
Shows the number of client requests currently being processed by the RPC Client Access service.
Shown as request
exchange.rpc.user_count
(gauge)
Shows the number of users connected to the service.
Shown as user
exchange.workload_management.active_tasks
(gauge)
Shows the number of active tasks currently running in the background for workload management.
Shown as task
exchange.workload_management.completed_tasks
(gauge)
Shows the number of workload management tasks that have been completed.
Shown as task
exchange.workload_management.queued_tasks
(gauge)
Shows the number of workload management tasks that are currently queued up waiting to be processed.
Shown as task
exchange.ws.connection_attempts
(gauge)
Shows the rate that connections to the Web service are being attempted. Determines current user load.
Shown as connection
exchange.ws.current_connections_default_website
(gauge)
Shows the current number of connections established to the Default website which corresponds to the number of connections hitting the Front End CAS server role. Determines current user load.
Shown as connection
exchange.ws.current_connections_total
(gauge)
Shows the current number of connections established with the Web service. Determines current user load.
Shown as connection
exchange.ws.other_attempts
(gauge)
Shows the rate HTTP requests are made that don't use the OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, MOVE, COPY, MKCOL, PROPFIND, PROPPATCH, SEARCH, LOCK, or UNLOCK methods. Determines current user load.
Shown as connection
exchange.ws.requests_persec
(gauge)
Shows the number of requests processed each second. Determines current user load.
Shown as request

Events

The Exchange server check does not include any events.

Service Checks

The Exchange server check does not include any service checks.

Troubleshooting

Need help? Contact Datadog support.