IBM MQ

Supported OS Windows Mac OS

통합 버전6.3.0

개요

이 검사는 IBM MQ 버전 9.1 이상을 모니터링합니다.

설정

설치

IBM MQ 검사는 Datadog Agent 패키지에 포함되어 있습니다.

IBM MQ 검사를 사용하려면 IBM MQ 클라이언트 버전 9.1 이상이 설치되어 있는지 확인하세요(IBM MQ 서버의 호환 가능한 버전이 Agent 호스트에 이미 설치되어 있지 않은 경우). 예를 들어 9.3 Redistributable client입니다. IBM MQ 검사는 z/OS의 IBM MQ 서버에 대한 연결을 지원하지 않습니다.

Linux의 경우

라이브러리 위치를 포함하도록 LD_LIBRARY_PATH를 업데이트합니다. 아직 존재하지 않는 경우 해당 환경 변수를 생성합니다. 예를 들어 클라이언트를 /opt에 설치한 경우:

export LD_LIBRARY_PATH=/opt/mqm/lib64:/opt/mqm/lib:$LD_LIBRARY_PATH

참고: Agent v6 이상은 upstart, systemd, launchd를 사용하여 datadog-agent 서비스를 조정합니다. 환경 변수는 다음 기본 위치에 있는 서비스 구성 파일에 추가되어야 할 수도 있습니다.

  • Upstart (Linux): /etc/init/datadog-agent.conf
  • Systemd (Linux): /lib/systemd/system/datadog-agent.service
  • Launchd (MacOS): ~/Library/LaunchAgents/com.datadoghq.agent.plist

systemd 구성 예:

[Unit]
Description="Datadog Agent"
After=network.target
Wants=datadog-agent-trace.service datadog-agent-process.service
StartLimitIntervalSec=10
StartLimitBurst=5

[Service]
Type=simple
PIDFile=/opt/datadog-agent/run/agent.pid
Environment="LD_LIBRARY_PATH=/opt/mqm/lib64:/opt/mqm/lib:$LD_LIBRARY_PATH"
User=dd-agent
Restart=on-failure
ExecStart=/opt/datadog-agent/bin/agent/agent run -p /opt/datadog-agent/run/agent.pid

[Install]
WantedBy=multi-user.target

upstart 구성 예:

description "Datadog Agent"

start on started networking
stop on runlevel [!2345]

respawn
respawn limit 10 5
normal exit 0

console log
env DD_LOG_TO_CONSOLE=false
env LD_LIBRARY_PATH=/opt/mqm/lib64:/opt/mqm/lib:$LD_LIBRARY_PATH

setuid dd-agent

script
  exec /opt/datadog-agent/bin/agent/agent start -p /opt/datadog-agent/run/agent.pid
end script

post-stop script
  rm -f /opt/datadog-agent/run/agent.pid
end script

launchd 구성 예:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>KeepAlive</key>
        <dict>
            <key>SuccessfulExit</key>
            <false/>
        </dict>
        <key>Label</key>
        <string>com.datadoghq.agent</string>
        <key>EnvironmentVariables</key>
        <dict>
            <key>DD_LOG_TO_CONSOLE</key>
            <string>false</string>
            <key>LD_LIBRARY_PATH</key>
            <string>/opt/mqm/lib64:/opt/mqm/lib</string>
        </dict>
        <key>ProgramArguments</key>
        <array>
            <string>/opt/datadog-agent/bin/agent/agent</string>
            <string>run</string>
        </array>
        <key>StandardOutPath</key>
        <string>/var/log/datadog/launchd.log</string>
        <key>StandardErrorPath</key>
        <string>/var/log/datadog/launchd.log</string>
        <key>ExitTimeOut</key>
        <integer>10</integer>
    </dict>
</plist>

Agent 업데이트가 있을 때마다 이러한 파일은 지워지며 다시 업데이트해야 합니다.

또는 Linux를 사용하는 경우 MQ 클라이언트가 설치된 후 런타임 링커가 라이브러리를 찾을 수 있는지 확인하세요. 예를 들어 ldconfig를 사용하면 다음과 같습니다.

라이브러리 위치를 ld 구성 파일에 넣습니다.

sudo sh -c "echo /opt/mqm/lib64 > /etc/ld.so.conf.d/mqm64.conf"
sudo sh -c "echo /opt/mqm/lib > /etc/ld.so.conf.d/mqm.conf"

바인딩을 업데이트합니다.

sudo ldconfig

Windows의 경우

IBM MQ 데이터 디렉터리에 mqclient.ini라는 파일이 있습니다. 일반적으로 C:\ProgramData\IBM\MQ입니다. 데이터 디렉터리를 가리키도록 환경 변수 MQ_FILE_PATH를 구성합니다.

권한 및 인증

IBM MQ에서 권한을 설정하는 방법에는 여러 가지가 있습니다. 설정 방식에 따라 MQ 내에서 읽기 전용 권한 및 +chg 권한(선택 사항)이 있는 datadog 사용자를 생성합니다. 재설정 대기열 통계(MQCMD_RESET_Q_STATS)에 대한 메트릭을 수집하려면 +chg 권한이 필요합니다. 이러한 메트릭을 수집하지 않으려면 collect_reset_queue_metrics 구성을 비활성화할 수 있습니다. 재설정 대기열 통계의 성능 데이터를 수집하면 성능 데이터도 재설정됩니다.

참고: “Queue Monitoring"은 MQ 서버에서 활성화되어야 하며 최소한 “Medium"으로 설정되어야 합니다. 이는 MQ UI를 사용하거나 서버 호스트의 mqsc 명령을 사용하여 수행할 수 있습니다.

> /opt/mqm/bin/runmqsc
5724-H72 (C) Copyright IBM Corp. 1994, 2018.
Starting MQSC for queue manager datadog.


ALTER QMGR MONQ(MEDIUM) MONCHL(MEDIUM)
     1 : ALTER QMGR MONQ(MEDIUM) MONCHL(MEDIUM)
AMQ8005I: IBM MQ queue manager changed.

       :
One MQSC command read.
No commands have a syntax error.
All valid MQSC commands were processed.

구성

호스트

호스트에서 실행 중인 에이전트에 대해 이 점검을 구성하려면:

메트릭 수집
  1. Agent의 구성 디렉터리 루트에 있는 conf.d/ 폴더에서 ibm_mq.d/conf.yaml 파일을 편집하여 IBM MQ 성능 데이터 수집을 시작하세요. 사용 가능한 모든 구성 옵션은 샘플 ibm_mq.d/conf.yaml을 참조하세요. 사용 방법에 따라 다양한 IBM MQ 구성 옵션이 있습니다.

    • channel: IBM MQ 채널
    • queue_manager: 설정된 대기열 관리자
    • host: IBM MQ가 실행 중인 호스트
    • port: IBM MQ가 노출한 포트
    • convert_endianness: MQ 서버가 AIX 또는 IBM i에서 실행 중인 경우 이 기능을 활성화해야 합니다.

    사용자 이름과 비밀번호 설정을 사용하는 경우 및 usernamepassword를 설정할 수 있습니다. 사용자 이름이 설정되지 않은 경우 Agent 프로세스 소유자(dd-agent)가 사용됩니다.

    참고: 이 검사는 queues 파라미터로 설정한 대기열만 모니터링합니다.

    queues:
      - APP.QUEUE.1
      - ADMIN.QUEUE.1
    
  2. Agent를 재시작합니다.

로그 수집

Agent 버전 6.0 이상에서 사용 가능

  1. Datadog 에이전트에서 로그 수집은 기본적으로 사용하지 않도록 설정되어 있습니다. datadog.yaml파일에서 로그 수집을 사용하도록 설정합니다.

    logs_enabled: true
    
  2. 다음으로, 구성 파일을 적절한 MQ 로그 파일로 지정합니다. MQ 통합 구성 파일 하단에 있는 줄의 주석 처리를 제거하고 적절하게 수정할 수 있습니다.

      logs:
        - type: file
          path: '/var/mqm/log/<APPNAME>/active/AMQERR01.LOG'
          service: '<APPNAME>'
          source: ibm_mq
          log_processing_rules:
            - type: multi_line
              name: new_log_start_with_date
              pattern: "\d{2}/\d{2}/\d{4}"
    
  3. Agent를 재시작합니다.

컨테이너화

컨테이너화된 환경의 경우 자동탐지 통합 템플릿에 다음 파라미터를 적용하는 방법이 안내되어 있습니다.

메트릭 수집
파라미터
<INTEGRATION_NAME>ibm_mq
<INIT_CONFIG>비어 있음 또는 {}
<INSTANCE_CONFIG>{"channel": "DEV.ADMIN.SVRCONN", "queue_manager": "datadog", "host":"%%host%%", "port":"%%port%%", "queues":["<QUEUE_NAME>"]}
로그 수집

Agent 버전 6.0 이상에서 사용 가능

Datadog Agent에서 로그 수집은 기본값으로 비활성화되어 있습니다. 이를 활성화하려면 쿠버네티스(Kubernetes) 로그 수집을 참조하세요.

파라미터
<LOG_CONFIG>{"source": "ibm_mq", "service": "<SERVICE_NAME>", "log_processing_rules": {"type":"multi_line","name":"new_log_start_with_date", "pattern":"\d{2}/\d{2}/\d{4}"}}

검증

Agent의 상태 하위 명령을 실행하고 Checks 섹션에서 ibm_mq를 찾습니다.

수집한 데이터

메트릭

ibm_mq.channel.batch_interval
(gauge)
This attribute is a period during which the channel keeps a batch open even if there are no messages on the transmission queue (parameter identifier: BATCHINT).
Shown as second
ibm_mq.channel.batch_size
(gauge)
This attribute is the maximum number of messages to be sent before a sync point is taken (parameter identifier: BATCHSZ).
Shown as resource
ibm_mq.channel.batches
(gauge)
This attribute specifies the number of completed batches (parameter identifier: MQIACH_BATCHES).
ibm_mq.channel.buffers_rcvd
(gauge)
This attribute specifies the number of buffers received (parameter identifier: MQIACH_BUFFERS_RCVD).
Shown as buffer
ibm_mq.channel.buffers_sent
(gauge)
This attribute specifies the number of buffers sent (parameter identifier: MQIACH_BUFFERS_SENT)
Shown as buffer
ibm_mq.channel.bytes_rcvd
(gauge)
This attribute specifies the number of bytes received (parameter identifier: MQIACH_BYTES_RCVD).
Shown as byte
ibm_mq.channel.bytes_sent
(gauge)
This attribute specifies the number of bytes sent (parameter identifier: MQIACH_BYTES_SENT).
Shown as byte
ibm_mq.channel.channel_status
(gauge)
This attribute specifies the channel status (parameter identifier: MQIACH_CHANNEL_STATUS).
ibm_mq.channel.channels
(gauge)
The number of active channels.
Shown as resource
ibm_mq.channel.count
(gauge)
Sum by status to count channels. Filter by channel and status tags to create notifications.
ibm_mq.channel.current_msgs
(gauge)
This attribute specifies the number of messages in-doubt (parameter identifier: MQIACH_CURRENT_MSGS).
Shown as message
ibm_mq.channel.disc_interval
(gauge)
This attribute is the length of time after which a channel closes down, if no message arrives during that period (parameter identifier: DISCINT).
Shown as second
ibm_mq.channel.hb_interval
(gauge)
This attribute specifies the approximate time between heartbeat flows that are to be passed from a sending MCA when there are no messages on the transmission queue (parameter identifier: HBINT).
Shown as second
ibm_mq.channel.indoubt_status
(gauge)
This attribute specifies the number whether the channel is currently in doubt (parameter identifier: MQIACH_INDOUBT_STATUS).
ibm_mq.channel.keep_alive_interval
(gauge)
This attribute is used to specify a timeout value for a channel (parameter identifier: KAINT).
Shown as second
ibm_mq.channel.long_retry
(gauge)
This attribute specifies the maximum number of times that the channel is to try allocating a session to its partner (parameter identifier: LONGRTY).
Shown as time
ibm_mq.channel.long_timer
(gauge)
This attribute is the approximate interval in seconds that the channel is to wait before retrying to establish connection, during the long retry mode (parameter identifier: LONGTMR).
Shown as second
ibm_mq.channel.max_message_length
(gauge)
This attribute specifies the maximum length of a message that can be transmitted on the channel (parameter identifier: MAXMSGL).
Shown as byte
ibm_mq.channel.mca_status
(gauge)
This attribute specifies the MCA status (parameter identifier: MQIACH_MCA_STATUS).
ibm_mq.channel.mr_count
(gauge)
This attribute specifies the number of times the channel tries to redeliver the message (parameter identifier: MRRTY).
ibm_mq.channel.mr_interval
(gauge)
This attribute specifies the minimum interval of time that must pass before the channel can retry the MQPUT operation (parameter identifier: MRTMR).
Shown as second
ibm_mq.channel.msgs
(gauge)
This attribute specifies the number of messages sent or received, or number of MQI calls handled (parameter identifier: MQIACH_MSGS).
Shown as message
ibm_mq.channel.network_priority
(gauge)
This attribute specifies the priority for the network connection. Distributed queuing chooses the path with the highest priority if there are multiple paths available. The value must be in the range 0 through 9; 0 is the lowest priority (parameter identifier: NETPRTY).
ibm_mq.channel.npm_speed
(gauge)
This attribute specifies the speed at which non-persistent messages are to be sent (parameter identifier: NPMSPEED).
ibm_mq.channel.sharing_conversations
(gauge)
This attribute specifies the maximum number of conversations that can share a channel instance associated with this channel (parameter identifier: SHARECNV).
ibm_mq.channel.short_retry
(gauge)
This attribute specifies the maximum number of attempts that are made by a sender or server channel to establish a connection to the remote machine (parameter identifier: MQIACH_SHORT_RETRY).
ibm_mq.channel.short_timer
(gauge)
This attribute specifies the short retry wait interval for a sender or server channel that is started automatically by the channel initiator (Parameter identifier: MQIACH_SHORT_TIMER).
Shown as second
ibm_mq.channel.ssl_key_resets
(gauge)
The value represents the total number of unencrypted bytes that are sent and received on the channel before the secret key is renegotiated (parameter identifier: SSLRSTCNT).
ibm_mq.queue.backout_threshold
(gauge)
Backout threshold (parameter identifier: MQIA_BACKOUT_THRESHOLD). That is, the number of times a message can be backed out before it is transferred to the backout queue specified by BackoutRequeueName.
Shown as resource
ibm_mq.queue.depth_current
(gauge)
The number of messages currently in the queue (parameter identifier: MQIA_CURRENT_Q_DEPTH).
Shown as message
ibm_mq.queue.depth_high_event
(gauge)
High limit for queue depth (parameter identifier: MQIA_Q_DEPTH_HIGH_LIMIT). This event indicates that an application has put a message to a queue, and this has caused the number of messages on the queue to become greater than or equal to the queue depth high threshold.
Shown as event
ibm_mq.queue.depth_high_limit
(gauge)
This attribute specifies the threshold against which the queue depth is compared before generated a queue high event (parameter identifier: MQIA_Q_DEPTH_HIGH_LIMIT).
Shown as resource
ibm_mq.queue.depth_low_event
(gauge)
Low limit for queue depth (parameter identifier: MQIA_Q_DEPTH_LOW_LIMIT). This event indicates that an application has retrieved a message from a queue, and this has caused the number of messages on the queue to become less than or equal to the queue depth low threshold.
Shown as event
ibm_mq.queue.depth_low_limit
(gauge)
This attribute specifies low limit for queue depth. This indicates that an application has retrieved a message from a queue, and this has caused the number of messages on the queue to become less than or equal to the queue depth low threshold (parameter identifier: MQIAQDEPTHLOWLIMIT).
Shown as item
ibm_mq.queue.depth_max
(gauge)
Maximum queue depth (parameter identifier: MQIA_MAX_Q_DEPTH). The maximum number of messages allowed on the queue. Note that other factors may cause the queue to be treated as full; for example, it will appear to be full if there is no storage available for a message.
Shown as message
ibm_mq.queue.depth_max_event
(gauge)
Controls whether Queue Full events are generated (parameter identifier: MQIA_Q_DEPTH_MAX_EVENT).
Shown as event
ibm_mq.queue.depth_percent
(gauge)
The percent of the queue that is currently utilized.
Shown as percent
ibm_mq.queue.harden_get_backout
(gauge)
Whether to harden backout count. Specifies whether the count of backed out messages should be saved (hardened) across restarts of the message queue manager (parameter identifier: MQIA_HARDEN_GET_BACKOUT).
Shown as request
ibm_mq.queue.high_q_depth
(gauge)
This attribute specifies the maximum number of messages on a queue (parameter identifier: MQIA_HIGH_Q_DEPTH).
Shown as message
ibm_mq.queue.inhibit_get
(gauge)
Whether get operations are allowed (parameter identifier: MQIA_INHIBIT_GET).
Shown as occurrence
ibm_mq.queue.inhibit_put
(gauge)
This attribute specifies whether put operations are allowed (parameter identifier: MQIA_INHIBIT_PUT).
Shown as occurrence
ibm_mq.queue.input_open_option
(gauge)
Specifies the default share option for applications opening this queue for input (parameter identifier: MQIA_DEF_INPUT_OPEN_OPTION).
Shown as resource
ibm_mq.queue.last_get_time
(gauge)
The elapsed time in seconds since the last message get from a queue.
Shown as second
ibm_mq.queue.last_put_time
(gauge)
The elapsed time in seconds since the last message put to a queue.
Shown as second
ibm_mq.queue.max_channels
(gauge)
This attribute is the maximum number of channels that can be current (parameter identifier: MQIA_MAX_CHANNELS).
Shown as connection
ibm_mq.queue.max_message_length
(gauge)
This attribute specifies the maximum message length that can be transmitted on the channel (parameter identifier: MQIACH_MAX_MSG_LENGTH).
Shown as resource
ibm_mq.queue.message_delivery_sequence
(gauge)
The order in which messages will be returned after a get operation (parameter identifier: MQIA_MSG_DELIVERY_SEQUENCE).
Shown as resource
ibm_mq.queue.msg_deq_count
(count)
This attribute specifies the number of messages dequeued (parameter identifier: MQIA_MSG_DEQ_COUNT).
Shown as message
ibm_mq.queue.msg_enq_count
(count)
This attribute specifies the number of messages enqueued (parameter identifier: MQIA_MSG_ENQ_COUNT).
Shown as message
ibm_mq.queue.oldest_message_age
(gauge)
The age, in seconds, of the oldest message on the queue (parameter identifier: MSGAGE).
Shown as second
ibm_mq.queue.open_input_count
(gauge)
Number of MQOPEN calls that have the queue open for input (parameter identifier: MQIA_OPEN_INPUT_COUNT).
Shown as connection
ibm_mq.queue.open_output_count
(gauge)
Number of MQOPEN calls that have the queue open for output (parameter identifier: MQIA_OPEN_OUTPUT_COUNT).
Shown as connection
ibm_mq.queue.persistence
(gauge)
Specifies the default for message-persistence on the queue. Message persistence determines whether or not messages are preserved across restarts of the queue manager (parameter identifier: MQIA_DEF_PERSISTENCE).
Shown as resource
ibm_mq.queue.priority
(gauge)
Specifies the default priority of messages put on the queue (parameter identifier: MQIA_DEF_PRIORITY).
Shown as resource
ibm_mq.queue.retention_interval
(gauge)
The number of hours for which the queue may be needed, based on the date and time when the queue was created (parameter identifier: MQIA_RETENTION_INTERVAL).
Shown as hour
ibm_mq.queue.scope
(gauge)
Scope of the queue definition (parameter identifier: MQIA_SCOPE). On OS/400, this is valid for receipt by MQSeries for AS/400 V4R2, or later. Specifies whether the scope of the queue definition does not extend beyond the queue manager which owns the queue, or whether the queue name is contained in a cell directory, so that it is known to all of the queue managers within the cell.
Shown as resource
ibm_mq.queue.service_interval
(gauge)
This attribute specifies the target for queue service interval. This is used for comparison to generate Queue Service Interval High and Queue Service Interval OK events (parameter identifier: MQIA_Q_SERVICE_INTERVAL).
Shown as millisecond
ibm_mq.queue.service_interval_event
(gauge)
Controls whether Service Interval High or Service Interval OK events are generated (parameter identifier: MQIA_Q_SERVICE_INTERVAL_EVENT).
Shown as occurrence
ibm_mq.queue.time_since_reset
(count)
This attribute specifies the time since statistics reset in seconds (parameter identifier: MQIA_TIME_SINCE_RESET).
Shown as second
ibm_mq.queue.trigger_control
(gauge)
This attribute specifies whether trigger messages are written to the initiation queue (parameter identifier: MQIA_TRIGGER_CONTROL).
Shown as method
ibm_mq.queue.trigger_depth
(gauge)
This attribute specifies the number of messages that will initiate a trigger message to the initiation queue (parameter identifier: MQIA_TRIGGER_DEPTH).
Shown as resource
ibm_mq.queue.trigger_message_priority
(gauge)
Threshold message priority for triggers (parameter identifier: MQIA_TRIGGER_MSG_PRIORITY). Specifies the minimum priority that a message must have before it can cause, or be counted for, a trigger event. The value must be in the range of priority values that are supported (0 through 9).
Shown as resource
ibm_mq.queue.trigger_type
(gauge)
The conditions under which trigger messages are written as a result of messages arriving on this queue (parameter identifier: MQIA_TRIGGER_TYPE).
Shown as resource
ibm_mq.queue.type
(gauge)
Type of queue to which the alias resolves (parameter identifier: MQIA_Q_TYPE).
Shown as resource
ibm_mq.queue.uncommitted_msgs
(gauge)
Specifies the maximum number of uncommitted messages. That is, the number of messages that can be retrieved, the number of messages that can be put, and any trigger messages generated within this unit of work (parameter identifier: MQIA_MAX_UNCOMMITTED_MSGS).
Shown as message
ibm_mq.queue.usage
(gauge)
This attribute whether the queue is for normal usage or for transmitting messages to a remote message queue manager (parameter identifier: MQIA_USAGE).
Shown as resource
ibm_mq.queue_manager.dist_lists
(gauge)
Specifies whether distribution-list messages can be placed on the queue (parameter identifier: MQIA_DIST_LISTS).
Shown as resource
ibm_mq.queue_manager.max_msg_list
(gauge)
Specifies the maximum message length that can be transmitted on the channel. This is compared with the value for the remote channel and the actual maximum is the lowest of the two values (parameter identifier: MQIACH_MAX_MSG_LENGTH).
Shown as byte
ibm_mq.stats.channel.avg_batch_size
(gauge)
The average batch size of batches processed by the channel (parameter identifier: MQIAMO_AVG_BATCH_SIZE).
Shown as message
ibm_mq.stats.channel.bytes
(count)
The number of bytes sent or received for persistent and nonpersistent messages. (parameter identifier: QCSTNBYT).
Shown as message
ibm_mq.stats.channel.full_batches
(count)
The number of batches processed by the channel that were sent because the value of the channel attributes BATCHSZ or BATCHLIM was reached (parameter identifier: MQIAMO_FULL_BATCHES).
Shown as message
ibm_mq.stats.channel.incomplete_batches
(count)
The number of batches processed by the channel, that were sent without the value of the channel attribute BATCHSZ being reached (parameter identifier: MQIAMO_INCOMPLETE_BATCHES).
Shown as message
ibm_mq.stats.channel.msgs
(count)
The number of persistent and nonpersistent messages sent or received (parameter identifier: QCSTNMSG).
Shown as message
ibm_mq.stats.channel.put_retries
(count)
The number of times in the time interval that a message failed to be put, and entered a retry loop (parameter identifier: MQIAMO_PUT_RETRIES).
Shown as message
ibm_mq.stats.queue.avg_q_time
(gauge)
The average latency, in microseconds, of messages destructively retrieved from the queue during the monitoring period for persistent and non-persistent messages (parameter identifier: MQIAMO64_AVG_Q_TIME).
Shown as message
ibm_mq.stats.queue.browse_bytes
(gauge)
The number of bytes read in non-destructive get requests for persistent and non-persistent messages (parameter identifier: MQIAMO64_BROWSE_BYTES).
Shown as message
ibm_mq.stats.queue.browse_count
(count)
The number of successful non-destructive get requests for persistent and non-persistent messages (parameter identifier: MQIAMO_BROWSES).
Shown as message
ibm_mq.stats.queue.browse_fail_count
(count)
The number of unsuccessful non-destructive get requests (parameter identifier: MQIAMO_BROWSES_FAILED).
Shown as message
ibm_mq.stats.queue.expired_msg_count
(count)
The number of persistent and non-persistent messages that were discarded because they had expired before they could be retrieved (parameter identifier: MQIAMO_MSGS_EXPIRED).
Shown as message
ibm_mq.stats.queue.get_bytes
(count)
The number of bytes read in destructive put requests for persistent and non-persistent messages (parameter identifier: MQIAMO64_GET_BYTES).
Shown as message
ibm_mq.stats.queue.get_count
(count)
The number of successful destructive get requests for persistent and non-persistent messages (parameter identifier: MQIAMO_GETS).
Shown as message
ibm_mq.stats.queue.get_fail_count
(count)
The number of unsuccessful destructive get requests (parameter identifier: MQIAMO_GETS_FAILED).
Shown as message
ibm_mq.stats.queue.non_queued_msg_count
(count)
The number of messages that bypassed the queue and were transferred directly to a waiting application. This number represents how many times WebSphere MQ was able to bypass the queue, and not the number of times an application was waiting (parameter identifier: MQIAMOMSGSNOT_QUEUED).
Shown as message
ibm_mq.stats.queue.purge_count
(count)
The number of messages purged (parameter identifier: MQIAMO_MSGS_PURGED).
Shown as message
ibm_mq.stats.queue.put1_count
(count)
The number of persistent and non-persistent messages successfully put to the queue using MQPUT1 calls (parameter identifier: MQIAMO_PUT1S).
Shown as message
ibm_mq.stats.queue.put1_fail_count
(count)
The number of unsuccessful attempts to put a message using MQPUT1 calls (parameter identifier: MQIAMO_PUT1S_FAILED).
Shown as message
ibm_mq.stats.queue.put_bytes
(count)
The number of bytes written in put requests to the queue for persistent and non-persistent messages (parameter identifier: MQIAMO64_PUT_BYTES).
Shown as message
ibm_mq.stats.queue.put_count
(count)
The number of persistent and non-persistent messages successfully put to the queue, with exception of MQPUT1 requests (parameter identifier: MQIAMO_PUTS).
Shown as message
ibm_mq.stats.queue.put_fail_count
(count)
The number of unsuccessful attempts to put a message to the queue (parameter identifier: MQIAMO_PUTS_FAILED).
Shown as message
ibm_mq.stats.queue.q_max_depth
(gauge)
The maximum queue depth during the monitoring period (parameter identifier: MQIAMO_Q_MAX_DEPTH).
Shown as message
ibm_mq.stats.queue.q_min_depth
(gauge)
The minimum queue depth during the monitoring period (parameter identifier: MQIAMO_Q_MIN_DEPTH).
Shown as message

이벤트

IBM MQ는 이벤트를 포함하지 않습니다.

서비스 점검

ibm_mq.can_connect
Returns CRITICAL if the Agent cannot connect to the MQ server for any reason or UNKNOWN if the configured queue manager is not matched by the queue_manager_process option. Returns OK otherwise.
Statuses: ok, critical, unknown

ibm_mq.queue_manager
Returns CRITICAL if the Agent cannot retrieve stats from the queue manager or UNKNOWN if the configured queue manager is not matched by the queue_manager_process option. Returns OK otherwise.
Statuses: ok, critical, unknown

ibm_mq.queue
Returns CRITICAL if the Agent cannot retrieve queue stats. Returns OK otherwise.
Statuses: ok, critical

ibm_mq.channel
Returns CRITICAL if the Agent cannot retrieve channel stats. Returns OK otherwise.
Statuses: ok, critical

ibm_mq.channel.status
Return CRITICAL if the status is INACTIVE/STOPPED/STOPPING. Returns OK if the status is RUNNING. Returns WARNING if the status might lead to running.
Statuses: ok, critical, warning, unknown

트러블슈팅

재설정 대기열 통계 MQRC_NOT_AUTHORIZED 권한 경고

다음 오류가 나타나는 경우:

Warning: Error getting pcf queue reset metrics for SAMPLE.QUEUE.1: MQI Error. Comp: 2, Reason 2035: FAILED: MQRC_NOT_AUTHORIZED

이는 datadog 사용자에게 재설정 대기열 메트릭을 수집할 +chg 권한이 없기 때문입니다. 이 문제를 해결하려면 setmqaut를 사용하여 datadog 사용자에게 +chg 권한을 부여하고 재설정 대기열 메트릭을 수집하거나 collect_reset_queue_metrics를 비활성화할 수 있습니다.

    collect_reset_queue_metrics: false

높은 리소스 이용률

IBM MQ 검사는 서버에서 쿼리를 수행합니다. 때때로 이러한 쿼리는 비용이 많이 들고 검사 성능 저하를 일으킬 수 있습니다.

검사를 실행하는 데 시간이 오래 걸리거나 호스트에서 많은 리소스를 소모하는 것으로 확인되면 다음을 시도하여 검사 범위를 잠재적으로 줄일 수 있습니다.

  • auto_discover_queues를 사용하는 경우 특정 대기열만 검색하려면 queue_patterns 또는 queue_regex를 사용해 보세요. 이는 시스템이 동적 대기열을 생성할 때 특히 유용합니다.
  • queue_patterns 또는 queue_regex를 사용하여 대기열을 자동 검색하는 경우 더 적은 대기열과 일치하도록 패턴이나 정규식을 강화해 보세요.
  • 채널이 너무 많을 경우 auto_discover_channels를 비활성합니다.
  • collect_statistics_metrics를 비활성화합니다.

로그 오류

  • Unpack for type ((67108864,)) not implemented: 이와 같은 오류가 표시되고 MQ 서버가 IBM OS에서 실행되고 있는 경우 convert_endianness를 활성화하고 Agent를 다시 시작하세요.

로그 경고

  • Error getting [...]: MQI Error. Comp: 2, Reason 2085: FAILED: MQRC_UNKNOWN_OBJECT_NAME: 이와 같은 메시지가 표시된다면 존재하지 않는 대기열에서 통합이 메트릭을 수집하려고 하기 때문입니다. 이는 잘못된 구성으로 인해 발생하거나auto_discover_queues를 사용하는 경우 통합이 동적 대기열을 검색할 수 있으며 메트릭을 수집하려고 할 때 대기열이 더 이상 존재하지 않기 때문일 수 있습니다. 이 경우 더 엄격한 queue_patterns 또는 queue_regex를 제공하여 문제를 완화하거나 경고를 무시할 수 있습니다.

기타

도움이 필요하신가요? Datadog 고객 지원팀에 문의해주세요.

참고 자료

기타 유용한 문서, 링크 및 기사: