Oracle Cloud Infrastructure

The Oracle Cloud Infrastructure integration is not supported for your selected Datadog site ().

Overview

Oracle Cloud Infrastructure (OCI) is an infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS) used by enterprise-scale companies. With a full suite of managed services for hosting, storage, networking, databases, and more.

Use Datadog’s OCI integration to forward your logs and metrics to Datadog, where they can power dashboards, help with troubleshooting, and be monitored for security and compliance posture.

Setup

Metric collection

To forward your OCI metrics to Datadog:

For a visual representation of this architecture, see the Architecture section.

Enter tenancy info

  • Your OCI user account needs the Cloud Administrator role to complete these steps.

  • Tenancy OCID

  • Home Region

Enter the OCID and home region of the tenancy you want to monitor in the Datadog OCI integration tile.

Create OCI policy stack

Ensure that the home region of the tenancy is selected in the top right of the screen.

This policy stack should only be deployed once per tenancy.

  1. Click the Create Policy Stack button on the Datadog OCI integration tile.
  2. Accept the Oracle Terms of Use.
  3. Leave the option to use custom Terraform providers unchecked.
  4. Use the default name and compartment for the stack, or optionally provide your own descriptive name or compartment.
  5. Click Next.
  6. Leave the tenancy field and current user field as-is.
  7. Click Next.
  8. Click Create.

Enter DatadogROAuthUser info

  • OCID of the DatadogROAuthUser

  • OCI API key and fingerprint value

  1. In the OCI console search bar, search for DatadogROAuthUser and click on the User resource that appears.
  2. Copy the user’s OCID value.
  3. Paste the value into the User OCID field in the Datadog OCI integration tile.
  4. Returning to the OCI console, generate an API key with these steps: a. In the bottom left corner of the screen, under Resources, click API keys. b. Click Add API key. c. Click Download private key. d. Click Add. e. A Configuration file preview popup appears, but no action is needed; close the popup.

The Add API Key page in the OCI console

  1. Copy the fingerprint value, and paste it into the Fingerprint field on the Datadog OCI integration tile.
  2. Copy the private key value with these steps: a. Open the downloaded private key .pem file in a text editor, or use a terminal command such as cat to display the file’s contents. b. Copy the entire contents, including -----BEGIN PRIVATE KEY----- and -----END PRIVATE KEY-----.
  3. Paste the private key value into the Private Key field on the Datadog OCI integration tile.

Create OCI metric forwarding stack

  • Your user account must be able to create resources in the compartment.
  • Datadog API Key value
  • Username and auth token for a user with the REPOSITORY_READ and REPOSITORY_UPDATE permissions to pull and push images to a Docker repo.

Note: To verify the Docker registry login is correct, see Logging in to Oracle Cloud Infrastructure Registry.

The metric forwarding stack must be deployed for each combination of tenancy and region to be monitored. For the simplest setup, Datadog recommends creating all the necessary OCI resources with the ORM stack provided below. Alternatively, you can use your existing OCI networking infrastructure.

All resources created by Datadog’s ORM stack are deployed to the compartment specified, and for the region currently selected in the top right of the screen.

  1. Click the Create Metric Stack button on the Datadog OCI integration tile.
  2. Accept the Oracle Terms of Use.
  3. Leave the Custom providers option unchecked.
  4. Name the stack and select the compartment to deploy it to.
  5. Click Next.
  6. In the Datadog API Key field, enter your Datadog API key value.
  1. In the Network options section, leave Create VCN checked.

If using an existing VCN, the subnet’s OCID must be provided to the stack. Make sure that the VCN:

  • Is allowed to make HTTP egress calls through NAT gateway.
  • Is able to pull images from OCI container registry using service gateway.
  • Has the route table rules to allow NAT gateway and service gateway.
  • Has the security rules to send HTTP requests.
  1. In the Network options section, uncheck the Create VCN option and enter your VCN information: a. In the vcnCompartment field, select your compartment. b. In the existingVcn section, select your existing VCN. c. In the Function Subnet OCID section, enter the OCID of the subnet to be used.
  1. In the Metrics settings section, optionally remove any metric namespaces from collection.
  2. In the Metrics compartments section, enter a comma-separated list of compartment OCIDs to monitor. Any metric namespace filters selected in the previous step are applied to each compartment.
  3. In the Function settings section, select GENERIC_ARM. Select GENERIC_X86 if deploying in a Japan region.
  4. Click Next.
  5. Click Create.
  6. Return to the Datadog OCI integration tile and click Create Configuration.

Notes:

  • By default, only the root compartment is selected, and all of the metric namespaces supported by the Datadog OCI integration are enabled (up to 50 namespaces are supported per connector hub). If you choose to monitor additional compartments, any metric namespace exclusion filters are applied to each compartment.
  • You should manage who has access to the Terraform state files of the resource manager stacks. See the Terraform State Files section of the Securing Resource Manager page for more information.

Validation

View oci.* metrics in the OCI integration overview dashboard or Metrics Explorer page in Datadog.

OCI function metrics (oci.faas namespace) and container instance metrics (oci_computecontainerinstance namespace) are in Preview.

Configuration

Add regions

To monitor an additional region in a tenancy, navigate to that tenancy in the OCI Integration tile.

  1. In the Configure an Additional Region section, click Create Metric Stack.
  2. Switch to the region you wish to monitor in the top right of the screen.
  3. Complete the steps in Create OCI metric forwarding stack for the new region.
Add compartments or metric namespaces

To add compartments or edit the list of metric namespaces enabled, click Edit on the newly created Connector Hub.

  • Click + Another compartment to add compartments.
  • In the Configure source section, add or remove namespaces from the Namespaces dropdown.

Architecture

Metric forwarding resources

A diagram of the OCI resources mentioned in this page and displaying the flow of data

This integration creates an OCI connector hub, function app, and secure networking infrastructure to forward OCI metrics to Datadog. The ORM stack for these resources creates a function container repository for the region in the tenancy, and the Docker image is pushed to it to be used by the function.

IAM resources

A diagram of the OCI resources and workflow used for integration authentication

This integration creates:

  • A dynamic group with resource.type = 'serviceconnectors', to enable access to the connector hub.
  • A user called DatadogROAuthUser, which Datadog uses to read tenancy resources.
  • A group to which the created user is added for policy access.
  • A user called DatadogAuthWriteUser, which is used to push Docker images for the function.
  • A write access group that the DatadogAuthWriteUser is added to, for pushing images through policy access.
  • A policy in the root compartment to allow connector hubs to read metrics and invoke functions. This policy also gives the created user group read access to both the tenancy resources and write access group, to push images. The following statements are added to the policy:
Allow dynamic-group Default/<GROUP_NAME> to read metrics in tenancy
Allow dynamic-group Default/<GROUP_NAME> to use fn-function in tenancy
Allow dynamic-group Default/<GROUP_NAME> to use fn-invocation in tenancy
Allow group Default/<USER_GROUP_NAME> to read all-resources in tenancy
Allow group Default/<WRITE_USER_GROUP_NAME> to manage repos in tenancy where ANY {request.permission = 'REPOSITORY_READ', request.permission = 'REPOSITORY_UPDATE', request.permission = 'REPOSITORY_CREATE'}

Metric namespaces

IntegrationMetric Namespace
API Gatewayoci_apigateway
Autonomous Databaseoci_autonomous_database
Block Storageoci_blockstore
Computeoci_computeagent, rdma_infrastructure_health, gpu_infrastructure_health, oci_compute_infrastructure_health
Container Instances (Preview)oci_computecontainerinstance
Databaseoci_database, oci_database_cluster
Dynamic Routing Gatewayoci_dynamic_routing_gateway
E-Business Suite (EBS)oracle_appmgmt
FastConnectoci_fastconnect
File Storageoci_filestorage
Functions (Preview)oci_faas
GPUgpu_infrastructure_health
HeatWave MySQLoci_mysql_database
Kubernetes Engineoci_oke
Load Balanceroci_lbaas, oci_nlb
MediaStreamsoci_mediastreams
NAT Gatewayoci_nat_gateway
Network Firewalloci_network_firewall
Object Storageoci_objectstorage
PostgreSQLoci_postgresql
Queueoci_queue
Service Connector Huboci_service_connector_hub
Service Gatewayoci_service_gateway
VCNoci_vcn
VPNoci_vpn
Web Application Firewalloci_waf

Log collection

Send logs from your Oracle Cloud Infrastructure to Datadog by following either process:

  1. Configure an OCI log.
  2. Create an OCI function.
  3. Setup an OCI Service Connector.

The instructions below use the OCI portal to set up the integration.

OCI logging

  1. In the OCI portal, navigate to Logging -> Log Groups.
  2. Select your compartment and click Create Log Group. A side panel opens.
  3. Enter data_log_group for the name, and optionally provide a description and tags.
  4. Click Create to set up your new Log Group.
  5. Under Resources, click Logs.
  6. Click to Create custom log or Enable service log as desired.
  7. Click Enable Log, to create your new OCI Log.

For more information on OCI Logs, see Enabling Logging for a Resource.

OCI function

  1. In the OCI portal, navigate to Functions.
  2. Select an existing application or click Create Application.
  3. Create a new OCI function within your application. See the Oracle Overview of Functions for details.
  4. It is recommended to create a boilerplate Python function first and replace the auto generated files with Datadog’s source code:
    • Replace func.py with code from the Datadog OCI repo.
    • Replace func.yaml with code from the Datadog OCI repo. DATADOG_TOKEN and DATADOG_HOST must be replaced with your Datadog API key and region logs intake link.
    • Replace requirements.txt with code from the Datadog OCI repo.

OCI service connector hub

  1. In the OCI portal, navigate to Logging -> Service Connectors.
  2. Click Create Service Connector to be directed to the Create Service Connector page.
  3. Select the Source as Logging and Target as Functions.
  4. Under Configure Source Connection select a Compartment name, Log Group, and Log. (The Log Group and Log created in the first step)
  5. If you also want to send Audit Logs, click +Another Log and select the same Compartment while replacing “_Audit” as your Log Group.
  6. Under Configure target select a Compartment, Function application, and Function. (The Function Application and Function created in the previous step)
  7. If you are prompted to create a policy, click Create from the prompt.
  8. Click Create at the bottom to finish creating your Service Connector.

For more information on OCI Object Storage, see Oracle’s Service Connector blog post.

  1. Configure an OCI log.
  2. Create an OCI object store and enable read/write access for OCI logs.
  3. Create an OCI function.
  4. Set up an OCI event.

The instructions below use the OCI portal to set up the integration.

OCI logging

  1. In the OCI portal, navigate to Solutions and Platform -> Logging -> Logs.
  2. Click Create Custom Log to be directed to the Create Custom Log page.
  3. Give your new OCI log a name.
  4. Select a Compartment and Log Group. These selections remain consistent across the entire installation.
  5. Click Create Custom Log to be directed to the Create Agent Config page.
  6. Click Create new configuration.
  7. Give your new configuration a name. Your compartment is preselected for you.
  8. Set the group type to Dynamic Group and group to one of your existing groups.
  9. Set the input type to Log Path, enter your preferred input name and use “/” for file paths.
  10. Click Create Custom Log, then your OCI log is created and available on the logs page.

For more information on OCI Logs, see Enabling Logging for a Resource.

OCI object storage

  1. In the OCI portal, navigate to Core Infrastructure -> Object Storage -> Object Storage.
  2. Click Create Bucket to be directed to the Create Bucket form.
  3. Select Standard for your storage tier and check Emit Object Events.
  4. Complete the rest of the form based on your preference.
  5. Click Create Bucket, then your bucket is created and available in the bucket list.
  6. Select your new bucket from the active bucket list and click Logs under resources.
  7. Toggle read to enabled which directs you to an Enable Log side menu.
  8. Select a Compartment and Log Group (use the same selections as your OCI log).
  9. Enter a name for the Log Name and select your preferred log retention.

For more information on OCI Object Storage, see Putting Data into Object Storage.

OCI function

  1. In the OCI portal, navigate to Solutions and Platform -> Developer Services -> Functions.
  2. Select an existing application or click Create Application.
  3. Create a new OCI function within your application. See the Oracle Overview of Functions for more details.
  4. It is recommended to create a boilerplate Python function first and replace the auto generated files with Datadog’s source code:
    • Replace func.py with code from the Datadog OCI repo.
    • Replace func.yaml with code from the Datadog OCI repo. DATADOG_TOKEN and DATADOG_HOST must be replaced with your Datadog API key and region logs intake link.
    • Replace requirements.txt with code from the Datadog OCI repo.

OCI event

  1. In the OCI portal, navigate to Solutions and Platform -> Application Integration -> Event Service.
  2. Click Create Rule to be directed to the Create Rule page.
  3. Give your event rule a name and description.
  4. Set your condition as Event Type, service name as Object Storage, and event type as Object - Create.
  5. Set your action type as Functions.
  6. Ensure that your function compartment is the same selection you made for OCI Log, OCI Bucket, and OCI Function.
  7. Select your function application and function (according to the previous installation step.)
  8. Click Create Rule, then your rule is created and available in the rules list.

For more information on OCI Object Storage, see Getting Started with Events.

Data Collected

Metrics

oci.apigateway.backend_http_responses
(count)
Count of the HTTP responses returned by the back-end services.
Shown as response
oci.apigateway.bytes_received
(count)
Number of bytes received by the API gateway from API clients.
Shown as byte
oci.apigateway.bytes_sent
(count)
Number of bytes sent by the API gateway to API clients.
Shown as byte
oci.apigateway.http_requests
(count)
Number of incoming API client requests to the API gateway.
Shown as request
oci.apigateway.http_responses
(count)
Number of HTTP responses that the API gateway has sent back.
Shown as response
oci.apigateway.integration_latency
(gauge)
Time between the API gateway sending a request to the back-end service and receiving a response from the back-end service.
Shown as second
oci.apigateway.internal_latency
(gauge)
Time spent internally in the API gateway to process the request.
Shown as second
oci.apigateway.latency
(gauge)
Average time that it takes for a request to be processed and its response to be sent. This is calculated from the time the API gateway receives the first byte of an HTTP request to the time when the response send operation is completed.
Shown as second
oci.apigateway.response_cache_action
(gauge)
The action taken by the response cache.
oci.apigateway.response_cache_availability
(gauge)
Availability of the response cache as seen by the API gateway.
oci.apigateway.response_cache_latency
(gauge)
Total time taken for connect, read, and store operations on the response cache.
Shown as millisecond
oci.apigateway.subscriber_quota_proportion_used
(gauge)
Proportion of an entitlement's quota that has been consumed by a subscriber. Emitted per request. Calculated as: / .
Shown as fraction
oci.apigateway.subscriber_rate_limit_proportion_used
(gauge)
Proportion of an entitlement's rate limit that has been consumed by a subscriber. Emitted per request. Calculated as: / .
Shown as fraction
oci.apigateway.subscriber_requests
(count)
Number of requests made by a subscriber. Emitted per request.
Shown as request
oci.apigateway.usage_plan_requests
(count)
Number of requests to a given entitlement. Emitted per request.
Shown as request
oci.goldengate.cpu_utilization
(gauge)
Total CPU usage percentage by all consumer groups.
Shown as percent
oci.goldengate.deployment_health
(gauge)
Overall percentage health of deployment services. There are four services: Administration service, Distribution service, Receiver service, and Performance Metric service. If all four are running healthy, the expected score is 100%. If Deployment Health is 50%, then only two of the services are running healthy.
Shown as percent
oci.goldengate.deployment_inbound_lag
(gauge)
Average lag, in seconds, for all inbound streams critical to deployment health
Shown as second
oci.goldengate.deployment_outbound_lag
(gauge)
Average lag, in seconds, for all outbound streams critical to deployment health.
Shown as second
oci.goldengate.distribution_path_lag
(gauge)
Average lag, in seconds, of a Distribution Path process in the deployment. For example, if the source and target deployments are running in two different data centers, network latency issues could impact lag.
Shown as second
oci.goldengate.distribution_path_status
(gauge)
Health percentage of a Distribution Path process in the deployment. 100% when process is Running. 0% when process is Abended or Stopped.
Shown as percent
oci.goldengate.extract_lag
(gauge)
The difference, in seconds, between the time the Extract processed a record (based on the system clock) and the time stamp of that record in the data source.
Shown as second
oci.goldengate.extract_status
(gauge)
Health percentage of an Extract process in the deployment. 100% when process is Running. 0% when process is Abended or Stopped.
Shown as percent
oci.goldengate.file_system_usage
(gauge)
Percentage of File System Space used by the deployment.
Shown as percent
oci.goldengate.heartbeat_lag
(gauge)
Replication lag, in seconds, from the source endpoint to the target endpoint.
Shown as second
oci.goldengate.memory_utilization
(gauge)
Percentage of available memory used. The need for memory is aligned with the size of the data replicated. If enough memory is allocated, then each open transaction is kept in memory until a commit record is received.
Shown as percent
oci.goldengate.ocpu_consumption
(count)
Total number of OCPUs used by the deployment. When the count is lower than the minimum number of OCPUs, the minimum is shown. When the number of OCPUs is greater than the minimum number, the actual number of OCPUs used is shown.
Shown as cpu
oci.goldengate.pipeline_health
(gauge)
Overall health percentage of a Stream Analytics pipeline. 100% when a pipeline is healthy during the time range. 0% when the pipeline is unhealthy or not running during the time range. Between 0% to 100% when a pipeline was unhealthy and is recovering or going to terminate within the time range and needs attention.
Shown as percent
oci.goldengate.pipeline_memory_usage
(gauge)
Memory usage in megabytes (MB) of pipeline drivers and executors in the deployment.
Shown as megabyte
oci.goldengate.pipeline_processing_rate
(gauge)
Average number of events processed per second by pipelines in the deployment.
Shown as event
oci.goldengate.pipeline_scheduling_delay
(gauge)
Average scheduling delay in milliseconds (ms) of pipelines in the deployment.
Shown as millisecond
oci.goldengate.pipeline_total_delay
(gauge)
Average total delay in milliseconds (ms) of pipelines in the deployment.
Shown as millisecond
oci.goldengate.receiver_path_lag
(gauge)
Average lag, in seconds, of Receiver Path process in the deployment
Shown as second
oci.goldengate.receiver_path_status
(gauge)
Health percentage of a Receiver Path process in the deployment. 100% when process is Running. 0% when process is Abended or Stopped.
Shown as percent
oci.goldengate.replicat_lag
(gauge)
The difference, in seconds, between the time the Replicat processed the last record (based on the system clock) and the time stamp of the record in the trail.
Shown as second
oci.goldengate.replicat_status
(gauge)
Health percentage of a Replicat process in the deployment. 100% when process is Running. 0% when process is Abended or Stopped.
Shown as percent
oci.goldengate.swap_space_usage
(gauge)
Percentage of Swap Space used by the deployment. As OCI GoldenGate only writes only committed transaction to the trail files, all the uncommitted transactions are cached in memory. Cache uses both physical memory and swap space (virtual memory). Swap space is located on hard drives to provide additional memory when the physical memory (RAM) is full.
Shown as percent
oci.goldengate.temp_space_usage
(gauge)
Percentage of temporary space used by the deployment. When total cached transaction data exceeds the Cachesize setting, Extract writes cache data to temporary files. It is more efficient for the operating system to swap to disk than it is for Extract to write temporary files.
Shown as percent
oci.oracle_appmgmt.active_requests_by_application
(gauge)
Number of executions active grouped by category.
Shown as request
oci.oracle_appmgmt.active_user_sessions
(gauge)
Current Active User Sessions by username.
Shown as session
oci.oracle_appmgmt.active_user_sessions_by_responsibility
(gauge)
Current Active User Sessions grouped by responsibility.
Shown as session
oci.oracle_appmgmt.capacity_utilization_of_concurrent_managers
(gauge)
Utilized capacity of the concurrent manager.
Shown as percent
oci.oracle_appmgmt.completed_requests_by_application
(gauge)
Percentage of executions completed grouped by category.
Shown as percent
oci.oracle_appmgmt.concurrent_processing_component_status
(gauge)
Status of the component. Values are: 1 = Up 0 = Down.
Shown as resource
oci.oracle_appmgmt.concurrent_requests_by_status
(count)
Concurrent requests by status.
Shown as request
oci.oracle_appmgmt.deferred_records
(count)
Deferred records grouped by status.
Shown as record
oci.oracle_appmgmt.executed_programs_by_running_time
(gauge)
Running time of each execution of the program (raw data).
Shown as millisecond
oci.oracle_appmgmt.forms_database_sessions_per_application
(count)
Number Of Forms Sessions.
Shown as session
oci.oracle_appmgmt.forms_database_sessions_per_user
(count)
Number Of Forms Sessions.
Shown as session
oci.oracle_appmgmt.hourly_completed_concurrent_requests_rate
(gauge)
Concurrent Requests Completed by category.
Shown as percent
oci.oracle_appmgmt.inbound_notifications
(count)
Inbound records grouped by status.
Shown as record
oci.oracle_appmgmt.internal_concurrent_manager_status
(gauge)
Status of the resource. Values are: 1 = Up 0 = Down.
Shown as resource
oci.oracle_appmgmt.long_active_concurrent_requests
(gauge)
For pending requests, pending time. For running requests, running time.
Shown as millisecond
oci.oracle_appmgmt.monitoring_status
(gauge)
Status of the resource. Values are: 1 = Up 0 = Down.
Shown as resource
oci.oracle_appmgmt.outbound_notifications
(count)
Outbound records grouped by status.
Shown as record
oci.oracle_appmgmt.queue_details
(count)
Requests grouped by status.
Shown as request
oci.oracle_appmgmt.users_with_most_pending_requests
(gauge)
Number of requests.
Shown as user
oci.oracle_appmgmt.users_with_most_running_requests
(gauge)
Number of requests.
Shown as user

Service Checks

The OCI integration does not include any service checks.

Events

The OCI integration does not include any events.

Troubleshooting

Need help? Contact Datadog support.

Further Reading

Additional helpful documentation, links, and articles: