mirror of https://github.com/hashicorp/consul
Backport of Consul Kubernetes Datadog Integration Docs Update into release/1.18.x (#20947)
* backport of commitpull/20953/head8fbe164ce9
* backport of commita451d3e338
* backport of commit6435d67ea9
* backport of commit45e5073628
* backport of commite4a0fa97a1
* backport of commit8ac1b1bd26
* backport of commitc8084f4205
* backport of commit41577dab22
* backport of commitc5b41671d3
* backport of commitbddbd0d922
* backport of commit57993715ae
* backport of commita926b12686
* backport of commitb1d3c3e6a9
* backport of commit83ca3690ad
* backport of commitc821820b34
* backport of commit54ecce7ffc
* backport of commit96252cc15f
* backport of commit87e8d65bdc
* backport of commitfc4c8f72bd
--------- Co-authored-by: natemollica-dev <nathan.mollica@hashicorp.com> Co-authored-by: natemollica-dev <57850649+natemollica-nm@users.noreply.github.com>
parent
1eb243f9cf
commit
f1fdeba214
@ -0,0 +1,468 @@
|
||||
---
|
||||
layout: docs
|
||||
page_title: Configure Datadog metrics collection for Consul on Kubernetes
|
||||
description: >-
|
||||
Consul can integrate with external platforms such as Datadog to stream metrics about its operations. Learn how to enable Consul monitoring with Datadog by configuring the `metrics.datadog` Helm value override options.
|
||||
---
|
||||
|
||||
# Configure Datadog metrics collection for Consul on Kubernetes
|
||||
|
||||
This page describes the processes for integrating Datadog metrics collection in your Consul on Kubernetes deployment. The Helm chart includes automated configuration options to simplify the integration process.
|
||||
|
||||
## Datadog Metrics Integration Methods
|
||||
|
||||
- [DogstatsD](#dogstatsd)
|
||||
- [Datadog Checks: Official Consul Integration](#datadog-checks-official-consul-integration)
|
||||
- [Openmetrics Prometheus](#openmetrics-prometheus)
|
||||
|
||||
Users should choose **_one_** integration method from the three described below that best suites the intent for metrics collection. **[DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/?tab=kubernetes)**, **[Consul Integration](https://docs.datadoghq.com/integrations/consul/?tab=containerized)**, and **[Openmetrics Prometheus](https://docs.datadoghq.com/containers/kubernetes/prometheus/?tab=kubernetesadv2)** methods of integration are **_mutually exclusive_**.
|
||||
|
||||
**Reasoning:** _The consul-k8s helm chart automated configuration implements Datadog's [Consul Integration](https://docs.datadoghq.com/integrations/consul/?tab=containerized) method using the [`use_prometheus_endpoint`](https://github.com/DataDog/integrations-core/blob/07c04c5e9465ba1f3e0198830896d05923e81283/consul/datadog_checks/consul/data/conf.yaml.example#L59) configuration parameter. **DogstatsD**, **Consul Integration**, and **Openmetrics Prometheus** Metrics **by design** share the same [metric name](https://docs.datadoghq.com/integrations/consul/?tab=host#data-collected) syntax for collection, and would therefore cause a conflict. The [consul.py](https://github.com/DataDog/integrations-core/blob/07c04c5e9465ba1f3e0198830896d05923e81283/consul/datadog_checks/consul/consul.py#L55-L61) integration source code, as well as the [consul-k8s helm chart](https://github.com/hashicorp/consul-k8s/blob/4cac70496788f50354f96e9331003fcf338f419c/charts/consul/templates/_helpers.tpl#L595-L598) prohibit the enablement of more that one integration at a time._
|
||||
|
||||
|
||||
## DogstatsD
|
||||
|
||||
This method of implementation leverages the [hashicorp/go-metrics DogstatsD client library](https://github.com/hashicorp/go-metrics/tree/master/datadog) to manage metrics collection.
|
||||
Metrics are aggregated and sent via UDP or UDS transports to a Datadog Agent that runs on the same Kube Node as the Consul servers.
|
||||
|
||||
Enabling this method of metrics collection allows Consul to control the delivery of metrics traffic directly to a Datadog agent rather
|
||||
than a Datadog agent attempting to reach Consul and scrape the `/v1/agent/metrics` API endpoint.
|
||||
|
||||
This is accomplished by updating each server agent's configuration telemetry stanza.
|
||||
|
||||
### Helm Chart Configuration
|
||||
|
||||
<Tabs>
|
||||
<CodeBlockConfig heading={"DogstatsD (UDS)"}>
|
||||
|
||||
Consul Helm Chart Overrides
|
||||
|
||||
```yaml
|
||||
metrics:
|
||||
enabled: true
|
||||
enableAgentMetrics: true
|
||||
datadog:
|
||||
enabled: true
|
||||
namespace: "datadog"
|
||||
dogstatsd:
|
||||
enabled: true
|
||||
socketTransportType: "UDS"
|
||||
dogstatsdAddr: "/var/run/datadog/dsd.socket"
|
||||
```
|
||||
|
||||
Resulting server agent telemetry configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"telemetry": {
|
||||
"dogstatsd_addr": "unix:///var/run/datadog/dsd.socket"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
|
||||
<CodeBlockConfig heading={"DogstatsD (UDP:KubeService)"}>
|
||||
|
||||
Consul Helm Chart Overrides
|
||||
|
||||
```yaml
|
||||
metrics:
|
||||
enabled: true
|
||||
enableAgentMetrics: true
|
||||
datadog:
|
||||
enabled: true
|
||||
namespace: "datadog"
|
||||
dogstatsd:
|
||||
enabled: true
|
||||
socketTransportType: "UDP"
|
||||
# Set `dogstatsdPort` to `0` (default) to omit port number append to address.
|
||||
dogstatsdPort: 0
|
||||
dogstatsdAddr: "datadog-agent.datadog.svc.cluster.local"
|
||||
```
|
||||
|
||||
Resulting server agent telemetry configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"telemetry": {
|
||||
"dogstatsd_addr": "datadog-agent.datadog.svc.cluster.local"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
|
||||
<CodeBlockConfig heading={"DogstatsD (UDP: hostPort)"}>
|
||||
|
||||
Consul Helm Chart Overrides
|
||||
|
||||
```yaml
|
||||
metrics:
|
||||
enabled: true
|
||||
enableAgentMetrics: true
|
||||
datadog:
|
||||
enabled: true
|
||||
namespace: "datadog"
|
||||
dogstatsd:
|
||||
enabled: true
|
||||
socketTransportType: "UDP"
|
||||
dogstatsdPort: 8125
|
||||
dogstatsdAddr: "172.20.180.10"
|
||||
```
|
||||
|
||||
Resulting server agent telemetry configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"telemetry": {
|
||||
"dogstatsd_addr": "172.20.180.10:8125",
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
</Tabs>
|
||||
|
||||
### UDS/UDP Advantages and Disadvantages
|
||||
|
||||
This integration method accomplishes metrics collection by leveraging either [Unix Domain Sockets](https://docs.datadoghq.com/developers/dogstatsd/unix_socket/?tab=kubernetes) (**UDS**) or [User Datagram Protocol](https://docs.datadoghq.com/developers/dogstatsd/?tab=kubernetes#agent) (**UDP**) transport.
|
||||
Practitioners who manage their Kubernetes infrastructure and/or service-mesh should take into account the implications outlined in the tables below.
|
||||
|
||||
#### UDS
|
||||
|
||||
**Packet Transport**: Unix Domain Socket File
|
||||
|
||||
| Advantages | Disadvantages |
|
||||
|-------------------------------------------------------|------------------------------------------------------------------------------------------------------|
|
||||
| No IP or DNS resolution requirement for Datadog Agent | Requires [hostPath](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath) Volume attachment |
|
||||
| Improved network performance | Datadog Agent must run on every host you send metrics from |
|
||||
| Higher throughput capacity | |
|
||||
| Packet error handling | |
|
||||
| Automatic container ID tagging | |
|
||||
|
||||
#### UDP
|
||||
|
||||
**Packet Transport**:
|
||||
* Kubernetes Service `IP:Port`
|
||||
* Container Host Port
|
||||
|
||||
| Advantages | Disadvantages |
|
||||
|------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
|
||||
| Does **not** require [hostPath](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath) Volume attachment | **No** packet error handling |
|
||||
| (**_KubeDNS_**) Does **not** require Hostport exposure if accessible from cluster | (**_Hostport_**) Requires a networking provider that adheres to the CNI specification, such as Calico, Canal, or Flannel. |
|
||||
| Similar `IP:Port` configuration as Virtual Machine hosts | (**_Hostport_**) Requires port to be exposed on host using `hostNetwork` |
|
||||
| | (**_Hostport_**) Requires firewall access controls to permit access |
|
||||
| | (**_Hostport_**) Network Namespace sharing is required |
|
||||
|
||||
|
||||
#### Verifying DogstatsD Metric Collection
|
||||
|
||||
To confirm you're Datadog agent is receiving traffic, the `status` subcommand can be ran from the Datadog Agent expecting to receive DogstatsD traffic from Consul.
|
||||
|
||||
There should be an increase in either UDP or UDS traffic packet counts from the resultant output after the configuration has been properly established.
|
||||
|
||||
|
||||
| Transport | Command | Pod | Container |
|
||||
|:---------------|----------------------------------------------------------------------------|---------------|-----------|
|
||||
| `UDP`\|\|`UDS` | `agent status` | datadog-agent | agent |
|
||||
|
||||
|
||||
<Tabs>
|
||||
<CodeBlockConfig heading={"UDP"}>
|
||||
|
||||
```shell
|
||||
# Example: UDP Packet and Metric Packet Traffic Increase
|
||||
=========
|
||||
DogStatsD
|
||||
=========
|
||||
Event Packets: 0
|
||||
Event Parse Errors: 0
|
||||
Metric Packets: 5,908
|
||||
Metric Parse Errors: 0
|
||||
Service Check Packets: 0
|
||||
Service Check Parse Errors: 0
|
||||
Udp Bytes: 636,872
|
||||
Udp Packet Reading Errors: 0
|
||||
Udp Packets: 3,300
|
||||
Uds Bytes: 0
|
||||
Uds Origin Detection Errors: 0
|
||||
Uds Packet Reading Errors: 0
|
||||
Uds Packets: 0
|
||||
Unterminated Metric Errors: 0
|
||||
```
|
||||
</CodeBlockConfig>
|
||||
<CodeBlockConfig heading={"UDS"}>
|
||||
|
||||
```shell
|
||||
# Example: UDS Packet and Metric Packet Traffic Increase
|
||||
=========
|
||||
DogStatsD
|
||||
=========
|
||||
Event Packets: 0
|
||||
Event Parse Errors: 0
|
||||
Metric Packets: 30,523
|
||||
Metric Parse Errors: 0
|
||||
Service Check Packets: 0
|
||||
Service Check Parse Errors: 0
|
||||
Udp Bytes: 124,635
|
||||
Udp Packet Reading Errors: 0
|
||||
Udp Packets: 731
|
||||
Uds Bytes: 2,957,433
|
||||
Uds Origin Detection Errors: 0
|
||||
Uds Packet Reading Errors: 0
|
||||
Uds Packets: 11,563
|
||||
Unterminated Metric Errors: 0
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
</Tabs>
|
||||
|
||||
Traffic verification can also be accomplished using the `netstat` command line utility from a consul-server expected to be submitting
|
||||
metrics data to Datadog.
|
||||
|
||||
<Note>
|
||||
Using <code>netstat</code> requires privileged container permissions to install <code>open-bsd</code> networking tools on the consul-server for testing.
|
||||
</Note>
|
||||
|
||||
| Transport | Command | Pod | Container |
|
||||
|:-----------------|-----------|---------------|-----------|
|
||||
| `UDP` \|\| `UDS` | `netstat` | consul-server | consul |
|
||||
|
||||
<Tabs>
|
||||
<CodeBlockConfig heading={"UDP"}>
|
||||
|
||||
```shell
|
||||
$ netstat -nup | grep "172.28.13.12:8125.*ESTABLISHED
|
||||
udp 0 0 127.0.0.1:53874 127.0.0.1:8125 ESTABLISHED 23176/consul
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
|
||||
<CodeBlockConfig heading={"UDS"}>
|
||||
|
||||
```shell
|
||||
$ netstat -x
|
||||
Active UNIX domain sockets (w/o servers)
|
||||
Proto RefCnt Flags Type State I-Node Path
|
||||
unix 2 [ ] DGRAM CONNECTED 15952473
|
||||
unix 2 [ ] DGRAM 15652537 @9d10c
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
|
||||
</Tabs>
|
||||
|
||||
UDS provides the additional capability for verification by sending a test metrics packet to the Unix Socket configured.
|
||||
|
||||
<Note>
|
||||
Using <code>netcat</code> (nc) requires privileged container permissions to install <code>open-bsd</code> networking tools on the consul-server for testing.
|
||||
</Note>
|
||||
|
||||
| Transport | Command | Pod | Container |
|
||||
|:----------|---------|---------------|-----------|
|
||||
| `UDS` | `nc` | consul-server | consul |
|
||||
|
||||
<CodeBlockConfig heading={"UDS"}>
|
||||
|
||||
```shell
|
||||
$ echo -n "custom.metric.name:1|c" | nc -U -u -w1 /var/run/datadog/dsd.socket
|
||||
Bound on /tmp/nc-IjJkoG/recv.sock
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
|
||||
#### Use Case
|
||||
|
||||
DogstatsD integration provides full-scope metrics collection from Consul, and minimizes access control configuration requirements as traffic
|
||||
flow is outbound (toward the Datadog Agent) as opposed to inbound (toward the `/v1/agent/metrics/` API endpoint).
|
||||
|
||||
|
||||
#### Metrics Data Collected
|
||||
|
||||
- Full list of metrics sent via DogstatsD consists of those listed in the [Agent Telemetry](https://developer.hashicorp.com/consul/docs/agent/telemetry) documentation.
|
||||
|
||||
|
||||
## Datadog Checks: Official Consul Integration
|
||||
|
||||
The Datadog Agent package includes official third-party integrations for built-in availability upon agent deployment.
|
||||
|
||||
The Consul Integration Datadog checks provided some additional metric verification checks that leverage Consul's built-in feature-set, and help monitor Consul
|
||||
during normal operation beyond that of Consul's available metrics.
|
||||
|
||||
See the below [table](#additional-integration-checks-performed) for an outline of the features added by the official integration.
|
||||
|
||||
<Note>
|
||||
Currently, the annotations configured by the Helm overrides with Consul RPC TLS enabled
|
||||
assume server and ca certificate secrets are shared with the Datadog agent release namespace and mount the valid <code>tls.crt</code>, <code>tls.key</code>,
|
||||
and <code>ca.crt</code> secret volumes at the <code>/etc/datadog-agent/conf.d/consul.d/certs</code> path on the Datadog Agent, agent container.
|
||||
</Note>
|
||||
|
||||
### Helm Chart Configuration
|
||||
|
||||
<Tabs>
|
||||
<CodeBlockConfig heading={"Datadog Consul Checks"}>
|
||||
|
||||
Consul Helm Chart Overrides
|
||||
|
||||
```yaml
|
||||
global:
|
||||
tls:
|
||||
enabled: true
|
||||
enableAutoEncrypt: true
|
||||
acls:
|
||||
manageSystemACLs: true
|
||||
metrics:
|
||||
enabled: true
|
||||
enableAgentMetrics: true
|
||||
datadog:
|
||||
enabled: true
|
||||
namespace: "datadog"
|
||||
```
|
||||
|
||||
|
||||
Consul `server-statefulset.yaml` annotations
|
||||
|
||||
```yaml
|
||||
"ad.datadoghq.com/consul.checks": |
|
||||
{
|
||||
"consul": {
|
||||
"init_config": {},
|
||||
"instances": [
|
||||
{
|
||||
"url": "https://consul-server.consul.svc:8501",
|
||||
"tls_cert": "/etc/datadog-agent/conf.d/consul.d/certs/tls.crt",
|
||||
"tls_private_key": "/etc/datadog-agent/conf.d/consul.d/certs/tls.key",
|
||||
"tls_ca_cert": "/etc/datadog-agent/conf.d/consul.d/ca/tls.crt",
|
||||
"use_prometheus_endpoint": true,
|
||||
"acl_token": "ENC[k8s_secret@consul/consul-datadog-agent-metrics-acl-token/token]",
|
||||
"new_leader_checks": true,
|
||||
"network_latency_checks": true,
|
||||
"catalog_checks": true,
|
||||
"auth_type": "basic"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
</Tabs>
|
||||
|
||||
### Additional Integration Checks Performed
|
||||
|
||||
| Consul Component | Description | API Endpoint(s) |
|
||||
|------------------|-----------------------------------------------------|----------------------------------------------------------------------|
|
||||
| Agent | Agent Metadata (i.e., version) | `/v1/agent/self` |
|
||||
| Metrics | Prometheus formatted metrics | `/v1/agent/metrics` |
|
||||
| Serf | Events and Membership Flaps | `/v1/health/service/consul` `/v1/agent/self` |
|
||||
| Raft | Monitors Raft peer information and leader elections | `/v1/status/leader` `/v1/status/peers` |
|
||||
| Catalog Services | Service Health Status and Node Count | `/v1/catalog/services` `/v1/health/state/any` |
|
||||
| Catalog Nodes | Node Service Count and Health Status | `/v1/health/state/any` `/v1/health/service/<service>` |
|
||||
| Consul Latency | Consul LAN + WAN Coordinate Latency Calculations | `/v1/agent/self` `/v1/coordinate/nodes` `/v1/coordinate/datacenters` |
|
||||
|
||||
#### Use Case
|
||||
|
||||
This integration is primarily for basic Consul monitoring with focus on the service discovery.
|
||||
|
||||
#### Metrics Data Collected
|
||||
|
||||
The list of Consul's Prometheus metrics scraped and mapped by this method are listed in the latest [metrics.py](https://github.com/DataDog/integrations-core/blob/master/consul/datadog_checks/consul/metrics.py) of the integration source code.
|
||||
|
||||
To understand how Consul Latency metrics are calculated, review the [Consul Network Coordinates](https://developer.hashicorp.com/consul/docs/architecture/coordinates) documentation.
|
||||
|
||||
Review the [Datadog Documentation](https://docs.datadoghq.com/integrations/consul/?tab=containerized#data-collected) for the full description of Metrics data collected.
|
||||
|
||||
## Openmetrics Prometheus
|
||||
|
||||
For Datadog agents at or above v6.5.0, OpenMetrics and Prometheus checks are available to scrape Kubernetes application Prometheus endpoints.
|
||||
|
||||
This method implements the collection via Openmetrics as that is fully supported for Prometheus text format and is accomplished using pod annotations as demonstrated below.
|
||||
|
||||
<Note>
|
||||
Enabling OpenMetrics collection via Datadog <strong><em>by design</em></strong> removes the <code>prometheus.io/path</code> and <code>prometheus.io/port</code> annotations from the consul-server statefulset deployment to allow Datadog
|
||||
to scrape the agent's metrics API endpoint using either RPC TLS and Consul ACLs as necessary.
|
||||
</Note>
|
||||
|
||||
<Note>
|
||||
Currently, the annotations configured by the Helm overrides with Consul RPC TLS enabled
|
||||
assume server and ca certificate secrets are shared with the Datadog agent release namespace and mount the valid <code>tls.crt</code>, <code>tls.key</code>,
|
||||
and <code>ca.crt</code> secret volumes at the <code>/etc/datadog-agent/conf.d/consul.d/certs</code> path on the Datadog Agent, agent container.
|
||||
</Note>
|
||||
|
||||
### Helm Chart Configuration
|
||||
|
||||
<Tabs>
|
||||
<CodeBlockConfig heading={"OpenMetrics Prometheus"}>
|
||||
|
||||
Consul Helm Chart Overrides
|
||||
|
||||
```yaml
|
||||
global:
|
||||
tls:
|
||||
enabled: true
|
||||
enableAutoEncrypt: true
|
||||
acls:
|
||||
manageSystemACLs: true
|
||||
metrics:
|
||||
enabled: true
|
||||
enableAgentMetrics: true
|
||||
datadog:
|
||||
enabled: true
|
||||
namespace: "datadog"
|
||||
openMetricsPrometheus:
|
||||
enabled: true
|
||||
```
|
||||
|
||||
Consul `server-statefulset.yaml` annotations
|
||||
|
||||
```yaml
|
||||
ad.datadoghq.com/consul.checks: |
|
||||
{
|
||||
"openmetrics": {
|
||||
"init_config": {},
|
||||
"instances": [
|
||||
{
|
||||
"openmetrics_endpoint": "https://consul-server.consul.svc:8501/v1/agent/metrics?format=prometheus",
|
||||
"tls_cert": "/etc/datadog-agent/conf.d/consul.d/certs/tls.crt",
|
||||
"tls_private_key": "/etc/datadog-agent/conf.d/consul.d/certs/tls.key",
|
||||
"tls_ca_cert": "/etc/datadog-agent/conf.d/consul.d/ca/tls.crt",
|
||||
"headers": {
|
||||
"X-Consul-Token": "ENC[k8s_secret@consul/consul-datadog-agent-metrics-acl-token/token]"
|
||||
},
|
||||
"namespace": "consul",
|
||||
"metrics": [ ".*" ]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</CodeBlockConfig>
|
||||
</Tabs>
|
||||
|
||||
#### Use Case
|
||||
|
||||
This method of integration is useful for Prometheus-enabled scrapes with further customization of the collected data.
|
||||
|
||||
By default, all metrics pulled using this method scrape Consul metrics using the `/v1/agent/metrics?format=prometheus` API query, and are considered to be custom metrics.
|
||||
|
||||
Use of this method maps to Datadog as described in [Mapping Prometheus Metrics to Datadog Metrics](https://docs.datadoghq.com/integrations/guide/prometheus-metrics/?tab=latestversion). The following table summarizing how these metrics map to each other.
|
||||
|
||||
| OpenMetrics metric type | Datadog metric type |
|
||||
|:------------------------|:-----------------------------------|
|
||||
| `Gauge` | `gauge` |
|
||||
| `Counter` | `count` |
|
||||
| Histogram: `_count ` | `count.count` |
|
||||
| Histogram: `_sum` | `count.sum` |
|
||||
| Histogram: `_bucket` | `count.bucket` \|\| `distribution` |
|
||||
| Summary: `_count` | `count.count` |
|
||||
| Summary: `_sum` | `count.sum` |
|
||||
| Summary: `sample` | `gauge.quantile` |
|
||||
|
||||
|
||||
#### Metrics Data Collected
|
||||
|
||||
The integration, by default, uses a wildcard (`".*"`) to collect **_all_** metrics emitted from the `/v1/agent/metrics` endpoint.
|
||||
|
||||
Please refer to the [Agent Telemetry](https://developer.hashicorp.com/consul/docs/agent/telemetry) documentation for a full list and desription of the metrics data collected.
|
Loading…
Reference in new issue