mirror of https://github.com/hashicorp/consul
Language touch-ups for the checks docs.
parent
3c85d7e231
commit
8afcf9f152
|
@ -3,39 +3,39 @@ layout: "docs"
|
||||||
page_title: "Check Definition"
|
page_title: "Check Definition"
|
||||||
sidebar_current: "docs-agent-checks"
|
sidebar_current: "docs-agent-checks"
|
||||||
description: |-
|
description: |-
|
||||||
One of the primary roles of the agent is the management of system and application level health checks. A health check is considered to be application level if it associated with a service. A check is defined in a configuration file, or added at runtime over the HTTP interface.
|
One of the primary roles of the agent is management of system- and application-level health checks. A health check is considered to be application-level if it is associated with a service. A check is defined in a configuration file or added at runtime over the HTTP interface.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Checks
|
# Checks
|
||||||
|
|
||||||
One of the primary roles of the agent is the management of system and
|
One of the primary roles of the agent is management of system- and application-level health
|
||||||
application level health checks. A health check is considered to be application
|
checks. A health check is considered to be application-level if it is associated with a
|
||||||
level if it associated with a service. A check is defined in a configuration file,
|
service. A check is defined in a configuration file or added at runtime over the HTTP interface.
|
||||||
or added at runtime over the HTTP interface.
|
|
||||||
|
|
||||||
There are three different kinds of checks:
|
There are three different kinds of checks:
|
||||||
|
|
||||||
* Script + Interval - These checks depend on invoking an external application
|
* Script + Interval - These checks depend on invoking an external application
|
||||||
that does the health check and exits with an appropriate exit code, potentially
|
that performs the health check, exits with an appropriate exit code, and potentially
|
||||||
generating some output. A script is paired with an invocation interval (e.g.
|
generates some output. A script is paired with an invocation interval (e.g.
|
||||||
every 30 seconds). This is similar to the Nagios plugin system.
|
every 30 seconds). This is similar to the Nagios plugin system.
|
||||||
|
|
||||||
* HTTP + Interval - These checks make an `HTTP GET` request every Interval (e.g.
|
* HTTP + Interval - These checks make an HTTP `GET` request every Interval (e.g.
|
||||||
every 30 seconds) to the specified URL. The status of the service depends on the HTTP Response Code.
|
every 30 seconds) to the specified URL. The status of the service depends on the HTTP response code:
|
||||||
any `2xx` code is passing, `429 Too Many Requests` is warning and anything else is failing.
|
any `2xx` code is considered passing, a `429 Too Many Requests` is a warning, and anything else is a failure.
|
||||||
This type of check should be preferred over a script that for example uses `curl`.
|
This type of check should be preferred over a script that uses `curl` or another external process
|
||||||
|
to check a simple HTTP operation.
|
||||||
|
|
||||||
* Time to Live (TTL) - These checks retain their last known state for a given TTL.
|
* Time to Live (TTL) - These checks retain their last known state for a given TTL.
|
||||||
The state of the check must be updated periodically over the HTTP interface. If an
|
The state of the check must be updated periodically over the HTTP interface. If an
|
||||||
external system fails to update the status within a given TTL, the check is
|
external system fails to update the status within a given TTL, the check is
|
||||||
set to the failed state. This mechanism is used to allow an application to
|
set to the failed state. This mechanism is used to allow an application to
|
||||||
directly report its health. For example, a web app can periodically curl the
|
directly report its health. For example, a healthy web app can periodically `PUT` a status
|
||||||
endpoint, and if the app fails, then the TTL will expire and the health check
|
update to the HTTP endpoint; if the app fails, the TTL will expire and the health check
|
||||||
enters a critical state. This is conceptually similar to a dead man's switch.
|
enters a critical state. This is conceptually similar to a dead man's switch.
|
||||||
|
|
||||||
## Check Definition
|
## Check Definition
|
||||||
|
|
||||||
A check definition that is a script looks like:
|
A script check:
|
||||||
|
|
||||||
```javascript
|
```javascript
|
||||||
{
|
{
|
||||||
|
@ -48,7 +48,7 @@ A check definition that is a script looks like:
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
An HTTP based check looks like:
|
A HTTP check:
|
||||||
|
|
||||||
```javascript
|
```javascript
|
||||||
{
|
{
|
||||||
|
@ -61,7 +61,7 @@ An HTTP based check looks like:
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
A TTL based check is very similar:
|
A TTL check:
|
||||||
|
|
||||||
```javascript
|
```javascript
|
||||||
{
|
{
|
||||||
|
@ -74,18 +74,19 @@ A TTL based check is very similar:
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Each type of definitions must include a `name`, and may optionally
|
Each type of definition must include a `name` and may optionally
|
||||||
provide an `id` and `notes` field. The `id` is set to the `name` if not
|
provide an `id` and `notes` field. The `id` is set to the `name` if not
|
||||||
provided. It is required that all checks have a unique ID per node, so if names
|
provided. It is required that all checks have a unique ID per node: if names
|
||||||
might conflict then unique ID's should be provided.
|
might conflict, unique IDs should be provided.
|
||||||
|
|
||||||
The `notes` field is opaque to Consul, but may be used for human
|
|
||||||
readable descriptions. The field is set to any output that a script
|
The `notes` field is opaque to Consul but can be used to provide a human-readable
|
||||||
generates, and similarly the TTL update hooks can update the `notes`
|
descriptions. With a script check, the field is set to any output generated by the
|
||||||
as well.
|
script. Similarly, an external process updating a TTL check via the HTTP interface
|
||||||
|
can set the `notes` value.
|
||||||
|
|
||||||
To configure a check, either provide it as a `-config-file` option to the
|
To configure a check, either provide it as a `-config-file` option to the
|
||||||
agent, or place it inside the `-config-dir` of the agent. The file must
|
agent or place it inside the `-config-dir` of the agent. The file must
|
||||||
end in the ".json" extension to be loaded by Consul. Check definitions can
|
end in the ".json" extension to be loaded by Consul. Check definitions can
|
||||||
also be updated by sending a `SIGHUP` to the agent. Alternatively, the
|
also be updated by sending a `SIGHUP` to the agent. Alternatively, the
|
||||||
check can be registered dynamically using the [HTTP API](/docs/agent/http.html).
|
check can be registered dynamically using the [HTTP API](/docs/agent/http.html).
|
||||||
|
@ -93,8 +94,8 @@ check can be registered dynamically using the [HTTP API](/docs/agent/http.html).
|
||||||
## Check Scripts
|
## Check Scripts
|
||||||
|
|
||||||
A check script is generally free to do anything to determine the status
|
A check script is generally free to do anything to determine the status
|
||||||
of the check. The only limitations placed are that the exit codes must convey
|
of the check. The only limitations placed are that the exit codes must obey
|
||||||
a specific meaning. Specifically:
|
this convention:
|
||||||
|
|
||||||
* Exit code 0 - Check is passing
|
* Exit code 0 - Check is passing
|
||||||
* Exit code 1 - Check is warning
|
* Exit code 1 - Check is warning
|
||||||
|
@ -106,7 +107,7 @@ by human operators.
|
||||||
|
|
||||||
## Service-bound checks
|
## Service-bound checks
|
||||||
|
|
||||||
Health checks may also be optionally bound to a specific service. This ensures
|
Health checks may optionally be bound to a specific service. This ensures
|
||||||
that the status of the health check will only affect the health status of the
|
that the status of the health check will only affect the health status of the
|
||||||
given service instead of the entire node. Service-bound health checks may be
|
given service instead of the entire node. Service-bound health checks may be
|
||||||
provided by adding a `service_id` field to a check configuration:
|
provided by adding a `service_id` field to a check configuration:
|
||||||
|
@ -123,12 +124,12 @@ provided by adding a `service_id` field to a check configuration:
|
||||||
```
|
```
|
||||||
|
|
||||||
In the above configuration, if the web-app health check begins failing, it will
|
In the above configuration, if the web-app health check begins failing, it will
|
||||||
only affect the availability of the web-app service and no other services
|
only affect the availability of the web-app service. All other services
|
||||||
provided by the node.
|
provided by the node will remain unchanged.
|
||||||
|
|
||||||
## Multiple Check Definitions
|
## Multiple Check Definitions
|
||||||
|
|
||||||
Multiple check definitions can be provided at once using the `checks` (plural)
|
Multiple check definitions can be defined using the `checks` (plural)
|
||||||
key in your configuration file.
|
key in your configuration file.
|
||||||
|
|
||||||
```javascript
|
```javascript
|
||||||
|
|
Loading…
Reference in New Issue