2014-04-11 00:41:49 +00:00
|
|
|
---
|
2020-04-07 18:55:19 +00:00
|
|
|
layout: intro
|
|
|
|
page_title: Registering Health Checks
|
|
|
|
sidebar_current: gettingstarted-checks
|
|
|
|
description: >-
|
|
|
|
We've now seen how simple it is to run Consul, add nodes and services, and
|
|
|
|
query those nodes and services. In this step, we will continue our tour by
|
|
|
|
adding health checks to both nodes and services. Health checks are a critical
|
|
|
|
component of service discovery that prevent using services that are unhealthy.
|
2014-04-11 00:41:49 +00:00
|
|
|
---
|
|
|
|
|
2014-04-14 21:00:53 +00:00
|
|
|
# Health Checks
|
2014-04-11 00:41:49 +00:00
|
|
|
|
2014-04-14 21:00:53 +00:00
|
|
|
We've now seen how simple it is to run Consul, add nodes and services, and
|
2015-02-20 23:10:58 +00:00
|
|
|
query those nodes and services. In this section, we will continue our tour
|
|
|
|
by adding health checks to both nodes and services. Health checks are a
|
2015-03-17 21:50:28 +00:00
|
|
|
critical component of service discovery that prevent using services that
|
2015-02-20 23:10:58 +00:00
|
|
|
are unhealthy.
|
2014-04-14 21:00:53 +00:00
|
|
|
|
2015-03-17 21:50:28 +00:00
|
|
|
This step builds upon [the Consul cluster created previously](join.html).
|
|
|
|
At this point, you should have a two-node cluster running.
|
2014-04-11 02:06:10 +00:00
|
|
|
|
|
|
|
## Defining Checks
|
|
|
|
|
2015-02-20 23:10:58 +00:00
|
|
|
Similar to a service, a check can be registered either by providing a
|
|
|
|
[check definition](/docs/agent/checks.html) or by making the
|
2017-04-04 16:33:22 +00:00
|
|
|
appropriate calls to the [HTTP API](/api/health.html).
|
2014-04-14 21:00:53 +00:00
|
|
|
|
2015-03-17 21:50:28 +00:00
|
|
|
We will use the check definition approach because, just like with
|
|
|
|
services, definitions are the most common way to set up checks.
|
2014-04-14 21:00:53 +00:00
|
|
|
|
2018-01-04 02:32:42 +00:00
|
|
|
In Consul 0.9.0 and later the agent must be configured with
|
|
|
|
`enable_script_checks` set to true in order to enable script checks.
|
2017-12-27 15:23:29 +00:00
|
|
|
|
2015-02-20 23:16:31 +00:00
|
|
|
Create two definition files in the Consul configuration directory of
|
2015-02-20 23:10:58 +00:00
|
|
|
the second node:
|
2014-04-11 02:06:10 +00:00
|
|
|
|
2014-10-19 23:40:10 +00:00
|
|
|
```text
|
2015-07-07 03:08:22 +00:00
|
|
|
vagrant@n2:~$ echo '{"check": {"name": "ping",
|
2018-01-04 02:32:42 +00:00
|
|
|
"args": ["ping", "-c1", "google.com"], "interval": "30s"}}' \
|
2015-03-17 21:50:28 +00:00
|
|
|
>/etc/consul.d/ping.json
|
2014-04-11 02:06:10 +00:00
|
|
|
|
2015-07-07 03:08:22 +00:00
|
|
|
vagrant@n2:~$ echo '{"service": {"name": "web", "tags": ["rails"], "port": 80,
|
2018-01-04 02:32:42 +00:00
|
|
|
"check": {"args": ["curl", "localhost"], "interval": "10s"}}}' \
|
2015-03-17 21:50:28 +00:00
|
|
|
>/etc/consul.d/web.json
|
2014-04-11 02:06:10 +00:00
|
|
|
```
|
|
|
|
|
2014-04-14 21:00:53 +00:00
|
|
|
The first definition adds a host-level check named "ping". This check runs
|
2015-06-16 02:55:05 +00:00
|
|
|
on a 30 second interval, invoking `ping -c1 google.com`. On a `script`-based
|
|
|
|
health check, the check runs as the same user that started the Consul process.
|
2018-02-11 14:30:40 +00:00
|
|
|
If the command exits with an exit code >= 2, then the check will be flagged as
|
2018-10-24 14:09:41 +00:00
|
|
|
failing and the service will be considered unhealthy. An exit code of 1 will
|
|
|
|
be considered as warning state. This is the contract for any
|
|
|
|
[`script`-based health check](/docs/agent/checks.html#check-scripts).
|
2014-04-14 21:00:53 +00:00
|
|
|
|
2015-02-20 23:10:58 +00:00
|
|
|
The second command modifies the service named `web`, adding a check that sends a
|
|
|
|
request every 10 seconds via curl to verify that the web server is accessible.
|
2018-02-11 14:30:40 +00:00
|
|
|
As with the host-level health check, if the script exits with an exit code >= 2,
|
|
|
|
the check will be flagged as failing and the service will be considered unhealthy.
|
2014-04-11 02:06:10 +00:00
|
|
|
|
2017-02-15 00:09:38 +00:00
|
|
|
Now, restart the second agent, reload it with `consul reload`, or send it a `SIGHUP` signal. You should see the
|
2014-04-14 21:00:53 +00:00
|
|
|
following log lines:
|
2014-04-11 02:06:10 +00:00
|
|
|
|
2014-10-19 23:40:10 +00:00
|
|
|
```text
|
2014-04-11 02:06:10 +00:00
|
|
|
==> Starting Consul agent...
|
|
|
|
...
|
|
|
|
[INFO] agent: Synced service 'web'
|
|
|
|
[INFO] agent: Synced check 'service:web'
|
|
|
|
[INFO] agent: Synced check 'ping'
|
|
|
|
[WARN] Check 'service:web' is now critical
|
|
|
|
```
|
|
|
|
|
2015-03-17 21:50:28 +00:00
|
|
|
The first few lines indicate that the agent has synced the new
|
2014-04-14 21:00:53 +00:00
|
|
|
definitions. The last line indicates that the check we added for
|
|
|
|
the `web` service is critical. This is because we're not actually running
|
2015-02-20 23:10:58 +00:00
|
|
|
a web server, so the curl test is failing!
|
2014-04-11 02:06:10 +00:00
|
|
|
|
|
|
|
## Checking Health Status
|
|
|
|
|
2015-02-20 23:10:58 +00:00
|
|
|
Now that we've added some simple checks, we can use the HTTP API to inspect
|
|
|
|
them. First, we can look for any failing checks using this command (note, this
|
|
|
|
can be run on either node):
|
2014-04-11 02:06:10 +00:00
|
|
|
|
2014-10-19 23:40:10 +00:00
|
|
|
```text
|
2015-03-17 21:50:28 +00:00
|
|
|
vagrant@n1:~$ curl http://localhost:8500/v1/health/state/critical
|
2017-11-23 21:16:42 +00:00
|
|
|
[{"Node":"agent-two","CheckID":"service:web","Name":"Service 'web' check","Status":"critical","Notes":"","ServiceID":"web","ServiceName":"web","ServiceTags":["rails"]}]
|
2014-04-11 02:06:10 +00:00
|
|
|
```
|
|
|
|
|
2015-02-20 23:10:58 +00:00
|
|
|
We can see that there is only a single check, our `web` service check, in the
|
|
|
|
`critical` state.
|
2014-04-14 21:00:53 +00:00
|
|
|
|
|
|
|
Additionally, we can attempt to query the web service using DNS. Consul
|
2015-02-20 23:10:58 +00:00
|
|
|
will not return any results since the service is unhealthy:
|
2014-04-11 02:06:10 +00:00
|
|
|
|
2014-10-19 23:40:10 +00:00
|
|
|
```text
|
|
|
|
dig @127.0.0.1 -p 8600 web.service.consul
|
2014-04-14 21:00:53 +00:00
|
|
|
...
|
2014-04-11 02:06:10 +00:00
|
|
|
|
|
|
|
;; QUESTION SECTION:
|
|
|
|
;web.service.consul. IN A
|
|
|
|
```
|
|
|
|
|
2015-03-17 21:50:28 +00:00
|
|
|
## Next Steps
|
|
|
|
|
2015-02-20 23:10:58 +00:00
|
|
|
In this section, you learned how easy it is to add health checks. Check definitions
|
2014-04-11 02:06:10 +00:00
|
|
|
can be updated by changing configuration files and sending a `SIGHUP` to the agent.
|
2015-02-20 23:16:31 +00:00
|
|
|
Alternatively, the HTTP API can be used to add, remove, and modify checks dynamically.
|
2015-03-17 21:50:28 +00:00
|
|
|
The API also allows for a "dead man's switch", a
|
|
|
|
[TTL-based check](/docs/agent/checks.html#TTL). TTL checks can be used to integrate an
|
|
|
|
application more tightly with Consul, enabling business logic to be evaluated as part
|
|
|
|
of assessing the state of the check.
|
|
|
|
|
|
|
|
Next, we will explore [Consul's K/V store](kv.html).
|