website: document checks and services

2014-02-18 18:05:18 -08:00 · 2014-02-18 18:05:18 -08:00 · 1eb51ee663
parent 0fc401b14e
commit 1eb51ee663
4 changed files with 127 additions and 3 deletions
--- a/website/source/docs/agent/checks.html.markdown
+++ b/website/source/docs/agent/checks.html.markdown
@ -0,0 +1,62 @@
+---
+layout: "docs"
+page_title: "Check Definition"
+sidebar_current: "docs-agent-checks"
+---
+
+# Checks
+
+One of the primary roles of the agent is the management of system and
+application level health checks. A health check is considered to be application
+level if it associated with a service. A check is defined in a configuration file,
+or added at runtime over the HTTP interface.
+
+There are two different kinds of checks:
+
+ * Script + Interval - These checks depend on invoking an external application
+ which does the health check and exits with an appropriate exit code, potentially
+ generating some output. A script is paired with an invocation interval (e.g.
+ every 30 seconds). This is similar to the Nagios plugin system.
+
+ * TTL - These checks retain their last known state for a given TTL. The state
+ of the check must be updated periodicadically over the HTTP interface. If an
+ external system fails to update the status within a given TTL, the check is
+ set to the failed state. This mechanism is used to allow an application to
+ directly report it's health. For example, a web app can periodically curl the
+ endpoint, and if the app fails, then the TTL will expire and the health check
+ enters a critical state.
+
+## Check Definition
+
+A check definition that is a script looks like:
+
+    {
+        "check": {
+            "id": "mem-util",
+            "name": "Memory utilization",
+            "script": "/usr/local/bin/check_mem.py",
+            "interval": "10s"
+        }
+    }
+
+A TTL based check is very similar:
+
+    {
+        "check": {
+            "id": "web-app",
+            "name": "Web App Status",
+            "notes": "Web app does a curl internally every 10 seconds",
+            "ttl": "30s"
+        }
+    }
+
+Both types of definitions must include a `name`, and may optionally
+provide an `id` and `notes` field. The `id` is set to the `name` if not
+provided. It is required that all checks have a unique ID, so if names
+might conflict, then unique ID's should be provided.
+
+The `notes` field is opaque to Consul, but may be used for human
+readable descriptions. The field is set to any output that a script
+generates, and similarly the TTL update hooks can update the `notes`
+as well.
+
--- a/website/source/docs/agent/options.html.markdown
+++ b/website/source/docs/agent/options.html.markdown
@ -32,7 +32,7 @@ The options below are all specified on the command-line.
  IP address. Consul uses both TCP and UDP and use the same port for both, so if you
  have any firewalls be sure to allow both protocols.

- * `-server-addr` - The address that the agent will bind to for handling RPC calls
+* `-server-addr` - The address that the agent will bind to for handling RPC calls
 if running in server mode. This does not affect clients running in client mode.
 By default this is "0.0.0.0:8300". This port is used for TCP communications so any
 firewalls must be configured to allow this.
@ -127,8 +127,8 @@ at a single JSON object with configuration within it.
 Configuration files are used for more than just setting up the agent,
 they are also used to provide check and service definitions. These are used
 to announce the availability of system servers to the rest of the cluster.
-They are documented seperately under [check configuration](#) and
-[service configuration](#) respectively.
+They are documented seperately under [check configuration](/docs/agent/checks.html) and
+[service configuration](/docs/agent/services.html) respectively.

 #### Example Configuration File

--- a/website/source/docs/agent/services.html.markdown
+++ b/website/source/docs/agent/services.html.markdown
@ -0,0 +1,54 @@
+---
+layout: "docs"
+page_title: "Service Definition"
+sidebar_current: "docs-agent-services"
+---
+
+# Services
+
+One of the main goals of service discovery is to provide a catalog of available
+services. To that end, the agent provides a simple service definition format
+to declare the availability of a service, and to potentially associate it with
+a health check. A health check is considered to be application level if it
+associated with a service. A service is defined in a configuration file,
+or added at runtime over the HTTP interface.
+
+## Service Definition
+
+A service definition that is a script looks like:
+
+    {
+        "service": {
+            "name": "redis",
+            "tag": "master",
+            "port": 8000,
+            "check": {
+                "script": "/usr/local/bin/check_redis.py",
+                "interval": "10s"
+            }
+        }
+    }
+
+A service definition must include a `name`, and may optionally provide
+an `id`, `tag`, `port`, and `check`.  The `id` is set to the `name` if not
+provided. It is required that all services have a unique ID, so if names
+might conflict, then unique ID's should be provided.
+
+The `tag` is an opaque value to Consul, but can be used to distinguish
+between "master" or "slave" nodes, or any other service level labels.
+The `port` can be used as well to make a service oriented architecture
+simpler to configure. This way the address and port of a service can
+be discovered.
+
+Lastly, a service can have an associated health check. This is a powerful
+feature as it allows a web balancer to gracefully remove failing nodes, or
+a database to replace a failed slave, etc. The health check is strongly integrated
+in the DNS interface as well. If a service is failing it's health check or
+a node has any failing system-level check, the DNS interface will omit that
+node from any service query.
+
+There is more information about [checks here](/docs/agent/checks.html). The
+check must be of the script or TTL type. If it is a script type, `script` and
+`interval` must be provided. If it is a TTL type, then only `ttl` must be
+provided. The check name is automatically generated as "service:<service-id>".
+
--- a/website/source/layouts/docs.erb
+++ b/website/source/layouts/docs.erb
@ -79,6 +79,14 @@

 					<li<%= sidebar_current("docs-agent-config") %>>
 					<a href="/docs/agent/options.html">Configuration</a>
+                    </li>
+
+					<li<%= sidebar_current("docs-agent-services") %>>
+					<a href="/docs/agent/services.html">Service Definitions</a>
+                    </li>
+
+					<li<%= sidebar_current("docs-agent-checks") %>>
+					<a href="/docs/agent/checks.html">Check Definitions</a>
 					</li>

 					<li<%= sidebar_current("docs-agent-encryption") %>>