open-consul/website/content/api-docs/operator/autopilot.mdx
Jared Kirschner daf5e3ea10 docs: split HTTP API params into sections by type
Path parameters, query parameters, and request body parameters are now shown in
separate sections rather than combined into one general parameters section.
This makes it much easier to understand quickly where a parameter should be
provided.
2022-05-25 14:45:47 -07:00

488 lines
18 KiB
Plaintext

---
layout: api
page_title: Autopilot - Operator - HTTP API
description: |-
The /operator/autopilot endpoints allow for automatic operator-friendly
management of Consul servers including cleanup of dead servers, monitoring
the state of the Raft cluster, and stable server introduction.
---
# Autopilot Operator HTTP API
The `/operator/autopilot` endpoints allow for automatic operator-friendly
management of Consul servers including cleanup of dead servers, monitoring
the state of the Raft cluster, and stable server introduction.
Please check the [Autopilot tutorial](https://learn.hashicorp.com/tutorials/consul/autopilot-datacenter-operations) for more details.
## Read Configuration
This endpoint retrieves its latest Autopilot configuration.
| Method | Path | Produces |
| ------ | ----------------------------------- | ------------------ |
| `GET` | `/operator/autopilot/configuration` | `application/json` |
The table below shows this endpoint's support for
[blocking queries](/api-docs/features/blocking),
[consistency modes](/api-docs/features/consistency),
[agent caching](/api-docs/features/caching), and
[required ACLs](/api#authentication).
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
| ---------------- | ----------------- | ------------- | --------------- |
| `NO` | `none` | `none` | `operator:read` |
The corresponding CLI command is [`consul operator autopilot get-config`](/commands/operator/autopilot#get-config).
### Query Parameters
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
the datacenter of the agent being queried.
- `stale` `(bool: false)` - If the cluster does not currently have a leader an
error will be returned. You can use the `?stale` query parameter to read the
Raft configuration from any of the Consul servers.
### Sample Request
```shell-session
$ curl \
http://127.0.0.1:8500/v1/operator/autopilot/configuration
```
### Sample Response
```json
{
"CleanupDeadServers": true,
"LastContactThreshold": "200ms",
"MaxTrailingLogs": 250,
"ServerStabilizationTime": "10s",
"RedundancyZoneTag": "",
"DisableUpgradeMigration": false,
"UpgradeVersionTag": "",
"CreateIndex": 4,
"ModifyIndex": 4
}
```
For more information about the Autopilot configuration options, see the
[agent configuration section](/docs/agent/config/config-files#autopilot).
## Update Configuration
This endpoint updates the Autopilot configuration of the cluster.
| Method | Path | Produces |
| ------ | ----------------------------------- | ------------------ |
| `PUT` | `/operator/autopilot/configuration` | `application/json` |
The table below shows this endpoint's support for
[blocking queries](/api-docs/features/blocking),
[consistency modes](/api-docs/features/consistency),
[agent caching](/api-docs/features/caching), and
[required ACLs](/api#authentication).
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
| ---------------- | ----------------- | ------------- | ---------------- |
| `NO` | `none` | `none` | `operator:write` |
The corresponding CLI command is [`consul operator autopilot set-config`](/commands/operator/autopilot#set-config).
### Query Parameters
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
the datacenter of the agent being queried.
- `cas` `(int: 0)` - Specifies to use a Check-And-Set operation. The update will
only happen if the given index matches the `ModifyIndex` of the configuration
at the time of writing.
### JSON Request Body Schema
- `CleanupDeadServers` `(bool: true)` - Specifies automatic removal of dead
server nodes periodically and whenever a new server is added to the cluster.
- `LastContactThreshold` `(string: "200ms")` - Specifies the maximum amount of
time a server can go without contact from the leader before being considered
unhealthy. Must be a duration value such as `10s`.
- `MaxTrailingLogs` `(int: 250)` specifies the maximum number of log entries
that a server can trail the leader by before being considered unhealthy.
- `MinQuorum` `(int: 0)` - specifies the minimum number of servers needed before
Autopilot can prune dead servers.
- `ServerStabilizationTime` `(string: "10s")` - Specifies the minimum amount of
time a server must be stable in the 'healthy' state before being added to the
cluster. Only takes effect if all servers are running Raft protocol version 3
or higher. Must be a duration value such as `30s`.
- `RedundancyZoneTag` `(string: "")` - Controls the node-meta key to use when
Autopilot is separating servers into zones for redundancy. Only one server in
each zone can be a voting member at one time. If left blank, this feature will
be disabled.
- `DisableUpgradeMigration` `(bool: false)` - Disables Autopilot's upgrade
migration strategy in Consul Enterprise of waiting until enough
newer-versioned servers have been added to the cluster before promoting any of
them to voters.
- `UpgradeVersionTag` `(string: "")` - Controls the node-meta key to use for
version info when performing upgrade migrations. If left blank, the Consul
version will be used.
### Sample Payload
```json
{
"CleanupDeadServers": true,
"LastContactThreshold": "200ms",
"MaxTrailingLogs": 250,
"MinQuorum": 3,
"ServerStabilizationTime": "10s",
"RedundancyZoneTag": "",
"DisableUpgradeMigration": false,
"UpgradeVersionTag": "",
"CreateIndex": 4,
"ModifyIndex": 4
}
```
## Read Health
This endpoint queries the health of the autopilot status.
| Method | Path | Produces |
| ------ | ---------------------------- | ------------------ |
| `GET` | `/operator/autopilot/health` | `application/json` |
The table below shows this endpoint's support for
[blocking queries](/api-docs/features/blocking),
[consistency modes](/api-docs/features/consistency),
[agent caching](/api-docs/features/caching), and
[required ACLs](/api#authentication).
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
| ---------------- | ----------------- | ------------- | --------------- |
| `NO` | `none` | `none` | `operator:read` |
### Query Parameters
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
the datacenter of the agent being queried.
### Sample Request
```shell-session
$ curl \
http://127.0.0.1:8500/v1/operator/autopilot/health
```
### Sample response
```json
{
"Healthy": true,
"FailureTolerance": 0,
"Servers": [
{
"ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",
"Name": "node1",
"Address": "127.0.0.1:8300",
"SerfStatus": "alive",
"Version": "0.7.4",
"Leader": true,
"LastContact": "0s",
"LastTerm": 2,
"LastIndex": 46,
"Healthy": true,
"Voter": true,
"StableSince": "2017-03-06T22:07:51Z"
},
{
"ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",
"Name": "node2",
"Address": "127.0.0.1:8205",
"SerfStatus": "alive",
"Version": "0.7.4",
"Leader": false,
"LastContact": "27.291304ms",
"LastTerm": 2,
"LastIndex": 46,
"Healthy": true,
"Voter": false,
"StableSince": "2017-03-06T22:18:26Z"
}
]
}
```
- `Healthy` is whether all the servers are currently healthy.
- `FailureTolerance` is the number of redundant healthy servers that could be
fail without causing an outage (this would be 2 in a healthy cluster of 5
servers).
- `Servers` holds detailed health information on each server:
- `ID` is the Raft ID of the server.
- `Name` is the node name of the server.
- `Address` is the address of the server.
- `SerfStatus` is the SerfHealth check status for the server.
- `Version` is the Consul version of the server.
- `Leader` is whether this server is currently the leader.
- `LastContact` is the time elapsed since this server's last contact with the leader.
- `LastTerm` is the server's last known Raft leader term.
- `LastIndex` is the index of the server's last committed Raft log entry.
- `Healthy` is whether the server is healthy according to the current Autopilot configuration.
- `Voter` is whether the server is a voting member of the Raft cluster.
- `StableSince` is the time this server has been in its current `Healthy` state.
The HTTP status code will indicate the health of the cluster. If `Healthy` is true, then a
status of 200 will be returned. If `Healthy` is false, then a status of 429 will be returned.
## Read the Autopilot State
This endpoint queries the health of the autopilot status.
| Method | Path | Produces |
| ------ | --------------------------- | ------------------ |
| `GET` | `/operator/autopilot/state` | `application/json` |
The table below shows this endpoint's support for
[blocking queries](/api-docs/features/blocking),
[consistency modes](/api-docs/features/consistency),
[agent caching](/api-docs/features/caching), and
[required ACLs](/api#authentication).
| Blocking Queries | Consistency Modes | Agent Caching | ACL Required |
| ---------------- | ----------------- | ------------- | --------------- |
| `NO` | `none` | `none` | `operator:read` |
The corresponding CLI command is [`consul operator autopilot state`](/commands/operator/autopilot#state).
### Query Parameters
- `dc` `(string: "")` - Specifies the datacenter to query. This will default to
the datacenter of the agent being queried.
### Sample Request
```shell-session
$ curl \
http://127.0.0.1:8500/v1/operator/autopilot/state
```
### Response Format
```json
{
"Healthy": true,
"FailureTolerance": 1,
"OptimisticFailureTolerance": 4,
"Servers": {
"5e26a3af-f4fc-4104-a8bb-4da9f19cb278": {},
"10b71f14-4b08-4ae5-840c-f86d39e7d330": {},
"1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2": {},
"63783741-abd7-48a9-895a-33d01bf7cb30": {},
"6cf04fd0-7582-474f-b408-a830b5471285": {}
},
"Leader": "5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
"Voters": [
"5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
"10b71f14-4b08-4ae5-840c-f86d39e7d330",
"1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2"
],
"RedundancyZones": {
"az1": {},
"az2": {},
"az3": {}
},
"ReadReplicas": [
"63783741-abd7-48a9-895a-33d01bf7cb30",
"6cf04fd0-7582-474f-b408-a830b5471285"
],
"Upgrade": {}
}
```
- `Healthy` is whether all the servers are currently healthy.
- `FailureTolerance` is the number of redundant healthy servers that could be
fail without causing an outage (this would be 2 in a healthy cluster of 5
servers).
- `OptimisticFailuretolerance` <EnterpriseAlert inline /> is the maximum number
of servers that could fail in the right order over the right period of time
without causing an outage. This value is only useful when using the [Redundancy
Zones feature](/docs/enterprise/redundancy) with autopilot.
- `Servers` is a mapping of server ID to an object holding detailed information about that server.
The format of the detailed info is documented in its own section.
- `Leader` is the server ID of current leader. This value can be used as an index into the `Servers` object.
- `Voters` is a list of server IDs that are voters. These values can be used as indexes into the `Servers` object.
- `RedundancyZones` <EnterpriseAlert inline /> is mapping of redundancy zone name to redundancy zone information.
The format of the redundancy zone information is documented in its own section.
- `ReadReplicas` <EnterpriseAlert inline /> is a list of server IDs that autopilot has identified as read replicas.
These will never be promoted. These values can be used as indexes into the `Servers` map.
- `Upgrade` <EnterpriseAlert inline /> is an object holding all the information about any ongoing automated upgrade.
The format of this object is detailed in its own section.
### Server Response Format
```json
{
"ID": "1c3e3278-3f88-4a97-9f6a-1058584e8058",
"Name": "node1",
"Address": "198.18.0.1:8300",
"NodeStatus": "alive",
"Version": "1.9.0+ent",
"LastContact": "1.321ms",
"LastTerm": 4,
"LastIndex": 42,
"Healthy": true,
"StableSince": "2020-08-12T12:13:14Z",
"RedundancyZone": "az1",
"UpgradeVersion": "1.2.3",
"ReadReplica": false,
"Status": "voter",
"Meta": {
"build": "1.2.3",
"zone": "az1"
},
"NodeType": "redundancy-zone-voter"
}
```
- `ID` is the Raft ID of the server.
- `Name` is the node name of the server.
- `Address` is the address of the server.
- `NodeStatus` is the SerfHealth check status for the server.
- `Version` is the Consul version of the server.
- `LastContact` is the time elapsed since this server's last contact with the leader.
- `LastTerm` is the server's last known Raft leader term.
- `LastIndex` is the index of the server's last committed Raft log entry.
- `Healthy` is whether the server is healthy according to the current Autopilot configuration.
- `StableSince` is the time this server has been in its current `Healthy` state.
- `RedundancyZone` <EnterpriseAlert inline /> is the name of the redundancy zone this server is within.
- `UpgradeVersion` <EnterpriseAlert inline /> is the version that will be used for automated upgrade calculations.
- `ReadReplica` <EnterpriseAlert inline /> indicates whether this server is a read replica or not.
- `Status` indicates the current Raft status of this server. Possible values are:
`leader`, `voter`, `non-voter`, or `staging`.
- `Meta` is the node metadata of this server. Values within this map are used for determining a server's
redundancy zone and upgrade version.
- `NodeType` is the desired type autopilot thinks this server should have. In Consul OSS the only possible
value is `voter` as all present servers should having voting rights. In Consul Enterprise the possible values also
include `read-replica`, `zone-voter`, `zone-standby` and `zone-extra-voter`. `zone-voter` indicates that autopilot
wants this server to be the voter for a particular redundancy zone. When a zone has no voter all nodes will be typed
as this until one is promoted. When that happens the other non-voters in the zone will be typed as `zone-standby`.
This indicates that they are currently desired to be standby servers in case the voter from the zone fails. Finally,
the `zone-extra-voter` status indicates that autopilot wants this server to be a voter due to a failure of all servers
in another zone and that when one of the servers in that failed zone are restored, this server will be demoted.
### Redundancy Zone Response Format <EnterpriseAlert inline />
```json
{
"Servers": [
"10b71f14-4b08-4ae5-840c-f86d39e7d330",
"b007061c-6d15-4c90-b3d6-2fef276a0650"
],
"Voters": ["b007061c-6d15-4c90-b3d6-2fef276a0650"],
"FailureTolerance": 1
}
```
Each zone in the responses `RedundancyZones` mapping will have this structure.
- `Servers` is a list of server IDs of all the servers in this zone. These values can be used as indexes
into the top level response's `Servers` mapping.
- `Voters` is a list of server IDs of all servers in this zone that have voting rights. Typically this will
be a list with 1 value but in some failure scenarios or upgrade scenarios the size could increase. These
values can be used as indexes into the top level response's `Servers` mapping.
- `FailureTolerance` is the number of servers in this zone that could fail without causing a total zone failure
and subsequent promotion of a server from another zone as a fallback.
### Upgrade Information Response Format <EnterpriseAlert inline />
```json
{
"Status": "awaiting-new-servers",
"TargetVersion": "1.9.1+ent",
"TargetVersionVoters": ["f0344689-3e1f-4125-b55d-e888d3abf514"],
"TargetVersionNonVoters": [
"619a4ba6-1a0b-476e-8a1a-28aeee7735a2",
"fd683fe6-541f-4ebf-bc5a-6eae51571ddb"
],
"TargetVersionReadReplicas": ["9f1e27ae-1129-45ef-97dd-6d8c3ec47e6a"],
"OtherVersionVoters": [
"0cbdd493-235f-48f2-98d9-1bf2443b9d72",
"21812bd7-2f21-4565-9892-2fdd3d4e1a99",
"c654ba5c-cc76-4056-a5ca-6e78d95f27ad"
],
"OtherVersionNonVoters": [
"6d973f11-6bdb-4f7d-8a90-c1300066da4c",
"6241ab45-371e-4b2a-a0f1-d847c3b7b1b0"
],
"OtherVersionReadReplicas": ["42d10fc3-581b-4403-832d-945b3a0d8841"]
}
```
- `Status` is the automated upgrade status. Possible values are:
- `disabled` indicates that automated upgrades are disabled either from user configuration or due to being unlicensed.
- `idle` indicates that there is no ongoing upgrade and that all servers are running the same Consul version.
- `await-new-voters` indicates that a newer versioned server has been added but that autopilot is waiting for more servers
of that version to be added before proceeding with the upgrade.
- `promoting` indicates that enough servers of the target version have been added and autopilot will now promote them
to voters.
- `demoting` indicates that autopilot is currently demoting the servers not running the target version.
- `leader-transfer` indicates that autopilot is in the process of transferring leadership to a server running
the target version.
- `await-new-servers` indicates that the majority of the upgrade is complete but that more servers running the target
version need to be added to completely replace all of the previous servers.
- `await-server-removal` indicates that the upgrade is complete and it is now safe to remove the previous servers.
- `TargetVersion` is the version that Autopilot is upgrading to. This will be the maximum version of all servers
`UpgradeVersion` field in the top level `Servers` mapping.
- `TargetVersionVoters` is a list of IDs of servers running the target version and that currently have voting rights.
- `TargetVersionNonVoters` is a list of IDs of servers running the target version and that currently do not have voting rights.
- `TargetVersionReadReplicas` is a list of IDs of servers running the target version and are read replicas.
- `OtherVersionVoters` is a list of IDs of servers not running the target version and that currently have voting rights.
- `OtherVersionNonVoters` is a list of IDs of servers not running the target version and that currently do not have voting rights.
- `OtherVersionReadReplicas` is a list of IDs of servers not running the target version and are read replicas.