Commit Graph

321 Commits

Author SHA1 Message Date
Kyle Havlovitz 37ea20cb44
Add advanced autopilot features 2017-03-22 15:25:16 -07:00
James Phillips 9b43b31900
Forces user-supplied node IDs to lower case for consistency. 2017-03-13 19:51:56 -07:00
Kyle Havlovitz 8bcab6c6d7
Add autopilot server health tracking
This adds two goroutines to perform autopilot tasks on the leader - one
to monitor the health of servers and another to periodically clean up
dead servers with a limit on removal count. Also adds a new http endpoint,
`/v1/operator/autopilot/health`, for querying this information through an
operator RPC endpoint.
2017-03-06 16:00:10 -08:00
Kyle Havlovitz 44f0b08db7 Merge pull request #2771 from hashicorp/f-autopilot
Autopilot dead server cleanup, config, and raft version compatibility
2017-02-28 15:04:16 -08:00
Kyle Havlovitz 23c492a74e
Rename DeadServerCleanup and make wording adjustments 2017-02-28 14:45:21 -08:00
Kyle Havlovitz 3b67c50c1d
Remove the RPC client interface and update docs 2017-02-28 13:41:09 -08:00
Kyle Havlovitz 56e22a719f
Add state store table and endpoints for autopilot 2017-02-23 20:32:13 -08:00
Kyle Havlovitz f9588b8d7f
Add raft version 2/3 compatibility 2017-02-22 12:53:32 -08:00
Kyle Havlovitz f48cd78bc5 Merge pull request #2699 from hashicorp/f-tls-min-version
Add TLSMinVersion to config options
2017-02-01 16:31:53 -05:00
Kyle Havlovitz 086975ac0a
Add TLSMinVersion to config options 2017-02-01 16:20:33 -05:00
Sean Chittenden 9e2557c53c
Re-cherry-pick 71d807f607589f2eb4fea4e83e3876d122c8afc0 and e2320d69b6b155d8223758415aabafc60a0e9d3b. 2017-02-01 10:27:04 -08:00
James Phillips d1824ce361 Revert "Adds gopsutil in the loop when trying to make the node ID." 2017-01-31 19:13:49 -08:00
James Phillips e7b683dad8
Gets rid of a goto. 2017-01-31 19:02:25 -08:00
James Phillips d9b999109a
Adds gopsutil into node ID process and attempts to use host ID, if availabile. 2017-01-31 08:51:33 -08:00
Kyle Havlovitz cc271aa436
Merge branch 'master' into f-prepared-query-nodemeta 2017-01-23 20:17:48 -05:00
Kyle Havlovitz 3cd054e81f
Add tests for node meta in prepared queries and update docs 2017-01-23 19:17:30 -05:00
James Phillips 96bff003b7
Adds basic support for node IDs. 2017-01-17 22:47:59 -08:00
Kyle Havlovitz 51a1834667
Add -node-meta to agent command line options 2017-01-11 16:09:04 -05:00
Kyle Havlovitz d77890a011
Validate metadata config earlier and handle multiple filters 2017-01-11 15:12:03 -05:00
Kyle Havlovitz 6b5cf20b1c
Fix formatting 2017-01-09 13:49:33 -08:00
Kyle Havlovitz aee766baba
Add meta key validations and more tests 2017-01-09 11:21:49 -08:00
Kyle Havlovitz b25f4c7d35
Add tests for node metadata functionality 2017-01-05 17:21:56 -08:00
Kyle Havlovitz 2a423c6e2c
Add support for setting node metadata fields 2017-01-05 14:10:26 -08:00
James Phillips 927aa3789a
Adds ACL management support to the agent. 2016-12-14 07:07:41 -08:00
James Phillips 8db53b4ac2
Adds complete ACL coverage for /v1/coordinate/nodes and Coordinate.Update RPC. 2016-12-12 14:52:27 -08:00
James Phillips 67f1fe8d0b
Adds support for a new "acl_agent_token" which is used for internal
catalog operations.
2016-12-12 14:52:27 -08:00
James Phillips 2b19a5340c
Removes the exception for the "consul" service in the catalog. 2016-12-07 17:58:23 -08:00
Sean Chittenden 1ca285221a
Run all known addresses through go-sockaddr/template.
The following is now possible:

```
$ consul agent -dev -client="{{GetPrivateIP}}" -bind='{{GetInterfaceIP "en0"}}'
```
2016-12-02 16:35:38 +11:00
Kyle Havlovitz dd05afb32e Add reload/leave http endpoints (#2516) 2016-11-30 13:29:42 -05:00
Seth Vargo 1c55429a38
Add an API method for determining the best status
Given a list of HealthChecks, this determines the "best" status for the
collective group. This is useful for nodes and services, which may have
multiple checks associated with them.
2016-11-29 18:41:46 -05:00
Kyle Havlovitz ccab51b07c Add logWriter to agent Create() method 2016-11-28 18:36:26 -05:00
Kyle Havlovitz d91854f3b0 Add monitor http endpoint 2016-11-28 18:36:26 -05:00
Kyle Havlovitz 9b6847d13f Use uuids in persist temp files to avoid race (#2494) 2016-11-09 15:22:53 -08:00
Kyle Havlovitz b170282d0f Improve logging when deregistering a nonexistent service (#2492)
Log a warning instead of a success message when attempting to deregister a nonexistent service. In Consul 0.8 this can be changed to giving an error outright, but for now we can keep the idempotent delete behavior.
2016-11-09 16:56:54 -05:00
Kyle Havlovitz d71cdf9dc2 Merge pull request #2480 from hashicorp/b-atomic-writes
Atomic writes for persisting service/check state
2016-11-07 15:36:35 -05:00
Kyle Havlovitz 06fac8a918
Add a note about not calling sync for persistCheckState 2016-11-07 15:24:31 -05:00
Kyle Havlovitz 6b6601093c
Call fsync() for saving check/service state 2016-11-07 13:51:03 -05:00
Kyle McCullough 697ea0f8eb Add setting to skip ssl certificate verification for HTTP checks (#1984)
* http check: add setting to skip ssl certificate verification

* update http check documentation

* fix typo in documentation

* Add TLSSkipVerify to agent api
2016-11-03 13:17:30 -07:00
James Phillips 18926728d4 Supports WAN and LAN Serf Bind Addresses. (#2468)
* * adding cli config and config file support for specifying the serf wan and lan bind addresses
* updating documentation for serf wan and lan options
Fixes #2007

* Cleans up some small things from #2380.

* Uses the bind default for the agent test for Serf WAN and LAN.
2016-11-03 12:58:58 -07:00
James Phillips bc29610124 Adds support for snapshots and restores. (#2396)
* Updates Raft library to get new snapshot/restore API.

* Basic backup and restore working, but need some cleanup.

* Breaks out a snapshot module and adds a SHA256 integrity check.

* Adds snapshot ACL and fills in some missing comments.

* Require a consistent read for snapshots.

* Make sure snapshot works if ACLs aren't enabled.

* Adds a bit of package documentation.

* Returns an empty response from restore to avoid EOF errors.

* Adds API client support for snapshots.

* Makes internal file names match on-disk file snapshots.

* Adds DC and token coverage for snapshot API test.

* Adds missing documentation.

* Adds a unit test for the snapshot client endpoint.

* Moves the connection pool out of the client for easier testing.

* Fixes an incidental issue in the prepared query unit test.

I realized I had two servers in bootstrap mode so this wasn't a good setup.

* Adds a half close to the TCP stream and fixes panic on error.

* Adds client and endpoint tests for snapshots.

* Moves the pool back into the snapshot RPC client.

* Adds a TLS test and fixes half-closes for TLS connections.

* Tweaks some comments.

* Adds a low-level snapshot test.

This is independent of Consul so we can pull this out into a library
later if we want to.

* Cleans up snapshot and archive and completes archive tests.

* Sends a clear error for snapshot operations in dev mode.

Snapshots require the Raft snapshots to be readable, which isn't supported
in dev mode. Send a clear error instead of a deep-down Raft one.

* Adds docs for the snapshot endpoint.

* Adds a stale mode and index feedback for snapshot saves.

This gives folks a way to extract data even if the cluster has no
leader.

* Changes the internal format of a snapshot from zip to tgz.

* Pulls in Raft fix to cancel inflight before a restore.

* Pulls in new Raft restore interface.

* Adds metadata to snapshot saves and a verify function.

* Adds basic save and restore snapshot CLI commands.

* Gets rid of tarball extensions and adds restore message.

* Fixes an incidental bad link in the KV docs.

* Adds documentation for the snapshot CLI commands.

* Scuttle any request body when a snapshot is saved.

* Fixes archive unit test error message check.

* Allows for nil output writers in snapshot RPC handlers.

* Renames hash list Decode to DecodeAndVerify.

* Closes the client connection for snapshot ops.

* Lowers timeout for restore ops.

* Updates Raft vendor to get new Restore signature and integrates with Consul.

* Bounces the leader's internal state when we do a restore.
2016-10-25 19:20:24 -07:00
James Phillips dace771f06 Merge pull request #2389 from hashicorp/jbs-2019
Lower Service tag DNS warning to DEBUG for #2019
2016-10-24 17:05:02 -07:00
Brian Shumate 36e1a194c2
Lower Service tag DNS warning to DEBUG for #2019 2016-10-05 08:45:01 -04:00
Adam Wolfe Gordon ae5bd0f2cc agent: Stop reaping child processes (resolves #1988)
The consul docker image now uses dumb-init to reap child processes, so
there's no need to reap them ourselves.
2016-10-04 09:36:41 -06:00
James Phillips b339b0d2fc
Adds performance tuning capability for Raft, detuned defaults, and supplemental docs. 2016-08-24 21:58:37 -07:00
James Phillips db9a72dde1
Merge branch 'master' into f-deregister-critical 2016-08-16 12:53:21 -07:00
James Phillips 607595f99e
Cleans up based on code review feedback. 2016-08-16 12:52:30 -07:00
James Phillips 817d7e93e4
Adds an "lan" tagged address so we have a way to get them all.
If we didn't have this, then there would be no way to know the LAN
address if address translation was turned on.
2016-08-16 10:49:03 -07:00
James Phillips b4f981c837
Adds ability to deregister a service based on critical check state longer than a timeout. 2016-08-16 01:00:26 -07:00
James Phillips 9cece515c0
Adds basic ACL replication plumbing. 2016-08-03 21:24:04 -07:00
Cameron Davison 40d1c279dd
atomic write service state and checks files, fixes #1221 2016-08-03 10:34:12 -05:00
Sean Chittenden c90fc057d4
Give log reviewers a hint as to which check is failing 2016-06-20 15:25:21 -07:00
Sean Chittenden ff45f8c8ff
Revert "Move `structs.CheckID` to a new top-level package, `types`."
This reverts commit 2bbd52e3b44ff1b60939a8400264d534662d6d51.
2016-06-07 16:59:02 -04:00
Sean Chittenden a4554b945c
Move `structs.CheckID` to a new top-level package, `types`.
Per discussion w/ @slackpad, move this type to its own top-level package
2016-06-07 16:59:02 -04:00
Sean Chittenden cd68cd3868
Move `structs.CheckID` to a new top-level package, `types`.
Per discussion w/ @slackpad, move this type to its own top-level package
2016-06-07 16:59:02 -04:00
Sean Chittenden 0857e93d0b
Float a type balloon. Some strings are square pegs in round holes.
This experiment was brought about because of variable naming
confusion where name and checkIDs were interchanged.  Gave CheckID
an Qualified Type Name and chased downstream changes.
2016-06-07 16:59:02 -04:00
James Phillips 0f7f07d2f5 Fixes some bad error returns in the persist service and check paths. 2016-04-26 15:03:26 -07:00
James Phillips 03b0c196e0 Merge pull request #1762 from mshean/script-timeout
Add Timeout field to CheckMonitor
2016-04-24 23:08:06 -07:00
Matt Shean 7e617d6086 add Timeout field to CheckMonitor 2016-04-20 11:41:30 -07:00
James Phillips 32389a9822 Makes reap time configurable for LAN and WAN. 2016-04-11 00:38:25 -07:00
Wim 508bc796a8 Allow [::] as a bind address (binds to first public IPv6 address) 2016-03-18 23:59:44 +01:00
James Phillips 79bd1fd4bb Sets up config for more address tags down the road, renames struct members. 2016-02-07 10:37:34 -08:00
Evan Gilman d123b4dece Rectify value of `AdvertiseAddrWan` when set elsewhere
`AdvertiseAddrs` has been introduced as a configuration option, which
duplicates a few other options, namely `AdvertiseAddrWan`. We need to
use this value elsewhere, so rather than doing a precedence check every
time we need to access it, rectify the value of `AdvertiseAddrWan` to
match
2016-02-06 23:01:45 -08:00
Sean Chittenden c470553b6b Factor out duplicate functions into a lib package
Consolidate code duplication and tests into a single lib package.  Most of these functions were from various **/util.go functions that couldn't be imported due to cyclic imports.  The consul/lib package is intended to be a terminal node in an import DAG and a place to stash various consul-only helper functions.  Pulled in hashicorp/go-uuid instead of consolidating UUID access.
2016-01-29 16:57:45 -08:00
James Phillips d22fe5ba62 Adds support for the reap lock. 2016-01-12 21:10:25 -08:00
Ryan Uber ce4cf8a542 consul: dev mode works 2015-12-26 20:19:36 -05:00
James Phillips 23536f2b97 Adds Docker checks support to client API.
Also changed `DockerContainerId` to `DockerContainerID`, and updated the agent
API docs to reflect their support for Docker checks.
2015-11-18 07:40:02 -08:00
James Phillips 38daaea503 Adds a slightly more flexible mock system so we can test DNS. 2015-11-15 17:06:00 -08:00
James Phillips c1305a08ea Makes the version upshift code look at the correct version field. 2015-10-27 14:44:34 -07:00
Diptanu Choudhury 8af6ac8f16 Making an explicit check to test whether a check is of type Monitor 2015-10-26 19:52:32 -07:00
Diptanu Choudhury fd6d57ccac Not adding the docker check if we couldn't create the client 2015-10-26 16:45:12 -07:00
Diptanu Choudhury 0de0abcaec Defaulting to Monitor check 2015-10-26 15:02:23 -07:00
Diptanu Choudhury c4b85010fb Implemented Docker health checks 2015-10-26 10:23:57 -07:00
James Phillips aea2194ce3 Makes the default protocol 2 and lets 3 interoperate with 2. 2015-10-23 15:23:01 -07:00
James Phillips 9ba9a708f6 Scales coordinate sends to hit a fixed aggregate rate across the cluster. 2015-10-23 15:23:01 -07:00
James Phillips d8b8a3719f Simplifies the batching function and adds some comments. 2015-10-23 15:23:01 -07:00
James Phillips f71c79c53f Does some small cleanups based on PR feedback.
* Holds coordinate updates in map and gets rid of the update channel.
* Cleans up config variables a bit.
2015-10-23 15:23:01 -07:00
James Phillips acb0dce829 Moves batching down into the state store and changes it to fail-fast.
* A batch of updates is done all in a single transaction.
* We no longer need to get an update to kick things, there's a periodic flush.
* If incoming updates overwhelm the configured flush rate they will be dumped with an error.
2015-10-23 15:23:01 -07:00
James Phillips b6c31bdf2f Flips the sense of the coordinate enable option. 2015-10-23 15:23:01 -07:00
James Phillips edb9a119e2 Does a clean up pass on the Consul side. 2015-10-23 15:23:01 -07:00
Derek Chiang 139c9240ea Address comments 2015-10-23 15:23:01 -07:00
Derek Chiang 840474f170 Add a test case 2015-10-23 15:23:01 -07:00
Derek Chiang 23c08aeeb4 Use IndexedCoordinate instead 2015-10-23 15:23:01 -07:00
Derek Chiang 530e73212a Some fixes 2015-10-23 15:23:01 -07:00
Derek Chiang b2cff43bb5 Complete logic for sending coordinates 2015-10-23 15:23:01 -07:00
Derek Chiang 66b210afcb Some fixes 2015-10-23 15:23:01 -07:00
Derek Chiang b5bbe2bcfa Adding tests and stuff 2015-10-23 15:23:01 -07:00
James Phillips 263c7e3fd3 Deletes the old state store and all its accoutrements. 2015-10-15 14:59:09 -07:00
Dale Wijnand c5168e1263 Fix a bunch of typos. 2015-09-15 13:22:08 +01:00
Ryan Uber e129a59316 agent: thread tokens through for maintenance mode 2015-09-10 11:43:59 -07:00
Peter Fern 86edd7169c Add TCP check type
Adds the ability to simply check whether a TCP socket accepts
connections to determine if it is healthy.  This is a light-weight -
though less comprehensive than scripting - method of checking network
service health.

The check parameter `tcp` should be set to the `address:port`
combination for the service to be tested.  Supports both IPv6 and IPv4,
in the case of a hostname that resolves to both, connections will be
attempted via both protocol versions, with the first successful
connection returning a successful check result.

Example check:

```json
{
  "check": {
    "id": "ssh",
    "name": "SSH (TCP)",
    "tcp": "example.com:22",
    "interval": "10s"
  }
}
```
2015-07-24 14:06:05 +10:00
Ryan Uber 715f69cce8 Merge pull request #1004 from i0rek/advertise_addrs
Add advertise_addrs.
2015-06-23 12:32:07 -07:00
Hans Hasselberg 4598087df3 Implement advertise_addrs for SerfLan, SerfWan and RPC.
Fixes #550.
This will make it possible to configure the advertised adresses for
SerfLan, SerfWan and RPC. It will enable multiple consul clients on a
single host which is very useful in a container environment.

This option might override advertise_addr and advertise_addr_wan
depending on the configuration.

It will be configureable with advertise_addrs. Example:

{
  "advertise_addrs": {
    "serf_lan": "10.0.120.91:4424",
    "serf_wan": "201.20.10.61:4423",
    "rpc": "10.20.10.61:4424"
  }
}
2015-06-23 21:23:45 +02:00
Ryan Uber e4f937f71d agent: use persist/load/purge convention for function names 2015-06-08 09:35:10 -07:00
Ryan Uber bd448d27fe agent: check state is purged if expired 2015-06-05 16:59:41 -07:00
Ryan Uber 7d8993e0f0 agent: purge check state when checks are deregistered 2015-06-05 16:57:14 -07:00
Ryan Uber 8c9facbff0 agent: testing state persistence, recovery, and expiration 2015-06-05 16:45:05 -07:00
Ryan Uber f1eb739949 agent: first stab at persisting check state 2015-06-05 16:17:07 -07:00
Ryan Uber 7e82b3c32c agent: refactor loadChecks/loadServices, fixes a few minor bugs 2015-06-04 14:33:30 -07:00
Armon Dadgar 430100a0c8 Merge pull request #927 from hashicorp/f-tls
Add new `verify_server_hostname` to mitigate possibility of MITM
2015-05-11 18:15:16 -07:00
Armon Dadgar f797130228 Fixing merge conflict 2015-05-11 16:48:10 -07:00
Armon Dadgar 1952083354 agent: copy config into consul config 2015-05-11 15:16:13 -07:00
Ryan Uber d0a28482ff agent: use service ID field to determine associated health checks during deregister 2015-05-07 15:30:01 -07:00
Ryan Uber e7b00723f4 agent: restore check status when re-registering (updating) services 2015-05-06 12:28:42 -07:00
Ryan Uber ed067fd03c agent: allow persisted services to be updated on disk 2015-05-05 22:36:45 -07:00
Ryan Uber 9acc42b86e Merge pull request #891 from hashicorp/f-token
ACL tokens for service/check registration
2015-05-05 22:17:31 -07:00
Ryan Uber 3c577a0069 agent: use an additional parameter for passing tokens 2015-05-04 17:48:05 -07:00
Ryan Uber 67e9a04f48 agent: restore tokens for services and checks in config 2015-04-28 12:44:46 -07:00
Ryan Uber 802b4793df agent: backwards compat for persisted services from pre-0.5.1 2015-04-28 12:18:41 -07:00
Ryan Uber f069db21e3 agent: safer read methods for tokens 2015-04-28 11:53:53 -07:00
Ryan Uber 54b5f17629 agent: test coverage loading service/check tokens from persisted files 2015-04-27 22:46:01 -07:00
Ryan Uber 75d182296f agent: add service/check token methods to reduce invasiveness 2015-04-27 22:01:01 -07:00
Ryan Uber 30f6f1142e agent: persist tokens from API registrations 2015-04-27 19:01:02 -07:00
Ryan Uber 545f3db3fe agent: initial pass threading through tokens for services/checks 2015-04-27 18:33:46 -07:00
Ryan Mills 370853d7ff Allow specifying a status field in the agent/service/register and agent/check/register endpoints.
This status must be one of the valid check statuses: 'passing', 'warning', 'critical', 'unknown'.
If the status field is not present or the empty string, the default of 'critical' is used.
2015-04-12 02:00:31 +00:00
pepov e66301ab99 Add configuration option to specify a separate address for advertising on the wan 2015-03-28 16:08:19 +01:00
Michael Fraenkel c00c4ebaaa Support SesionTTLMin configuration
- Allow setting SessionTTLMin
- Validate on the Server
2015-03-27 05:13:57 -07:00
Ryan Uber 66d89ab4df agent: warn and purge checks which cannot be restored from agent state 2015-03-11 16:13:31 -07:00
Armon Dadgar 60f69224ea agent: Snapshot and restore health state on reload. Fixes #693 2015-02-17 12:00:04 -08:00
Ryan Uber 60dd1a073e agent: clarify the valid characters for dns 2015-02-09 09:59:27 -08:00
Ryan Uber 5da078f989 agent: warn on service tags with invalid chars 2015-02-09 09:30:06 -08:00
Ryan Uber fe80d28664 agent: Warn on dns-incompatible characters during service registration. Fixes #683. 2015-02-09 09:23:17 -08:00
arnaud briche 4d105487a8 - add Timeout field to CheckType and CheckHTTP to make http request timeout configurable by the client 2015-01-29 13:37:48 +07:00
foostan 524cb0d0c3 Validate ServiceID/CheckID when deregistering. 2015-01-27 01:06:57 +09:00
Armon Dadgar e5e519df4b Merge pull request #625 from hashicorp/f-maintcmd
New "maint" command
2015-01-22 11:56:49 -08:00
Ryan Uber 17d2443f91 command/maint: clean up 2015-01-22 11:14:28 -08:00
Ryan Uber fa19d09390 command/maint: display active maintenance when no args are passed 2015-01-22 10:26:17 -08:00
Armon Dadgar 14d55cb40a Merge pull request #622 from hashicorp/f-sockets
Unix domain sockets
2015-01-21 16:30:03 -08:00
Ryan Uber 8c1c230ce3 agent: use const for default maintenance reason strings 2015-01-21 14:53:52 -08:00
Ryan Uber d65f4035b9 agent: support passing ?reason= for custom notes field values on maintenance checks 2015-01-21 12:21:57 -08:00
Ryan Uber 40cfd527cb agent: fix maintenance check ID 2015-01-21 11:03:42 -08:00
Ryan Uber 27de0adf3b agent: support multiple checks per service 2015-01-20 21:48:42 -08:00
Ryan Uber c01277725a agent: beginning socket user/group/mode support as discussed in #612 2015-01-20 13:44:27 -08:00
Ryan Uber f29ea9f637 agent: maintenance logging + unique service check IDs 2015-01-16 15:38:13 -08:00
Ryan Uber 8ea9eb87ea agent: node maintenance mode works 2015-01-16 15:38:13 -08:00
Ryan Uber 3b815cd0aa agent: maintenance mode api's are idempotent 2015-01-16 15:37:52 -08:00
Ryan Uber 8a4a8eb023 agent: maintenance mode is persistent 2015-01-16 15:37:52 -08:00
Ryan Uber 7d4801db29 agent: test agent service maintenance mode 2015-01-16 15:37:51 -08:00
Ryan Uber 4f4eb204f6 agent: first pass at service maintenance mode 2015-01-16 15:37:51 -08:00
Ryan Uber 263614d596 agent: error if binding to existing socket file 2015-01-16 12:39:15 -08:00
Nicholas Capo e116c03f86 command/agent: HTTP Check: Create httpClient in Start()
For long (>10s) interval checks the http timeout is 10s, otherwise thetimeout is the interval. This means that a check *should* return
before the next check begins.
2015-01-13 00:01:15 +00:00
Nicholas Capo f26a79c325 command/agent: Stop HTTP checks when the check is removed 2015-01-12 22:34:39 +00:00
Nicholas Capo f2649edcfc command/agent: Add simple HTTP check type
These checks make an `HTTP GET` request every Interval to the specified URL.
The status of the service depends on the HTTP Response Code.
`200` is passing, `503` is warning and anything else is failing.
2015-01-09 16:43:24 -06:00
Ryan Uber 86bcef81b5 agent: test service and check unloading 2015-01-07 22:32:14 -08:00
Ryan Uber 770bb60af8 agent: consolidate service loading code, better logging 2015-01-07 22:01:43 -08:00
Ryan Uber 33bd157a61 agent: separate service and check loading/unloading concerns 2015-01-07 22:01:20 -08:00
Ryan Uber dea0070a39 agent: persist service/check data using hashed service/check IDs (fixes #573) 2015-01-07 19:11:21 -08:00
Veres Lajos 850d5bdc32 typofixes - https://github.com/vlajos/misspell_fixer 2014-12-04 23:25:06 +00:00
Ryan Uber 9e52588cec agent: persist CheckType with health checks 2014-11-29 12:54:37 -08:00
Ryan Uber 4f14ba6326 agent: allow config reload to modify checks/services persistence
This change consolidates loading services and checks from both config
and persisted state into methods on the agent. As part of this, we
introduce optional persistence when calling RemoveCheck/RemoveService.

Fixes a bug where config reloads would kill persisted services/checks.
Also fixes an edge case:

1. A service or check is registered via the HTTP API
2. A new service or check definition with the same ID is added to config
3. Config is reloaded

The desired behavior (which this implements) is:

1. All services and checks deregistered in memory
2. All services and checks in config are registered first
3. All persisted checks are restored using the same logic as the agent
   start sequence, which prioritizes config over persisted, and removes
   any persistence files if new config counterparts are present.
2014-11-26 12:46:42 -08:00
Ryan Uber fe119a1bd2 agent: prefer config over persisted services/checks (#497) 2014-11-24 19:40:53 -08:00