Commit Graph

949 Commits

Author SHA1 Message Date
Diptanu Choudhury b35836fdad Adding a debug log to indicate the exit code of failed check 2015-10-26 10:35:51 -07:00
Diptanu Choudhury 1b1518512c Registering the Exec with Docker Daemon everytime the check is invoked 2015-10-26 10:23:57 -07:00
Diptanu Choudhury c4b85010fb Implemented Docker health checks 2015-10-26 10:23:57 -07:00
James Phillips 2a51d55ce3 Runs go fmt (sorts new go-cleanhttp imports). 2015-10-23 17:14:35 -07:00
James Phillips b91e2d3a97 Completes rebase of network coordinates to new memdb. 2015-10-23 15:23:01 -07:00
James Phillips b9732e3c39 Gives RTT class a more Go-like name. 2015-10-23 15:23:01 -07:00
James Phillips c48e3e1179 Defaults second node to agent if not given. Removes -short option and tweaks output formatting. 2015-10-23 15:23:01 -07:00
James Phillips a51e170eef Cleans up after code review, adds a -short option to "consul rtt" command. 2015-10-23 15:23:01 -07:00
James Phillips 43df5c6784 Adds a "consul rtt" command. 2015-10-23 15:23:01 -07:00
James Phillips aea2194ce3 Makes the default protocol 2 and lets 3 interoperate with 2. 2015-10-23 15:23:01 -07:00
James Phillips baa223ab4a Runs go fmt after latest rebase. 2015-10-23 15:23:01 -07:00
James Phillips 8e67c37398 Adds a test for the `DisableCoordinate` config. 2015-10-23 15:23:01 -07:00
James Phillips 3f11bfaea4 Adds coordinate of agent to self endpoint. 2015-10-23 15:23:01 -07:00
James Phillips a74bdcba49 Moves sorting up into coordinate endpoint HTTP handlers. 2015-10-23 15:23:01 -07:00
James Phillips 0e7d5871c5 Installs a friendly handler for coordinate endpoints when coordinates are disabled. 2015-10-23 15:23:01 -07:00
James Phillips def8c6087d Fixes config merge fn for disabling coordinates and adds it to JSON. 2015-10-23 15:23:01 -07:00
James Phillips a2bcef00a0 Adds endpoints for raw network coordinates. 2015-10-23 15:23:01 -07:00
James Phillips fecd639b02 Changes ?near=self to a safer ?near=_agent, which is also clearer about what it does. 2015-10-23 15:23:01 -07:00
James Phillips ee1cf1e13b Adds distance sorting to health endpoint. Cleans up unit tests. 2015-10-23 15:23:01 -07:00
James Phillips 1724b9a6be Adds a magic "self" node name to distance queries. 2015-10-23 15:23:01 -07:00
James Phillips 59b710894d Adds tests for HTTP interface. Removes a stray mark. 2015-10-23 15:23:01 -07:00
James Phillips b63909cf67 Adds coordinate sorting support to catalog queries for nodes and service nodes. 2015-10-23 15:23:01 -07:00
James Phillips d7bac0e808 Turns down the coordinate sync rate a bit more. 2015-10-23 15:23:01 -07:00
James Phillips 9ba9a708f6 Scales coordinate sends to hit a fixed aggregate rate across the cluster. 2015-10-23 15:23:01 -07:00
James Phillips d8b8a3719f Simplifies the batching function and adds some comments. 2015-10-23 15:23:01 -07:00
James Phillips f71c79c53f Does some small cleanups based on PR feedback.
* Holds coordinate updates in map and gets rid of the update channel.
* Cleans up config variables a bit.
2015-10-23 15:23:01 -07:00
James Phillips 1222772452 Hardens Consul from bad coordinates from other nodes. 2015-10-23 15:23:01 -07:00
James Phillips acb0dce829 Moves batching down into the state store and changes it to fail-fast.
* A batch of updates is done all in a single transaction.
* We no longer need to get an update to kick things, there's a periodic flush.
* If incoming updates overwhelm the configured flush rate they will be dumped with an error.
2015-10-23 15:23:01 -07:00
James Phillips b6c31bdf2f Flips the sense of the coordinate enable option. 2015-10-23 15:23:01 -07:00
James Phillips 9c069c5031 Removes one more WAN leftover. 2015-10-23 15:23:01 -07:00
James Phillips edb9a119e2 Does a clean up pass on the Consul side. 2015-10-23 15:23:01 -07:00
James Phillips ac4185b888 Merges config changes after rebase. 2015-10-23 15:23:01 -07:00
Derek Chiang 139c9240ea Address comments 2015-10-23 15:23:01 -07:00
Derek Chiang 840474f170 Add a test case 2015-10-23 15:23:01 -07:00
Derek Chiang 23c08aeeb4 Use IndexedCoordinate instead 2015-10-23 15:23:01 -07:00
Derek Chiang 530e73212a Some fixes 2015-10-23 15:23:01 -07:00
Derek Chiang 2ef802b8b3 Fix a comment 2015-10-23 15:23:01 -07:00
Derek Chiang b2cff43bb5 Complete logic for sending coordinates 2015-10-23 15:23:01 -07:00
Derek Chiang 66b210afcb Some fixes 2015-10-23 15:23:01 -07:00
Derek Chiang b5bbe2bcfa Adding tests and stuff 2015-10-23 15:23:01 -07:00
Armon Dadgar d035dbd43b Merge pull request #1318 from daveadams/f-http-header-token
Allow specifying Consul token in an HTTP request header
2015-10-22 13:33:47 -07:00
Jeff Mitchell 9267f956a2 Update cleanhttp repo location 2015-10-22 14:14:22 -04:00
Jeff Mitchell 06bb9d5f36 Use cleanhttp to get rid of DefaultTransport 2015-10-22 10:47:50 -04:00
David Adams 5f175add40 Add HTTP request header X-Consul-Token
Add support for an X-Consul-Token HTTP request header to specify the
token with which this request should be fulfilled. The header would have
precedence over the responding Agent's default token, but would have
lower precedence than a token specified in the query string.
2015-10-19 11:26:01 -05:00
James Phillips 263c7e3fd3 Deletes the old state store and all its accoutrements. 2015-10-15 14:59:09 -07:00
James Phillips bcdabe4606 Knocks out the Raft indexes before doing compare. 2015-10-15 14:59:09 -07:00
James Phillips cbcd977a39 Gets new structs changes to compile, adds some corner case handling and extra unit tests. 2015-10-15 14:59:09 -07:00
Ryan Uber b46f878747 Merge pull request #1309 from hashicorp/f-remove-migrate
Removes consul-migrate for 0.6
2015-10-15 14:50:19 -07:00
Jeff Mitchell 9cddc187b5 Don't use http.DefaultClient
Two of the changes are in tests; the one of consequence is in the API.
As explained in #1308 this can cause conflicts with downstream programs.

Fixes #1308.
2015-10-15 17:49:35 -04:00
Ryan Uber aba1b26015 agent: consolidates data dir checker 2015-10-15 14:21:35 -07:00
Ryan Uber 8bc51eb237 agent: test mdb dir protection 2015-10-15 14:15:41 -07:00
Ryan Uber 2a7609d6bc agent: remove migrator, refuse to start if mdb dir found 2015-10-15 14:15:08 -07:00
Armon Dadgar 0b77960349 agent: remove an N^2 check. See #1265 2015-10-12 20:30:11 -07:00
Michael Puncel a94589ad67 Add http method to log output 2015-10-02 18:33:06 -07:00
James Phillips 26eadcd95c Merge pull request #1235 from wuub/master
fix conflict between handleReload and antiEntropy critical sections
2015-09-17 07:28:39 -07:00
Wojciech Bederski 9a1b52171f panic when unbalanced localState.Resume() is detected 2015-09-17 11:32:08 +02:00
Dale Wijnand c5168e1263 Fix a bunch of typos. 2015-09-15 13:22:08 +01:00
James Phillips b25797a808 Merge pull request #1187 from sfncook/enable_tag_drift_03
Enable tag drift 03
2015-09-11 15:35:32 -07:00
Anthony Scalisi 8d733b7fca remove various typos 2015-09-11 12:29:54 -07:00
Wojciech Bederski 4cd1b09ad7 make Pause()/Resume()/isPaused() behave more like a semaphore
see: https://github.com/hashicorp/consul/issues/1173 #1173

Reasoning: somewhere during consul development Pause()/Resume() and
PauseSync()/ResumeSync() were added to protect larger changes to
agent's localState.  A few of the places that it tries to protect are:

- (a *Agent) AddService(...)      # part of the method
- (c *Command) handleReload(...)  # almost the whole method
- (l *localState) antiEntropy(...)# isPaused() prevents syncChanges()

The main problem is, that in the middle of handleReload(...)'s
critical section it indirectly (loadServices()) calls  AddService(...).
AddService() in turn calls Pause() to protect itself against
syncChanges(). At the end of AddService() a defered call to Resume() is
made.

With the current implementation, this releases
isPaused() "lock" in the middle of handleReload() allowing antiEntropy
to kick in while configuration reload is still in progress.
Specifically almost all services and probably all check are unloaded
when syncChanges() is allowed to run.

This in turn can causes massive service/check de-/re-registration,
and since checks are by default registered in the critical state,
a majority of services on a node can be marked as failing.
It's made worse with automation, often calling `consul reload` in close
proximity on many nodes in the cluster.

This change basically turns Pause()/Resume() into P()/V() of
a garden-variety semaphore. Allowing Pause() to be called multiple times,
and releasing isPaused() only after all matching/defered Resumes() are
called as well.

TODO/NOTE: as with many semaphore implementations, it might be reasonable
to panic() if l.paused ever becomes negative.
2015-09-11 18:28:06 +02:00
Wojciech Bederski 24ac26b3c1 failing test showing that nested Pause()/Resume() release too early
see: #1173 / https://github.com/hashicorp/consul/issues/1173
2015-09-11 17:52:57 +02:00
Shawn Cook 99be758411 Rename EnableTagOverride and update formatting 2015-09-11 08:35:29 -07:00
Shawn Cook f448a62826 Remove debug lines 2015-09-11 08:32:59 -07:00
Shawn Cook 2f04917261 Merge remote-tracking branch 'hashicorp/master' into enable_tag_drift_03 2015-09-10 14:55:30 -07:00
James Phillips d00889c3a4 Adds missing token to maint unit test. 2015-09-10 14:53:00 -07:00
Shawn Cook 8a86eee9fb Add test cases TestAgentAntiEntropy_EnableTagDrift 2015-09-10 14:08:16 -07:00
Ryan Uber 08d12e978f Merge pull request #1230 from hashicorp/f-maintfix
Respect tokens in maintenance mode
2015-09-10 12:30:07 -07:00
Ryan Uber 948bd57d6a agent: testing node/service maintenance using tokens 2015-09-10 12:08:08 -07:00
Ryan Uber e129a59316 agent: thread tokens through for maintenance mode 2015-09-10 11:43:59 -07:00
Wim 3d7c3725d8 Allow AAAA queries for nodeLookup 2015-09-08 16:54:36 +02:00
Ryan Breen d63749b30e Merge pull request #1217 from 42wim/fix-rfc2308-part3
No NXDOMAIN when the answer is empty
2015-09-04 10:42:38 -04:00
Armon Dadgar 56efa4958b Merge pull request #1214 from zendesk/fix_lock_race_2
lock.go: fix another race condition
2015-09-02 16:04:55 -07:00
Wim 2336c6a4bd No NXDOMAIN when the answer is empty 2015-09-02 16:12:22 +02:00
Ryan Breen a013095f62 Merge pull request #1167 from railsguru/master
Add -http-port option to change the HTTP API port
2015-09-02 01:15:55 -04:00
Armon Dadgar 655666170a agent: Always enable the UI endpoints 2015-09-01 18:28:32 -07:00
Michael S. Fischer 01ec256c7e lock.go: fix another race condition
The previous fix to `consul lock` (commit 6875e8d) didn't completely
eliminate the race that could occur if the lock was acquired around the
same time SIGTERM was received:  It was still possible for
Run() to spawn the process via startChild() after killChild() had
released the shared mutex.

Now, when SIGTERM is received, we acquire a mutex that prevents
spawning a new process and never release it.

We've tested this fix pretty thoroughly and believe it completely
resolves the issue.
2015-09-01 14:27:23 -07:00
Wim e97973c1e1 Limit the DNS responses after getting the NodeRecords 2015-09-01 23:23:05 +02:00
Ryan Breen 56d2fa4c17 Merge pull request #1195 from 42wim/fix-rfc2308-part2
Return SOA/NXDOMAIN when the answer is empty
2015-09-01 17:08:31 -04:00
Wim b806aceef4 Return SOA/not found when the answer is empty 2015-09-01 22:28:12 +02:00
James Phillips 0f49e1c3a9 Merge pull request #1200 from ryotarai/lock-pass-stdin
command/lock: Pass stdin to child process when -pass-stdin passed.
2015-08-31 21:14:45 -07:00
Ryan Uber d6b71de3f4 agent: reload SCADA client if endpoint changes 2015-08-27 13:29:07 -07:00
Ryan Uber 5bd7a5f239 command: atlas endpoint can be passed 2015-08-27 11:11:05 -07:00
Ryan Uber cda2bf6975 agent: atlas_endpoint is configurable 2015-08-27 11:08:01 -07:00
Ryota Arai c45f2971e7 command/lock: Pass stdin to child process when -pass-stdin passed. 2015-08-26 16:27:21 +09:00
Ryan Uber 00d78963bf agent: log a message when making a new scada connection 2015-08-25 21:03:16 -07:00
Ryan Uber 33cadcf925 agent: don't reload scada client if there is no config change 2015-08-25 20:43:57 -07:00
Ryan Uber 8eea77d58f agent: testing scada client creation in command 2015-08-25 20:22:22 -07:00
Ryan Uber 495cc41ba4 agent: test scada HTTP server creation 2015-08-25 18:51:04 -07:00
Ryan Uber e3cd2f2c0d agent: clean up scada connection manager 2015-08-25 18:27:07 -07:00
Ryan Uber bc96c14a6f agent: document the scada http creation func 2015-08-25 17:19:11 -07:00
Ryan Uber 1378fd93b0 agent: scada client and HTTP server are tracked separately 2015-08-25 16:59:53 -07:00
Andy Lo-A-Foe 325b54649a Remove duplicate code 2015-08-20 20:46:20 +02:00
Andy Lo-A-Foe 3d133ab78c Use Ports.HTTP directly 2015-08-20 20:27:20 +02:00
Andy Lo-A-Foe 7e2ecf6a3c Add documentation for http-port option 2015-08-20 20:19:35 +02:00
Shawn Cook d4ec6aa630 Update tests - NodeService init needs bool 2015-08-20 09:09:26 -07:00
Shawn Cook 854ff1eb41 Add EnableTagDrift logic to command/agent/local.go 2015-08-18 14:03:48 -07:00
Shawn Cook 3a740ac07b Remove from command/agent/config_test.go 2015-08-18 10:42:25 -07:00
Shawn Cook f6814c89ed EnableTagDrift in NodeService struct 2015-08-18 10:34:55 -07:00
Ryan Uber 5024e7c3c7 Merge pull request #1166 from hashicorp/f-dns-log
Log network address of DNS clients
2015-08-13 18:32:32 -07:00
Ryan Uber 07299a61dc agent: log network address of DNS clients 2015-08-11 10:33:27 -07:00
Andy Lo-A-Foe 7a3da7db26 Add -http-port option to change the HTTP API port
This is useful when pushing consul to PaaS like
Cloud Foundry making the HTTP API easily routable.
2015-08-11 14:14:21 +02:00
Armon Dadgar c2a865686c Merge pull request #1158 from mfischer-zd/fix_1155
lock.go: fix race condition
2015-08-05 14:56:13 -07:00
Michael S. Fischer 9e0e9e389d lock.go: fix race condition
Fix a race condition between startChild() and killChild() that could
lead to an orphaned managed process.

Fixes #1155
2015-08-05 09:06:51 -07:00
J.R. Garcia f95ba4b8eb Remove trailing slash from lock
Lock command will remove trailing slash from path (as it is invalid).
Fixes #1136.
2015-07-30 12:14:17 -05:00
Ryan Breen b13ba78651 Merge pull request #1143 from hashicorp/GH-1142
Check NXDOMAIN after filtering nodes
2015-07-29 18:56:08 -04:00
Ryan Breen 20f17a0cc8 Test for GH-1142. 2015-07-29 18:21:16 -04:00
Armon Dadgar 0a853bca2e Merge pull request #1137 from 42wim/fix-1124
Recurse when PTR answer is empty
2015-07-29 14:39:04 -07:00
Ryan Breen dbb5ad1316 Check NXDOMAIN after filtering nodes
Move the check for NXDOMAIN below the service health filter.
2015-07-29 17:16:48 -04:00
Ryan Uber 6e38497ce9 Merge pull request #1141 from hashicorp/f-travis
Try moving to newer Travis-CI infrastructure
2015-07-28 10:42:56 -07:00
Ryan Uber f48bc65e62 travis-ci: skip syslog tests for container-based travis infra 2015-07-28 09:58:43 -07:00
Wim a6734ab4b8 Recurse when PTR answer is empty 2015-07-27 23:22:36 +02:00
Armon Dadgar 2017c95d13 Merge pull request #1130 from pdf/check_socket
Add Socket check type
2015-07-27 14:21:24 -07:00
Ryan Uber 6d38027689 Merge pull request #1090 from hashicorp/f-keyring-acl
Keyring ACLs
2015-07-24 10:23:18 -07:00
Peter Fern 86edd7169c Add TCP check type
Adds the ability to simply check whether a TCP socket accepts
connections to determine if it is healthy.  This is a light-weight -
though less comprehensive than scripting - method of checking network
service health.

The check parameter `tcp` should be set to the `address:port`
combination for the service to be tested.  Supports both IPv6 and IPv4,
in the case of a hostname that resolves to both, connections will be
attempted via both protocol versions, with the first successful
connection returning a successful check result.

Example check:

```json
{
  "check": {
    "id": "ssh",
    "name": "SSH (TCP)",
    "tcp": "example.com:22",
    "interval": "10s"
  }
}
```
2015-07-24 14:06:05 +10:00
Ryan Uber a339ca19a6 agent: disable ACLs for RPC client tests 2015-07-23 17:09:33 -07:00
Armon Dadgar 6f42f6640c command/lock: Check for shutdown during lock acquisition. Fixes #800 2015-07-22 16:07:44 -07:00
Benjamin Abbott-Scott eece6e4848 Return every time lock acquisition fails 2015-07-22 10:44:47 -07:00
Ryan Uber f424f94d45 agent: vet fixes 2015-07-14 11:42:51 -07:00
Ryan Uber 8e834ba976 Merge pull request #995 from 42wim/rfc2308-soa-ttl
Send SOA with negative responses (RFC2308)
2015-07-13 08:49:25 -07:00
Ryan Uber 0a09b11cba agent: testing keyring ACLs 2015-07-07 15:14:06 -06:00
Ryan Uber e764eb62bd agent: write-level keyring ACLs work 2015-07-07 10:36:51 -06:00
Ryan Uber 665551a182 agent: read-level keyring ACLs work 2015-07-07 10:30:34 -06:00
Ryan Uber 503fa1eed1 Merge pull request #1046 from hashicorp/f-event-acl
Event ACLs
2015-07-02 07:02:07 -07:00
Ryan Uber 715f69cce8 Merge pull request #1004 from i0rek/advertise_addrs
Add advertise_addrs.
2015-06-23 12:32:07 -07:00
Hans Hasselberg 4598087df3 Implement advertise_addrs for SerfLan, SerfWan and RPC.
Fixes #550.
This will make it possible to configure the advertised adresses for
SerfLan, SerfWan and RPC. It will enable multiple consul clients on a
single host which is very useful in a container environment.

This option might override advertise_addr and advertise_addr_wan
depending on the configuration.

It will be configureable with advertise_addrs. Example:

{
  "advertise_addrs": {
    "serf_lan": "10.0.120.91:4424",
    "serf_wan": "201.20.10.61:4423",
    "rpc": "10.20.10.61:4424"
  }
}
2015-06-23 21:23:45 +02:00
Ryan Uber 5866588d68 command: remote exec takes -token parameter 2015-06-22 17:16:28 -07:00
Ryan Uber 1f68b5268b command: event command supports -token arg 2015-06-22 16:59:41 -07:00
Ryan Uber 5c9cf8cc69 agent: avoid masking errors when ACLs deny a request 2015-06-18 18:13:29 -07:00
Ryan Uber 4a95f29e03 agent: testing user event endpoint ACLs 2015-06-18 18:13:29 -07:00
Ryan Uber e6923a4832 consul: always fire events from server nodes 2015-06-18 18:13:29 -07:00
Ryan Uber 713f1781c8 agent: enforce event policy during event fire 2015-06-18 18:13:29 -07:00
Wim e70e57a0dc Send SOA with negative responses 2015-06-14 00:03:44 +02:00
Ryan Uber b7674a2d35 Merge pull request #1028 from sebastianmarkow/master
Remove unreachable error handling in AgentRPC.listen()
2015-06-12 22:28:10 -07:00
Ryan Uber b2556bd4d0 agent: testing dns when acls are in use 2015-06-12 16:01:57 -07:00
Ryan Uber 48cb636e8d agent: dns uses the configured token during queries 2015-06-12 16:01:57 -07:00
Sebastian Klatt 34611c8599
consul: Remove unreachable error handling 2015-06-12 20:21:32 +02:00
Ryan Uber 47073c87c1 agent: don't replace empty tokens in the logs, fixes #1020 2015-06-12 00:11:37 -07:00
Ryan Uber b5fff74c8a agent: fix failing test 2015-06-11 15:13:10 -07:00
Ryan Uber e4f937f71d agent: use persist/load/purge convention for function names 2015-06-08 09:35:10 -07:00
Ryan Uber 08a12ffd5b agent: test check state restoration from AddCheck 2015-06-05 17:33:34 -07:00
Ryan Uber bd448d27fe agent: check state is purged if expired 2015-06-05 16:59:41 -07:00
Ryan Uber 7d8993e0f0 agent: purge check state when checks are deregistered 2015-06-05 16:57:14 -07:00
Ryan Uber 8c9facbff0 agent: testing state persistence, recovery, and expiration 2015-06-05 16:45:05 -07:00
Ryan Uber f1eb739949 agent: first stab at persisting check state 2015-06-05 16:17:07 -07:00
Ryan Uber 7e82b3c32c agent: refactor loadChecks/loadServices, fixes a few minor bugs 2015-06-04 14:33:30 -07:00
Ryan Uber 359abdb9a8 agent: don't replace config on SIGHUP if parsing fails 2015-05-30 22:50:24 -07:00
Emil Hessman c746d218f4 command/agent: skip unix file permissions test on windows 2015-05-29 21:12:45 +02:00
Adam Renberg 263fae2292 Sort members in by name for consul members 2015-05-22 10:37:54 +02:00
Adam Renberg bec5ce6af9 Sort tags in consul members -detailed output 2015-05-22 10:27:47 +02:00
Ryan Uber b5cd115795 agent: flush progress info to console during migrations 2015-05-19 18:47:44 -07:00