Commit Graph

18874 Commits

Author SHA1 Message Date
Ryan Breen ec5f5623b4 Merge pull request #1234 from jovandeginste/quote-variable
Add quotes to locations in case pwd contains spaces
2015-09-11 12:31:47 -04:00
Wojciech Bederski 4cd1b09ad7 make Pause()/Resume()/isPaused() behave more like a semaphore
see: https://github.com/hashicorp/consul/issues/1173 #1173

Reasoning: somewhere during consul development Pause()/Resume() and
PauseSync()/ResumeSync() were added to protect larger changes to
agent's localState.  A few of the places that it tries to protect are:

- (a *Agent) AddService(...)      # part of the method
- (c *Command) handleReload(...)  # almost the whole method
- (l *localState) antiEntropy(...)# isPaused() prevents syncChanges()

The main problem is, that in the middle of handleReload(...)'s
critical section it indirectly (loadServices()) calls  AddService(...).
AddService() in turn calls Pause() to protect itself against
syncChanges(). At the end of AddService() a defered call to Resume() is
made.

With the current implementation, this releases
isPaused() "lock" in the middle of handleReload() allowing antiEntropy
to kick in while configuration reload is still in progress.
Specifically almost all services and probably all check are unloaded
when syncChanges() is allowed to run.

This in turn can causes massive service/check de-/re-registration,
and since checks are by default registered in the critical state,
a majority of services on a node can be marked as failing.
It's made worse with automation, often calling `consul reload` in close
proximity on many nodes in the cluster.

This change basically turns Pause()/Resume() into P()/V() of
a garden-variety semaphore. Allowing Pause() to be called multiple times,
and releasing isPaused() only after all matching/defered Resumes() are
called as well.

TODO/NOTE: as with many semaphore implementations, it might be reasonable
to panic() if l.paused ever becomes negative.
2015-09-11 18:28:06 +02:00
Jo Vandeginste 29a1f07c07 Add quotes to locations in case pwd contains spaces 2015-09-11 18:19:22 +02:00
Wojciech Bederski 24ac26b3c1 failing test showing that nested Pause()/Resume() release too early
see: #1173 / https://github.com/hashicorp/consul/issues/1173
2015-09-11 17:52:57 +02:00
Shawn Cook 99be758411 Rename EnableTagOverride and update formatting 2015-09-11 08:35:29 -07:00
Shawn Cook f448a62826 Remove debug lines 2015-09-11 08:32:59 -07:00
James Phillips 51141558fd Merge pull request #1233 from hashicorp/b-maint-test
Adds missing token to maint unit test.
2015-09-10 15:07:44 -07:00
Shawn Cook 2f04917261 Merge remote-tracking branch 'hashicorp/master' into enable_tag_drift_03 2015-09-10 14:55:30 -07:00
James Phillips d00889c3a4 Adds missing token to maint unit test. 2015-09-10 14:53:00 -07:00
Shawn Cook 8a86eee9fb Add test cases TestAgentAntiEntropy_EnableTagDrift 2015-09-10 14:08:16 -07:00
Ryan Uber 7bcb88820e Update CHANGELOG.md 2015-09-10 12:31:13 -07:00
Ryan Uber 08d12e978f Merge pull request #1230 from hashicorp/f-maintfix
Respect tokens in maintenance mode
2015-09-10 12:30:07 -07:00
Ryan Uber 948bd57d6a agent: testing node/service maintenance using tokens 2015-09-10 12:08:08 -07:00
Ryan Uber e129a59316 agent: thread tokens through for maintenance mode 2015-09-10 11:43:59 -07:00
Ryan Breen 40e6527720 Merge pull request #1222 from 42wim/node-aaaa-queries
Allow AAAA queries for nodeLookup
2015-09-08 11:01:49 -04:00
Wim 3d7c3725d8 Allow AAAA queries for nodeLookup 2015-09-08 16:54:36 +02:00
Ryan Breen d63749b30e Merge pull request #1217 from 42wim/fix-rfc2308-part3
No NXDOMAIN when the answer is empty
2015-09-04 10:42:38 -04:00
Ryan Breen 42ee984dce Merge pull request #1218 from hashicorp/b-typo
Fixes a typo in the telemetry docs.
2015-09-03 10:08:39 -04:00
James Phillips c601f64c3f Fixes a typo in the telemetry docs. 2015-09-02 21:37:31 -07:00
Armon Dadgar 56efa4958b Merge pull request #1214 from zendesk/fix_lock_race_2
lock.go: fix another race condition
2015-09-02 16:04:55 -07:00
James Phillips f51f61a7d3 Merge pull request #1216 from hashicorp/sethvargo/update_middleman
Update Middleman
2015-09-02 09:11:10 -07:00
Seth Vargo 292acffcd9 Update Middleman 2015-09-02 10:14:06 -04:00
Wim 2336c6a4bd No NXDOMAIN when the answer is empty 2015-09-02 16:12:22 +02:00
William Tisäter 4ce2af3bd4 Bind to loopback only as last resort 2015-09-02 12:24:44 +02:00
William Tisäter 1013f2f494 Don't try to bind on address from inactive interface 2015-09-02 12:24:36 +02:00
William Tisäter 4267814e5b Treat 127.0.0.0/8 and 169.254.0.0/16 as private network 2015-09-02 12:24:14 +02:00
Ryan Breen a013095f62 Merge pull request #1167 from railsguru/master
Add -http-port option to change the HTTP API port
2015-09-02 01:15:55 -04:00
Ryan Uber c17d838d8a Merge pull request #1215 from hashicorp/f-ui-endpoint
agent: Always enable the UI endpoints
2015-09-01 21:31:47 -07:00
Andy Lo-A-Foe 46e614ae70 Position it alphabetically 2015-09-02 06:28:55 +02:00
Andy Lo-A-Foe e9ee1f9bea Update agent options section on the website 2015-09-02 05:36:09 +02:00
Armon Dadgar 655666170a agent: Always enable the UI endpoints 2015-09-01 18:28:32 -07:00
Ryan Breen 0bd7211eab Merge pull request #1194 from 42wim/fix-maxServiceResponses
Limit the DNS responses after getting the NodeRecords (fixes 0 A/AAAA responses)
2015-09-01 17:41:39 -04:00
Michael S. Fischer 01ec256c7e lock.go: fix another race condition
The previous fix to `consul lock` (commit 6875e8d) didn't completely
eliminate the race that could occur if the lock was acquired around the
same time SIGTERM was received:  It was still possible for
Run() to spawn the process via startChild() after killChild() had
released the shared mutex.

Now, when SIGTERM is received, we acquire a mutex that prevents
spawning a new process and never release it.

We've tested this fix pretty thoroughly and believe it completely
resolves the issue.
2015-09-01 14:27:23 -07:00
Wim e97973c1e1 Limit the DNS responses after getting the NodeRecords 2015-09-01 23:23:05 +02:00
Ryan Breen 56d2fa4c17 Merge pull request #1195 from 42wim/fix-rfc2308-part2
Return SOA/NXDOMAIN when the answer is empty
2015-09-01 17:08:31 -04:00
Ryan Breen 6b6e79dadf Merge pull request #1211 from kikitux/master
add consul-do to community tools
2015-09-01 16:57:35 -04:00
Alvaro Miranda 2c3634ca57 Update downloads_tools.html.erb 2015-09-02 08:50:57 +12:00
Wim b806aceef4 Return SOA/not found when the answer is empty 2015-09-01 22:28:12 +02:00
Ryan Breen 300078db4d Merge pull request #1213 from mainframe/nodefabric-patch-1
Adding NodeFabric reference to Community Tools
2015-09-01 16:17:39 -04:00
Andres Toomsalu 11db6aba88 Adding NodeFabric reference to Community Tools 2015-09-01 23:09:34 +03:00
Alvaro Miranda 7e32c6a1de add consul-do to community tools
adding consul-do Do something based on leadership status

https://github.com/zeroXten/consul-do

From README.md

Useful for running cronjobs in HA mode.

Run something like this on two or more servers:

* * * * * /usr/bin/consul-do JOB-1 $(/bin/hostname) && /path/to/job1
*/10 * * * * /usr/bin/consul-do JOB-2 $(/bin/hostname) && /path/to/job2
Only one of the servers will be elected leader and will therefore run the job. Should the leader fail, a follower will take over.
2015-09-02 00:52:20 +12:00
James Phillips 0f49e1c3a9 Merge pull request #1200 from ryotarai/lock-pass-stdin
command/lock: Pass stdin to child process when -pass-stdin passed.
2015-08-31 21:14:45 -07:00
James Phillips 5c839f73f5 Removes incorrect protocol version in change log. 2015-08-31 21:11:50 -07:00
Ryota Arai 50914e2388 website: description of -pass-stdin option 2015-09-01 11:00:26 +09:00
James Phillips 6f29337782 Merge pull request #1209 from hashicorp/f-downgrade
Bumps protocol version back down as we've made memberlist smarter.
2015-08-31 14:52:27 -07:00
James Phillips c0127e9932 Bumps protocol version back down as we've made memberlist smarter. 2015-08-31 11:16:34 -07:00
Ryan Breen 2ee887ecc3 Cleanup for guides/forwarding.html 2015-08-30 12:01:49 -04:00
Ryan Breen 9f99b2e7cc Merge pull request #1204 from tamsky/docs/forwarding-dnsmasq-example
add a dnsmasq example, explain the utility of 'recursors'
2015-08-30 01:17:21 -04:00
Marc Tamsky baede06a9e Explain 'recursors' behavior with an example. 2015-08-28 18:27:26 -07:00
Marc Tamsky 76aa082d5b add dnsmasq example, add pointer to 'recursors' 2015-08-28 18:10:37 -07:00