Commit Graph

57 Commits

Author SHA1 Message Date
WangFenjin dcecbf4653 Fix typo 2015-12-08 17:45:01 +08:00
James Phillips eb4bfa3411 Prevents agents from considering Raft information when doing sync checks. 2015-10-28 14:32:00 -07:00
Armon Dadgar 0b77960349 agent: remove an N^2 check. See #1265 2015-10-12 20:30:11 -07:00
James Phillips 26eadcd95c Merge pull request #1235 from wuub/master
fix conflict between handleReload and antiEntropy critical sections
2015-09-17 07:28:39 -07:00
Wojciech Bederski 9a1b52171f panic when unbalanced localState.Resume() is detected 2015-09-17 11:32:08 +02:00
Wojciech Bederski 4cd1b09ad7 make Pause()/Resume()/isPaused() behave more like a semaphore
see: https://github.com/hashicorp/consul/issues/1173 #1173

Reasoning: somewhere during consul development Pause()/Resume() and
PauseSync()/ResumeSync() were added to protect larger changes to
agent's localState.  A few of the places that it tries to protect are:

- (a *Agent) AddService(...)      # part of the method
- (c *Command) handleReload(...)  # almost the whole method
- (l *localState) antiEntropy(...)# isPaused() prevents syncChanges()

The main problem is, that in the middle of handleReload(...)'s
critical section it indirectly (loadServices()) calls  AddService(...).
AddService() in turn calls Pause() to protect itself against
syncChanges(). At the end of AddService() a defered call to Resume() is
made.

With the current implementation, this releases
isPaused() "lock" in the middle of handleReload() allowing antiEntropy
to kick in while configuration reload is still in progress.
Specifically almost all services and probably all check are unloaded
when syncChanges() is allowed to run.

This in turn can causes massive service/check de-/re-registration,
and since checks are by default registered in the critical state,
a majority of services on a node can be marked as failing.
It's made worse with automation, often calling `consul reload` in close
proximity on many nodes in the cluster.

This change basically turns Pause()/Resume() into P()/V() of
a garden-variety semaphore. Allowing Pause() to be called multiple times,
and releasing isPaused() only after all matching/defered Resumes() are
called as well.

TODO/NOTE: as with many semaphore implementations, it might be reasonable
to panic() if l.paused ever becomes negative.
2015-09-11 18:28:06 +02:00
Shawn Cook 99be758411 Rename EnableTagOverride and update formatting 2015-09-11 08:35:29 -07:00
Shawn Cook f448a62826 Remove debug lines 2015-09-11 08:32:59 -07:00
Shawn Cook 854ff1eb41 Add EnableTagDrift logic to command/agent/local.go 2015-08-18 14:03:48 -07:00
Shawn Cook f6814c89ed EnableTagDrift in NodeService struct 2015-08-18 10:34:55 -07:00
Ryan Uber 9acc42b86e Merge pull request #891 from hashicorp/f-token
ACL tokens for service/check registration
2015-05-05 22:17:31 -07:00
Ryan Uber 3c577a0069 agent: use an additional parameter for passing tokens 2015-05-04 17:48:05 -07:00
Ryan Uber 2496a70452 agent: more tests 2015-04-28 13:06:02 -07:00
Ryan Uber f069db21e3 agent: safer read methods for tokens 2015-04-28 11:53:53 -07:00
Ryan Uber 23eab5ebcc agent: fix deadlock reading tokens from state 2015-04-27 22:26:03 -07:00
Ryan Uber 75d182296f agent: add service/check token methods to reduce invasiveness 2015-04-27 22:01:01 -07:00
Ryan Uber 545f3db3fe agent: initial pass threading through tokens for services/checks 2015-04-27 18:33:46 -07:00
artushin 68cb8fda58 remove config 2015-04-24 09:51:40 -05:00
artushin 88bfdd45ef use existing randomStagger 2015-04-23 17:08:17 -05:00
artushin 1e13acba68 adding check_update_stagger 2015-04-23 16:27:42 -05:00
Ryan Uber 8af963348c agent: handle nil node services in anti-entropy 2015-04-10 11:15:31 -07:00
Ryan Uber c3820b4c28 agent: fix anti-entropy check sync 2015-04-09 10:40:05 -07:00
Ryan Uber 80a6d7caca agent: anti-entropy sync services/checks if they don't exist in the catalog 2015-04-08 12:21:01 -07:00
foostan a75b727391 Validation ServiceID/CheckID when deleting in deleteService() in local.go 2015-01-27 18:11:57 +09:00
Ryan Uber ced676ffc2 agent: comments for new anti-entropy functionality 2015-01-20 21:48:46 -08:00
Ryan Uber 72e750fb00 agent: simplify anti-entropy of services with multiple checks, add tests 2015-01-20 21:48:46 -08:00
Ryan Uber 001d2e246d agent: only send service with check sync if it is out of sync 2015-01-20 21:48:46 -08:00
Ryan Uber 3b2e9d1b90 agent: refactor syncChecks 2015-01-20 21:48:46 -08:00
Ryan Uber 27de0adf3b agent: support multiple checks per service 2015-01-20 21:48:42 -08:00
Veres Lajos 850d5bdc32 typofixes - https://github.com/vlajos/misspell_fixer 2014-12-04 23:25:06 +00:00
Armon Dadgar b10159a87b agent: Handle service ACLs when doing anti-entropy 2014-12-01 11:43:01 -08:00
Ryan Uber b7a75a2314 formatting 2014-10-15 14:56:15 -07:00
Ryan Uber f83f80cf22 agent: remove special case of consul service, adjust tests 2014-10-15 14:52:00 -07:00
Armon Dadgar e972de44ad agent: Cleanup handling of defer checks 2014-06-10 10:42:55 -07:00
Armon Dadgar 801a472042 agent: Prevent anti-entropy from doing early sync of check output 2014-06-09 16:00:25 -07:00
Armon Dadgar bc9ea2af9a agent: leave inSync until the defer runs 2014-06-09 12:57:50 -07:00
Armon Dadgar 33c2132949 agent: Defer sync based on CheckUpdateInterval 2014-06-09 12:46:29 -07:00
Armon Dadgar bf009097ae agent: Ensure we don't retry too often 2014-06-06 14:38:01 -07:00
Armon Dadgar 7e27ded378 agent: Adding debug log messages 2014-04-23 12:21:47 -07:00
Armon Dadgar e210f9b541 agent: Adding random stagger to anti-entropy. Fixes #72. 2014-04-23 12:21:34 -07:00
Armon Dadgar 2845d24cfa Store check output in dedicated field. Fixes #59. 2014-04-21 16:20:22 -07:00
Armon Dadgar 293733b74f agent: Simplify the local state sync 2014-04-14 12:57:54 -07:00
Armon Dadgar b4a9b4301b agent: Handle API changes 2014-03-05 15:03:23 -08:00
Armon Dadgar 7e6c8ac3a8 agent: simplify a select block 2014-02-19 12:39:03 -08:00
Armon Dadgar fb34815562 agent: adding ability to reload services and checks 2014-02-07 12:19:56 -08:00
Armon Dadgar 7534346103 agent: Adding server up callback to make state sync faster 2014-02-07 12:11:34 -08:00
Armon Dadgar ec2b10c006 agent: Adding support to edge trigger consul server coming up for state sync 2014-02-07 12:03:31 -08:00
Armon Dadgar 1c2a853831 agent: adding ability to pause syncing 2014-02-07 11:58:24 -08:00
Armon Dadgar 500d2d2282 agent: RPC changes and blocking query support 2014-02-05 14:36:13 -08:00
Armon Dadgar 344ed9569c agent: syncCheck provides the relevant check to prevent a race condition 2014-01-30 13:17:34 -08:00