Commit Graph

164 Commits

Author SHA1 Message Date
Preetha Appan ce6d4a8d7a
Fix tests and move isClient to constructor 2018-06-01 15:59:53 -05:00
Preetha Appan a5bfaa098c
Fix unnecessary deregistration in consul sync
This commit fixes an issue where if a nomad client and server shared the same consul instance, the server would deregister any services and checks registered by clients for running tasks.
2018-06-01 14:48:25 -05:00
Alex Dadgar 51e67daf69 Use Tags when CanaryTags isn't specified
This PR fixes a bug where we weren't defaulting to `tags` when
`canary_tags` was empty and adds documentation.
2018-05-23 13:07:47 -07:00
Michael Schurter f1d13683e6
consul: remove services with/without canary tags
Guard against Canary being set to false at the same time as an
allocation is being stopped: this could cause RemoveTask to be called
with the wrong Canary value and leaking a service.

Deleting both Canary values is the safest route.
2018-05-07 14:55:01 -05:00
Michael Schurter 50e04c976e
consul: support canary tags for services
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.

Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Michael Schurter f6a4713141 consul: make grpc checks more like http checks 2018-05-04 11:08:11 -07:00
Michael Schurter 382caec1e1 consul: initial grpc implementation
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Michael Schurter cfcbb9fa21 consul: periodically reconcile services/checks
Periodically sync services and checks from Nomad to Consul. This is
mostly useful when testing with the Consul dev agent which does not
persist state across restarts. However, this is a reasonable safety
measure to prevent skew between Consul's state and Nomad's
services+checks.

Also modernized the test suite a bit.
2018-04-19 15:45:42 -07:00
Michael Schurter d3650fb2cd test: build with mock_driver by default
`make release` and `make prerelease` set a `release` tag to disable
enabling the `mock_driver`
2018-04-18 14:45:33 -07:00
Michael Schurter 86f562be3a Remove unnecessary conversions 2018-03-16 16:32:59 -07:00
Michael Schurter c3e8f6319c gofmt -s (simplify) files 2018-03-16 16:31:16 -07:00
Michael Schurter 0971114f0c Replace Consul TLSSkipVerify handling
Instead of checking Consul's version on startup to see if it supports
TLSSkipVerify, assume that it does and only log in the job service
handler if we discover Consul does not support TLSSkipVerify.

The old code would break TLSSkipVerify support if Nomad started before
Consul (such as on system boot) as TLSSkipVerify would default to false
if Consul wasn't running. Since TLSSkipVerify has been supported since
Consul 0.7.2, it's safe to relax our handling.
2018-03-14 17:43:06 -07:00
Josh Soref 1359fd2c3d spelling: unexpected 2018-03-11 19:08:07 +00:00
Josh Soref 42d7f19861 spelling: supports 2018-03-11 19:00:11 +00:00
Josh Soref 05305afcd9 spelling: services 2018-03-11 18:53:58 +00:00
Josh Soref ad55e85e73 spelling: registrations 2018-03-11 18:40:53 +00:00
Josh Soref 3c1ce6d16d spelling: otherwise 2018-03-11 18:34:27 +00:00
Josh Soref 85fabc63c8 spelling: expected 2018-03-11 17:57:01 +00:00
Josh Soref 7a6dfa4b1a spelling: current 2018-03-11 17:52:32 +00:00
Josh Soref 5f87691df1 spelling: asynchronously 2018-03-11 17:41:50 +00:00
Michael Schurter 8a0cf66822 Improve invalid port error message for services
Related to #3681

If a user specifies an invalid port *label* when using
address_mode=driver they'll get an error message about the label being
an invalid number which is very confusing.

I also added a bunch of testing around Service.AddressMode validation
since I was concerned by the linked issue that there were cases I was
missing. Unfortunately when address_mode=driver is used there's only so
much validation that can be done as structs/structs.go validation never
peeks into the driver config which would be needed to verify the port
labels/map.
2018-01-18 15:35:24 -08:00
Michael Schurter 447dc5bbd3 Fix test 2018-01-18 15:35:24 -08:00
Michael Schurter 583e17fad5 Always advertise driver IP when in driver mode
Fixes #3681

When in drive address mode Nomad should always advertise the driver's IP
in Consul even when no network exists. This matches the 0.6 behavior.

When in host address mode Nomad advertises the alloc's network's IP if
one exists. Otherwise it lets Consul determine the IP.

I also added some much needed logging around Docker's network discovery.
2018-01-18 15:35:24 -08:00
Michael Schurter 714eb0b266 Services should not require a port
Fixes #3673
2017-12-19 15:50:23 -08:00
Michael Schurter cdcefd0908 Use the Service.Hash() method in agent service ids
The allocID and taskName parameters are useless for agents, but it's
still nice to reuse the same hash method for agent and task services.
This brings in the lowercase mode for the agent hash as well.
2017-12-11 16:50:15 -08:00
Michael Schurter 4f1002c1a8 Be more defensive in port checks 2017-12-08 12:27:57 -08:00
Michael Schurter d613e0aaf5 Move service hash logic to Service.Hash method 2017-12-08 12:03:43 -08:00
Michael Schurter b71edf846f Hash fields used in task service IDs
Fixes #3620

Previously we concatenated tags into task service IDs. This could break
deregistration of tag names that contained double //s like some Fabio
tags.

This change breaks service ID backward compatibility so on upgrade all
users services and checks will be removed and re-added with new IDs.

This change has the side effect of including all service fields in the
ID's hash, so we no longer have to track PortLabel and AddressMode
changes independently.
2017-12-08 12:03:43 -08:00
Michael Schurter 91282315d1 Prevent using port 0 with address_mode=driver 2017-12-08 12:03:43 -08:00
Michael Schurter 4b20441eef Validate port label for host address mode
Also skip getting an address for script checks which don't use them.

Fixed a weird invalid reserved port in a TaskRunner test helper as well
as a problem with our mock Alloc/Job. Hopefully the latter doesn't cause
other tests to fail, but we were referencing an invalid PortLabel and
just not catching it before.
2017-12-08 12:03:43 -08:00
Michael Schurter 4347026f83 Test Consul from TaskRunner thoroughly
Rely less on the mockConsulServiceClient because the real
consul.ServiceClient needs all the testing it can get!
2017-12-08 12:03:00 -08:00
Michael Schurter 4ae115dc59 Allow custom ports for services and checks
Fixes #3380

Adds address_mode to checks (but no auto) and allows services and checks
to set literal port numbers when using address_mode=driver.

This allows SDNs, overlays, etc to advertise internal and host addresses
as well as do checks against either.
2017-12-08 12:03:00 -08:00
Jens Herrmann 5680fcccc2 Fix typos in metric names. #3610 2017-12-01 15:24:14 +01:00
Michael Schurter 0aace3d749 Don't set Interval on TTL health checks 2017-10-16 17:35:47 -07:00
Alex Dadgar 4173834231 Enable more linters 2017-09-26 15:26:33 -07:00
Michael Schurter a844fba8d2 Fix comments: task -> check 2017-09-15 15:19:53 -07:00
Michael Schurter 0f2a3dcec9 Test check watch updates 2017-09-14 16:48:39 -07:00
Michael Schurter 847fe080f6 Rename unhealthy var and fix test indeterminism 2017-09-14 16:48:39 -07:00
Michael Schurter 573a0df03d Watched -> TriggersRestart
Watched was a silly name
2017-09-14 16:48:39 -07:00
Michael Schurter 4ea19baa52 Handle multiple failing checks on a single task
Before this commit if a task had 2 checks cause restarts at the same
time, both would trigger restarts of the task! This change removes all
checks for a task whenever one of them is restarted.
2017-09-14 16:48:39 -07:00
Michael Schurter 73fb71ca10 RestartDelay isn't needed as checks are re-added on restarts
@dadgar made the excellent observation in #3105 that TaskRunner removes
and re-registers checks on restarts. This means checkWatcher doesn't
need to do *any* internal restart tracking. Individual checks can just
remove themselves and be re-added when the task restarts.
2017-09-14 16:48:39 -07:00
Michael Schurter 448ad3945f Simplify from 2 select loops to one 2017-09-14 16:48:39 -07:00
Michael Schurter 550e631eea Wrap check watch updates in a struct
Reusing checkRestart for both adds/removes and the main check restarting
logic was confusing.
2017-09-14 16:48:39 -07:00
Michael Schurter 72e5c0c0aa Fix whitespace 2017-09-14 16:47:41 -07:00
Michael Schurter ade29ecbed Improve check watcher logging and add tests
Also expose a mock Consul Agent to allow testing ServiceClient and
checkWatcher from TaskRunner without actually talking to a real Consul.
2017-09-14 16:47:41 -07:00
Michael Schurter a137676358 Add comments and move delay calc to TaskRunner 2017-09-14 16:46:54 -07:00
Michael Schurter a180c00fc3 on_warning=false -> ignore_warnings=false
Treat warnings as unhealthy by default
2017-09-14 16:46:54 -07:00
Michael Schurter 8a87475498 Use existing restart policy infrastructure 2017-09-14 16:46:54 -07:00
Michael Schurter 22690c5f4c Add check watcher for restarting unhealthy tasks 2017-09-14 16:46:54 -07:00
Michael Schurter 7f6e1f3a9c Initializing embedded structs is weird 2017-08-17 16:49:14 -07:00