Commit graph

3039 commits

Author SHA1 Message Date
Michael Schurter f1d13683e6
consul: remove services with/without canary tags
Guard against Canary being set to false at the same time as an
allocation is being stopped: this could cause RemoveTask to be called
with the wrong Canary value and leaking a service.

Deleting both Canary values is the safest route.
2018-05-07 14:55:01 -05:00
Michael Schurter 50e04c976e
consul: support canary tags for services
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.

Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Alex Dadgar df8fce4347
Ensure canaries tags are interpolated 2018-05-07 14:50:01 -05:00
Alex Dadgar 552604451c
rework where time gets set 2018-05-07 14:50:01 -05:00
Alex Dadgar ee50789c22
Initial implementation 2018-05-07 14:50:01 -05:00
Michael Schurter 0d534d30d6
Merge pull request #4251 from hashicorp/f-grpc-checks
Support Consul gRPC Health Checks
2018-05-04 14:55:16 -07:00
Michael Schurter f6a4713141 consul: make grpc checks more like http checks 2018-05-04 11:08:11 -07:00
Michael Schurter 382caec1e1 consul: initial grpc implementation
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Michael Schurter 526af6a246 framer: fix early exit/truncation in framer 2018-05-02 10:46:16 -07:00
Michael Schurter f1a6aa103a framer: fix race and remove unused error var
In the old code `sending` in the `send()` method shared the Data slice's
underlying backing array with its caller. Clearing StreamFrame.Data
didn't break the reference from the sent frame to the StreamFramer's
data slice.
2018-05-02 10:46:16 -07:00
Michael Schurter 7360fe3a6d client: squelch errors on cleanly closed pipes 2018-05-02 10:46:16 -07:00
Michael Schurter ffff97e25f client: don't spin on read errors 2018-05-02 10:46:16 -07:00
Michael Schurter 5ef0a82e6e client: reset encoders between uses
According to go/codec's docs, Reset(...) should be called on
Decoders/Encoders before reuse:

https://godoc.org/github.com/ugorji/go/codec

I could find no evidence that *not* calling Reset() caused bugs, but
might as well do what the docs say?
2018-05-02 10:46:16 -07:00
Alex Dadgar de4af37249 version bump and remove generated 2018-04-27 11:10:00 -07:00
Alex Dadgar 845a43864a generated files 2018-04-27 10:45:40 -07:00
Alex Dadgar 35e06ddb31 Remove generated and version bump 2018-04-26 16:49:19 -07:00
Alex Dadgar 43192cefae generated files 2018-04-26 16:28:58 -07:00
Michael Schurter 0e602d4779
Merge pull request #4188 from hashicorp/f-rkt-stats
rkt: create parent cgroup to enable stats
2018-04-24 14:54:36 -07:00
Michael Schurter d687761ebf rkt: test Stats() and always run tests
Remove the NOMAD_TEST_RKT flag as a guard for rkt tests. Still require
Linux, root, and rkt to be installed. Only check for rkt installation
once in hopes of speeding up rkt tests a bit.
2018-04-24 11:05:42 -07:00
Javier Palomo Almena 3e6c01ffa1 docker tests: Fix usage of NewDriverContext 2018-04-23 22:51:06 +02:00
Javier Palomo Almena 74d3c5df07 DriverContext: Add the TaskGroup and the Job name
Adding this fields to the DriverContext object, will allow us to pass
them to the drivers.

An use case for this, will be to emit tagged metrics in the drivers,
which contain all relevant information:
- Job
- TaskGroup
- Task
- ...

Ref: https://github.com/hashicorp/nomad/pull/4185
2018-04-23 00:15:29 +02:00
Michael Schurter 4cee6cca6c rkt: create parent cgroup to enable stats
Having the Nomad executor create parent cgroups that rkt is launched
within allows the stats collection code used for the exec driver to Just
Work. The only downside is that now the Nomad executor's resource
utilization counts against the cgroups resource limits just as it does
for the exec driver.
2018-04-19 15:14:56 -07:00
Michael Schurter 1a85d0c990 run goimports 2018-04-19 11:16:28 -07:00
Michael Schurter d77c265d1f
Merge pull request #4168 from ninoles/b-2117-windows-group-process
B 2117 windows group process
2018-04-19 11:10:51 -07:00
Michael Schurter fdbcbd4e5b
Merge pull request #4058 from hashicorp/f-mock-by-default
[Post-0.8] test: build with mock_driver by default
2018-04-18 15:57:00 -07:00
Michael Schurter d3650fb2cd test: build with mock_driver by default
`make release` and `make prerelease` set a `release` tag to disable
enabling the `mock_driver`
2018-04-18 14:45:33 -07:00
Michael Schurter a991923389 tests: fix race in alloc_runner_test.go
I could not reproduce the failure locally even with `stress -cpu ...`
eating all the cpu it could on my machine.

But I think the race was in one of two places:
* The task could restart which could create new events
* I think there could be a race between the updater's version of events
  and alloc runners as updates are async

I fixed both. Here's hoping that fixes this flaky test.
2018-04-17 17:14:59 -07:00
Fabien Ninoles c81bec48c9 Merge branch 'master' into b-2117-windows-group-process 2018-04-17 13:47:25 -04:00
Fabien Ninoles 35cf641416 Update based on PR request. 2018-04-17 13:43:04 -04:00
Alex Dadgar c4ad76091d
Merge pull request #4166 from hashicorp/b-panic-fix-update
Fixes races accessing node and updating it during fingerprinting
2018-04-17 10:02:19 -07:00
Chelsea Holland Komlo 9b8a079558 fix up comments 2018-04-17 11:53:08 -04:00
Alex Dadgar 9d612c8cb0 Cleanup 2018-04-16 15:48:34 -07:00
Alex Dadgar 32adaf9dfc Copy the config given to the alloc runner 2018-04-16 15:45:52 -07:00
Alex Dadgar 3ff2d4d795 fix race node access 2018-04-16 15:45:51 -07:00
Alex Dadgar 4f2a7b6949 Fix copying drivers 2018-04-16 15:45:51 -07:00
Alex Dadgar 0b799822ff Operate on copy 2018-04-16 15:45:49 -07:00
Fabien Ninoles 27cf4995ce - Clean up for windows compilation.
- Set CREATE_NEW_PROCESS_GROUP for Windows subprocess.
- Ensure we only kill actual process that need to.
2018-04-14 13:58:42 -04:00
Michael Schurter 3836b8a335
Merge pull request #3572 from emate/master
Create new process group on process startup.
2018-04-13 11:56:38 -07:00
Alex Dadgar adaf4fa7e0 Remove generated structs 2018-04-12 16:35:31 -07:00
Alex Dadgar 663c4d0433 Version bump and generated files 2018-04-12 16:21:50 -07:00
Alex Dadgar ff1a1a63e8 Move where attribute for driver detection is set 2018-04-12 15:50:25 -07:00
Chelsea Holland Komlo 5291788b40 delete driver name from only health check attributes 2018-04-12 18:24:41 -04:00
Alex Dadgar 3d53d380f7 Fix tests 2018-04-12 14:29:30 -07:00
Alex Dadgar f24ce2c50c Driver health detection cleanups
This PR does:

1. Health message based on detection has format "Driver XXX detected"
and "Driver XXX not detected"
2. Set initial health description based on detection status and don't
wait for the first health check.
3. Combine updating attributes on the node, fingerprint and health
checking update for drivers into a single call back.
4. Condensed driver info in `node status` only shows detected drivers
and make the output less wide by removing spaces.
2018-04-12 12:46:40 -07:00
Charlie Voiselle ba88f00ccb Changed "til" to "until"
Should be "till" or "until"; chose "until" because it is unambiguous as to meaning.
2018-04-11 12:36:28 -05:00
Chelsea Komlo eb5aac16e6
Merge pull request #4111 from hashicorp/b-undetected-set-health-to-false
Immediately set driver health status to false when driver moves to undetected
2018-04-10 18:30:31 -04:00
Chelsea Holland Komlo d58b3e473c update comment for when the fingerprinter setting health status 2018-04-10 16:53:00 -04:00
Chelsea Holland Komlo f7ef13cc64 fingerprinter should set health check status if health check is not periodic 2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo ede4f518bd add setters for access to the fingerprint manager's node
refactor extracting driver info
2018-04-10 15:29:51 -04:00
Chelsea Holland Komlo f479da19f5 guard against overwriting health status 2018-04-10 15:29:51 -04:00