Commit graph

16904 commits

Author SHA1 Message Date
Drew Bailey 65aa6a674d
Merge pull request #6872 from hashicorp/dep-gohclog
update go-hclog to latest release
2019-12-18 13:24:56 -05:00
Drew Bailey 31777c3fdb
update go-hclog to latest release 2019-12-18 12:53:33 -05:00
John Schlederer 8b35c75206 Making pull activity timeout configurable in Docker
* Making pull activity timeout configurable in Docker plugin config, first pass

* Fixing broken function call

* Fixing broken tests

* Fixing linter suggestion

* Adding documentation on new parameter in Docker plugin config

* Adding unit test

* Setting min value for pull_activity_timeout, making pull activity duration a private var
2019-12-18 12:58:53 +01:00
Lang Martin 06f441f562 test: quota: relax multierror message matching to Contains 2019-12-17 13:20:14 -05:00
Lang Martin fb6c27b828 test: build quota_apply_test, remove the tests that require ent 2019-12-17 13:20:14 -05:00
Mahmood Ali 504ce7f12c
Merge pull request #6865 from hashicorp/b-cli-error-formatting
cli: sequence cli.Ui operations
2019-12-16 16:06:13 -05:00
Drew Bailey 2030a9bd16
update changelog for group shutdown_delay 2019-12-16 11:41:37 -05:00
Drew Bailey d9e41d2880
docs for shutdown delay
update docs, address pr comments

ensure pointer is not nil

use pointer for diff tests, set vs unset
2019-12-16 11:38:35 -05:00
Drew Bailey ae145c9a37
allow only positive shutdown delay
more explicit test case, remove select statement
2019-12-16 11:38:30 -05:00
Drew Bailey 24929776a2
shutdown delay for task groups
copy struct values

ensure groupserviceHook implements RunnerPreKillhook

run deregister first

test that shutdown times are delayed

move magic number into variable
2019-12-16 11:38:16 -05:00
Mahmood Ali 76be9b4afb cli: sequence cli.Ui operations
Fixes a bug where if a command flag parsing errors, the resulting error
and help usage messages get interleaved in unexpected and non-user
friendly way.

The reason is that we have flag parsing library effectively writes to
ui.Error in a goroutine.  This is problematic: first, we lose the sequencing between help
usage and error message; second, cli.Ui methods are not concurrent safe.

Here, we introduce a custom error writer that buffers result and calls
ui.Error() in the write method and in the same goroutine.

For context, we need to wrap ui.Error because it's line-oriented, while
flags library expects a io.Writer which is bytes oriented.
2019-12-16 10:08:17 -05:00
Danielle fc85ec7295
env_aws: Disable Retries and set Session cfg (#6860)
env_aws: Disable Retries and set Session cfg
2019-12-16 15:25:29 +01:00
Danielle b006be623d
Update client/fingerprint/env_aws.go
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2019-12-16 14:48:52 +01:00
Seth Hoenig 90716f13d3
Merge pull request #6848 from hashicorp/b-log-less-in-node-drainer
tests: remove trace statements from nodeDrainWatcher.watch
2019-12-16 07:47:52 -06:00
Tim Gross 9b2b4da3a4
e2e: run client/allocs metrics nightly tests vs Windows (#6850)
Adds Windows targets to the client/allocs metrics tests. Removes the
`allocstats` test, which covers less than these tests and is now
redundant.

Adds a firewall rule to our Windows instances so that the prometheus
server can scrape the Nomad HTTP API for metrics.
2019-12-16 08:34:17 -05:00
Michel Vocks 5cb462fd13 Add raw field for ClientCert and ClientKey 2019-12-16 14:30:00 +01:00
Seth Hoenig 270233e23d tests: remove trace statements from nodeDrainWatcher.watch
Avoid logging in the `watch` function as much as possible, since
it is not waited on during a server shutdown. When the logger
logs after a test passes, it may or may not cause the testing
framework to panic.

More info in:
https://github.com/golang/go/issues/29388#issuecomment-453648436
2019-12-16 07:08:11 -06:00
Michel Vocks 6e413b3929 Update go mod 2019-12-16 12:47:10 +01:00
Michel Vocks 3864d91d03 Add option to set certificate in-memory via SDK 2019-12-16 10:59:27 +01:00
Danielle Lancashire 5a87b3ab4b
env_aws: Disable Retries and set Session cfg
Previously, Nomad used hand rolled HTTP requests to interact with the
EC2 metadata API. Recently however, we switched to using the AWS SDK for
this fingerprinting.

The default behaviour of the AWS SDK is to perform retries with
exponential backoff when a request fails. This is problematic for Nomad,
because interacting with the EC2 API is in our client start path.

Here we revert to our pre-existing behaviour of not performing retries
in the fast path, as if the metadata service is unavailable, it's likely
that nomad is not running in AWS.
2019-12-16 10:56:32 +01:00
Michael Schurter 9b08968c2e
Merge pull request #6855 from hashicorp/b-interp-connect-task
connect: canonicalize before adding sidecar
2019-12-13 09:26:44 -08:00
Mahmood Ali 4a1cc67f58
Merge pull request #6820 from hashicorp/f-skip-docker-logging-knob
driver: allow disabling log collection
2019-12-13 11:41:20 -05:00
Mahmood Ali a7361612b6
Merge pull request #6556 from hashicorp/c-vendor-multierror-20191025
Update go-multierror library
2019-12-13 11:32:42 -05:00
Mahmood Ali 46bc3b57e6 address review comments 2019-12-13 11:21:00 -05:00
Mahmood Ali b3a1e571e5 tests: fix error format assertion
multierror library changed formatting slightly.
2019-12-13 11:01:20 -05:00
Buck Doyle d1b689c0f4 Update changelog with #6817 2019-12-13 09:16:31 -06:00
Mahmood Ali ea30ab9c56 Update go-multierror to 72917a1
To pick up https://github.com/hashicorp/go-multierror/pull/28
2019-12-13 10:13:31 -05:00
Buck Doyle 6ff6a41df8
Fix flapping status light test (#6852)
I unintentionally introduced a flapping test in #6817. The
draining status of the node will be randomly chosen and
that flag takes precedence over eligibility. This forces
the draining flag to be false rather than random so the
test should no longer flap.

See here for an example failure:
https://circleci.com/gh/hashicorp/nomad/26368
2019-12-13 09:02:02 -06:00
Preetha Appan 5f7cfa5050 update changelog 2019-12-13 08:22:14 -06:00
Mahmood Ali 5a81cd3309
Merge pull request #6839 from hashicorp/b-cgroup-cleanup
executor: stop joining executor to container cgroup
2019-12-13 09:05:09 -05:00
Michael Schurter 5a25766714 docs: add #6855 to changelog
Also make Connect related fixes more consistent in the changelog. I
suspect users won't care if a Connect related fix is in the server's
admission controller or in the client's groupservice hook or somewhere
else, so I think grouping them by `consul/connect:` makes the most
sense.
2019-12-12 20:58:49 -08:00
Michael Schurter 95fd2643d7 connect: canonicalize before adding sidecar
Fixes #6853

Canonicalize jobs first before adding any sidecars. This fixes a bug
where sidecar tasks were added without interpolated names and broke
validation. Sidecar tasks must be canonicalized independently.

Also adds a group network to the mock connect job because it wasn't a
valid connect job before!
2019-12-12 20:55:56 -08:00
Mahmood Ali 570d81966f
Merge pull request #6854 from hashicorp/update-changelog
Add notarization details to changelog
2019-12-12 20:56:53 -05:00
Michele 161780ba00 Add clarifying update 2019-12-12 15:28:47 -08:00
Michele 49d0ef63aa Add apple notarization note 2019-12-12 15:24:18 -08:00
Preetha 2692949556
Merge pull request #6849 from hashicorp/b-debug-preemption
Use debug logging for scheduler internals
2019-12-12 16:15:46 -06:00
Preetha Appan afff27b69b More error->debug for logging in the bin packing iterator 2019-12-12 15:50:16 -06:00
ebarriosjr b953239227 driver/pot: Added extra_hosts and args commands (#6577) 2019-12-12 16:29:45 -05:00
Buck Doyle 09067b4eb7
UI: Fix client sorting (#6817)
There are two changes here, and some caveats/commentary:

1. The “State“ table column was actually sorting only by status. The state was not an actual property, just something calculated in each client row, as a product of status, isEligible, and isDraining. This PR adds isDraining as a component of compositeState so it can be used for sorting.

2. The Sortable mixin declares dependent keys that cause the sort to be live-updating, but only if the members of the array change, such as if a new client is added, but not if any of the sortable properties change. This PR adds a SortableFactory function that generates a mixin whose listSorted computed property includes dependent keys for the sortable properties, so the table will live-update if any of the sortable properties change, not just the array members. There’s a warning if you use SortableFactory without dependent keys and via the original Sortable interface, so we can eventually migrate away from it.
2019-12-12 13:06:54 -06:00
Michael Lange ed8fd28a10
Merge pull request #6808 from hashicorp/b-ui/unclosed-log-streams
UI: Unclosed log streams
2019-12-12 10:55:49 -08:00
Preetha Appan 3458b41290 Use debug logging for scheduler internals
We currently log an error if preemption is unable to find a suitable set of
allocations to preempt. This commit changes that to debug level since not finding
preemptable allocations is not an error condition.
2019-12-12 12:05:29 -06:00
Tim Gross e439e927ed
e2e: run client/allocs metrics tests nightly (#6842)
Refactor the metrics end-to-end tests so they can be run with our e2e
test framework. Runs fabio/prometheus and a collection of jobs that
will cause metrics to be measured. We then query Prometheus to ensure
we're publishing those allocation metrics and some metrics from the
clients as well.

Includes adding a placeholder for running the same tests on Windows.
2019-12-12 12:45:16 -05:00
Mahmood Ali d80ae6765b simplify cgroup path lookup 2019-12-11 12:43:25 -05:00
Seth Hoenig 2600c95af7
Merge pull request #6838 from hashicorp/f-parallelize-state-store-tests
tests: parallelize state store tests
2019-12-11 11:05:52 -06:00
Mahmood Ali 94ab62dfb4 executor: stop joining executor to container cgroup
Stop joining libcontainer executor process into the newly created task
container cgroup, to ensure that the cgroups are fully destroyed on
shutdown, and to make it consistent with other plugin processes.

Previously, executor process is added to the container cgroup so the
executor process resources get aggregated along with user processes in
our metric aggregation.

However, adding executor process to container cgroup adds some
complications with much benefits:

First, it complicates cleanup.  We must ensure that the executor is
removed from container cgroup on shutdown.  Though, we had a bug where
we missed removing it from the systemd cgroup.  Because executor uses
`containerState.CgroupPaths` on launch, which includes systemd, but
`cgroups.GetAllSubsystems` which doesn't.

Second, it may have advese side-effects.  When a user process is cpu
bound or uses too much memory, executor should remain functioning
without risk of being killed (by OOM killer) or throttled.

Third, it is inconsistent with other drivers and plugins.  Logmon and
DockerLogger processes aren't in the task cgroups.  Neither are
containerd processes, though it is equivalent to executor in
responsibility.

Fourth, in my experience when executor process moves cgroup while it's
running, the cgroup aggregation is odd.  The cgroup
`memory.usage_in_bytes` doesn't seem to capture the full memory usage of
the executor process and becomes a red-harring when investigating memory
issues.

For all the reasons above, I opted to have executor remain in nomad
agent cgroup and we can revisit this when we have a better story for
plugin process cgroup management.
2019-12-11 11:28:09 -05:00
Mahmood Ali 739e5e8811 drivers/exec: test all cgroups are destroyed 2019-12-11 11:12:29 -05:00
Seth Hoenig d45dec1ca8 tests: parallelize state store tests
It has been decided we're going to live in a many core world.
Let's take advantage of that and parallelize these state store
tests which all run in memory and are largely CPU bound.

An unscientific benchmark demonstrating the improvement:

[mp state (master)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	5.162s

[mp state (f-parallelize-state-store-tests)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	1.527s
2019-12-11 09:36:37 -06:00
Tim Gross b25713a837
doc: spread is inherited from job to group (#6837) 2019-12-11 09:59:26 -05:00
Drew Bailey a8bb422500
Merge pull request #6834 from hashicorp/monitor-changelog
add 6828 to changelog
2019-12-11 08:17:12 -05:00
Michael Schurter cdfaa3ca8a
Merge pull request #6833 from hashicorp/sentinel-imports-note
Make note of Sentinel standard imports
2019-12-10 13:56:01 -08:00