Commit graph

13449 commits

Author SHA1 Message Date
Danielle Tomlinson c623609e89 guides: Update for globbed namespace rules 2018-12-19 12:48:56 +01:00
Danielle Tomlinson cbfa1388db fixup: Code Review 2018-12-12 12:43:16 +01:00
Danielle Tomlinson 1ee0777521 fixup: Correctly sort based on distance, use iradix for ordering 2018-12-11 17:35:51 +01:00
Danielle Tomlinson 8d76d9c24b acl: Add support for globbing namespaces
This commit adds basic support for globbing namespaces in acl
definitions.

For concrete definitions, we merge all of the defined policies at load time, and
perform a simple lookup later on. If an exact match of a concrete
definition is found, we do not attempt to resolve globs.

For glob definitions, we merge definitions of exact replicas of a glob.

When loading a policy for a glob defintion, we choose the glob that has
the closest match to the namespace we are resolving for. We define the
closest match as the one with the _smallest character difference_
between the glob and the namespace we are matching.
2018-12-11 16:33:19 +01:00
Michael Schurter 8808ab9cea
Merge pull request #4953 from hashicorp/b-script-context-wrapper
consul: add ScriptExecutor context wrapper
2018-12-10 17:22:53 -08:00
Michael Schurter 4c5f3ae82c
Merge pull request #4952 from hashicorp/b-script-context
consul: fix script checks exiting after 1 run
2018-12-10 17:22:15 -08:00
Alex Dadgar 457c6eb398 typo 2018-12-10 15:35:26 -08:00
Alex Dadgar 508a3dfa49 merge 087 and 090 changelog 2018-12-10 15:34:21 -08:00
Mahmood Ali 97829a3f02 fix dtestutil.NewDriverHarness ref 2018-12-08 09:58:23 -05:00
Mahmood Ali 021d3720b5
Merge pull request #4950 from hashicorp/b-exc-libcontainer-kill
executor: kill all container processes
2018-12-08 09:52:42 -05:00
Nick Ethier 32057b6f7f
Merge pull request #4973 from emate/recover-filerotator-from-io-errors
Recover from any possible io error when invoking Write on FileRotator
2018-12-08 00:05:42 -05:00
Alex Dadgar 695fa416a6
Merge pull request #4965 from hashicorp/b-gc-running
Don't GC running but desired stop allocations
2018-12-07 13:36:33 -08:00
Marcin Matlaszek 39eec70f31
Recover from any possible io error when invoking Write on FileRotator
As of now, FileRotator uses bufio.Write under the hood to write data to
configured output file. Due to the way how bufio handles any occurred io
error - saves it into `err` variable never resetting it automatically -
any operation like `Write`, `Flush` etc will become a no-op, returning the very same,
saved error (eg. Out of disk space) even when the problem is fixed (eg. disk
space is available again).

That automatically means that FileRotator will stop writing any logs,
reporting the same error over and over again, even if it's no longer
valid.

This PR fixes it by resetting the bufio Writer, which resets any errors
and tries to write requested data.
2018-12-07 18:22:29 +01:00
Mahmood Ali 7d5b5bb5f9
Merge pull request #4933 from hashicorp/f-mount-device
Mount Devices in container based drivers
2018-12-07 10:32:03 -05:00
Mahmood Ali 91a67f347d Vendor libcontainer/devices 2018-12-07 09:13:27 -05:00
Danielle Tomlinson 8100252116
Merge pull request #4960 from hashicorp/dani/b-gc-tests
Re-enable Client GC tests
2018-12-06 23:18:36 +01:00
Mahmood Ali a7b205daf2
Merge pull request #4955 from hashicorp/fix-docker-tests-20181203
Fix docker driver tests
2018-12-06 16:41:33 -05:00
Danielle Tomlinson e3621c55fa gc: Fix maxallocs integration test 2018-12-06 21:50:50 +01:00
Mahmood Ali 9e825f880c Use absolute path in example device plugin
deviceDir is used for specifying mount/device host paths, and those
should be absolute paths.
2018-12-06 15:46:35 -05:00
Mahmood Ali bdc53b1d8e driver/rkt: mount plugin devices 2018-12-06 15:46:35 -05:00
Mahmood Ali 2c0fd2a902 driver/lxc: mount plugin devices
Also, LXC requires target paths to be relative.  Container paths in LXC
binds should never be absolute paths, so we strip any preceeding `/`,
even if a user sets one.
2018-12-06 15:46:35 -05:00
Mahmood Ali 699875eb1c fixup: add missed docker utils test 2018-12-06 15:46:35 -05:00
Mahmood Ali e9557ae596 tests: ensure image is loaded as test setup 2018-12-06 15:36:43 -05:00
Michael Lange 81c2d8b4a2
Merge pull request #4967 from hashicorp/b-ui-stat-charts-can-escape-canvas
UI: Keep line charts in their canvases at all times
2018-12-06 10:56:37 -08:00
Danielle Tomlinson 62b98e64ca client/gc: Replace GC integration test with unit
The previous integration test was broken during the client refactor, and
it seems to be some sort of race with state updating.

I'm going to try and construct a replacement test as part of work on
performance, but for now, the underlying behaviour is still being
tested.
2018-12-06 12:28:23 +01:00
Danielle Tomlinson f6e474fd55 client: Re-enable GC tests 2018-12-06 12:28:23 +01:00
Danielle Tomlinson d043532cb0 allocrunner: Basic test alloc runner 2018-12-06 12:28:23 +01:00
Michael Lange 795ea7eade Grow the default 0 to 1 bounds to the domain of the data when necessary 2018-12-05 22:07:44 -08:00
Alex Dadgar b18a0f77a2
Merge pull request #4966 from hashicorp/b-failure-event
Fix various bugs with task events
2018-12-05 14:43:50 -08:00
Alex Dadgar b39c21d49c Fix various bugs with task events
Fixes the following:
* Emitting events when the task fails to start
* Don't double emit events on task shutdown (nomad stop)
* Don't emit a OOM kill metric unless actually OOM'd
2018-12-05 14:27:07 -08:00
Alex Dadgar 14a61ea3ea Don't GC running but desired stop allocations
This PR fixes an edge case where we could GC an allocation that was in a
desired stop state but had not terminated yet. This can be hit if the
client hasn't shutdown the allocation yet or if the allocation is still
shutting down (long kill_timeout).

Fixes https://github.com/hashicorp/nomad/issues/4940
2018-12-05 13:01:12 -08:00
Mahmood Ali b55fb642f1 driver/docker: honor plugin devices 2018-12-04 21:31:28 -05:00
Mahmood Ali a580cef986 refactor device manipulation 2018-12-04 20:55:59 -05:00
Mahmood Ali 3a18105d06 drivers/exec: refactor stop/kill tests
Simplify the tests to do all assertions within the main goroutine and
account for status propagation delay.
2018-12-04 20:34:43 -05:00
Mahmood Ali adb4d69576
Merge pull request #4956 from hashicorp/b-vault-client-tweaks-followup
server/vault: Lock Vault expiration tracking
2018-12-04 19:46:59 -05:00
Mahmood Ali 366f478f8f
Merge pull request #4959 from hashicorp/fix-rkt-tests-20181204
tests: fix rkt tests
2018-12-04 19:46:41 -05:00
Mahmood Ali 428d35a5a9 executor: Keep 0.8.6 exit code for wait() failures
0.8.6 uses exit code 1 when `proc.Wait()` fails: https://github.com/hashicorp/nomad/blob/v0.8.6/client/driver/executor/executor.go#L442
2018-12-04 19:38:25 -05:00
Mahmood Ali 8df9de6fd5 driver/rkt: use rkt environment
The rkt command itself needs an environment with PATH set to find iptables.
2018-12-04 14:00:45 -05:00
Preetha 8068d9f64e
Merge pull request #4949 from hashicorp/b-neg-running-summary
Add guards around subtracting summary count
2018-12-04 12:52:58 -06:00
Mahmood Ali f8efc40b8b tests: stop integration tests tasks explicitly
Also update the new recommended `nomad job` subcommands
2018-12-04 11:50:59 -05:00
Dan Brown 8aebe8c47d Add Reference Architecture and Deployment Guide (#4768)
* Add Nomad RA

* Add deployment guide and nav

* Deployment Guide update

* Minor typo fixes

* Update diagrams

* Fixes for review

* Link fixes and typo fix

* Edits following review

- Update image text from "zone" to "datacenter" to match Nomad terminology
- Clean up text based on Preetha's feedback

* Text updates

Based on feedback from Rob

* Update diagrams

* fixing spelling

* Add suggestions from Preetha and Omar
2018-12-04 11:49:35 -05:00
Mahmood Ali 06a5cadf35 drivers/rkt: use image isolation for rkt 2018-12-04 11:40:10 -05:00
Mahmood Ali 178365848e tests: don't assert in WaitForResult
WaitForResult expects body to fail and retries few times before giving
up.  Assertions inside the testfn body causes it to terminate abruptly
without retrying.
2018-12-04 11:40:10 -05:00
Mahmood Ali 50e38104a5 server/nomad: Lock Vault expiration tracking
`currentExpiration` field is accessed in multiple goroutines: Stats and
renewal, so needs locking.

I don't anticipate high contention, so simple mutex suffices.
2018-12-04 09:29:48 -05:00
Mahmood Ali f8ceeebf11
no t.Parallel() in excutor table driven tests (#4948)
When `t.Parallel()` is used inside a `t.Run()` sub-set, the closure
doesn't behave as expected, and some cases effectively get skipped.
More details can be found in
https://gist.github.com/posener/92a55c4cd441fc5e5e85f27bca008721
2018-12-04 09:04:04 -05:00
Mahmood Ali 216a2566c7
Update LXC with drivers/testutils changes (#4951) 2018-12-04 08:57:54 -05:00
Michael Schurter 8fa5e90095 consul: add ScriptExecutor context wrapper
Since d335a82859ca2177bc6deda0c2c85b559daf2db3 ScriptExecutors now take
a timeout duration instead of a context. This broke the script check
removal code which used context cancelation propagation to remove
script checks while they were executing.

This commit adds a wrapper around ScriptExecutors that obeys context
cancelation again. The only downside is that it leaks a goroutine until
the underlying Exec call completes or timeouts.

Since check removal is relatively rare, check timeouts usually low, and
scripts usually fast, the risk of leaking a goroutine seems very small.
2018-12-03 20:26:31 -08:00
Mahmood Ali c88e3723eb Fix docker tests
Some tests have containers that die almost immediately, and may die
and cleaned up before `driver.WaitUntilStarted` runs.

The causes for container dying seems special for each test:
* TestDockerDriver_Cleanup: `hello-world` image just emits a message and exits immediately
* TestDockerDriver_ForcePull_RepoDigest: the busybox image in `TestDockerDriver_ForcePull_RepoDigest` test didn't support `-p 0` argument
* TestDockerDriver_Entrypoint: with the entrypoint being `/bin/sh -c`, the command needs to be the entire string; otherwise, it ignores the comments
2018-12-03 23:08:52 -05:00
Michael Schurter 6459c19ffc consul: fix script checks exiting after 1 run
Fixes a regression caused in d335a82859ca2177bc6deda0c2c85b559daf2db3

The removal of the inner context made the remaining cancels cancel the
outer context and cause script checks to exit prematurely.
2018-12-03 18:50:02 -08:00
Mahmood Ali 2516cb16b9 Kill all container processes on shutdown
Currently, libcontainer-based executor, upon shutdown, kills the
container initial process.  The children of the killed process remain
running, and the executor is never marked as terminated until they do.

Also, fix a case where we treat processes as successful, when
`proc.Wait()` fails.  In some attempts, I was getting "waitid no child
processes" errors and such error shouldn't get process to be considered
successful.
2018-12-03 20:40:49 -05:00