Commit Graph

18112 Commits

Author SHA1 Message Date
Mahmood Ali faecf3a760
Add tests for running allocation client count (#7906)
* Add tests for running allocation client count

* remove redundant assertions
2020-05-11 12:28:31 -04:00
Tim Gross 6554e9ee37
csi: log fallthrough on invalid node IDs for client RPC (#7918)
When a CSI client RPC is given a specific node for a controller but
the lookup fails (because the node is gone or is an older version), we
fallthrough to select a node from all those available. This adds
logging to this case to aid in diagnostics.
2020-05-11 12:26:10 -04:00
Michael Lange a798c8c246
Merge pull request #7896 from hashicorp/f-ui/csi-namespace-issue
UI: CSI namespace bugs
2020-05-11 09:22:32 -07:00
Tim Gross 1ec41b6770
volumewatcher: stop watcher goroutines when there's no work (#7909)
The watcher goroutines will be automatically started if a volume has
updates, but when idle we shouldn't keep a goroutine running and
taking up memory.
2020-05-11 09:32:05 -04:00
Mahmood Ali 061a439f2c
Merge pull request #7912 from hashicorp/f-scheduler-algorithm-followup
Scheduler Algorithm Defaults handling and docs
2020-05-11 09:30:58 -04:00
Mahmood Ali 0384543d05
Merge pull request #7913 from hashicorp/deflake-TestTaskTemplateManager_BlockedEvents
Deflake TestTaskTemplateManager_BlockedEvents test
2020-05-11 09:30:44 -04:00
Mahmood Ali dff0fcf2f3
Merge pull request #7914 from hashicorp/b-csi-fix-slice-initialization
Fix slice initialization
2020-05-11 09:27:01 -04:00
Tim Gross 3aa761b151
Periodic GC for volume claims (#7881)
This changeset implements a periodic garbage collection of CSI volumes
with missing allocations. This can happen in a scenario where a node
update fails partially and the allocation updates are written to raft
but the evaluations to GC the volumes are dropped. This feature will
cover this edge case and ensure that upgrades from 0.11.0 and 0.11.1
get any stray claims cleaned up.
2020-05-11 08:20:50 -04:00
James Rasell d8ddb01a8e
Merge pull request #7916 from hashicorp/changelog-7903
changelog: add entry for GH-7902
2020-05-11 13:15:58 +02:00
James Rasell 8ca0b3081a
changelog: add entry for GH-7902 2020-05-11 09:26:43 +02:00
James Rasell aaf2fe033e
Merge pull request #7903 from hashicorp/b-gh-7902
api: validate scale count value is not negative.
2020-05-11 09:17:01 +02:00
Mahmood Ali 9fac6ea5d9 Fix slice initialization 2020-05-09 21:35:42 -04:00
Mahmood Ali 64de395df0 tests: ease debugging TestClientEndpoint_CreateNodeEvals
TestClientEndpoint_CreateNodeEvals flakes a bit but its output is very
confusing, as `structs.Evaluations` overrides GoString.

Here, we emit the entire struct of the evaluation, and hopefully we'll
figure out the problem the next time it happens
2020-05-09 16:04:32 -04:00
Mahmood Ali ff5c3e81b0 avoid logging after a test completes 2020-05-09 14:40:00 -04:00
Mahmood Ali 543f08c1ae Deflake TestTaskTemplateManager_BlockedEvents test
This change deflakes TestTaskTemplateManager_BlockedEvents test, because
it is expecting a number of events without accounting for transitional
state.

The test TestTaskTemplateManager_BlockedEvents attempts to ensure that a
template rendering emits blocked events for missing template ksys.

It works by setting a template that requires keys 0,1,2,3,4 and then
eventually sets keys 0,1,2,3 and ensures that we get a final event indicating
that keys 3 and 4 are still missing.

The test waits to get a blocked event for the final state, but it can
fail if receives a blocked event for a transitional state (e.g. one
reporting 2,3,4,5 are missing).

This fixes the test by ensuring that it waits until the final message
before assertion.

Also, it clarifies the intent of the test with stricter assertions and
additional comments.
2020-05-09 14:09:39 -04:00
Mahmood Ali 2c963885b0 handle upgrade path and defaults
Ensure that `""` Scheduler Algorithm gets explicitly set to binpack on
upgrades or on API handling when user misses the value.

The scheduler already treats `""` value as binpack.  This PR merely
ensures that the operator API returns the effective value.
2020-05-09 12:34:08 -04:00
Mahmood Ali cb39226332 Document scheduler algorithm 2020-05-09 12:12:38 -04:00
Michael Lange 5e6a89836b The active namespace is now shared between jobs and storage 2020-05-08 17:35:27 -07:00
Michael Lange b3fe5b4f75 Changing namespaces now situationally redirects to jobs or volumes
Changing namespaces can be done anywhere in the app even though many
Nomad resources aren't namespace-sensitive (e.g., clients, plugins).

A user changing namespaces is an intent to reset context, "now I want
to begin a task that relates to Namespace X". Where that task begins
used to always be the Jobs list, since it was the only namespace
sensitive resource. Now with CSI Volumes, "square 1" is Volumes if the
namespace is changed from a storage page.
2020-05-08 17:35:27 -07:00
Michael Lange 2d6cedc1b5
Merge pull request #7895 from hashicorp/f-ui/csi-search-and-filter
UI: CSI Searchable volumes and plugins lists
2020-05-08 17:34:24 -07:00
Tim Gross 8373e917fc
volumewatcher: set maximum batch size for raft update (#7907)
The `volumewatcher` has a 250ms batch window so claim updates will not
typically be large enough to risk exceeding the maximum raft message
size. But large jobs might have enough volume claims that this could
be a danger. Set a maximum batch size of 100 messages per
batch (roughly 33K), as a very conservative safety/robustness guard.

Co-authored-by: Chris Baker <1675087+cgbaker@users.noreply.github.com>
2020-05-08 16:53:57 -04:00
Seth Hoenig 0235b23709
Merge pull request #7891 from hashicorp/docs-improve-connect-examples
docs: improve some connect examples
2020-05-08 09:50:31 -06:00
Tim Gross 298f67dca3
volume_mount doc tweak (#7900) 2020-05-08 11:14:16 -04:00
Seth Hoenig b95253c289
Merge pull request #7892 from hashicorp/f-e2e-connect-use-expose-service-check
e2e: set an expose service check in connect e2e testcase
2020-05-08 09:00:35 -06:00
James Rasell 55a2ad3854
api: validate scale count value is not negative.
An operator could submit a scale request including a negative count
value. This negative value caused the Nomad server to panic. The
fix adds validation to the submitted count, returning an error to
the caller if it is negative.
2020-05-08 16:51:40 +02:00
Michael Lange ee2d1eeb39 Introduce shallow to plugin factories to speed up tests 2020-05-07 17:29:42 -07:00
Michael Lange 7610493445 Searchable plugins 2020-05-07 17:25:24 -07:00
Michael Lange be9d9176b1 Searchable volumes 2020-05-07 16:57:51 -07:00
Seth Hoenig aae8a8504e e2e: set an expose service check in connect e2e testcase
Make sure exposed checks work in e2e by setting an expose
check on the e2e connect test.
2020-05-07 14:40:03 -06:00
Seth Hoenig b28a26bfe3 docs: replace use of test/test with runnable examples
A few connect examples reference a fake 'test/test' image.
By replacing those with `hashicorpnomad/counter-api` we can
turn them into runnable examples.
2020-05-07 14:16:27 -06:00
Seth Hoenig b4c56293fe docs: fix misspelling of countdash example 2020-05-07 14:00:00 -06:00
Tim Gross 42f9d517d8
CSI volumewatcher testability improvments (#7889)
* volumewatcher: remove redundant log fields

The constructor for `volumeWatcher` already sets a `logger.With` that
includes the volume ID and namespace fields. Remove them from the
various trace logs.

* volumewatcher: advance state for controller already released

One way of bypassing client RPCs in testing is to set a claim status
to controller-detached, but this results in an incorrect claim state
when we checkpoint.
2020-05-07 15:57:24 -04:00
Seth Hoenig 20d7a76d2c docs: mention Connect ACLs guide in jobspec connect docs
Promote the Connect ACLs guide on the jobspec connect stanza docs
page. This was suggested in a ticket after someone got stuck not
realizing they needed to enable Consul Intentions for their connect
enabled services, which is covered in the guide.
2020-05-07 13:53:22 -06:00
Seth Hoenig 2de039d570 docs: update examples for connect jobspec docs
Replace the existing top example with something that is directly
runnable on a `-dev-connect` nomad setup.

Add the _complete_ `countdash` example at the bottom in the
examples section, so that people do not need to go guide-hunting
to find a complete example. The hope is people will see more
runnable examples and be less afraid of connect.
2020-05-07 13:45:13 -06:00
Michael Lange f1904cf22b
Merge pull request #7886 from hashicorp/f/csi-missing-prop-for-plugin-list
Add ControllersExpected to the PluginListStub
2020-05-07 10:39:58 -07:00
Seth Hoenig b82e75c347 docs: improve connect expose path examples
This change replaces the top example for expose path configuration with
two new runnable examples. Users should be able to copy and paste those
jobs into a job file and run them against a basic connect enabled nomad
setup.

The example presented first demonstrates use of the service check expose
parameter with no dynamic port explicitly defined (new to 0.11.2).
This is expected to be the "90%" use case of users, and so we should
try to emphasise this pattern as best practice.

The example presented second demonstrates achieving the same goal as the
first exmaple, but utilizing the full plumbing available through the
`connect.proxy.expose` stanza. This should help readers comprehend what
is happening "under the hood".
2020-05-07 11:03:37 -06:00
Michael Lange 1a9631dbfa Add ControllersExpected to the PluginListStub 2020-05-07 10:01:52 -07:00
Drew Bailey 479905a880
Merge pull request #7885 from hashicorp/update-license-output
update license output
2020-05-07 12:50:00 -04:00
Drew Bailey fde40046a1
update license output 2020-05-07 12:14:15 -04:00
Seth Hoenig a1c198b3e0
Merge pull request #7781 from hashicorp/demo-countdash-docker
demo: create build scripts for our countdash demo
2020-05-07 08:34:20 -06:00
Michael Lange 0e31854213
Merge pull request #7872 from hashicorp/f-ui/csi-phase-2
UI: CSI Plugins Pages
2020-05-06 18:02:41 -07:00
Michael Lange 11a42b6075 Acceptance test for PluginDetail 2020-05-06 17:36:45 -07:00
Michael Lange f3bae3b535 Don't wrap between icons and health text 2020-05-06 17:22:42 -07:00
Michael Lange 38c03f180b Use lazyClick to avoid multiple transitionToRoutes being in flight as once 2020-05-06 17:22:42 -07:00
Michael Lange 2369254a3e Refactor AllocationRow qualifyAllocation
There was a missing edge case where a job is pending. I took the moment
to also refactor the code to use async/await which cleaned up the
promise chaining.
2020-05-06 17:22:42 -07:00
Michael Lange f9882d42e6 Plugin detail page object 2020-05-06 17:21:49 -07:00
Michael Lange 517db723cd Sort allocations on the plugin detail page 2020-05-06 17:21:49 -07:00
Michael Lange 41c4746aaf Test coverage for the plugins list page 2020-05-06 17:21:48 -07:00
Michael Lange e98095bcbd Page object for Plugins List 2020-05-06 17:21:48 -07:00
Michael Lange 59bb9a516e Correct the table headers for dates on the volume page 2020-05-06 17:21:48 -07:00