Commit graph

18560 commits

Author SHA1 Message Date
Mahmood Ali 2108681c1d Endpoint for snapshotting server state 2020-05-21 20:04:38 -04:00
Mahmood Ali 5585cb347e Add snapshot helper
Effectively Copied from https://github.com/hashicorp/consul/tree/v1.8.0-beta1/snapshot

With addition of overall snapshot checksum file
2020-05-21 20:04:38 -04:00
Mahmood Ali fbe140b26c vault: ensure ttl expired tokens are purge
If a token is scheduled for revocation expires before we revoke it,
ensure that it is marked as purged in raft and is only removed from
local vault state if the purge operation succeeds.

Prior to this change, we may remove the accessor from local state but
not purge it from Raft.  This causes unnecessary and churn in the next
leadership elections (and until 0.11.2 result in indefinite retries).
2020-05-21 19:54:50 -04:00
Tim Gross ba11aef5d9
csi: skip unit tests on unsupported platforms (#8033)
Some of the unit tests for CSI require platform-specific APIs that aren't
available on macOS. We can safely skip these tests.
2020-05-21 13:56:50 -04:00
Tim Gross ec3531cdac
docs: add missing fields to driver Capabilities (#8037)
Adds `NetIsolationModes`, `MustInitiateNetwork`, and `MountConfigs` fields to
the driver `Capabilities` internals documentation.
2020-05-21 13:56:40 -04:00
Michael Lange 262c558e1c Replace nulls with empty strings to have an empty response body 2020-05-21 10:35:41 -07:00
Michael Lange 6274282948 Fix latent race condition in client detail test
Adding this settled makes this test pass now that Ember Data is using
fetch instead of jquery. The test was presumably always incorrect but
never flaked.
2020-05-21 10:35:40 -07:00
Michael Lange 1f82e19e2d Type-check the ForbiddenError rather than ducktype it
Going off of the error message being "Forbidden" was brittle to begin
with and no longer works with Fetch due to the error message coming from
jquery underpinnings that were unobserved by Ember Data's attempted
recreation.
2020-05-21 10:35:40 -07:00
Michael Lange 388bb138c2 Always lower case headers now 2020-05-21 10:35:40 -07:00
Michael Lange e1f3086ba3 Don't double query params for findQuery 2020-05-21 10:35:40 -07:00
Michael Lange 07a604033f Adjust header case, which changed with the migration to fetch 2020-05-21 10:35:40 -07:00
Michael Lange de73b9539d Replace XHRToken with AbortController 2020-05-21 10:35:39 -07:00
Michael Lange 20f209c4fb Use the AbortController provided by the fetch polyfill 2020-05-21 10:35:39 -07:00
Michael Lange 899d8266f4 Instrument jquery ajax method to see if it is getting called anywhere 2020-05-21 10:35:39 -07:00
Michael Lange d5e8c1ca8c Force the use of fetch despite jquery still being integrated 2020-05-21 10:35:39 -07:00
Buck Doyle b3164e50f7
Add test-timing comparison for UI PRs (#7929)
This runs ember-test-audit for UI PRs to compare how long
the tests take for the base vs the PR. It posts or updates
a comparison of times and test count.

It’s somewhat slow to report back as it runs the test suite
three times to even out variability in a shared environment.
If we end up being unhappy with that slowness, we could try
running the repetitions in parallel as well, but that would
involve more changes to ember-test-audit.
2020-05-21 11:41:13 -05:00
Michael Lange 2edd482a7c
Merge pull request #8024 from hashicorp/f-ui/csi-plugin-allocations
UI: CSI Plugin Allocations Page
2020-05-21 09:16:45 -07:00
Tim Gross aa8927abb4
volumes: return better error messages for unsupported task drivers (#8030)
When an allocation runs for a task driver that can't support volume mounts,
the mounting will fail in a way that can be hard to understand. With host
volumes this usually means failing silently, whereas with CSI the operator
gets inscrutable internals exposed in the `nomad alloc status`.

This changeset adds a MountConfig field to the task driver Capabilities
response. We validate this when the `csi_hook` or `volume_hook` fires and
return a user-friendly error.

Note that we don't currently have a way to get driver capabilities up to the
server, except through attributes. Validating this when the user initially
submits the jobspec would be even better than what we're doing here (and could
be useful for all our other capabilities), but that's out of scope for this
changeset.

Also note that the MountConfig enum starts with "supports all" in order to
support community plugins in a backwards compatible way, rather than cutting
them off from volume mounting unexpectedly.
2020-05-21 09:18:02 -04:00
Mahmood Ali aa8e79e55b Reorder leadership handling
Start serving RPC immediately after leader components are enabled, and
move clean up to the bottom as they don't block leadership
responsibilities.
2020-05-21 08:30:31 -04:00
Mahmood Ali 1cf1114627 apply the same change to consul revocation 2020-05-21 08:30:31 -04:00
Mahmood Ali 1399d02f45 rate limit revokeDaemon 2020-05-21 08:30:31 -04:00
Mahmood Ali 6e749d12a0 on leadership establishment, revoke Vault tokens in background
Establishing leadership should be very fast and never make external API
calls.

This fixes a situation where there is a long backlog of Vault tokens to
be revoked on when leadership is gained.  In such case, revoking the
tokens will significantly slow down leadership establishment and slow
down processing.  Worse, the revocation call does not honor leadership
`stopCh` signals, so it will not stop when the leader loses leadership.
2020-05-21 07:38:27 -04:00
Michael Lange e263d2821e Prevent sticky query params for plugin alloc filtering 2020-05-20 14:34:16 -07:00
Michael Lange 4588f2b5a3 Show the number of allocations in the the goto links on plugin detail 2020-05-20 14:22:47 -07:00
Tim Gross 72430a4e62
csi: don't pass volume claim releases thru GC eval (#8021)
Following the new volumewatcher in #7794 and performance improvements
to it that landed afterwards, there's no particular reason we should
be threading claim releases through the GC eval rather than writing an
empty `CSIVolumeClaimRequest` with the mode set to
`CSIVolumeClaimRelease`, just as the GC evaluation would do.

Also, by batching up these raft messages, we can reduce the amount of
raft writes by 1 and cross-server RPCs by 1 per volume we release
claims on.
2020-05-20 15:22:51 -04:00
Tim Gross 065fa7af8b
stats_hook: log normal shutdown condition as debug, not error (#8028)
The `stats_hook` writes an Error log every time an allocation becomes
terminal. This is a normal condition, not an error. A real error
condition like a failure to collect the stats is logged later. It just
creates log noise, and this is a particularly bad operator experience
for heavy batch workloads.
2020-05-20 10:28:30 -04:00
Tim Gross 4b4bf84d79
docs: notice to operators about script check timeouts (#8015)
The tasklet passes the timeout for the script check into the task
driver's `Exec`, and its up to the task driver to enforce that via a
golang `context.WithDeadline`. In practice, this deadline is started
before the task driver starts setting up the execution
environment (because we need it to do things like timeout Docker API
calls).

Under even moderate load, the time it takes to set up the execution
context for the script check regularly exceeds a full second or
two. This can cause script checks to unexpected timeout or even never
execute if the context expires before the task driver ever gets a
chance to `execve`.

This changeset adds a notice to operators about setting script check
timeouts with plenty of padding and what to monitor for problems.
2020-05-20 10:28:14 -04:00
Tim Gross 3902709c0a
csi: check for empty arguments on CSI endpoint (#8027)
Some of the CSI RPC endpoints were missing validation that the ID or
the Volume definition was present. This could result in nonsense
`CSIVolume` structs being written to raft during registration. This
changeset corrects that bug and adds validation checks to present
nicer error messages to operators in some other cases.
2020-05-20 10:22:24 -04:00
James Rasell 6db1bc69d0
Merge pull request #8013 from hashicorp/gh-8008
api: return custom error if API attempts to decode empty body.
2020-05-20 09:55:05 +02:00
James Rasell 779d58397a
Merge pull request #8007 from hashicorp/volumes-docs-csi-put-blocking
docs: update CSI PUT operation blocking query support yes->no.
2020-05-20 09:38:38 +02:00
Michael Lange 360a0dfc08 Test coverage for the plugin allocations page 2020-05-19 19:22:12 -07:00
Michael Lange e57d33a6b1 Test coverage for the limited alloc tables and goto links 2020-05-19 18:01:42 -07:00
Michael Lange f683529b29 Limit the length of the alloc tables on the plugin overview page 2020-05-19 17:30:00 -07:00
Michael Lange 993cbe6bdb Add a title to the plugin/allocations page
Typically these filterable list views don't have titles beyond the
breadcrumbs, but since this page has no search bar, the title really
helps balance it out.
2020-05-19 17:29:05 -07:00
Michael Lange 98933e12a0 Expose existing qp serialization util as a helper 2020-05-19 17:28:23 -07:00
Mahmood Ali 1bd4496feb
Merge pull request #8020 from hashicorp/vendor-hcl2-20200519
Update hcl2 vendoring
2020-05-19 16:25:14 -04:00
Mahmood Ali 751f337f1c Update hcl2 vendoring
The hcl2 library has moved from http://github.com/hashicorp/hcl2 to https://github.com/hashicorp/hcl/tree/hcl2.

This updates Nomad's vendoring to start using hcl2 library.  Also
updates some related libraries (e.g. `github.com/zclconf/go-cty/cty` and
`github.com/apparentlymart/go-textseg`).
2020-05-19 15:00:03 -04:00
Charlie Voiselle bde545b4db
Merge pull request #7994 from shantanugadgil/patch-3
update docs to mention spread algorithm
2020-05-19 14:49:19 -04:00
Shantanu Gadgil d5fc981636
Merge branch 'master' into patch-3 2020-05-20 00:16:43 +05:30
Charlie Voiselle 31c71767ec
Merge pull request #7870 from hashicorp/d-fix-version-typo
[docs] Fix version typo
2020-05-19 14:09:53 -04:00
Charlie Voiselle e63346560c
Interlinking API, FAQ, and Config
Creating a FAQ question to provide a home for additional context around
bootstrapping. Linking from API page to `default_server_config`
attribute. Added sample API response to to discuss "Updated: false"
2020-05-19 13:51:47 -04:00
Seth Hoenig 7ab61709f3 docs: update changelog with sidecar task resources fix 2020-05-19 11:49:59 -06:00
Seth Hoenig f136afc04f api: canonicalize connect components
Add `Canonicalize` methods to the connect components of a service
definition in the `api` package. Without these, we have been relying
on good input for the connect stanza.

Fixes #7993
2020-05-19 11:47:22 -06:00
Michael Lange af6846e10e
Merge pull request #7986 from hashicorp/missing-changelog-entry
Add missing bug fix from the changelog for 0.11.2
2020-05-19 09:20:23 -07:00
Michael Lange 7199570d2f
Merge pull request #8006 from hashicorp/f-ui/csi-node-only-support
UI: CSI node only support
2020-05-19 09:08:30 -07:00
Seth Hoenig a6436b5ac6
Merge pull request #8003 from hashicorp/c-use-taskkind
consul/connect: use task kind to get service name
2020-05-19 08:51:21 -06:00
Mahmood Ali 42d5a4136d docs: update snippet to include summary namespace 2020-05-19 10:06:53 -04:00
Mahmood Ali 63b87a9a89 update api/ JobListStub 2020-05-19 09:58:19 -04:00
Mahmood Ali 406fce90c3 list all jobs on namespaces the token can access 2020-05-19 09:51:41 -04:00
James Rasell ae0fb98c6b
api: return custom error if API attempts to decode empty body. 2020-05-19 15:46:31 +02:00