Commit graph

22940 commits

Author SHA1 Message Date
Tim Gross 99d03cdc6c
CSI: sort capabilities in plugin status (#12154)
Also fix `LIST_SNAPSHOTS` capability name
2022-03-01 07:59:31 -05:00
Tim Gross ca06f6153a
docs: clarify that plugin commands are for CSI only (#12151) 2022-03-01 07:57:41 -05:00
Tim Gross 02ae95ab22
csi: respect -verbose flag for allocs in volume status (#12153) 2022-03-01 07:57:29 -05:00
Kevin Wang 166011237b
fix(website): hide version select on /plugins & /tools (#12145)
* fix(website/plugins): display version select

* fix: hide version select on `/tools` + `/plugins`
2022-02-28 12:44:08 -05:00
Tim Gross 77fac26d5e
CI: increase test run timeout (#12143) 2022-02-28 11:30:59 -05:00
Seth Hoenig 5cf57e429a
Merge pull request #12137 from hashicorp/rpc-advertise-docs
docs: clairfy advertise.rpc effect
2022-02-28 08:15:28 -06:00
James Rasell 8a23afdb56
events: add state objects and logic for service registrations. 2022-02-28 10:44:58 +01:00
James Rasell 20249bb761
state: add service registration restore functionality. 2022-02-28 10:15:27 +01:00
James Rasell 74b367553e
state: add service registration state interaction functions. 2022-02-28 10:15:03 +01:00
James Rasell cfdb5a3c66
structs: add service registration struct and basic composed funcs. 2022-02-28 10:14:40 +01:00
James Rasell 1da859c60e
mock: add service registration mock generation for test use. 2022-02-28 10:14:25 +01:00
James Rasell cf0b63d561
state: add the table schema for the service_registrations table. 2022-02-28 10:14:10 +01:00
Jorge Marey a466f01120 Add metadata to namespaces 2022-02-27 09:09:10 +01:00
Michael Schurter cbf6ba843d
cli: fix op api typos
Co-authored-by: Seth Hoenig <seth.a.hoenig@gmail.com>
2022-02-25 16:31:56 -08:00
Michael Schurter 4550c5fb80 cli: only return 1 on errors from op api
We don't want people to expect stable error codes for errors, and I
don't think these were useful for scripts anyway.
2022-02-25 16:23:31 -08:00
Michael Schurter aeff156177 docs: fix nav for op api 2022-02-25 16:21:14 -08:00
Seth Hoenig 5269b2e02f docs: clairfy advertise.rpc effect
The advertise.rpc config option is not intuitive. At first glance you'd
assume it works like advertise.http or advertise.serf, but it does not.

The current behavior is working as intended, but the documentation is
very hard to parse and doesn't draw a clear picture of what the setting
actually does.

Closes https://github.com/hashicorp/nomad/issues/11075
2022-02-25 16:02:29 -06:00
Jai 817e66f930
Merge pull request #12134 from hashicorp/b-ui/target-link
ui:  external links open in new tabs
2022-02-25 10:29:04 -05:00
Seth Hoenig 34d46cd4c4
Merge pull request #12130 from hashicorp/flakey-serf-non-voter
tests: deflake test that joins a server with non-voting servers to form quorum
2022-02-25 09:12:42 -06:00
Jai Bhagat 8958d48ca9 ui: external links open in new tabs 2022-02-25 09:24:37 -05:00
Michael Schurter f6342d1d45 docs: add changelog for #10808 2022-02-24 17:13:42 -08:00
Michael Schurter a42d832f98 cli: add tests and minor fixes for op api
Trimmed spaces around header values.

Fixed method getting forced to GET.
2022-02-24 17:06:07 -08:00
Michael Schurter 238a732098 cli: add filter support 2022-02-24 15:52:54 -08:00
Michael Schurter bb3daac628 rename nomad curl to nomad operator api 2022-02-24 15:52:54 -08:00
Michael Schurter 141db0c562 cli: add curl command
Just a hackweek project at this point.
2022-02-24 15:52:54 -08:00
Seth Hoenig 1274aa690f tests: deflake test that joins a server with non-voting servers to form qourum
This PR
 - upgrades the serf library
 - has the test start the join process using the un-joined server first
 - disables schedulers on the servers
 - uses the WaitForLeader and wantPeers helpers

Not sure which, if any of these actually improves the flakiness of this test.
2022-02-24 17:02:58 -06:00
Zachary Shilton 81521ca248
chore: bump docs-page for code-block fix (#12117)
* chore: bump to latest docs-page

* fix: bump to react-consent-manager patch

* chore: bump to consent-manager with events dep

* chore: bump to stable consent-manager release
2022-02-24 15:34:54 -05:00
Tim Gross 31ee2a3c67
CSI: ensure all fields are mapped from structs to api response (#12124)
In PR #12108 we added missing fields to the plugin response, but we
didn't include the manual serialization steps that we need until
issue #10470 is resolved.
2022-02-24 14:17:15 -05:00
Tim Gross 13ea2c7fb3
CSI: display plugin capabilities in verbose status (#12116)
The behaviors of CSI plugins are governed by their capabilities as
defined by the CSI specification. When debugging plugin issues, it's
useful to know which behaviors are expected so they can be matched
against RPC calls made to the plugin allocations.

Expose the plugin capabilities as named in the CSI spec in the `nomad
plugin status -verbose` output.
2022-02-24 13:51:38 -05:00
Luiz Aoqui 61d79e75b0
docs: add docs for the autoscaler on_error and on_check_error configuration (#12083) 2022-02-24 12:12:29 -05:00
James Rasell bc6056cbbe
Merge pull request #12122 from hashicorp/b-api-remove-namespace-test-ent-tag
api: remove ent build tag on namespace test file.
2022-02-24 17:13:15 +01:00
James Rasell 8f175d44da
api: remove ent build tag on namespace test file. 2022-02-24 16:40:04 +01:00
Tim Gross 22cf24a6bd
CSI: retry claims from client when max claims are reached (#12113)
When the alloc runner claims a volume, an allocation for a previous
version of the job may still have the volume claimed because it's
still shutting down. In this case we'll receive an error from the
server. Retry this error until we succeed or until a very long timeout
expires, to give operators a chance to recover broken plugins.

Make the alloc runner hook tolerant of temporary RPC failures.
2022-02-24 10:39:07 -05:00
Tim Gross cfe3117af8
CSI: enforce usage at claim time (#12112)
* Remove redundant schedulable check in `FreeWriteClaims`. If a volume
  has been created but not yet claimed, its capabilities will be checked
  in `WriteSchedulable` at both scheduling time and claim time. We don't
  need to also check them in the `FreeWriteClaims` method.

* Enforce maximum volume claims for writers.

  When the scheduler checks feasibility for CSI volumes, the check is
  fairly loose: earlier versions of the same job are not counted as
  active claims. This allows the scheduler to place new allocations
  for the new version of a job, under the assumption that we'll replace
  the existing allocations and their volume claims.

  But when the alloc runner claims the volume, we need to enforce the
  active claims even if they're for allocations of an earlier version of
  the job. Otherwise we'll try to mount a volume that's currently being
  unmounted, and this will cause replacement allocations to frequently
  fail.

* Enforce single-node reader check for read-only volumes. When the
  alloc runner makes a claim for a read-only volume, we only check that
  the volume is potentially schedulable and not that it actually has
  free read claims.
2022-02-24 09:37:37 -05:00
Sander Mol 42b338308f
add go-sockaddr templating support to nomad consul address (#12084) 2022-02-24 09:34:54 -05:00
Florian Apolloner 3bced8f558
namespaces: allow enabling/disabling allowed drivers per namespace 2022-02-24 09:27:32 -05:00
Seth Hoenig 57b9c64b8f
Merge pull request #12107 from hashicorp/use-bbolt
core: swap bolt impl and enable configuring raft freelist sync behavior
2022-02-24 08:25:54 -06:00
Seth Hoenig 8e6d97744b docs: emphasize snapshot before upgrading 2022-02-24 08:22:41 -06:00
Tim Gross 5b7b9fdafb
csi: tolerate missing plugins on job delete (#12114)
If a plugin job fails before successfully fingerprinting the plugins,
the plugin will not exist when we try to delete the job. Tolerate
missing plugins.
2022-02-24 08:53:15 -05:00
Seth Hoenig a0350b0608 command: switch from raft-boltdb to raft-boltdb/v2 2022-02-23 14:43:59 -06:00
Seth Hoenig da9b978806 client: resolve rebase conflict 2022-02-23 14:32:32 -06:00
Seth Hoenig 0420724c14 build: disallow old boltdb during build 2022-02-23 14:28:31 -06:00
Seth Hoenig ca84ba12ac agent: switch to go.etc.io/bbolt for state store
This PR modifies the server and client agents to use `go.etc.io/bbolt` as the
implementation for their state stores.
2022-02-23 14:28:31 -06:00
Seth Hoenig de95998faa core: switch to go.etc.io/bbolt
This PR swaps the underlying BoltDB implementation from boltdb/bolt
to go.etc.io/bbolt.

In addition, the Server has a new configuration option for disabling
NoFreelistSync on the underlying database.

Freelist option: https://github.com/etcd-io/bbolt/blob/master/db.go#L81
Consul equivelent PR: https://github.com/hashicorp/consul/pull/11720
2022-02-23 14:26:41 -06:00
Tim Gross 246db87a74
CSI: allow for concurrent plugin allocations (#12078)
The dynamic plugin registry assumes that plugins are singletons, which
matches the behavior of other Nomad plugins. But because dynamic
plugins like CSI are implemented by allocations, we need to handle the
possibility of multiple allocations for a given plugin type + ID, as
well as behaviors around interleaved allocation starts and stops.

Update the data structure for the dynamic registry so that more recent
allocations take over as the instance manager singleton, but we still
preserve the previous running allocations so that restores work
without racing.

Multiple allocations can run on a client for the same plugin, even if
only during updates. Provide each plugin task a unique path for the
control socket so that the tasks don't interfere with each other.
2022-02-23 15:23:07 -05:00
Tim Gross e5a52b0b6f
CSI: add missing plugin capabilities to api response (#12108)
Detection of the full set of plugin capabilities was added in Nomad
1.1 for the volume creation workflow, but these were not added to the
API response for plugins.
2022-02-23 15:22:29 -05:00
Tim Gross 17dc0adee3
csi: fix broken test (#12110) 2022-02-23 13:48:39 -05:00
Charlie Voiselle 01f6e57602
Fixed scheduler config examples (#12049) 2022-02-23 12:58:29 -05:00
Tim Gross 57a546489f
CSI: minor refactoring (#12105)
* rename method checking that free write claims are available
* use package-level variables for claim errors
* semgrep fix for testify
2022-02-23 11:13:51 -05:00
Tim Gross de134d9783
csi: fix mocked modes in volumewatcher test (#12104)
The volumewatcher test incorrectly represents the change in attachment
and access modes introduced in Nomad 1.1.0 to support volume
creation. This leads to a test that happens to pass but only
accidentally.

Update the test to correctly represent the volume modes set by the
existing claims on the test volumes.
2022-02-23 09:51:20 -05:00