Commit Graph

19925 Commits

Author SHA1 Message Date
Tim Gross b5abf4ec9d csi: fix incorrect comment on csi_hook context lifetime 2020-10-09 11:03:51 -04:00
Tim Gross ecec432653
csi: allow for volume detach to work with gc'd nodes (#9057)
When we try to prefix match the `nomad volume detach` node ID argument, the
node may have been already GC'd. The volume unpublish workflow gracefully
handles this case so that we can free the claim. So make a best effort to find
a node ID among the volume's claimed allocations, or otherwise just use the
node ID we've been given by the user as-is.
2020-10-09 09:45:03 -04:00
Tim Gross 67817068e0
docs: show distinct_hosts constraint for CSI plugins (#9052)
CSI plugins with the same plugin ID and type (controller, node, monolith) will
collide on a host, both in the communication socket and in the dynamic plugin
registry. Until this can be fixed, leave notice to operators in the
documentation.
2020-10-08 16:55:55 -04:00
Seth Hoenig ccc52c9896 docs: fix linter typos in docs 2020-10-08 13:01:33 -05:00
Seth Hoenig 9b555fe6d5 env_aws: fixup test case node attr detection 2020-10-08 12:59:07 -05:00
Ryan Oaks 09048d834f
Merge pull request #9048 from hashicorp/ro.docs-html-redirect-catchall
docs: Update redirects to use a broader catch-all for routes ending in .html
2020-10-08 13:56:18 -04:00
Seth Hoenig e693d15a5b env_aws: get ec2 cpu perf data from AWS API
Previously, Nomad was using a hand-made lookup table for looking
up EC2 CPU performance characteristics (core count + speed = ticks).

This data was incomplete and incorrect depending on region. The AWS
API has the correct data but requires API keys to use (i.e. should not
be queried directly from Nomad).

This change introduces a lookup table generated by a small command line
tool in Nomad's tools module which uses the Amazon AWS API.

Running the tool requires AWS_* environment variables set.
  $ # in nomad/tools/cpuinfo
  $ go run .

Going forward, Nomad can incorporate regeneration of the lookup table
somewhere in the CI pipeline so that we remain up-to-date on the latest
offerings from EC2.

Fixes #7830
2020-10-08 12:01:09 -05:00
Tim Gross 29a5454894
csi: loosen ValidateVolumeCapability requirements (#9049)
The CSI specification for `ValidateVolumeCapability` says that we shall
"reconcile successful capability-validation responses by comparing the
validated capabilities with those that it had originally requested" but leaves
the details of that reconcilation unspecified. This API is not implemented in
Kubernetes, so controller plugins don't have a real-world implementation to
verify their behavior against.

We have found that CSI plugins in the wild may return "successful" but
incomplete `VolumeCapability` responses, so we can't require that all
capabilities we expect have been validated, only that the ones that have been
validated match. This appears to violate the CSI specification but until
that's been resolved in upstream we have to loosen our validation
requirements. The tradeoff is that we're more likely to have runtime errors
during `NodeStageVolume` instead of at the time of volume registration.
2020-10-08 12:53:24 -04:00
Ryan Oaks c03ec5f77b Update redirects to use a broader catch-all for routes ending in .html 2020-10-08 10:48:58 -04:00
Tim Gross e8c13a2307
csi: validate mount options during volume registration (#9044)
Volumes using attachment mode `file-system` use the CSI filesystem API when
they're mounted, and can be passed mount options. But `block-device` mode
volumes don't have this option. When RPCs are made to plugins, we are silently
dropping the mount options we don't expect to see, but this results in a poor
operator experience when the mount options aren't honored. This changeset
makes passing mount options to a `block-device` volume a validation error.
2020-10-08 09:23:21 -04:00
Tim Gross 16913bf3a1
docs: CSI mount_options are available only for filesystem vols (#9043)
The CSI specification allows only the `file-system` attachment mode to have
mount options. The `block-device` mode is left "intentionally empty, for now"
in the protocol. We should be validating against this problem, but our
documentation also had it backwards.

Also adds missing mount_options on group volume.
2020-10-08 08:49:43 -04:00
Ryan Oaks 51bccb9f16
Merge pull request #9036 from hashicorp/ro.docs-vercel-migration
Update docs site to deploy to Vercel instead of Netlify
2020-10-07 13:53:16 -04:00
Ryan Oaks 98bdb9ee2e Update docs site to deploy to Vercel instead of Netlify 2020-10-07 13:49:05 -04:00
Tim Gross 82749bd6a6
csi: allow volume detach to take a node ID prefix (#9041)
Fixes a bug where the `nomad volume detach` command would not accept a node ID
prefix instead of a full node ID. The volume ID is already prefix matched
server-side.
2020-10-07 11:14:57 -04:00
Tim Gross 3ceb5b36b1
csi: allow more than 1 writer claim for multi-writer mode (#9040)
Fixes a bug where CSI volumes with the `MULTI_NODE_MULTI_WRITER` access mode
were using the same logic as `MULTI_NODE_SINGLE_WRITER` to determine whether
the volume had writer claims available for scheduling.

Extends CSI claim endpoint test to exercise multi-reader and make sure `WriteFreeClaims`
is exercised for multi-writer in feasibility test.
2020-10-07 10:43:23 -04:00
Seth Hoenig 55824909b6
Merge pull request #9039 from hashicorp/f-update-aws-sdk-go
deps: update aws-sdk-go for ec2metadata bugfix
2020-10-06 13:28:00 -05:00
Seth Hoenig 3f5ecd6f42 deps: update aws-sdk-go for ec2metadata bugfix
Newer versions of aws-sdk-go contain a bugfix for retrieving
region vs availability-zone information.
2020-10-06 13:05:29 -05:00
James Rasell fac7fb5d5d
Merge pull request #9037 from hashicorp/f-changelog-entries-9023-9025
changelog: add entries for #9023 and #9025
2020-10-06 18:29:33 +02:00
James Rasell d2fe895216
Merge pull request #9023 from hashicorp/f-gh-8648
cli: add scale and scaling-events commands to job cmd.
2020-10-06 18:03:41 +02:00
Dave May 561637c063
Merge pull request #9034 from hashicorp/dmay-debug-metrics
Add metrics command / output to debug bundle
2020-10-06 11:47:09 -04:00
davemay99 67b4161411 added comment to operator metrics function 2020-10-06 11:22:10 -04:00
James Rasell ed7eb77bc7
changelog: add entries for #9023 and #9025 2020-10-06 17:16:55 +02:00
Seth Hoenig 6cffbecb3a
Merge pull request #9033 from pierreca/verify-remove-checks
Do not double-remove checks removed by Consul
2020-10-06 10:16:13 -05:00
davemay99 18aa30c00f metrics return bytes instead of string for more flexibility 2020-10-06 10:49:15 -04:00
davemay99 19a075cf47 update deprecated syntax per GH-9027 2020-10-06 09:47:16 -04:00
davemay99 7160c26f04 sync vendored modules 2020-10-06 09:16:52 -04:00
Drew Bailey a1cc07d4a4
make sync to align vendor/ with go.mod (#9032) 2020-10-06 08:30:32 -04:00
James Rasell b7dac9020f
Merge pull request #9025 from hashicorp/f-gh-8649
cli: add policy list and info to new scaling cmd.
2020-10-06 12:40:43 +02:00
James Rasell 552d1b2ed4
cli: ensure scaling policy target doesn't have trailing comma 2020-10-06 12:18:17 +02:00
James Rasell 564adc1678
cli: add scale and scaling-events commands to job cmd.
This adds the ability to scale Nomad jobs and view scaling events
via the CLI.
2020-10-06 09:58:46 +02:00
James Rasell ffe6533ad1
Merge pull request #9027 from hashicorp/f-gh-9026
cli: move tests to use NewMockUi func.
2020-10-06 08:28:18 +02:00
davemay99 603cc1776c Add metrics command / output to debug bundle 2020-10-05 22:30:01 -04:00
Pierre Cauchois 1efe05f516 Do not double-remove checks removed by Consul
When deregistering a service, consul also deregisters the associated
checks. The current state keeps track of all services and all checks
separately and deregisters them in sequence, which leads, whether during
syncs or shutdowns, to check deregistrations happening twice and failing
the second time (generating errors in logs)

This fix includes:
- a fix to the sync logic that just pulls the checks *after* the
services have been synced
- a fix to the shutdown mechanism that gets an updated list of checks
after deregistering the services, so that we get a cleaner check
deregistration process.
2020-10-06 00:30:29 +00:00
Seth Hoenig 0c5ae5769f
Merge pull request #9029 from hashicorp/b-tgs-updates
consul/connect: trigger update as necessary on connect changes
2020-10-05 16:48:04 -05:00
Chris Baker 8a50ba9399
Merge pull request #9020 from hashicorp/b-8796-i-do-not-want-no-null-chars
job, task group, and task IDs should not allow null characters
2020-10-05 14:54:02 -05:00
Seth Hoenig f44a4f68ee consul/connect: trigger update as necessary on connect changes
This PR fixes a long standing bug where submitting jobs with changes
to connect services would not trigger updates as expected. Previously,
service blocks were not considered as sources of destructive updates
since they could be synced with consul non-destructively. With Connect,
task group services that have changes to their connect block or to
the service port should be destructive, since the network plumbing of
the alloc is going to need updating.

Fixes #8596 #7991

Non-destructive half in #7192
2020-10-05 14:53:00 -05:00
Chris Baker 7f701fddd0 updated docs and validation to further prohibit null chars in region, datacenter, and job name 2020-10-05 18:01:50 +00:00
Chris Baker 5c68f53c24 updated api tests wrt backwards compat on null chars in IDs 2020-10-05 18:01:50 +00:00
Chris Baker 23ea7cd27c updated job validate to refute job/group/task IDs containing null characters
updated CHANGELOG and upgrade guide
2020-10-05 18:01:49 +00:00
Chris Baker c8fd9428d4 documenting tests around null characters in job id, task group name, and task name 2020-10-05 18:01:49 +00:00
Seth Hoenig e48605809b
Merge pull request #9008 from luhhujbb/doc-portmap
[doc] fix deprecated port_map example
2020-10-05 12:25:53 -05:00
Tim Gross 12f3249390
changelog entry for Consul segment fingerprint (#9015) 2020-10-05 12:35:34 -04:00
Tim Gross 474c18102d
e2e: extend ConsulTemplate test and fix flakiness (#8997)
Add service discovery integration to the existing consul-template E2E test,
and verify both service and key updates force re-rendering. Fixes flakiness by
using the longer default wait config we use elsewhere.

Removes our last direct dependency on gomega.
2020-10-05 10:51:55 -04:00
Ryan Oaks da3e4a9e71
Merge pull request #8991 from hashicorp/ro.docs-formatting
docs: Format docs website code and mdx
2020-10-05 10:31:40 -04:00
James Rasell 2ed78b8a7e
cli: move tests to use NewMockUi func. 2020-10-05 16:07:41 +02:00
Tim Gross 727277793b
e2e: bootstrap vault and provision Nomad with vault tokens (#9010)
Provisions vault with the policies described in the Nomad Vault integration
guide, and drops a configuration file for Nomad vault server configuration
with its token. The vault root token is exposed to the E2E runner so that
tests can write additional policies to vault.
2020-10-05 09:28:37 -04:00
James Rasell b8727997cd
cli: add policy list and info to new scaling cmd.
This adds the ability to detail scaling policies via the CLI.
2020-10-05 15:18:30 +02:00
Kent 'picat' Gruber 5e1c716835
Merge pull request #8998 from hashicorp/keygen-32-bytes
Use 32-byte key for gossip encryption to enable AES-256
2020-10-02 17:17:55 -04:00
Landan Cheruka 023a2d36b7
fingerprint: changed unique.platform.azure.hostname to unique.platform.azure.name (#9016) 2020-10-02 16:50:12 -04:00
Javier Heredia 103ac0a37f
Add consul segment fingerprint (#7214) 2020-10-02 15:15:59 -04:00