Commit Graph

21752 Commits

Author SHA1 Message Date
Seth Hoenig e47ea462fb client: fix logline in group shutdown hook
Fixes #10844
2021-07-08 11:14:37 -05:00
Seth Hoenig 868b246128 consul/connect: Avoid assumption of parent service when filtering connect proxies
This PR uses regex-based matching for sidecar proxy services and checks when syncing
with Consul. Previously we would check if the parent of the sidecar was still being
tracked in Nomad. This is a false invariant - one which we must not depend when we
make #10845 work.

Fixes #10843
2021-07-08 09:43:41 -05:00
Mahmood Ali 1f34f2197b
Merge pull request #10806 from hashicorp/munda/idempotent-job-dispatch
Enforce idempotency of dispatched jobs using token on dispatch request
2021-07-08 10:23:31 -04:00
Tim Gross 8f25a9d7cd
cni: respect default `cni_config_dir` and `cni_path` (#10870)
The default agent configuration values were not set, which meant they were not
being set in the client configuration and this results in fingerprints failing
unless the values were set explicitly.
2021-07-08 09:56:57 -04:00
Tim Gross 9f128a28ae
service: remove duplicate name check during validation (#10868)
When a task group with `service` block(s) is validated, we validate that there
are no duplicates, but this validation doesn't have access to the task environment
because it hasn't been created yet. Services and checks with interpolation can
be flagged incorrectly as conflicting. Name conflicts in services are not
actually an error in Consul and users have reported wanting to use the same
service name for task groups differentiated by tags.
2021-07-08 09:43:38 -04:00
Tim Gross e88e1e5001
testing: prevent panic when `job status` output changes (#10869)
The `command/TestJobStatusCommand_Run` test assumes that it gets back running
allocations and will panic the test runner rather than failing.
2021-07-08 09:25:44 -04:00
Jai Bhagat 46f8b3dc66 refactor tests that depend on Agent fixutres in Mirage 2021-07-08 08:01:15 -04:00
Alex Munda d4d57325c9
Sync vendored api 2021-07-07 19:53:46 -05:00
Alex Munda b3a437f65d
Changelog entry for dispatch idempotency token 2021-07-07 19:48:59 -05:00
Alex Munda 02c1a4d912
Set/parse idempotency_token query param 2021-07-07 16:26:55 -05:00
Alex Munda 9e5061ef87
Update idempotency comment to reflect all jobs
Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>
2021-07-07 15:54:56 -05:00
Seth Hoenig a57b066402
Merge pull request #10865 from hashicorp/b-deregister-noops
consul: avoid extra sync operations when no action required
2021-07-07 13:42:46 -05:00
Isabel Suchanek 13db600665
cli: add -task flag to alloc signal, restart (#10859)
Alloc exec only works when task is passed as a flag and not an arg.
Alloc logs currently accepts either, but alloc signal and restart only
accept task as an arg. This adds -task as a flag to the other alloc
commands to make the cli UX consistent. If task is passed as a flag and
an arg, it ignores the arg.
2021-07-07 09:58:16 -07:00
Seth Hoenig 56a6a1b1df consul: avoid extra sync operations when no action required
This PR makes it so the Consul sync logic will ignore operations that
do not specify an action to take (i.e. [de-]register [services|checks]).

Ideally such noops would be discarded at the callsites (i.e. users
of [Create|Update|Remove]Workload], but we can also be defensive
at the commit point.

Also adds 2 trace logging statements which are helpful for diagnosing
sync operations with Consul - when they happen and why.

Fixes #10797
2021-07-07 11:24:56 -05:00
Tim Gross 417ec91317 scheduler: datacenter updates should be destructive
Updates to the datacenter field should be destructive for any allocation that
is on a node no longer in the list of datacenters, but inplace for any
allocation on a node that is still in the list. Add a check for this change to
the system and generic schedulers after we've checked the task definition for
updates and obtained the node for each current allocation.
2021-07-07 11:18:30 -04:00
Seth Hoenig dc3b13548b
Merge pull request #10857 from hashicorp/b-rm-canarys
consul: avoid triggering unnecessary sync when removing workload
2021-07-07 09:47:15 -05:00
James Rasell 90eced0e53
Merge pull request #10861 from hashicorp/f-gh-10860
api: Added `NewSystemJob` job creation helper function.
2021-07-07 16:17:15 +02:00
Mahmood Ali 5177abf752
Merge pull request #10743 from holtwilkins/parse-term-gws
Parse term gws
2021-07-07 10:13:37 -04:00
Tim Gross 69a7c9db7e
csi: account for nil volume_mount in API-to-structs conversion (#10855)
Fix a nil pointer in the API struct to `nomad/structs` conversion when a
`volume_mount` block is empty.
2021-07-07 08:06:39 -04:00
James Rasell 5a928a25ae
changelog: add entry for #10861 2021-07-07 11:03:39 +02:00
James Rasell 381741baad
api: Added `NewSystemJob` job creation helper function. 2021-07-07 11:03:20 +02:00
Jai Bhagat 2efa6cc7e4 add agent serializer
This commit adds a serializer for the Agent. When the factory was originally
designed to have the Member properities directly on the Agent class which was
not set-up properly technically, but since we didn't really make use of the
Agent endpoint.
2021-07-06 16:42:07 -04:00
Seth Hoenig c8260c3940 consul: avoid triggering unnecessary sync when removing workload
There are bits of logic in callers of RemoveWorkload on group/task
cleanup hooks which call RemoveWorkload with the "Canary" version
of the workload, in case the alloc is marked as a Canary. This logic
triggers an extra sync with Consul, and also doesn't do the intended
behavior - for which no special casing is necessary anyway. When the
workload is marked for removal, all associated services and checks
will be removed regardless of the Canary status, because the service
and check IDs do not incorporate the canary-ness in the first place.

The only place where canary-ness matters is when updating a workload,
where we need to compute the hash of the services and checks to determine
whether they have been modified, the Canary flag of which is a part of
that.

Fixes #10842
2021-07-06 14:08:42 -05:00
Mahmood Ali 94913d2ad6
Adopt go-changelog in Nomad (#10825)
Adopts [`go-changelog`](https://github.com/hashicorp/go-changelog) for managing Nomad's changelog. `go-changelog` is becoming the HashiCorp defacto standard tool for managing changelog, e.g. [Consul](https://github.com/hashicorp/consul/pull/8387), [Vault](https://github.com/hashicorp/vault/pull/10363), [Waypoint](https://github.com/hashicorp/waypoint/pull/1179). [Consul](https://github.com/hashicorp/consul/pull/8387) seems to be the first product to adopt it, and its PR has the most context - though I've updated `.changelog/README.md` with the relevant info here.

## Changes to developers workflow

When opening PRs, developers should add a changelog entry in `.changelog/<PR#>.txt`. Check [`.changelog/README.md`](https://github.com/hashicorp/nomad/blob/docs-adopt-gochangelog/.changelog/README.md#developer-guide). 

For the WIP release, entries can be amended even after the PR merged, and new files may be added post-hoc (e.g. during transition period, missed accidentally, community PRs, etc).

### Transitioning

Pending PRs can start including the changelog entry files immediately.

For 1.1.3/1.0.9 cycle, the release coordinator should create the entries for any PR that gets merged without a changelog entry file. They should also move any 1.1.3 entry in CHANGELOG.md to a changelog entry file, as this PR done for GH-10818.

## Changes to release process

Before cutting a release, release coordinator should update the changelog by inserting the output of `make changelog` to CHANGELOG.md with appropriate headers. See [`.changelog/README.md`](https://github.com/hashicorp/nomad/blob/docs-adopt-gochangelog/.changelog/README.md#how-to-generate-changelog-entries-for-release) for more details.


## Details

go-changelog is a basic templating engine for maintaining changelog in HashiCorp environment.

It expects the changelog entries as files indexed by their PR number. The CLI generates the changelog section for a release by comparing two git references (e.g. `HEAD` and the latest release, e.g. `v1.1.2`), and still requires manual process for updating CHANGELOG.md and final formatting.

The approach has many nice advantages:
* Avoids changelog related merge conflicts: Each PR touches different file!
* Copes with amendments and post-PR updates: Just add or update a changelog entry file using the original PR numbers.
* Addresses the release backporting scenario: Cherry-picking PRs will cherry-pick the relevant changelog entry automatically!
* Only relies on data available through `git` - no reliance on GitHub metadata or require GitHub credentials

The approach has few downsides though:
* CHANGELOG.md going stale during development and must be updated manually before cutting the release
  * Repository watchers can no longer glance at the CHANGELOG.md to see upcoming changes
  * We can periodically update the file, but `go-changelog` tool does not aid with that
* `go-changelog` tool does not offer good error reporting. If an entry is has an invalid tag (e.g. uses `release-note:bugfix` instead of `release-note:bug`), the entry will be dropped silently
  * We should update go-changelog to warn against unexpected entry tags
  * TODO: Meanwhile, PR reviewers and release coordinators should watch out

## Potential follow ups

We should follow up with CI checks to ensure PR changes include a warning. I've opted not to include that now. We still make many non-changelog-worth PRs for website/docs, for large features that get merged in multiple small PRs. I did not want to include a check that fails often.

Also, we should follow up to have `go-changelog` emit better warnings on unexpected tag.
2021-07-06 10:46:53 -04:00
James Rasell d9cedab221
Merge pull request #10853 from shantanugadgil/patch-4
constraint on the Docker driver missing 'attr'
2021-07-06 12:19:08 +02:00
Shantanu Gadgil 39671aa4f4
constraint on the Docker driver missing 'attr'
Between this page and https://www.nomadproject.io/docs/runtime/interpolation
I realized that the syntax on the Docker page was missing the word `attr`.
2021-07-06 08:48:04 +05:30
Ben Buzbee e247f8806b Don't treat a failed recover + successful destroy as a successful
recover

This code just seems incorrect. As it stands today it reports a
successful restore if RecoverTask fails and then DestroyTask succeeds.

This can result in a really annoying bug where it then calls RecoverTask
again, whereby it will probably get ErrTaskNotFound and call DestroyTask
once more.

I think the only reason this has not been noticed so far is because most
drivers like Docker will return Success, then nomad will call
RecoverTask, get an error (not found) and call DestroyTask again, and
get a ErrTasksNotFound err.
2021-07-03 01:46:36 +00:00
Alex Munda 557a227de1
Match idempotency key on all child jobs and return existing job when idempotency keys match. 2021-07-02 14:08:46 -05:00
Alex Munda 34c63b086b
Move idempotency check closer to validate. Log error. 2021-07-02 10:58:42 -05:00
Grant Griffiths 7f8e285559
CSI: Snapshot volume create should use vol.Secrets (#10840)
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-07-02 08:28:22 -04:00
Alex Munda baba8fe7df
Update tests after moving idempotency token to WriteOptions 2021-07-01 08:48:57 -05:00
Alex Munda 848918018c
Move idempotency token to write options. Remove DispatchIdempotent 2021-06-30 15:10:48 -05:00
Alex Munda baae6d5546
Update comment about idempotency check 2021-06-30 12:30:44 -05:00
Alex Munda 01bcd9c41c
Make idempotency error user friendly
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-06-30 12:26:33 -05:00
Tim Gross db96e40f3a
docker: move host path for hosts file mount to alloc dir (#10823)
In Nomad 1.1.1 we generate a hosts file based on the Nomad-owned network
namespace, rather than using the default hosts file from the pause
container. This hosts file should be shared between tasks in the same
allocation so that tasks can update the file and have the results propagated
between tasks.
2021-06-30 11:10:04 -04:00
James Rasell 762d68a51c
Merge pull request #10829 from hashicorp/gh-10820-docs
changelog: add entry for #10822
2021-06-30 08:37:02 +02:00
James Rasell 9d093e4d8c
Merge pull request #10831 from jcalonso/patch-1
[docs] Fix missing opening quote
2021-06-30 08:36:39 +02:00
Holt Wilkins c3b2a72ac4 Enable parsing of terminating gateways 2021-06-30 05:34:16 +00:00
Juan Carlos Alonso ea272d91ce
Fix missing opening quote 2021-06-30 12:33:34 +08:00
Alex Munda ca86c7ba0c
Add idempotency token to dispatch request instead of special meta key 2021-06-29 15:59:23 -05:00
Seth Hoenig d1fa7ed88c
Merge pull request #10805 from hashicorp/b-cn-tls-env
consul/connect: automatically set consul tls sni name for connect native tasks
2021-06-29 14:10:21 -05:00
Alex Munda 122136b657
Always allow idempotency key meta. Tests for idempotent dispatch 2021-06-29 10:30:04 -05:00
James Rasell 0e9955bb4a
Merge pull request #10822 from hashicorp/b-gh-10820
cli: fixed system commands so they correctly use passed flags.
2021-06-29 10:11:00 +02:00
James Rasell ad6580686b
changelog: add entry for #10822 2021-06-29 10:10:32 +02:00
Tim Gross 5e6aca18e4
docs: unset port `to` field maps to dynamic port (#10828) 2021-06-28 15:55:24 -04:00
Tim Gross 777db36185
docs: add missing backwards compat warning about port_map (#10827)
The `docker` driver's `port_map` field was deprecated in 0.12 and this is
documented in the task driver's docs, but we never explicitly flagged it for
backwards compatibility.
2021-06-28 15:49:41 -04:00
Kendall Strautman 02d63a2024
chore: upgrade react-head and deps (#10811) 2021-06-28 08:55:38 -07:00
Seth Hoenig 5aa657c6bd consul/connect: automatically set consul tls sni name for connect native tasks
This PR makes it so that Nomad will automatically set the CONSUL_TLS_SERVER_NAME
environment variable for Connect native tasks running in bridge networking mode
where Consul has TLS enabled. Because of the use of a unix domain socket for
communicating with Consul when in bridge networking mode, the server name is
a file name instead of something compatible with the mTLS certificate Consul
will authenticate against. "localhost" is by default a compatible name, so Nomad
will set the environment variable to that.

Fixes #10804
2021-06-28 08:36:53 -05:00
Boris Shomodjvarac 64b1cafa57
docs: update csi_plugin example (#10821)
Current efs driver does not support telling it if its a `node` or a `controller`, and it will not print any error it will just ignore all other parameters then:(
So this will result in endpoint being `/tmp/csi.sock` and not `/csi/csi.sock` which will in turn break nomad/csi integration.

Also I changed the latest image tag to v1.3.2 to make sure anybody copy pasting this example is sure that it will work.

Tested on nomad 1.1.2
2021-06-28 08:28:03 -04:00
James Rasell 7a89dfe0cb
cli: fixed system commands so they correctly use passed flags. 2021-06-28 10:57:50 +02:00