Commit graph

4182 commits

Author SHA1 Message Date
Luiz Aoqui 2876739a51
api: apply consistent behaviour of the reverse query parameter (#12244) 2022-03-11 19:44:52 -05:00
Tim Gross 066a820209
job summary query in Job.List RPC should use job's namespace (#12249)
The `Job.List` RPC attaches a `JobSummary` to each job stub. We're
using the request namespace and not the job namespace for that query,
which results in a nil `JobSummary` whenever we pass the wildcard
namespace. This is incorrect and causes panics in downstream consumers
like the CLI, which assume the `JobSummary` is non-nil as an unstate
invariant.
2022-03-09 10:47:19 -05:00
Luiz Aoqui 550c5ab6ec
fix TestCSIVolumeEndpoint_List_PaginationFiltering test (#12245) 2022-03-09 09:40:40 -05:00
Luiz Aoqui ab8ce87bba
Add pagination, filtering and sort to more API endpoints (#12186) 2022-03-08 20:54:17 -05:00
Michael Schurter 7bb8de68e5
Merge pull request #12138 from jorgemarey/f-ns-meta
Add metadata to namespaces
2022-03-07 10:19:33 -08:00
Tim Gross 2dafe46fe3
CSI: allow updates to volumes on re-registration (#12167)
CSI `CreateVolume` RPC is idempotent given that the topology,
capabilities, and parameters are unchanged. CSI volumes have many
user-defined fields that are immutable once set, and many fields that
are not user-settable.

Update the `Register` RPC so that updating a volume via the API merges
onto any existing volume without touching Nomad-controlled fields,
while validating it with the same strict requirements expected for
idempotent `CreateVolume` RPCs.

Also, clarify that this state store method is used for everything, not just
for the `Register` RPC.
2022-03-07 11:06:59 -05:00
Tim Gross 3a692a4360
csi: get plugin ID for creating snapshot from volume, not args (#12195)
The `CreateSnapshot` RPC expects a plugin ID to be set by the API, but
in the common case of the `nomad volume snapshot create` command, we
don't ask the user for the plugin ID because it's available from the
volume we're snapshotting.

Change the order of the RPC so that we get the volume first and then
use the volume's plugin ID for the plugin if the API didn't set the
value.
2022-03-07 09:06:50 -05:00
Jorge Marey 372ea7479b Add changelog file. Add meta to ns mock for testing 2022-03-07 10:56:56 +01:00
Tim Gross b776c1c196
csi: fix prefix queries for plugin list RPC (#12194)
The `CSIPlugin.List` RPC was intended to accept a prefix to filter the
list of plugins being listed. This was being accidentally being done
in the state store instead, which contributed to incorrect filtering
behavior for plugins in the `volume plugin status` command.

Move the prefix matching into the RPC so that it calls the
prefix-matching method in the state store if we're looking for a
prefix.

Update the `plugin status command` to accept a prefix for the plugin
ID argument so that it matches the expected behavior of other commands.
2022-03-04 16:44:09 -05:00
Luiz Aoqui b1809eb48c
Fix CSI volume list with prefix and * namespace (#12184)
When using a prefix value and the * wildcard for namespace, the endpoint
would not take the prefix value into consideration due to the order in
which the checks were executed but also the logic for retrieving volumes
from the state store.

This commit changes the order to check for a prefix first and wraps the
result iterator of the state store query in a filter to apply the
prefix.
2022-03-03 17:27:04 -05:00
James Rasell ca6ba2e047
rpc: add job service registration list RPC endpoint. 2022-03-03 11:26:14 +01:00
James Rasell b68d573aa5
rpc: add alloc service registration list RPC endpoint. 2022-03-03 11:25:55 +01:00
James Rasell 1ad8ea558a
rpc: add service registration RPC endpoints. 2022-03-03 11:25:29 +01:00
James Rasell 52283f057f
fsm: add FSM functionality for service registration endpoints. 2022-03-03 11:24:29 +01:00
Luiz Aoqui 01931587ba
api: paginated results with different ordering (#12128)
The paginator logic was built when go-memdb iterators would return items
ordered lexicographically by their ID prefixes, but #12054 added the
option for some tables to return results ordered by their `CreateIndex`
instead, which invalidated the previous paginator assumption.

The iterator used for pagination must still return results in some order
so that the paginator can properly handle requests where the next_token
value is not present in the results anymore (e.g., the eval was GC'ed).

In these situations, the paginator will start the returned page in the
first element right after where the requested token should've been.

This commit moves the logic to generate pagination tokens from the
elements being paginated to the iterator itself so that callers can have
more control over the token format to make sure they are properly
ordered and stable.

It also allows configuring the paginator as being ordered in ascending
or descending order, which is relevant when looking for a token that may
not be present anymore.
2022-03-01 15:36:49 -05:00
Tim Gross f2a4ad0949
CSI: implement support for topology (#12129) 2022-03-01 10:15:46 -05:00
James Rasell 8a23afdb56
events: add state objects and logic for service registrations. 2022-02-28 10:44:58 +01:00
James Rasell 20249bb761
state: add service registration restore functionality. 2022-02-28 10:15:27 +01:00
James Rasell 74b367553e
state: add service registration state interaction functions. 2022-02-28 10:15:03 +01:00
James Rasell cfdb5a3c66
structs: add service registration struct and basic composed funcs. 2022-02-28 10:14:40 +01:00
James Rasell 1da859c60e
mock: add service registration mock generation for test use. 2022-02-28 10:14:25 +01:00
James Rasell cf0b63d561
state: add the table schema for the service_registrations table. 2022-02-28 10:14:10 +01:00
Jorge Marey a466f01120 Add metadata to namespaces 2022-02-27 09:09:10 +01:00
Seth Hoenig 1274aa690f tests: deflake test that joins a server with non-voting servers to form qourum
This PR
 - upgrades the serf library
 - has the test start the join process using the un-joined server first
 - disables schedulers on the servers
 - uses the WaitForLeader and wantPeers helpers

Not sure which, if any of these actually improves the flakiness of this test.
2022-02-24 17:02:58 -06:00
Tim Gross cfe3117af8
CSI: enforce usage at claim time (#12112)
* Remove redundant schedulable check in `FreeWriteClaims`. If a volume
  has been created but not yet claimed, its capabilities will be checked
  in `WriteSchedulable` at both scheduling time and claim time. We don't
  need to also check them in the `FreeWriteClaims` method.

* Enforce maximum volume claims for writers.

  When the scheduler checks feasibility for CSI volumes, the check is
  fairly loose: earlier versions of the same job are not counted as
  active claims. This allows the scheduler to place new allocations
  for the new version of a job, under the assumption that we'll replace
  the existing allocations and their volume claims.

  But when the alloc runner claims the volume, we need to enforce the
  active claims even if they're for allocations of an earlier version of
  the job. Otherwise we'll try to mount a volume that's currently being
  unmounted, and this will cause replacement allocations to frequently
  fail.

* Enforce single-node reader check for read-only volumes. When the
  alloc runner makes a claim for a read-only volume, we only check that
  the volume is potentially schedulable and not that it actually has
  free read claims.
2022-02-24 09:37:37 -05:00
Sander Mol 42b338308f
add go-sockaddr templating support to nomad consul address (#12084) 2022-02-24 09:34:54 -05:00
Florian Apolloner 3bced8f558
namespaces: allow enabling/disabling allowed drivers per namespace 2022-02-24 09:27:32 -05:00
Seth Hoenig 57b9c64b8f
Merge pull request #12107 from hashicorp/use-bbolt
core: swap bolt impl and enable configuring raft freelist sync behavior
2022-02-24 08:25:54 -06:00
Tim Gross 5b7b9fdafb
csi: tolerate missing plugins on job delete (#12114)
If a plugin job fails before successfully fingerprinting the plugins,
the plugin will not exist when we try to delete the job. Tolerate
missing plugins.
2022-02-24 08:53:15 -05:00
Seth Hoenig de95998faa core: switch to go.etc.io/bbolt
This PR swaps the underlying BoltDB implementation from boltdb/bolt
to go.etc.io/bbolt.

In addition, the Server has a new configuration option for disabling
NoFreelistSync on the underlying database.

Freelist option: https://github.com/etcd-io/bbolt/blob/master/db.go#L81
Consul equivelent PR: https://github.com/hashicorp/consul/pull/11720
2022-02-23 14:26:41 -06:00
Tim Gross 17dc0adee3
csi: fix broken test (#12110) 2022-02-23 13:48:39 -05:00
Tim Gross 57a546489f
CSI: minor refactoring (#12105)
* rename method checking that free write claims are available
* use package-level variables for claim errors
* semgrep fix for testify
2022-02-23 11:13:51 -05:00
Tim Gross de134d9783
csi: fix mocked modes in volumewatcher test (#12104)
The volumewatcher test incorrectly represents the change in attachment
and access modes introduced in Nomad 1.1.0 to support volume
creation. This leads to a test that happens to pass but only
accidentally.

Update the test to correctly represent the volume modes set by the
existing claims on the test volumes.
2022-02-23 09:51:20 -05:00
Michael Schurter 7494a0c4fd core: remove all traces of unused protocol version
Nomad inherited protocol version numbering configuration from Consul and
Serf, but unlike those projects Nomad has never used it. Nomad's
`protocol_version` has always been `1`.

While the code is effectively unused and therefore poses no runtime
risks to leave, I felt like removing it was best because:

1. Nomad's RPC subsystem has been able to evolve extensively without
   needing to increment the version number.
2. Nomad's HTTP API has evolved extensively without increment
   `API{Major,Minor}Version`. If we want to version the HTTP API in the
   future, I doubt this is the mechanism we would choose.
3. The presence of the `server.protocol_version` configuration
   parameter is confusing since `server.raft_protocol` *is* an important
   parameter for operators to consider. Even more confusing is that
   there is a distinct Serf protocol version which is included in `nomad
   server members` output under the heading `Protocol`. `raft_protocol`
   is the *only* protocol version relevant to Nomad developers and
   operators. The other protocol versions are either deadcode or have
   never changed (Serf).
4. If we were to need to version the RPC, HTTP API, or Serf protocols, I
   don't think these configuration parameters and variables are the best
   choice. If we come to that point we should choose a versioning scheme
   based on the use case and modern best practices -- not this 6+ year
   old dead code.
2022-02-18 16:12:36 -08:00
Luiz Aoqui de91954582
initial base work for implementing sorting and filter across API endpoints (#12076) 2022-02-16 14:34:36 -05:00
Luiz Aoqui 110dbeeb9d
Add go-bexpr filters to evals and deployment list endpoints (#12034) 2022-02-16 11:40:30 -05:00
Seth Hoenig 40c714a681 api: return sorted results in certain list endpoints
These API endpoints now return results in chronological order. They
can return results in reverse chronological order by setting the
query parameter ascending=true.

- Eval.List
- Deployment.List
2022-02-15 13:48:28 -06:00
Luiz Aoqui 3bf6036487 Version 1.2.6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJiBIXqAAoJELC0QQl2hbZ2M8cP/A7LENJbFSph25M1aGItra5j
 BphSX//Sq/v9ZzO44rOGNYQGfTpFT8STJgj2GC50qR/ilF4KX4D0oZlDyu/6D0NG
 ouN9RUjnFd6IEDQrjqqqhr3F69Z95SWVfi1rfgn/pIgOYkVEXfi6DXaulVVyd2ZT
 J0G5w5ryl5d8PhuL7TWw4zbhZRQn0hVspZv/1s3/I9aG6Sew8SMweeOxbN9lBr7E
 H19Amdjh6ugRuPgU7YMpKDVrZQRv9Wt7BUP/uc0u3LiW9z3Ko8ZKnCRKErtL5Kc3
 HDZsWe+t3va4Uekzd0HULNcYU4kwjogdRYRzX5kRsOyXelrZkQIqYFiKrk1wVbq/
 cYM5DUak6eUQBGhgi3UY0fklBFq4GDGpiwEzn7rvQb0PRSuVyykgbZ12fzyIu8dp
 tWbR/WOEg9F+jva6HkR2kDIcr5mDmny3Pxi5aUT6lMk1111nCzOjDzhLkQVtfsex
 FDMByXxM4oWAK3ouq2OIdxDL2c742A2933C4/30KWE7Xy7twsvkGw52irw66VO3V
 4PHP880cDvEDaEh15mY/8FlaAE7t/gsCUuYLxGwl33TaXSRBLc9vVNrrp89q53TD
 ZcvXTBpHUOWa6ZlHF/4f8LW44rowM6bU0Wili7NaWOKx86dnUJMG4sqJifNgcpS/
 7lXogv98CYLbMy4X4if0
 =NY1Z
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmIFbbkACgkQKfRZwNnL
 tXOr/g/+N2ZBMK8ohEvtdXLl7WXrVhgJfUSVbdD5Kfshul9CPn3yWRxJzqtEN2Pf
 55ozeWLpoziP9y9LviJ7rDidXcTmDFutbFdGJ3L+ZLdLILsNOq1A+lbuwO3fJngZ
 5aiPoJLsw4sqj6uHaM6Cls2f145O92nT7GXEHCxuvGHeSf3NkcR+zRY5nPrLTIrA
 uxYefCOzP6C2I+W7dL4Oj5R5EZd4UDi1WiL8pGzwm24LcagZN2ctctolAeF9OlJX
 M58UUv9b4GObe617u8MeH0LIlyZiNwn9JqrV33dKVTyrkBIYfYxkzdzMKf1csVYk
 kQb13KPdPTASBAGTl+sxeXXnw/bg09JXGcvREX5lLyQqY8xGwTv2FpTmybKWLiss
 Bg6BbejrgtCPBik0EAHWV0+kVzhi9bPfUYwTXLDCzMtrbyCyPoWchruel2sm41U1
 ezRDzlSvf6nrXf7sAv6umJICck4Bc5Gol+8W7fxvWqnY9rQ3ds2v7E5lXZMBbOmE
 JSi+EDWBJjBAXehE6pLxeVsvlHMRWN007Z2UeD4neGIgG7xFJLq6nKeUKoiNIpgk
 hKBL8iwHyuJfrBB/dcPzI9NV+jL6OZ/oI1RWxSj0MX/B4VXZp8HrqZA5JxzQolUg
 KIxqe4iX3WIkQv+UU4WiELvs4O7fujB4KWz3iQokhwDxqGUpffk=
 =5EG2
 -----END PGP SIGNATURE-----

Merge tag 'v1.2.6' into merge-release-1.2.6-branch

Version 1.2.6
2022-02-10 14:55:34 -05:00
Seth Hoenig 437bb4b86d
client: check escaping of alloc dir using symlinks
This PR adds symlink resolution when doing validation of paths
to ensure they do not escape client allocation directories.
2022-02-09 19:50:13 -05:00
Tim Gross 6bd33d3fb9
CSI: use job status not alloc status for plugin updates from summary (#12027)
When an allocation is updated, the job summary for the associated job
is also updated. CSI uses the job summary to set the expected count
for controller and node plugins. We incorrectly used the allocation's
server status instead of the job status when deciding whether to
update or remove the job from the plugins. This caused a node drain or
other terminal state for an allocation to clear the expected count for
the entire plugin.

Use the job status to guide whether to update or remove the expected
count.

The existing CSI tests for the state store incorrectly modeled the
updates we received from servers vs those we received from clients,
leading to test assertions that passed when they should not.

Rework the tests to clarify each step in the lifecycle and rename CSI state
store functions for clarity
2022-02-09 11:51:49 -05:00
Luiz Aoqui 0e09b120e4
fix mTLS certificate check on agent to agent RPCs (#11998)
PR #11956 implemented a new mTLS RPC check to validate the role of the
certificate used in the request, but further testing revealed two flaws:

  1. client-only endpoints did not accept server certificates so the
     request would fail when forwarded from one server to another.
  2. the certificate was being checked after the request was forwarded,
     so the check would happen over the server certificate, not the
     actual source.

This commit checks for the desired mTLS level, where the client level
accepts both, a server or a client certificate. It also validates the
cercertificate before the request is forwarded.
2022-02-04 20:35:20 -05:00
Karthick Ramachandran 0600bc32e2
improve error message on service length (#12012) 2022-02-04 19:39:34 -05:00
Tim Gross 790e49b0dd
core: log CSI GC cutoff index only on non-forced GC (#11997)
Non-CSI garbage collection tasks on the server only log the cutoff
index in the case where it's not a forced GC from `nomad system gc`.
Do the same for CSI for consistency.
2022-02-03 15:03:39 -05:00
Seth Hoenig 5f48e18189
Merge pull request #11983 from hashicorp/b-select-after
cleanup: prevent leaks from time.After
2022-02-03 09:38:06 -06:00
Samantha 54f8c04c91
Fix health checking for ephemeral poststart tasks (#11945)
Update the logic in the Nomad client's alloc health tracker which
erroneously marks existing healthy allocations with dead poststart ephemeral
tasks as unhealthy even if they were already successful during a previous
deployment.
2022-02-02 16:29:49 -05:00
Seth Hoenig db2347a86c cleanup: prevent leaks from time.After
This PR replaces use of time.After with a safe helper function
that creates a time.Timer to use instead. The new function returns
both a time.Timer and a Stop function that the caller must handle.

Unlike time.NewTimer, the helper function does not panic if the duration
set is <= 0.
2022-02-02 14:32:26 -06:00
Luiz Aoqui c4cff5359f
Verify TLS certificate on endpoints that are used between agents only (#11956) 2022-02-02 15:03:18 -05:00
Michael Schurter fd242ab7f8
Merge pull request #11878 from kainoaseto/fix/multi-task-group-canary-deploys
Bugfix: auto-promote canary taskgroups when mixed with non-canary taskgroups
2022-01-31 16:22:51 -08:00
kainoaseto d575b3f4ae rename test variable names to something a easier to identify 2022-01-31 14:59:52 -08:00
Michael Schurter 8973cc39a3
Merge pull request #11944 from hashicorp/b-validate-plan
core: prevent malformed plans from crashing leader
2022-01-31 13:14:28 -08:00
Michael Schurter d87ed3fcd7 core: prevent malformed plans from crashing leader
The Plan.Submit endpoint assumed PlanRequest.Plan was never nil. While
there is no evidence it ever has been nil, we should not panic if a nil
plan is ever submitted because that would crash the leader.
2022-01-31 12:15:15 -08:00
Tim Gross 5773fc93a2 CSI: move terminal alloc handling into denormalization (#11931)
* The volume claim GC method and volumewatcher both have logic
collecting terminal allocations that duplicates most of the logic
that's now in the state store's `CSIVolumeDenormalize` method. Copy
this logic into the state store so that all code paths have the same
view of the past claims.
* Remove logic in the volume claim GC that now lives in the state
store's `CSIVolumeDenormalize` method.
* Remove logic in the volumewatcher that now lives in the state
store's `CSIVolumeDenormalize` method.
* Remove logic in the node unpublish RPC that now lives in the state
store's `CSIVolumeDenormalize` method.
2022-01-28 14:43:50 -05:00
Tim Gross c67c31e543 csi: ensure that PastClaims are populated with correct mode (#11932)
In the client's `(*csiHook) Postrun()` method, we make an unpublish
RPC that includes a claim in the `CSIVolumeClaimStateUnpublishing`
state and using the mode from the client. But then in the
`(*CSIVolume) Unpublish` RPC handler, we query the volume from the
state store (because we only get an ID from the client). And when we
make the client RPC for the node unpublish step, we use the _current
volume's_ view of the mode. If the volume's mode has been changed
before the old allocations can have their claims released, then we end
up making a CSI RPC that will never succeed.

Why does this code path get the mode from the volume and not the
claim? Because the claim written by the GC job in `(*CoreScheduler)
csiVolumeClaimGC` doesn't have a mode. Instead it just writes a claim
in the unpublishing state to ensure the volumewatcher detects a "past
claim" change and reaps all the claims on the volumes.

Fix this by ensuring that the `CSIVolumeDenormalize` creates past
claims for all nil allocations with a correct access mode set.
2022-01-28 14:43:43 -05:00
Tim Gross 951661db04 CSI: resolve invalid claim states (#11890)
* csi: resolve invalid claim states on read

It's currently possible for CSI volumes to be claimed by allocations
that no longer exist. This changeset asserts a reasonable state at
the state store level by registering these nil allocations as "past
claims" on any read. This will cause any pass through the periodic GC
or volumewatcher to trigger the unpublishing workflow for those claims.

* csi: make feasibility check errors more understandable

When the feasibility checker finds we have no free write claims, it
checks to see if any of those claims are for the job we're currently
scheduling (so that earlier versions of a job can't block claims for
new versions) and reports a conflict if the volume can't be scheduled
so that the user can fix their claims. But when the checker hits a
claim that has a GCd allocation, the state is recoverable by the
server once claim reaping completes and no user intervention is
required; the blocked eval should complete. Differentiate the
scheduler error produced by these two conditions.
2022-01-28 14:43:35 -05:00
Tim Gross 4e559c6255 csi: update leader's ACL in volumewatcher (#11891)
The volumewatcher that runs on the leader needs to make RPC calls
rather than writing to raft (as we do in the deploymentwatcher)
because the unpublish workflow needs to make RPC calls to the
clients. This requires that the volumewatcher has access to the
leader's ACL token.

But when leadership transitions, the new leader creates a new leader
ACL token. This ACL token needs to be passed into the volumewatcher
when we enable it, otherwise the volumewatcher can find itself with a
stale token.
2022-01-28 14:43:27 -05:00
Tim Gross b20a6c9ffb
CSI: move terminal alloc handling into denormalization (#11931)
* The volume claim GC method and volumewatcher both have logic
collecting terminal allocations that duplicates most of the logic
that's now in the state store's `CSIVolumeDenormalize` method. Copy
this logic into the state store so that all code paths have the same
view of the past claims.
* Remove logic in the volume claim GC that now lives in the state
store's `CSIVolumeDenormalize` method.
* Remove logic in the volumewatcher that now lives in the state
store's `CSIVolumeDenormalize` method.
* Remove logic in the node unpublish RPC that now lives in the state
store's `CSIVolumeDenormalize` method.
2022-01-27 10:39:08 -05:00
Tim Gross a40a20cff8
csi: ensure that PastClaims are populated with correct mode (#11932)
In the client's `(*csiHook) Postrun()` method, we make an unpublish
RPC that includes a claim in the `CSIVolumeClaimStateUnpublishing`
state and using the mode from the client. But then in the
`(*CSIVolume) Unpublish` RPC handler, we query the volume from the
state store (because we only get an ID from the client). And when we
make the client RPC for the node unpublish step, we use the _current
volume's_ view of the mode. If the volume's mode has been changed
before the old allocations can have their claims released, then we end
up making a CSI RPC that will never succeed.

Why does this code path get the mode from the volume and not the
claim? Because the claim written by the GC job in `(*CoreScheduler)
csiVolumeClaimGC` doesn't have a mode. Instead it just writes a claim
in the unpublishing state to ensure the volumewatcher detects a "past
claim" change and reaps all the claims on the volumes.

Fix this by ensuring that the `CSIVolumeDenormalize` creates past
claims for all nil allocations with a correct access mode set.
2022-01-27 10:05:41 -05:00
Tim Gross a2433e35fb
CSI: resolve invalid claim states (#11890)
* csi: resolve invalid claim states on read

It's currently possible for CSI volumes to be claimed by allocations
that no longer exist. This changeset asserts a reasonable state at
the state store level by registering these nil allocations as "past
claims" on any read. This will cause any pass through the periodic GC
or volumewatcher to trigger the unpublishing workflow for those claims.

* csi: make feasibility check errors more understandable

When the feasibility checker finds we have no free write claims, it
checks to see if any of those claims are for the job we're currently
scheduling (so that earlier versions of a job can't block claims for
new versions) and reports a conflict if the volume can't be scheduled
so that the user can fix their claims. But when the checker hits a
claim that has a GCd allocation, the state is recoverable by the
server once claim reaping completes and no user intervention is
required; the blocked eval should complete. Differentiate the
scheduler error produced by these two conditions.
2022-01-27 09:30:03 -05:00
James Rasell c93c292dca
Merge pull request #11907 from hashicorp/f-state-store-nomad-file
state: move restore functionality into its own file.
2022-01-25 08:55:49 +01:00
Tim Gross 04977525dd
csi: update leader's ACL in volumewatcher (#11891)
The volumewatcher that runs on the leader needs to make RPC calls
rather than writing to raft (as we do in the deploymentwatcher)
because the unpublish workflow needs to make RPC calls to the
clients. This requires that the volumewatcher has access to the
leader's ACL token.

But when leadership transitions, the new leader creates a new leader
ACL token. This ACL token needs to be passed into the volumewatcher
when we enable it, otherwise the volumewatcher can find itself with a
stale token.
2022-01-24 11:49:50 -05:00
James Rasell 03cdd1a624
state: move restore functionality into its own file. 2022-01-24 13:03:03 +01:00
kainoaseto 4bb8dee26a fix auto promote of canary task groups when deploying them alongside rolling deploy taskgroups that do not use the canary deployment system 2022-01-18 17:04:03 -08:00
Nomad Release bot de3070d49a Generate files for 1.2.4 release 2022-01-18 23:43:00 +00:00
Luiz Aoqui b1753d0568
scheduler: detect and log unexpected scheduling collisions (#11793) 2022-01-14 20:09:14 -05:00
Tim Gross d7756f8cdb
csi: volume deregistration should require exact ID (#11852)
The command line client sends a specific volume ID, but this isn't
enforced at the API level and we were incorrectly using a prefix match
for volume deregistration, resulting in cases where a volume with a
shorter ID that's a prefix of another volume would be deregistered
instead of the intended volume.
2022-01-14 12:26:03 -05:00
James Rasell 1d5c5062f2
chore: ensure consistent file naming for non-enterprise files. 2022-01-13 11:32:16 +01:00
Michael Schurter e6eff95769 agent: validate reserved_ports are valid
Goal is to fix at least one of the causes that can cause a node to be
ineligible to receive work:
https://github.com/hashicorp/nomad/issues/9506#issuecomment-1002880600
2022-01-12 14:21:47 -08:00
Conor Evans 8d622797af
replace 'a alloc' with 'an alloc' where appropriate (#11792) 2022-01-10 11:59:46 -05:00
Derek Strickland 0a8e03f0f7
Expose Consul template configuration parameters (#11606)
This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza.

- `wait`

It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza.

- `max_stale`
- `block_query_wait`
- `consul_retry`
- `vault_retry` 
- `wait` 

Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure.

- `wait_bounds`

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-01-10 10:19:07 -05:00
Joel May a8fc048151
Emit metrics for eval waitUntil as nomad.nomad.broker.eval_waiting (#10236) 2022-01-06 15:57:40 -05:00
Joel May 4f78bcfb98
Emit metrics on reschedule later decisions as nomad.client.allocs.reschedule (#10237) 2022-01-06 15:56:43 -05:00
Charlie Voiselle 98a240cd99
Make number of scheduler workers reloadable (#11593)
## Development Environment Changes
* Added stringer to build deps

## New HTTP APIs
* Added scheduler worker config API
* Added scheduler worker info API

## New Internals
* (Scheduler)Worker API refactor—Start(), Stop(), Pause(), Resume()
* Update shutdown to use context
* Add mutex for contended server data
    - `workerLock` for the `workers` slice
    - `workerConfigLock` for the `Server.Config.NumSchedulers` and
      `Server.Config.EnabledSchedulers` values

## Other
* Adding docs for scheduler worker api
* Add changelog message

Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>
2022-01-06 11:56:13 -05:00
Michael Schurter 1af8d47de2
Merge pull request #11744 from hashicorp/b-node-copy
Fix Node.Copy()
2022-01-05 17:01:53 -08:00
Tim Gross 51f512a3e6
csi: reap unused volume claims at leadership transitions (#11776)
When `volumewatcher.Watcher` starts on the leader, it starts a watch
on every volume and triggers a reap of unused claims on any change to
that volume. But if a reaping is in-flight during leadership
transitions, it will fail and the event that triggered the reap will
be dropped. Perform one reap of unused claims at the start of the
watcher so that leadership transitions don't drop this event.
2022-01-05 11:40:20 -05:00
Tim Gross 395628efe1
api: paginate deployment list and accept wildcard namespace (#11743)
Add `per_page` and `next_token` handling to `Deployment.List` RPC, and
allow the use of a wildcard namespace for namespace filtering.
2022-01-03 08:36:02 -05:00
Michael Schurter 20bd8acf43 do not initialize copy's slice if nil in original 2021-12-23 16:40:35 -08:00
Michael Schurter 88200f4eb9 core: fix DNS and CPU Core copying 2021-12-23 12:28:19 -08:00
Michael Schurter 7d741837b0 core: match struct field order in Copy() 2021-12-23 12:27:39 -08:00
Luiz Aoqui 40093f97cd
api: support namespace wildcard in CSI volume list (#11724) 2021-12-21 17:19:45 -05:00
Tim Gross e046bb31e9
api: respect wildcard in evaluations list API (#11710) 2021-12-20 12:23:50 -05:00
James Rasell 45f4689f9c
chore: fixup inconsistent method receiver names. (#11704) 2021-12-20 11:44:21 +01:00
Tim Gross a0cf5db797
provide -no-shutdown-delay flag for job/alloc stop (#11596)
Some operators use very long group/task `shutdown_delay` settings to
safely drain network connections to their workloads after service
deregistration. But during incident response, they may want to cause
that drain to be skipped so they can quickly shed load.

Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and
`nomad job stop` commands that bypasses the delay. This sets a new
desired transition state on the affected allocations that the
allocation/task runner will identify during pre-kill on the client.

Note (as documented here) that using this flag will almost always
result in failed inbound network connections for workloads as the
tasks will exit before clients receive updated service discovery
information and won't be gracefully drained.
2021-12-13 14:54:53 -05:00
Tim Gross 624ecab901
evaluations list pagination and filtering (#11648)
API queries can request pagination using the `NextToken` and `PerPage`
fields of `QueryOptions`, when supported by the underlying API.

Add a `NextToken` field to the `structs.QueryMeta` so that we have a
common field across RPCs to tell the caller where to resume paging
from on their next API call. Include this field on the `api.QueryMeta`
as well so that it's available for future versions of List HTTP APIs
that wrap the response with `QueryMeta` rather than returning a simple
list of structs. In the meantime callers can get the `X-Nomad-NextToken`.

Add pagination to the `Eval.List` RPC by checking for pagination token
and page size in `QueryOptions`. This will allow resuming from the
last ID seen so long as the query parameters and the state store
itself are unchanged between requests.

Add filtering by job ID or evaluation status over the results we get
out of the state store.

Parse the query parameters of the `Eval.List` API into the arguments
expected for filtering in the RPC call.
2021-12-10 13:43:03 -05:00
Tim Gross 03e697a69d
scheduler: config option to reject job registration (#11610)
During incident response, operators may find that automated processes
elsewhere in the organization can be generating new workloads on Nomad
clusters that are unable to handle the workload. This changeset adds a
field to the `SchedulerConfiguration` API that causes all job
registration calls to be rejected unless the request has a management
ACL token.
2021-12-06 15:20:34 -05:00
Michael Schurter 3d248153f4
Merge pull request #11579 from hashicorp/b-getscalingpolicy-rpc-index-response
rpc: fix scaling policy get index response when policy is found.
2021-11-30 10:43:20 -08:00
Tim Gross 39acac33a0
ui: change Consul/Vault base URL field name (#11589)
Give ourselves some room for extension in the UI configuration block
by naming the field `ui_url`, which will let us have an `api_url`.
Fix the template path to ensure we're getting the right value from the
API.
2021-11-30 13:20:29 -05:00
James Rasell 2412e9916d
rpc: fix scaling policy get index response when policy is found.
When GetPolicy is called within the scaling handler, the index
table was being used to populate the reply index irregardless of
whether the policy was found or not. This change fixes that
behaviour so that the policy modify index is used when the policy
lookup is successful.
2021-11-26 10:40:27 +01:00
Luiz Aoqui 0cf1964651
Merge remote-tracking branch 'origin/release-1.2.2' into merge-release-1.2.2-branch 2021-11-24 14:40:45 -05:00
Nomad Release Bot 2e4ef67c2d remove generated files 2021-11-24 18:54:50 +00:00
Tim Gross fcb96de9a7
config: UI configuration block with Vault/Consul links (#11555)
Add `ui` block to agent configuration to enable/disable the web UI and
provide the web UI with links to Vault/Consul.
2021-11-24 11:20:02 -05:00
Luiz Aoqui 9d6842dd4d
Don't emit scaling event error when a deployment is underway (#11556) 2021-11-23 10:20:18 -05:00
James Rasell 751c8217d1
core: allow setting and propagation of eval priority on job de/registration (#11532)
This change modifies the Nomad job register and deregister RPCs to
accept an updated option set which includes eval priority. This
param is optional and override the use of the job priority to set
the eval priority.

In order to ensure all evaluations as a result of the request use
the same eval priority, the priority is shared to the
allocReconciler and deploymentWatcher. This creates a new
distinction between eval priority and job priority.

The Nomad agent HTTP API has been modified to allow setting the
eval priority on job update and delete. To keep consistency with
the current v1 API, job update accepts this as a payload param;
job delete accepts this as a query param.

Any user supplied value is validated within the agent HTTP handler
removing the need to pass invalid requests to the server.

The register and deregister opts functions now all for setting
the eval priority on requests.

The change includes a small change to the DeregisterOpts function
which handles nil opts. This brings the function inline with the
RegisterOpts.
2021-11-23 09:23:31 +01:00
Luiz Aoqui d3c1a03edd Version 1.2.1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJhl94SAAoJELC0QQl2hbZ2pqoP/R7HyOxvealo5MBJcG4mGiWT
 Hsu9VXpYKDWn0GSXd3JmqYWH7tIwFMXispZ7pMlDLieypW3UpMYIbIquaePxOaRL
 yhlc0CLT7JDsFPx8Puv1fgKXaS3EfFyJlYx437bhCQ+K0k2+1n3EOhrzU/DQ4j8V
 D5qxlkZh6IK6brIJ54NivGzTxtzGGvIGXCrDPolX3cwoBtyO/pbecfEkRlN2xwxl
 P68l52+Jit3lK2Cljh4Kr1qFj8voHPjYUTXGas8ZkIVrx9l4fb6CHib2y3hy4bRR
 qwXT4keWc8bxtLQ7vtetGBAXp4UKJigziE4imhHAttBN9th2/Oy0qSQCNX3xELJC
 Jwgc+N+ON63QI2sP/8FWvmeUrJpASRITYl/Gr8uOR6n1PacrBhFT9OV4VMkte1ua
 jS/WF/7k21NZYqZca+thvN12wmw/gSEAEeCHH5kR3vPLeV6FdanhKLjufMNuMShc
 UKJCEZw1/Lyux1XkLqMPoZ4DCak8/HskupQoLNsekF1Uki8ObU4as7GERedxqkj6
 i2+1QIQMqvviskOwT0QOWm4RFXjRQsIK8uUfXzHHWDMzDhvnGjB0eWVMLAj4/rTe
 46yUP4kdarFkxwkDmLEyoogdD35wC4Xc8Y8IynzUTN77pOWID5QEyFZVaaBB4NR3
 wNowUJGrNkxEYXwGSkjh
 =Zuw2
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmGbu3sACgkQKfRZwNnL
 tXMx4BAAksQ07tSoOku8zDwx2JpoiNApoYhMLlfJ4S3Mw+RYtbayAMRyA08GG56I
 U85XJB/Z2CzliYL/Nya1e3z6Gyn92V0iD9u7N1xEAPt8PdyiXqIBZn1rWoiCcnMO
 C3f2aRGhLZMVOZG0v7fgbh1PkhJt4MLcRQE9nn5ojPvFzW9bL0Iz7lc9IxHQtaU0
 rANDcXdj3IhiOdEgjtO++Qhdeu3t2SBhT2xFnlJ3gXC2q/aY1a2C7BYdlSxtw0JU
 nKpxvBTsB7rINGcYxhXZlckui5YLL4BX11XqsYhUTMC+33vxE5HNty1ANc1+SNyO
 0iHp0yc5J6MCLuiZ/2sBek2tC+KHCufb+qEIqPmBpcWPJRT8HjginLxj/HyL2TQc
 pLF9XxhYKvv0sm3Zr3Ima5kqWgayph3XhQ73hKs9f7SLfErr6qr4XaI8egZA4OTG
 0QGmY/61UlAdsz5tUvIGRWYD5rqXyXIYnUprldPSQdeZ0o2GjX7T0GZ934O5uHfE
 Ne73GafGn8JaGxH9+AEHMJAVpkrzWR1wrExL3kGJ8NF40HlsYofIuhTkZqMKX3EH
 7KfefSJW1NQAGeAEwjtvzhmUiM0cVoCWGd4COxX1G3oJ0o8gZ3RklDEA4Pa9C0rO
 pBW/KIckPpGieGvPaA3mqmXDjx6oOaxPi9wd5TniBHh43pgrASo=
 =KVce
 -----END PGP SIGNATURE-----

Merge tag 'v1.2.1' into merge-release-1.2.1-branch

Version 1.2.1
2021-11-22 10:47:04 -05:00
Tim Gross e729133134
api: return 404 for alloc FS list/stat endpoints (#11482)
* api: return 404 for alloc FS list/stat endpoints

If the alloc filesystem doesn't have a file requested by the List
Files or Stat File API, we currently return a HTTP 500 error with the
expected "file not found" error message. Return a HTTP 404 error
instead.

* update FS Handler

Previously the FS handler would interpret a 500 status as a 404
in the adapter layer by checking if the response body contained
the text  or is the response status
was 500 and then throw an error code for 404.

Co-authored-by: Jai Bhagat <jaybhagat841@gmail.com>
2021-11-17 11:15:07 -05:00
Charlie Voiselle 176de1bfe6
Refactor sendAck(3) into sendAck(2),sendNack(2),sendAcknowledgement(3) (#11506) 2021-11-17 10:49:55 -05:00
Danish Prakash 1e2c9b3aa0
client: emit max_memory metric (#11490) 2021-11-17 08:34:22 -05:00
Nomad Release bot c4463682e7 Generate files for 1.2.0 release 2021-11-15 23:00:30 +00:00
James Rasell 99955eb80f
Merge pull request #11426 from hashicorp/b-set-dereg-eval-priority-correctly
rpc: set the deregistration eval priority to the job priority.
2021-11-05 15:53:10 +01:00
James Rasell 2cc661c523
Merge pull request #11429 from hashicorp/b-set-scale-eval-priority-correctly
rpc: set the job scale eval priority to the job priority.
2021-11-05 15:52:31 +01:00
Alessandro De Blasis 07c670fdc0
cli: show host_network in nomad status (#11432)
Enhance the CLI in order to return the host network in two flavors 
(default, verbose) of the `node status` command.

Fixes: #11223.
Signed-off-by: Alessandro De Blasis <alex@deblasis.net>
2021-11-05 09:02:46 -04:00
Michael Schurter 3718557041
Merge pull request #11416 from hashicorp/f-rejected-info
core: bump rejected plans from debug -> info
2021-11-03 16:49:28 -07:00
Luiz Aoqui 5be6710216
add /s/port-plan-failure redirect and link to in in plan reject log message 2021-11-02 20:43:54 -04:00
James Rasell ac9268a429
rpc: set the job scale eval priority to the job priority. 2021-11-02 12:57:53 +01:00
James Rasell afb6913428
rpc: set the deregistration eval priority to the job priority.
Previously when creating an eval for job deregistration, the eval
priority was set to the default value irregardless of the job
priority. In situations where an operator would want to deregister
a high priority job so they could re-register; the evaluation may
get blocked for some time on a busy cluster because of the
deregsiter priority.

If a job had a lower than default priority and was deregistered,
the deregister eval would get a priority higher than that of the
job. If we attempted to register another job with a higher
priority than this, but still below the default, the deregister
would be actioned before the register.

Both situations described above seem incorrect and unexpected from
a user prespective.

This fix modifies to behaviour to set the deregister eval priority
to that of the job, if available. Otherwise the default value is
still used.
2021-11-02 09:11:44 +01:00
Luiz Aoqui 655ac2719f
Allow using specific object ID on diff (#11400) 2021-11-01 15:16:31 -04:00
Michael Schurter efe5714840 core: bump rejected plans from debug -> info
As we have continued to see reports of #9506 we need to elevate this log
line as it is the only way to detect when plans are being *erroneously*
rejected.

Users who see this log line repeatedly should drain and restart the node
in the log line. This seems to workaorund the issue.

Please post any details on #9506!
2021-10-31 12:51:42 -07:00
Mahmood Ali daf20f9788
vault: set JobID in Vault metadata (#11397)
Closes: #11395 .
2021-10-27 07:20:29 -07:00
Mahmood Ali 1de395b42c
Fix preemption panic (#11346)
Fix a bug where the scheduler may panic when preemption is enabled. The conditions are a bit complicated:
A job with higher priority that schedule multiple allocations that preempt other multiple allocations on the same node, due to port/network/device assignments.

The cause of the bug is incidental mutation of internal cached data. `RankedNode` computes and cache proposed allocations  in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L42-L53 . But scheduler then mutates the list to remove pre-emptable allocs in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L293-L294, and  `RemoveAllocs` mutates and sets the tail of cached slice with `nil`s triggering a nil-pointer derefencing case.

I fixed the issue by avoiding the mutation in `RemoveAllocs` - the micro-optimization there doesn't seem necessary.

Fixes https://github.com/hashicorp/nomad/issues/11342
2021-10-19 20:22:03 -04:00
Michael Schurter 59fda1894e
Merge pull request #11167 from a-zagaevskiy/master
Support configurable dynamic port range
2021-10-13 16:47:38 -07:00
Michael Schurter e14cd34392 client: improve errors & tests for dynamic ports 2021-10-13 16:25:25 -07:00
Dave May 2d14c54fa0
debug: Improve namespace and region support (#11269)
* Include region and namespace in CLI output
* Add region and prefix matching for server members
* Add namespace and region API outputs to cluster metadata folder
* Add region awareness to WaitForClient helper function
* Add helper functions for SliceStringHasPrefix and StringHasPrefixInSlice
* Refactor test client agent generation
* Add tests for region
* Add changelog
2021-10-12 16:58:41 -04:00
Florian Apolloner 511cae92b4
Fixed plan diffing to handle non-unique service names. (#10965) 2021-10-12 16:42:39 -04:00
Mahmood Ali 583b9f2506
Merge pull request #11089 from hashicorp/b-cve-2021-37218
Apply authZ for nomad Raft RPC layer
2021-10-05 08:49:21 -04:00
Michael Schurter 7071425af3 client: defensively log reserved ports
- Fix test broken due to being improperly setup.
- Include min/max ports in default client config.
2021-10-04 15:43:35 -07:00
Luiz Aoqui 0a62bdc3c5
fix panic when Connect mesh gateway doesn't have a proxy block (#11257)
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2021-10-04 15:52:07 -04:00
Michael Schurter 20cf993b29
Merge pull request #11235 from hashicorp/test-evalbroker
test: fix fake by increasing time window
2021-10-04 11:16:41 -07:00
Mahmood Ali 4d90afb425 gofmt all the files
mostly to handle build directives in 1.17.
2021-10-01 10:14:28 -04:00
Michael Schurter c6e72b6818 client: output reserved ports with min/max ports
Also add a little more min/max port testing and add the consts back that
had been removed: but unexported and as defaults.
2021-09-30 17:05:46 -07:00
Michael Schurter c1bd10456c test: fix flaky TestAutopilot_CleanupDeadServer
The fix seems to be related to the pointer comparison and swapping we
did around killing a non-leader. I actually can't quite explain it, but
when comparing against Consul's version of this test I noticed they used
the slice index to track the killed server instead of pointer swapping.

As soon as I switched to slice index tracking I could no longer
reproduce the failure.

In addition:
- Tested membership counts on all servers instead of just 1 for added
  correctness.
- Stopped testing raft v1 because it is unsupported.
2021-09-28 16:38:56 -07:00
Michael Schurter d7e123d7cd test: fix fake by increasing time window
Test originally only had a 10ms time window tolerance. Increased to
100ms and also improved assertions and docstrings.
2021-09-28 12:22:59 -07:00
Luiz Aoqui 1035805a42
connect: update allowed protocols in ingress gateway config (#11187) 2021-09-16 10:47:53 -04:00
James Rasell 0e926ef3fd
allow configuration of Docker hostnames in bridge mode (#11173)
Add a new hostname string parameter to the network block which
allows operators to specify the hostname of the network namespace.
Changing this causes a destructive update to the allocation and it
is omitted if empty from API responses. This parameter also supports
interpolation.

In order to have a hostname passed as a configuration param when
creating an allocation network, the CreateNetwork func of the
DriverNetworkManager interface needs to be updated. In order to
minimize the disruption of future changes, rather than add another
string func arg, the function now accepts a request struct along with
the allocID param. The struct has the hostname as a field.

The in-tree implementations of DriverNetworkManager.CreateNetwork
have been modified to account for the function signature change.
In updating for the change, the enhancement of adding hostnames to
network namespaces has also been added to the Docker driver, whilst
the default Linux manager does not current implement it.
2021-09-16 08:13:09 +02:00
Aleksandr Zagaevskiy ebb87e65fe Support configurable dynamic port range 2021-09-10 11:52:47 +03:00
Isabel Suchanek ab51050ce8
events: fix wildcard namespace handling (#10935)
This fixes a bug in the event stream API where it currently interprets
namespace=* as an actual namespace, not a wildcard. When Nomad parses
incoming requests, it sets namespace to default if not specified, which
means the request namespace will never be an empty string, which is what
the event subscription was checking for. This changes the conditional
logic to check for a wildcard namespace instead of an empty one.

It also updates some event tests to include the default namespace in the
subscription to match current behavior.

Fixes #10903
2021-09-02 09:36:55 -07:00
Andy Assareh ed580726d6
corrected peersInfoContent - was copied from Consul and not updated for Nomad (#11109)
updated with Nomad ports and web link (learn guide: https://learn.hashicorp.com/tutorials/nomad/outage-recovery)
2021-09-01 08:30:49 +02:00
James Rasell b6813f1221
chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
Mahmood Ali ce43a7a852 update tests to make an actual RaftRPC 2021-08-27 10:37:30 -04:00
Mahmood Ali ff7c1ca79b Apply authZ for nomad Raft RPC layer
When mTLS is enabled, only nomad servers of the region should access the
Raft RPC layer. Clients and servers in other regions should only use the
Nomad RPC endpoints.

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
Co-authored-by: Seth Hoenig <shoenig@hashicorp.com>
2021-08-26 15:10:07 -04:00
Kush 1d6da9b55e
docs: fix typo in structs/event.go 2021-08-21 17:02:07 +05:30
Mahmood Ali b4ed8acbff tests: attempt deflaking TestAutopilot_CleanupDeadServer
Attempt to deflake the test by avoiding shutting down the leaders, as leadership
recovery takes more time, and consequently longer to process raft configuration
changes and potentially failing the test.
2021-08-18 15:37:25 -04:00
Mahmood Ali bcac5268df tests: deflake TestLeader_LeftLeader
Wait for leadership to be established before killing leader.
2021-08-18 14:19:00 -04:00
Mahmood Ali 84a3522133
Consider all system jobs for a new node (#11054)
When a node becomes ready, create an eval for all system jobs across
namespaces.

The previous code uses `job.ID` to deduplicate evals, but that ignores
the job namespace. Thus if there are multiple jobs in different
namespaces sharing the same ID/Name, only one will be considered for
running in the new node. Thus, Nomad may skip running some system jobs
in that node.
2021-08-18 09:50:37 -04:00
Mahmood Ali c37339a8c8
Merge pull request #9160 from hashicorp/f-sysbatch
core: implement system batch scheduler
2021-08-16 09:30:24 -04:00
Michael Schurter a7aae6fa0c
Merge pull request #10848 from ggriffiths/listsnapshot_secrets
CSI Listsnapshot secrets support
2021-08-10 15:59:33 -07:00
Mahmood Ali ea003188fa
system: re-evaluate node on feasibility changes (#11007)
Fix a bug where system jobs may fail to be placed on a node that
initially was not eligible for system job placement.

This changes causes the reschedule to re-evaluate the node if any
attribute used in feasibility checks changes.

Fixes https://github.com/hashicorp/nomad/issues/8448
2021-08-10 17:17:44 -04:00
Mahmood Ali bfc766357e
deployments: canary=0 is implicitly autopromote (#11013)
In a multi-task-group job, treat 0 canary groups as auto-promote.

This change fixes an edge case where Nomad requires a manual promotion,
if the job had any group with canary=0 and rest of groups having
auto_promote set.

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2021-08-10 17:06:40 -04:00
Seth Hoenig 3371214431 core: implement system batch scheduler
This PR implements a new "System Batch" scheduler type. Jobs can
make use of this new scheduler by setting their type to 'sysbatch'.

Like the name implies, sysbatch can be thought of as a hybrid between
system and batch jobs - it is for running short lived jobs intended to
run on every compatible node in the cluster.

As with batch jobs, sysbatch jobs can also be periodic and/or parameterized
dispatch jobs. A sysbatch job is considered complete when it has been run
on all compatible nodes until reaching a terminal state (success or failed
on retries).

Feasibility and preemption are governed the same as with system jobs. In
this PR, the update stanza is not yet supported. The update stanza is sill
limited in functionality for the underlying system scheduler, and is
not useful yet for sysbatch jobs. Further work in #4740 will improve
support for the update stanza and deployments.

Closes #2527
2021-08-03 10:30:47 -04:00
Grant Griffiths fecbbaee22 CSI ListSnapshots secrets implementation
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-07-28 11:30:29 -07:00
Michael Schurter ea996c321d
Merge pull request #10916 from hashicorp/f-audit-log-mode
Add audit log file mode config parameter
2021-07-27 12:16:37 -07:00
Mahmood Ali ac3cf10849
nomad: only activate one-time auth tokens with 1.1.0 (#10952)
Fix a panic in handling one-time auth tokens, used to support `nomad ui
--authenticate`.

If the nomad leader is a 1.1.x with some servers running as 1.0.x, the
pre-1.1.0 servers risk crashing and the cluster may lose quorum. That
can happen when `nomad authenticate -ui` command is issued, or when the
leader scans for expired tokens every 10 minutes.

Fixed #10943 .
2021-07-27 13:17:55 -04:00
Seth Hoenig 54d9bad657
Merge pull request #10904 from hashicorp/b-no-affinity-intern
core: remove internalization of affinity strings
2021-07-22 09:09:07 -05:00
Michael Schurter c06ea132d3 audit: add file mode configuration parameter
Rest of implementation is in nomad-enterprise
2021-07-20 10:54:53 -07:00
Alan Guo Xiang Tan e2d1372ac9
Fix typo. 2021-07-16 13:49:15 +08:00
Seth Hoenig ac5c83cafd core: remove internalization of affinity strings
Basically the same as #10896 but with the Affinity struct.
Since we use reflect.DeepEquals for job comparison, there is
risk of false positives for changes due to a job struct with
memoized vs non-memoized strings.

Closes #10897
2021-07-15 15:15:39 -05:00
Seth Hoenig bea8066187 core: add spec changed test with constriants 2021-07-14 10:44:09 -05:00
Seth Hoenig 52cf03df4a core: fix constraint tests 2021-07-14 10:39:38 -05:00
Seth Hoenig 1aec25f1df core: do not memoize constraint strings
This PR causes Nomad to no longer memoize the String value of
a Constraint. The private memoized variable may or may not be
initialized at any given time, which means a reflect.DeepEqual
comparison between two jobs (e.g. during Plan) may return incorrect
results.

Fixes #10836
2021-07-14 10:04:35 -05:00
Mahmood Ali 1f34f2197b
Merge pull request #10806 from hashicorp/munda/idempotent-job-dispatch
Enforce idempotency of dispatched jobs using token on dispatch request
2021-07-08 10:23:31 -04:00
Tim Gross 9f128a28ae
service: remove duplicate name check during validation (#10868)
When a task group with `service` block(s) is validated, we validate that there
are no duplicates, but this validation doesn't have access to the task environment
because it hasn't been created yet. Services and checks with interpolation can
be flagged incorrectly as conflicting. Name conflicts in services are not
actually an error in Consul and users have reported wanting to use the same
service name for task groups differentiated by tags.
2021-07-08 09:43:38 -04:00
Alex Munda 9e5061ef87
Update idempotency comment to reflect all jobs
Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>
2021-07-07 15:54:56 -05:00
Alex Munda 557a227de1
Match idempotency key on all child jobs and return existing job when idempotency keys match. 2021-07-02 14:08:46 -05:00
Alex Munda 34c63b086b
Move idempotency check closer to validate. Log error. 2021-07-02 10:58:42 -05:00
Grant Griffiths 7f8e285559
CSI: Snapshot volume create should use vol.Secrets (#10840)
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-07-02 08:28:22 -04:00
Alex Munda baba8fe7df
Update tests after moving idempotency token to WriteOptions 2021-07-01 08:48:57 -05:00
Alex Munda 848918018c
Move idempotency token to write options. Remove DispatchIdempotent 2021-06-30 15:10:48 -05:00
Alex Munda baae6d5546
Update comment about idempotency check 2021-06-30 12:30:44 -05:00
Alex Munda 01bcd9c41c
Make idempotency error user friendly
Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-06-30 12:26:33 -05:00
Alex Munda ca86c7ba0c
Add idempotency token to dispatch request instead of special meta key 2021-06-29 15:59:23 -05:00
Alex Munda 122136b657
Always allow idempotency key meta. Tests for idempotent dispatch 2021-06-29 10:30:04 -05:00
Alex Munda 561cd9fc7f
Enforce idempotency of dispatched jobs using special meta key 2021-06-23 17:10:31 -05:00
Seth Hoenig ebaaaae88e consul/connect: Validate uniqueness of Connect upstreams within task group
This PR adds validation during job submission that Connect proxy upstreams
within a task group are using different listener addresses. Otherwise, a
duplicate envoy listener will be created and not be able to bind.

Closes #7833
2021-06-18 16:50:53 -05:00
Mahmood Ali 33dfe98770
deployment watcher: Reuse allocsCh if allocIndex remains the same (#10756)
Fix deployment watchers to avoid creating unnecessary deployment watcher goroutines and blocking queries. `deploymentWatcher.getAllocsCh` creates a new goroutine that makes a blocking query to fetch updates of deployment allocs.

## Background

When operators submit a new or updated service job, Nomad create a new deployment by default. The deployment object controls how fast to place the allocations through [`max_parallel`](https://www.nomadproject.io/docs/job-specification/update#max_parallel) and health checks configurations.

The `scheduler` and `deploymentwatcher` package collaborate to achieve deployment logic: The scheduler only places the canaries and `max_parallel` allocations for a new deployment; the `deploymentwatcher` monitors for alloc progress and then enqueues a new evaluation whenever the scheduler should reprocess a job and places the next `max_parallel` round of allocations.

The `deploymentwatcher` package makes blocking queries against the state store, to fetch all deployments and the relevant allocs for each running deployments. If `deploymentwatcher` fails or is hindered from fetching the state, the deployments fail to make progress.

`Deploymentwatcher` logic only runs on the leader.

## Why unnecessary deployment watchers can halt cluster progress
Previously, `getAllocsCh` is called on every for loop iteration in `deploymentWatcher.watch()` function. However, the for-loop may iterate many times before the allocs get updated. In fact, whenever a new deployment is created/updated/deleted, *all* `deploymentWatcher`s get notified through `w.deploymentUpdateCh`. The `getAllocsCh` goroutines and blocking queries spike significantly and grow quadratically with respect to the number of running deployments. The growth leads to two adverse outcomes:

1. it spikes the CPU/Memory usage resulting potentially leading to OOM or very slow processing
2. it activates the [query rate limiter](abaa9c5c5b/nomad/deploymentwatcher/deployment_watcher.go (L896-L898)), so later the watcher fails to get updates and consequently fails to make progress towards placing new allocations for the deployment!

So the cluster fails to catch up and fails to make progress in almost all deployments. The cluster recovers after a leader transition: the deposed leader stops all watchers and free up goroutines and blocking queries; the new leader recreates the watchers without the quadratic growth and remaining under the rate limiter.  Well, until a spike of deployments are created triggering the condition again.

### Relevant Code References

Path for deployment monitoring:
* [`Watcher.watchDeployments`](abaa9c5c5b/nomad/deploymentwatcher/deployments_watcher.go (L164-L192)) loops waiting for deployment updates.
* On every deployment update, [`w.getDeploys`](abaa9c5c5b/nomad/deploymentwatcher/deployments_watcher.go (L194-L229)) returns all deployments in the system
* `watchDeployments` calls `w.add(d)` on every active deployment
* which in turns, [updates existing watcher if one is found](abaa9c5c5b/nomad/deploymentwatcher/deployments_watcher.go (L251-L255)).
* The deployment watcher [updates local local deployment field and trigger `deploymentUpdateCh` channel]( abaa9c5c5b/nomad/deploymentwatcher/deployment_watcher.go (L136-L147))
* The [deployment watcher `deploymentUpdateCh` selector is activated](abaa9c5c5b/nomad/deploymentwatcher/deployment_watcher.go (L455-L489)). Most of the time the selector clause is a no-op, because the flow was triggered due to another deployment update
* The `watch` for-loop iterates again and in the previous code we create yet another goroutine and blocking call that risks being rate limited.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-06-14 16:01:01 -04:00
Seth Hoenig 6eeaefa59f
Merge pull request #10754 from hashicorp/b-client-connect-constraint
consul/connect: remove unnecessary connect constraint on clients
2021-06-14 09:41:25 -05:00
Seth Hoenig 7b8e15159b consul/connect: remove unnecessary connect constraint on clients
PR https://github.com/hashicorp/nomad/pull/10702 added 2 new constraints
for connect jobs - one for Consul gRPC listener, and one for Connect being
enabled on Clients. Connect does not need to be enabled on clients, only
on Consul servers. Remove the extra constraint.

Discuss:
https://discuss.hashicorp.com/t/nomad-1-1-1-and-consul-connect-enabled-on-consul-clients/25295
2021-06-14 08:01:45 -05:00
James Rasell 0cccf7c2b8
volumewatcher: fix test data race. 2021-06-14 12:11:35 +02:00
James Rasell a99fcfb4c8
Merge pull request #10745 from hashicorp/b-fix-test-datarace-deploymentwatcher
deploymentwatcher: fix test data race.
2021-06-11 17:23:03 +02:00
James Rasell 939b23936a
Merge pull request #10744 from hashicorp/b-remove-duplicate-imports
chore: remove duplicate import statements
2021-06-11 16:42:34 +02:00
Mahmood Ali 74efd3626e
Merge pull request #10742 from hashicorp/deflake-tests-20210608
Deflaking Test 2021 June edition
2021-06-11 09:14:40 -04:00
James Rasell c168108bb7
Merge pull request #10739 from hashicorp/f-remove-unused-types-pkg
core: remove unused types pkg and PeriodicCallback type.
2021-06-11 13:27:22 +02:00
James Rasell ff75b4da09
deploymentwatcher: fix test data race. 2021-06-11 11:55:21 +02:00
James Rasell 492e308846
tests: remove duplicate import statements. 2021-06-11 09:39:22 +02:00
Mahmood Ali 9b35bf1858 deflake TestNomad_BootstrapExpect and other leader tests
The test fails reliably locally on my machine. The test uses non-dev mode
where Raft actions get committed to disk, causing operations to exceed
the 50ms tight Raft deadlines.

So, here we ensure that non-dev servers use default Raft config
files with longer timeouts.

Also, noticed that the test queries a server, that may a follower with a
stale state.

I've updated the test to ensure we query the leader for its state. The
Barrier call ensures that the leader is a "stable" leader with committed
entries. Protects against a window where a new leader reports the
previous term before it commits a raft log entry.
2021-06-10 22:04:10 -04:00
Mahmood Ali ff73cc279e tests: deflake TestAgentProfile_RemoteClient
TestAgentProfile_RemoteClient test must wait for the client node to be
registered in raft state store, and not merely that the server has a
network connection from the client.

In https://app.circleci.com/pipelines/github/hashicorp/nomad/15539/workflows/8dcbc3f3-946b-4da0-b089-9093788bc0c9/jobs/147919, notice how `node registration complete` log line occured after the test already have failed.

This is another case of flakiness due to not waiting for client
registration.
2021-06-10 22:00:15 -04:00
Mahmood Ali 8009d9837c tests: deflake TestMonitor_Monitor_RemoteServer and cross-region tests
Ensure that all servers are joined to each other before test proceed,
instead of just joining them to the first server and relying on
background serf propagation.

Relying on backgorund serf propagation is a cause of flakiness,
specially for tests with multiple regions. The server receiving the RPC
may not be aware of the region and fail to forward RPC accordingly.

For example, consider `TestMonitor_Monitor_RemoteServer` failure in https://app.circleci.com/pipelines/github/hashicorp/nomad/16402/workflows/7f327235-7d0c-40ba-9757-600522afca51/jobs/158045  you can observe:
* `nomad-117` is joined to `nomad-118` and `nomad-119`
* `nomad-119` is the foreign region
* `nomad-117` gains leadership in the default region, `nomad-118` is the non-leader
*  search logs for `nomad: adding server` and notice that `nomad-118`
   only added `nomad-118` and `nomad-118`, but not `nomad-119`!
* so the query to the non-leader in the test fails to be forwarded to
  the appopriate region.
2021-06-10 21:27:55 -04:00
James Rasell 25883eca43
core: remove unused types pkg and PeriodicCallback type. 2021-06-10 15:57:13 +02:00
Nomad Release Bot 4fe52bc753 remove generated files 2021-06-10 08:04:25 -04:00
Nomad Release bot 7cc7389afd Generate files for 1.1.1 release 2021-06-10 08:04:25 -04:00
Mahmood Ali aa77c2731b tests: use standard library testing.TB
Glint pulled in an updated version of mitchellh/go-testing-interface
which broke some existing tests because the update added a Parallel()
method to testing.T. This switches to the standard library testing.TB
which doesn't have a Parallel() method.
2021-06-09 16:18:45 -07:00
Seth Hoenig dbdc479970 consul: move consul acl tests into ent files
(cherry-pick ent back to oss)

This PR moves a lot of Consul ACL token validation tests into ent files,
so that we can verify correct behavior difference between OSS and ENT
Nomad versions.
2021-06-09 08:38:42 -05:00
Seth Hoenig 87be8c4c4b consul: correctly check consul acl token namespace when using consul oss
This PR fixes the Nomad Object Namespace <-> Consul ACL Token relationship
check when using Consul OSS (or Consul ENT without namespace support).

Nomad v1.1.0 introduced a regression where Nomad would fail the validation
when submitting Connect jobs and allow_unauthenticated set to true, with
Consul OSS - because it would do the namespace check against the Consul ACL
token assuming the "default" namespace, which does not work because Consul OSS
does not have namespaces.

Instead of making the bad assumption, expand the namespace check to handle
each special case explicitly.

Fixes #10718
2021-06-08 13:55:57 -05:00
Jasmine Dahilig ca4be6857e
deployment query rate limit (#10706) 2021-06-04 12:38:46 -07:00
Seth Hoenig d026ff1f66 consul/connect: add support for connect mesh gateways
This PR implements first-class support for Nomad running Consul
Connect Mesh Gateways. Mesh gateways enable services in the Connect
mesh to make cross-DC connections via gateways, where each datacenter
may not have full node interconnectivity.

Consul docs with more information:
https://www.consul.io/docs/connect/gateways/mesh-gateway

The following group level service block can be used to establish
a Connect mesh gateway.

service {
  connect {
    gateway {
      mesh {
        // no configuration
      }
    }
  }
}

Services can make use of a mesh gateway by configuring so in their
upstream blocks, e.g.

service {
  connect {
    sidecar_service {
      proxy {
        upstreams {
          destination_name = "<service>"
          local_bind_port  = <port>
          datacenter       = "<datacenter>"
          mesh_gateway {
            mode = "<mode>"
          }
        }
      }
    }
  }
}

Typical use of a mesh gateway is to create a bridge between datacenters.
A mesh gateway should then be configured with a service port that is
mapped from a host_network configured on a WAN interface in Nomad agent
config, e.g.

client {
  host_network "public" {
    interface = "eth1"
  }
}

Create a port mapping in the group.network block for use by the mesh
gateway service from the public host_network, e.g.

network {
  mode = "bridge"
  port "mesh_wan" {
    host_network = "public"
  }
}

Use this port label for the service.port of the mesh gateway, e.g.

service {
  name = "mesh-gateway"
  port = "mesh_wan"
  connect {
    gateway {
      mesh {}
    }
  }
}

Currently Envoy is the only supported gateway implementation in Consul.
By default Nomad client will run the latest official Envoy docker image
supported by the local Consul agent. The Envoy task can be customized
by setting `meta.connect.gateway_image` in agent config or by setting
the `connect.sidecar_task` block.

Gateways require Consul 1.8.0+, enforced by the Nomad scheduler.

Closes #9446
2021-06-04 08:24:49 -05:00
Seth Hoenig 4c087efd59
Merge pull request #10702 from hashicorp/f-cc-constraints
consul/connect: use additional constraints in scheduling connect tasks
2021-06-04 08:11:21 -05:00
Tim Gross 8b2ecde5b4 csi: accept list of caps during validation in volume register
When `nomad volume create` was introduced in Nomad 1.1.0, we changed the
volume spec to take a list of capabilities rather than a single capability, to
meet the requirements of the CSI spec. When a volume is registered via `nomad
volume register`, we should be using the same fields to validate the volume
with the controller plugin.
2021-06-04 07:57:26 -04:00
Seth Hoenig d359eb6f3a consul/connect: use additional constraints in scheduling connect tasks
This PR adds two additional constraints on Connect sidecar and gateway tasks,
making sure Nomad schedules them only onto nodes where Connect is actually
enabled on the Consul agent.

Consul requires `connect.enabled = true` and `ports.grpc = <number>` to be
explicitly set on agent configuration before Connect APIs will work. Until
now, Nomad would only validate a minimum version of Consul, which would cause
confusion for users who try to run Connect tasks on nodes where Consul is not
yet sufficiently configured. These contstraints prevent job scheduling on nodes
where Connect is not actually use-able.

Closes #10700
2021-06-03 15:43:34 -05:00
Tim Gross c01d661c98 csi: validate volume block has attachment_mode and access_mode
The `attachment_mode` and `access_mode` fields are required for CSI
volumes. The `mount_options` block is only allowed for CSI volumes.
2021-06-03 16:07:19 -04:00
Tim Gross e9777a88ce plan applier: add trace-level log of plan
The plans generated by the scheduler produce high-level output of counts on each
evaluation, but when debugging scheduler issues it'd be nice to have a more
detailed view of the resulting plan. Emitting this log at trace minimizes the
overhead, and producing it in the plan applyer makes it easier to find as it
will always be on the leader.
2021-06-02 10:25:23 -04:00
Tim Gross 7a55a6af16 leader: call eval log formatting lazily
Arguments to our logger's various write methods are evaluated eagerly, so
method calls in log parameters will always be called, regardless of log
level. Move some logger messages to the logger's `Fmt` method so that
`GoString` is evaluated lazily instead.
2021-06-02 09:59:55 -04:00
Ryan Sundberg d43c5f98a5 CSI: Include MountOptions in capabilities sent to CSI for all RPCs
Include the VolumeCapability.MountVolume data in
ControllerPublishVolume, CreateVolume, and ValidateVolumeCapabilities
RPCs sent to the CSI controller. The previous behavior was to only
include the MountVolume capability in the NodeStageVolume request, which
on some CSI implementations would be rejected since the Volume was not
originally provisioned with the specific mount capabilities requested.
2021-05-24 10:59:54 -04:00
James Rasell db4e2541bd
events: fix event endpoint tests to ignore heartbeats.
seems when this PR was raised, the Nomad CI provider was having
availability issues meaning the test suite was not correctly run,
thus allowing broken tests into main. The PR itself exercised test
code which had not been hit before.

The particular problem is when identifying whether the event
received is a heartbeat; this was performed using standard Golang
conditionals. Unfortunately the operator == is not defined on byte
arrays, resulting in the check always returning false. To overcome
this issue the code now uses the bytes.Equal function to correctly
compare the data.
2021-05-24 10:28:19 +02:00
Szabolcs Gelencsér fc97bd6acf
events: fix slow client connection to empty event stream (#10637)
* events: fix slow client connection to empty event stream

* doc: fix changelog of event stream connection init
2021-05-21 13:17:07 -04:00
Chris Baker 263ddd567c
Node Drain Metadata (#10250) 2021-05-07 13:58:40 -04:00
Mahmood Ali 102763c979
Support disabling TCP checks for connect sidecar services 2021-05-07 12:10:26 -04:00
Seth Hoenig b024d85f48 connect: use deterministic injected dynamic exposed port
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.

This fixes 2 bugs:
 - 'job plan' output is now idempotent for jobs making use of injected ports
 - tasks will no longer be destructively updated when jobs making use of
   injected ports are re-run without changing any user specified part of
   job config.

Closes: https://github.com/hashicorp/nomad/issues/10099
2021-04-30 15:18:22 -06:00
Michael Schurter 547a718ef6
Merge pull request #10248 from hashicorp/f-remotetask-2021
core: propagate remote task handles
2021-04-30 08:57:26 -07:00
Mahmood Ali 52d881f567
Allow configuring memory oversubscription (#10466)
Cluster operators want to have better control over memory
oversubscription and may want to enable/disable it based on their
experience.

This PR adds a scheduler configuration field to control memory
oversubscription. It's additional field that can be set in the [API via Scheduler Config](https://www.nomadproject.io/api-docs/operator/scheduler), or [the agent server config](https://www.nomadproject.io/docs/configuration/server#configuring-scheduler-config).

I opted to have the memory oversubscription be an opt-in, but happy to change it.  To enable it, operators should call the API with:
```json
{
  "MemoryOversubscriptionEnabled": true
}
```

If memory oversubscription is disabled, submitting jobs specifying `memory_max` will get a "Memory oversubscription is not
enabled" warnings, but the jobs will be accepted without them accessing
the additional memory.

The warning message is like:
```
$ nomad job run /tmp/j
Job Warnings:
1 warning(s):

* Memory oversubscription is not enabled; Task cache.redis memory_max value will be ignored

==> Monitoring evaluation "7c444157"
    Evaluation triggered by job "example"
==> Monitoring evaluation "7c444157"
    Evaluation within deployment: "9d826f13"
    Allocation "aa5c3cad" created: node "9272088e", group "cache"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "7c444157" finished with status "complete"

# then you can examine the Alloc AllocatedResources to validate whether the task is allowed to exceed memory:
$ nomad alloc status -json aa5c3cad | jq '.AllocatedResources.Tasks["redis"].Memory'
{
  "MemoryMB": 256,
  "MemoryMaxMB": 0
}
```
2021-04-29 22:09:56 -04:00
Luiz Aoqui f1b9055d21
Add metrics for blocked eval resources (#10454)
* add metrics for blocked eval resources

* docs: add new blocked_evals metrics

* fix to call `pruneStats` instead of `stats.prune` directly
2021-04-29 15:03:45 -04:00
Seth Hoenig d54a606819
Merge pull request #10439 from hashicorp/pick-ent-acls-changes
e2e: add e2e tests for consul namespaces on ent with acls
2021-04-28 08:30:08 -06:00
Tim Gross 79f81d617e licensing: remove raft storage and sync
This changeset is the OSS portion of the work to remove the raft storage and
sync for Nomad Enterprise.
2021-04-28 10:28:23 -04:00
Michael Schurter e62795798d core: propagate remote task handles
Add a new driver capability: RemoteTasks.

When a task is run by a driver with RemoteTasks set, its TaskHandle will
be propagated to the server in its allocation's TaskState. If the task
is replaced due to a down node or draining, its TaskHandle will be
propagated to its replacement allocation.

This allows tasks to be scheduled in remote systems whose lifecycles are
disconnected from the Nomad node's lifecycle.

See https://github.com/hashicorp/nomad-driver-ecs for an example ECS
remote task driver.
2021-04-27 15:07:03 -07:00
Seth Hoenig 09cd01a5f3 e2e: add e2e tests for consul namespaces on ent with acls
This PR adds e2e tests for Consul Namespaces for Nomad Enterprise
with Consul ACLs enabled.

Needed to add support for Consul ACL tokens with `namespace` and
`namespace_prefix` blocks, which Nomad parses and validates before
tossing the token. These bits will need to be picked back to OSS.
2021-04-27 14:45:54 -06:00
Seth Hoenig d76bcf0e12
Merge pull request #10457 from hashicorp/b-igce-wildcard
consul/connect: fix bug where ingress gateways could not use wildcard services
2021-04-27 14:41:47 -06:00
Seth Hoenig 865c7a5841 consul/connect: fix bug where ingress gateways could not use wildcard services
This PR fixes a bug where Nomad was more restrictive on Ingress Gateway Configuration
Entry definitions than Consul. Before, Nomad would not allow for declaring IGCEs with
http listeners with service name "*", which is a special feature allowable by Consul.

Note: to make http protocol work, a service-default must be defined setting the
protocol to http for each service.

Fixes: #9729
2021-04-27 13:42:26 -06:00
Seth Hoenig f47c6d34f7 consul/connect: check connect group and service names for uppercase characters
This PR adds job-submission validation that checks for the use of uppercase characters
in group and service names for services that make use of Consul Connect. This prevents
attempting to launch services that Consul will not validate correctly, which in turn
causes tasks to fail to launch in Nomad.

Underlying Consul issue: https://github.com/hashicorp/consul/issues/6765

Closes #7581 #10450
2021-04-27 11:26:37 -06:00
Mahmood Ali cf24a9eaaf
api: /v1/jobs always include namespaces (#10434)
Add Namespace as a top-level field in `/v1/jobs` stub.

The `/v1/jobs` endpoint already includes the namespace under `JobSummary`, though the API is odd, as typically the job ID and Namespace are in the same level, and the oddity complicates the UI frontend development.

The downside of adding it is redundant field, that makes the response body a bit bigger, specially for clusters with large jobs. Though, it should compress nicely and I expect the overhead to be small to overall response size. The benefit of a cleaner and more consistent API seem worth it.

Fixes #10431
2021-04-23 16:36:54 -04:00
Mahmood Ali d2fcce21f8
Migrate all allocs when draining a node (#10411)
This fixes a bug affecting drain nodes, where allocs may fail to be
migrated if they belong to different namespaces but share the same job
name.

The reason is that the helper function that creates the migration evals
indexed the allocs by job ID without accounting for the namespaces.
When job ids clash, only an eval is created for one and the rest of the
allocs remain intact.

Fixes #10172
2021-04-21 12:11:14 -04:00
Seth Hoenig f71dd3857e api: include ent fuzzy struct types in oss
Small change to pull in ent struct types in a switch
statement used by ent. They are benign in oss, this
is just to make sure OSS->ENT merges don't create a
diff.
2021-04-20 11:19:38 -06:00
Seth Hoenig 4e6dbaaec1
Merge pull request #10184 from hashicorp/f-fuzzy-search
api: implement fuzzy search API
2021-04-20 09:06:40 -06:00
Seth Hoenig 509490e5d2 e2e: consul namespace tests from nomad ent
(cherry-picked from ent without _ent things)

This is part 2/4 of e2e tests for Consul Namespaces. Took a
first pass at what the parameterized tests can look like, but
only on the ENT side for this PR. Will continue to refactor
in the next PRs.

Also fixes 2 bugs:
 - Config Entries registered by Nomad Server on job registration
   were not getting Namespace set
 - Group level script checks were not getting Namespace set

Those changes will need to be copied back to Nomad OSS.

Nomad OSS + no ACLs (previously, needs refactor)
Nomad ENT + no ACLs (this)
Nomad OSS + ACLs (todo)
Nomad ENT + ALCs (todo)
2021-04-19 15:35:31 -06:00
Seth Hoenig c34ef9eb78 api: fuzzy search results include job name with id in scope 2021-04-16 17:03:36 -06:00
Seth Hoenig 0b2114a7a5 api: make fuzzy searching case-agnostic 2021-04-16 16:56:10 -06:00
Seth Hoenig 1ee8d5ffc5 api: implement fuzzy search API
This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy
searching objects in Nomad. The fuzzy search endpoint routes requests
to the Nomad Server leader, which implements the Search.FuzzySearch RPC
method.

Requests to the fuzzy search API are based on the api.FuzzySearchRequest
object, e.g.

{
  "Text": "ed",
  "Context": "all"
}

Responses from the fuzzy search API are based on the api.FuzzySearchResponse
object, e.g.

{
  "Index": 27,
  "KnownLeader": true,
  "LastContact": 0,
  "Matches": {
    "tasks": [
      {
        "ID": "redis",
        "Scope": [
          "default",
          "example",
          "cache"
        ]
      }
    ],
    "evals": [],
    "deployment": [],
    "volumes": [],
    "scaling_policy": [],
    "images": [
      {
        "ID": "redis:3.2",
        "Scope": [
          "default",
          "example",
          "cache",
          "redis"
        ]
      }
    ]
  },
  "Truncations": {
    "volumes": false,
    "scaling_policy": false,
    "evals": false,
    "deployment": false
  }
}

The API is tunable using the new server.search stanza, e.g.

server {
  search {
    fuzzy_enabled   = true
    limit_query     = 200
    limit_results   = 1000
    min_term_length = 5
  }
}

These values can be increased or decreased, so as to provide more
search results or to reduce load on the Nomad Server. The fuzzy search
API can be disabled entirely by setting `fuzzy_enabled` to `false`.
2021-04-16 16:36:07 -06:00
Nick Spain 085b54bd0b Update TaskGroup services edited diff test to actually check Body 2021-04-13 09:15:35 -04:00
Nick Spain bfd4980e3f Hash Body field as part of ServiceCheck 2021-04-13 09:15:35 -04:00
Nick Spain 653d84ef68 Add a 'body' field to the check stanza
Consul allows specifying the HTTP body to send in a health check. Nomad
uses Consul for health checking so this just plumbs the value through to
where the Consul API is called.

There is no validation that `body` is not used with an incompatible
check method like GET.
2021-04-13 09:15:35 -04:00
Charlie Voiselle 8afb9eb05d
Fix parameterized <-> non-parameterized job error (#10357)
The error messages are reversed from tests performed above them.  The test uses the `validateJobUpdate()` function, but ignores the text of the error message itself.
2021-04-12 09:27:04 -04:00
Tim Gross 3113cced7b CSI: ensure page slices are within bounds
Plugins could potentially ignore the `max_entries` field and return a list of
entries that is greater, so we slice the return value in the server RPC to
enforce these value. But page sizes less than the number of entries for the
external CSI ListVolumes and ListSnapshots RPCs could cause a panic, so fix
the boundary checking.
2021-04-09 14:12:38 -04:00
Lars Lehtonen 61d3c3b480 nomad/structs: fix diff 2021-04-09 08:21:46 -04:00
Tim Gross 0892d34ff9 CSI: capability block is required for volume registration 2021-04-08 13:02:24 -04:00
Tim Gross d2e479505c CSI: capability check ListVolumes at RPC for nicer error messages
The plugin stub object does not include fine-grained capability checks, which
means `nomad volume status -verbose` will return ugly and verbose error
"Unimplemented" messages from the plugin if it does not support the CSI
`ListVolumes` RPC. Return a nicer error message from our RPC handler instead.
2021-04-07 12:00:22 -04:00
Tim Gross 276633673d CSI: use AccessMode/AttachmentMode from CSIVolumeClaim
Registration of Nomad volumes previously allowed for a single volume
capability (access mode + attachment mode pair). The recent `volume create`
command requires that we pass a list of requested capabilities, but the
existing workflow for claiming volumes and attaching them on the client
assumed that the volume's single capability was correct and unchanging.

Add `AccessMode` and `AttachmentMode` to `CSIVolumeClaim`, use these fields to
set the initial claim value, and add backwards compatibility logic to handle
the existing volumes that already have claims without these fields.
2021-04-07 11:24:09 -04:00
Tim Gross dbcc2694b0 refactor: move VolumeRequest validation to Validate method 2021-04-07 11:24:09 -04:00
Tim Gross 72c07f15fb refactor: internal claim methods should be private 2021-04-07 11:24:09 -04:00
Seth Hoenig fe8fce00d9 consul: minor CR cleanup 2021-04-05 10:10:16 -06:00
Seth Hoenig f17ba33f61 consul: plubming for specifying consul namespace in job/group
This PR adds the common OSS changes for adding support for Consul Namespaces,
which is going to be a Nomad Enterprise feature. There is no new functionality
provided by this changeset and hopefully no new bugs.
2021-04-05 10:03:19 -06:00
Chris Baker 21bc48ca29 json handles were moved to a new package in #10202
this was unecessary after refactoring, so this moves them back to their
original location in package structs
2021-04-02 13:31:10 +00:00
Chris Baker 436d46bd19
Merge branch 'main' into f-node-drain-api 2021-04-01 15:22:57 -05:00
Tim Gross 0856483115 CSI: fingerprint detailed node capabilities
In order to support new node RPCs, we need to fingerprint plugin capabilities
in more detail. This changeset mirrors recent work to fingerprint controller
capabilities, but is not yet in use by any Nomad RPC.
2021-04-01 16:00:58 -04:00
Tim Gross 466b620fa4
CSI: volume snapshot 2021-04-01 11:16:52 -04:00
Tim Gross 71b9daffb9 CSI: fix misleading HTTP test
The HTTP test to create CSI volumes depends on having a controller plugin to
talk to, but the test was using a node-only plugin, which allows it to
silently ignore the missing controller.
2021-03-31 16:37:09 -04:00
Tim Gross 9fc4cf1419 CSI: fingerprint detailed controller capabilities
In order to support new controller RPCs, we need to fingerprint volume
capabilities in more detail and perform controller RPCs only when the specific
capability is present. This fixes a bug in Ceph support where the plugin can
only suport create/delete but we assume that it also supports attach/detach.
2021-03-31 16:37:09 -04:00
Tim Gross f149abfa41 CSI: volume creation/registration should not validate attachment
The CSI specification requires that we validate a list of `Capability` (access
mode + accessibility) when we create volume, but the existing volume
registration workflow incorrectly validates a single capability. The
specific capability required by a volume claim is checked at the time we make
the claim, so remove the check for `AttachmentMode`/`AcccessMode`.
2021-03-31 16:37:09 -04:00
Tim Gross aec5337862 CSI: HTTP handlers for create/delete/list 2021-03-31 16:37:09 -04:00
Tim Gross d38008176e CSI: create/delete/list volume RPCs
This commit implements the RPC handlers on the client that talk to the CSI
plugins on that client for the Create/Delete/List RPC.
2021-03-31 16:37:09 -04:00
Tim Gross 43622680fa test infrastructure for mock client RPCs (#10193)
This commit includes a new test client that allows overriding the RPC
protocols. Only the RPCs that are passed in are registered, which lets you
implement a mock RPC in the server tests. This commit includes an example of
this for the ClientCSI RPC server.
2021-03-31 16:37:09 -04:00
Mahmood Ali e24dac2f39 fixup! oversubscription: Add MemoryMaxMB to internal structs 2021-03-31 09:26:26 -04:00
Mahmood Ali 4ec2d8f5e4 fixup! oversubscription: Add MemoryMaxMB to internal structs 2021-03-31 08:52:20 -04:00
Mahmood Ali 0c2551270a oversubscription: Add MemoryMaxMB to internal structs
Start tracking a new MemoryMaxMB field that represents the maximum memory a task
may use in the client. This allows tasks to specify a memory reservation (to be
used by scheduler when placing the task) but use excess memory used on the
client if the client has any.

This commit adds the server tracking for the value, and ensures that allocations
AllocatedResource fields include the value.
2021-03-30 16:55:58 -04:00
Nick Ethier daecfa61e6
Merge pull request #10203 from hashicorp/f-cpu-cores
Reserved Cores [1/4]: Structs and scheduler implementation
2021-03-29 14:05:54 -04:00
Chris Baker 16e37b986a reworked Node.Canonicalize() to enforce invariants, fixed a broken test 2021-03-26 18:58:38 +00:00
Chris Baker 99ef00d5da reinserted/expanded fsm node.canonicalize test that was still needed 2021-03-26 17:10:39 +00:00
Chris Baker 770c9cecb5 restored Node.Sanitize() for RPC endpoints
multiple other updates from code review
2021-03-26 17:03:15 +00:00
Mahmood Ali dbc3850358
Merge pull request #10145 from hashicorp/b-periodic-init-status
periodic: always reset periodic children status
2021-03-26 09:19:08 -04:00
Mahmood Ali 5d75705edd dispatched parameterized job should clear status too 2021-03-25 15:14:21 -04:00
Mahmood Ali e643742a38 Add a test for parameterized summary counts 2021-03-25 11:27:09 -04:00
Mahmood Ali b0e048bfa4 periodic: always reset periodic children status
Fixes a bug where Nomad reports negative or incorrect running children
counts for periodic jobs.

The periodic dispatcher derives a child job without reseting the status.
If the periodic job has a `running` status, the derived job will start
as `running` status and transition to `pending`.  Since this is
unexpected transition, the counting in StateStore.setJobSummary gets out of sync and
result in negative/incorrect values.

Note that this only affects periodic jobs after a leader transition.
During the first job registration, the job is added with `pending` or
`""` status. However, after a leader transition, the new leader
repopulates the dispatcher heap with `"running"` status and triggers the
bug.
2021-03-25 11:27:09 -04:00
Chris Baker 07646316e5 some comments on the new json extensions/encoding 2021-03-23 18:18:51 +00:00
Chris Baker a43b3a8736 change to fail-safe in json encoding 2021-03-23 18:13:10 +00:00
Chris Baker cb540ed691 added tests that the API doesn't leak Node.SecretID
added more documentation on JSON encoding to the contributing guide
2021-03-23 18:09:20 +00:00
Drew Bailey 74836b95b2
configuration and oss components for licensing (#10216)
* configuration and oss components for licensing

* vendor sync
2021-03-23 09:08:14 -04:00
Mahmood Ali 73b7b98018
testing: default nomad test nodes to 1.0.0 (#10213) 2021-03-23 08:24:26 -04:00
Chris Baker a186badf35 moved JSON handlers and extension code around a bit for proper order of
initialization
2021-03-22 14:12:42 +00:00
Chris Baker 9f7bc5a575 refactor? 2021-03-22 01:49:21 +00:00
Chris Baker dd291e69f4 removed deprecated fields from Drain structs and API
node drain: use msgtype on txn so that events are emitted
wip: encoding extension to add Node.Drain field back to API responses

new approach for hiding Node.SecretID in the API, using `json` tag
documented this approach in the contributing guide
refactored the JSON handlers with extensions
modified event stream encoding to use the go-msgpack encoders with the extensions
2021-03-21 15:30:11 +00:00
Nick Ethier 648ade63ad scheduler: implement scheduling of reserved cores 2021-03-19 00:29:07 -04:00
Nick Ethier 26b200e8bd api: add new 'cores' field to task resources 2021-03-18 23:13:30 -04:00
Nick Ethier 4b2912d343 structs: add struct fields and funcs for reservable cpu cores 2021-03-18 22:49:06 -04:00
Tim Gross fa25e048b2
CSI: unique volume per allocation
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.

Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
2021-03-18 15:35:11 -04:00
Tim Gross 9b2b580d1a
CSI: remove prefix matching from CSIVolumeByID and fix CLI prefix matching (#10158)
Callers of `CSIVolumeByID` are generally assuming they should receive a single
volume. This potentially results in feasibility checking being performed
against the wrong volume if a volume's ID is a prefix substring of other
volume (for example: "test" and "testing").

Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix
matching in the command line client. Add the required elements for prefix
matching to the commands and API.
2021-03-18 14:32:40 -04:00
Charlie Voiselle 0473f35003
Fixup uses of sanity (#10187)
* Fixup uses of `sanity`
* Remove unnecessary comments.

These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.

* Update nomad/fsm_test.go
2021-03-16 18:05:08 -04:00
Mahmood Ali 1d48433356
server: handle invalid jobs in expose handler hook (#10154)
The expose handler hook must handle if the submitted job is invalid. Without this validation, the rpc handler panics on invalid input.

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2021-03-10 09:12:46 -05:00
Tim Gross 7010a344d6 one-time token: never return expired tokens 2021-03-10 08:17:56 -05:00
Tim Gross 97b0e26d1f RPC endpoints to support 'nomad ui -login'
RPC endpoints for the user-driven APIs (`UpsertOneTimeToken` and
`ExchangeOneTimeToken`) and token expiration (`ExpireOneTimeTokens`).
Includes adding expiration to the periodic core GC job.
2021-03-10 08:17:56 -05:00
Tim Gross 6b2c4b56d0 state store updates for one-time tokens
The `OneTimeToken` struct is to support the `nomad ui -login` command. This
changeset adds the struct to the Nomad state store.
2021-03-10 08:17:56 -05:00
Andre Ilhicas f45fc6c899
consul/connect: enable setting local_bind_address in upstream 2021-02-26 11:37:31 +00:00
Drew Bailey 86d9e1ff90
Merge pull request #9955 from hashicorp/on-update-services
Service and Check on_update configuration option (readiness checks)
2021-02-24 10:11:05 -05:00
Tim Gross b764f52ab9
deploymentwatcher: reset progress deadline on promotion (#10042)
In a deployment with two groups (ex. A and B), if group A's canary becomes
healthy before group B's, the deadline for the overall deployment will be set
to that of group A. When the deployment is promoted, if group A is done it
will not contribute to the next deadline cutoff. Group B's old deadline will
be used instead, which will be in the past and immediately trigger a
deployment progress failure. Reset the progress deadline when the job is
promotion to avoid this bug, and to better conform with implicit user
expectations around how the progress deadline should interact with promotions.
2021-02-22 16:44:03 -05:00
James Rasell 6553cc3da7
drainer: fix error message when handling drain deadlined nodes. 2021-02-18 11:45:44 +01:00
AndrewChubatiuk cd152643fb fixed connect port label 2021-02-13 02:42:14 +02:00
AndrewChubatiuk 3d0aa2ef56 allocate sidecar task port on host_network interface 2021-02-13 02:42:13 +02:00
Nick Ethier fcc1f4c805
Merge pull request #9946 from hashicorp/b-9477
structs: namespace port validation by host_network
2021-02-11 12:53:28 -05:00
Seth Hoenig 45e0e70a50 consul/connect: enable custom sidecars to use expose checks
This PR enables jobs configured with a custom sidecar_task to make
use of the `service.expose` feature for creating checks on services
in the service mesh. Before we would check that sidecar_task had not
been set (indicating that something other than envoy may be in use,
which would not support envoy's expose feature). However Consul has
not added support for anything other than envoy and probably never
will, so having the restriction in place seems like an unnecessary
hindrance. If Consul ever does support something other than Envoy,
they will likely find a way to provide the expose feature anyway.

Fixes #9854
2021-02-09 10:49:37 -06:00
Drew Bailey 8507d54e3b
e2e test for on_update service checks
check_restart not compatible with on_update=ignore

reword caveat
2021-02-08 08:32:40 -05:00
Drew Bailey 82f971f289
OnUpdate configuration for services and checks
Allow for readiness type checks by configuring nomad to ignore warnings
or errors reported by a service check. This allows the deployment to
progress and while Consul handles introducing the sercive into a
resource pool once the check passes.
2021-02-08 08:32:40 -05:00
Nick Ethier eacc4da499
Merge branch 'master' into b-9477 2021-02-05 11:58:13 -05:00
Tim Gross eb3dd17fb2 volumes: implement plan diff for volume requests
The details of host volume and CSI volume requests do not show up in `nomad
plan` outputs, although the updates are detected by the scheduler and result
in an update as expected.
2021-02-04 16:55:17 -05:00
Chris Baker ebbb760ec4 support for scaling_policy in global prefix search 2021-02-03 19:26:57 +00:00
Nick Ethier 43a4d72fda structs: namespace port validation by host_network 2021-02-02 14:56:52 -05:00
Seth Hoenig 720780992c consul/connect: copy bind address map if empty
This parameter is now supposed to be non-nil even if
empty, and the Copy method should also maintain that
invariant.
2021-01-25 10:36:04 -06:00
Seth Hoenig 1ad219c441 consul/connect: remove debug line 2021-01-25 10:36:04 -06:00
Seth Hoenig 8b05efcf88 consul/connect: Add support for Connect terminating gateways
This PR implements Nomad built-in support for running Consul Connect
terminating gateways. Such a gateway can be used by services running
inside the service mesh to access "legacy" services running outside
the service mesh while still making use of Consul's service identity
based networking and ACL policies.

https://www.consul.io/docs/connect/gateways/terminating-gateway

These gateways are declared as part of a task group level service
definition within the connect stanza.

service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      terminating {
        // terminating-gateway configuration entry
      }
    }
  }
}

Currently Envoy is the only supported gateway implementation in
Consul. The gateay task can be customized by configuring the
connect.sidecar_task block.

When the gateway.terminating field is set, Nomad will write/update
the Configuration Entry into Consul on job submission. Because CEs
are global in scope and there may be more than one Nomad cluster
communicating with Consul, there is an assumption that any terminating
gateway defined in Nomad for a particular service will be the same
among Nomad clusters.

Gateways require Consul 1.8.0+, checked by a node constraint.

Closes #9445
2021-01-25 10:36:04 -06:00
Drew Bailey 007158ee75
ignore setting job summary when oldstatus == newstatus (#9884) 2021-01-25 10:34:27 -05:00
Drew Bailey 630babb886
prevent double job status update (#9768)
* Prevent Job Statuses from being calculated twice

https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval
insertion iwth job (de-)registration. This change removes a now obsolete
guard which checked if the index was equal to the job.CreateIndex, which
would empty the status. Now that the job regisration eval insetion is
atomic with the registration this check is no longer necessary to set
the job statuses correctly.

* test to ensure only single job event for job register

* periodic e2e

* separate job update summary step

* fix updatejobstability to use copy instead of modified reference of job

* update envoygatewaybindaddresses copy to prevent job diff on null vs empty

* set ConsulGatewayBindAddress to empty map instead of nil

fix nil assertions for empty map

rm unnecessary guard
2021-01-22 09:18:17 -05:00
Kris Hicks 8f9e47a8e7
Clean up Task Validation tests (#9833)
Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>
2021-01-21 11:53:02 -08:00
Dennis Schön 3eaf1432aa
validate connect block allowed only within group.service 2021-01-20 14:34:23 -05:00
Seth Hoenig f213b8c51b consul/connect: always set gateway proxy default timeout
If the connect.proxy stanza is left unset, the connection timeout
value is not set but is assumed to be, and may cause a non-fatal NPE
on job submission.
2021-01-19 11:23:41 -06:00
Kris Hicks d71a90c8a4
Fix some errcheck errors (#9811)
* Throw away result of multierror.Append

When given a *multierror.Error, it is mutated, therefore the return
value is not needed.

* Simplify MergeMultierrorWarnings, use StringBuilder

* Hash.Write() never returns an error

* Remove error that was always nil

* Remove error from Resources.Add signature

When this was originally written it could return an error, but that was
refactored away, and callers of it as of today never handle the error.

* Throw away results of io.Copy during Bridge

* Handle errors when computing node class in test
2021-01-14 12:46:35 -08:00
Kris Hicks 39e369c3bb
csi: Return error when deleting node (#9803)
In this change we'll properly return the error in the
CSIPluginTypeMonolith case (which is the type given in DeleteNode()),
and also return the error when the given ID is not found.

This was found via errcheck.
2021-01-14 12:44:50 -08:00
Kris Hicks abb8f2ebc0
Refactor Job.Scale() (#9771) 2021-01-14 12:40:42 -08:00
Kris Hicks f77ffb3b5b
Add missing sink.Cancel() in fsm (#9818) 2021-01-14 12:39:20 -08:00
Drew Bailey 03a9541822
ignore poststop task in alloc health tracker (#9548), fixes #9361
* investigating where to ignore poststop task in alloc health tracker

* ignore poststop when setting latest start time for allocation

* clean up logic

* lifecycle: isolate mocks for poststop deployment test

* lifecycle: update comments in tracker

Co-authored-by: Jasmine Dahilig <jasmine@dahilig.com>
2021-01-12 10:03:48 -08:00
Chris Baker 3546469205 nicer error message 2021-01-08 21:13:29 +00:00
Chris Baker d43e0d10c0 appease the linter and fix an incorrect test 2021-01-08 19:38:25 +00:00
Chris Baker 49effd5840 in Job.Scale, ensure that new count is within [min,max] configured in scaling policy
resolves #9758
2021-01-08 19:24:36 +00:00
Seth Hoenig 6c9366986b consul/connect: avoid NPE from unset connect gateway proxy
Submitting a job with an ingress gateway in host networking mode
with an absent gateway.proxy block would cause the Nomad client
to panic on NPE.

The consul registration bits would assume the proxy stanza was
not nil, but it could be if the user does not supply any manually
configured envoy proxy settings.

Check the proxy field is not nil before using it.

Fixes #9669
2021-01-05 09:27:01 -06:00
Chris Baker fd6beefe11 simple test to ensure that scaling endpoint methods support IsRead for
stale read support
2021-01-05 13:42:18 +00:00
Mahmood Ali ae0be24abb tweak bootstrap testing 2021-01-04 09:00:40 -05:00
Mahmood Ali 2ea8ae7584 Only bootstrap when bootstrap_expect 2020-12-17 20:06:14 -05:00
Mahmood Ali 421380a300 add a failing test for unexpected bootstrapping 2020-12-17 20:06:14 -05:00
Seth Hoenig 3a3a175e1a consul/connect: enable configuring custom gateway task
Add the ability to configure the Task used for Connect gateways,
similar to how sidecar Task can be configured.

The implementation here simply re-uses the sidecar_task stanza,
and now gets applied whether connect.sidecar_service or
connect.gateway is the thing being defined. In retrospect,
connect.sidecar_task could have been more generically named
like connect.task to make it a little more re-usable.

Closes #9474
2020-12-17 08:51:52 -06:00
Seth Hoenig beaa6359d5 consul/connect: fix regression where client connect images ignored
Nomad v1.0.0 introduced a regression where the client configurations
for `connect.sidecar_image` and `connect.gateway_image` would be
ignored despite being set. This PR restores that functionality.

There was a missing layer of interpolation that needs to occur for
these parameters. Since Nomad 1.0 now supports dynamic envoy versioning
through the ${NOMAD_envoy_version} psuedo variable, we basically need
to first interpolate

  ${connect.sidecar_image} => envoyproxy/envoy:v${NOMAD_envoy_version}

then use Consul at runtime to resolve to a real image, e.g.

  envoyproxy/envoy:v${NOMAD_envoy_version} => envoyproxy/envoy:v1.16.0

Of course, if the version of Consul is too old to provide an envoy
version preference, we then need to know to fallback to the old
version of envoy that we used before.

  envoyproxy/envoy:v${NOMAD_envoy_version} => envoyproxy/envoy:v1.11.2@sha256:a7769160c9c1a55bb8d07a3b71ce5d64f72b1f665f10d81aa1581bc3cf850d09

Beyond that, we also need to continue to support jobs that set the
sidecar task themselves, e.g.

  sidecar_task { config { image: "custom/envoy" } }

which itself could include teh pseudo envoy version variable.
2020-12-14 09:47:55 -06:00
Drew Bailey 54becaab7d
Events/acl events (#9595)
* fix acl event creation

* allow way to access secretID without exposing it to stream

test that values are omitted

test event creation

test acl events

payloads are pointers

fix failing tests, do all security steps inside constructor

* increase time

* ignore empty tokens

* uncomment line

* changelog
2020-12-11 10:40:50 -05:00
Seth Hoenig 52c9dbbb91 consul/connect: set default Envoy worker threads for gateways
Applying the default --concurrency for gateways was missed before.
Set the default Envoy concurrency to 1 for connect gateways. The
same override value meta.connect.proxy_concurrency applies.
2020-12-10 10:36:29 -06:00
Kris Hicks 0cf9cae656
Apply some suggested fixes from staticcheck (#9598) 2020-12-10 07:29:18 -08:00
Seth Hoenig b3d744fea3
Merge pull request #9586 from hashicorp/f-connect-interp
consul/connect: interpolate connect block
2020-12-09 13:21:50 -06:00
Kris Hicks 0a3a748053
Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
Seth Hoenig b51459a879 consul/connect: interpolate connect block
This PR enables job submitters to use interpolation in the connect
block of jobs making use of consul connect. Before, only the name of
the connect service would be interpolated, and only for a few select
identifiers related to the job itself (#6853). Now, all connect fields
can be interpolated using the full spectrum of runtime parameters.

Note that the service name is interpolated at job-submission time,
and cannot make use of values known only at runtime.

Fixes #7221
2020-12-09 09:10:00 -06:00
Kris Hicks 93155ba3da
Add gocritic to golangci-lint config (#9556) 2020-12-08 12:47:04 -08:00
Dennis Schön a9c97d9257
use os.ErrDeadlineExceeded in tests 2020-12-07 10:40:28 -05:00
Drew Bailey c0bd238eb2
fix allocation spelling error, update docs (#9527)
* fix allocation spelling error, update docs

* assign TopicACLPolicy and TopicACLToken properly
2020-12-04 12:04:58 -05:00
James Rasell fd53963afb
core: fix typo msg used when job ID/name contains a null char. 2020-12-04 09:49:31 +01:00
Drew Bailey ce85288f2f
ensure node secret ID is not included in event stream (#9510) 2020-12-03 12:27:14 -05:00
Drew Bailey 17de8ebcb1
API: Event stream use full name instead of Eval/Alloc (#9509)
* use full name for events

use evaluation and allocation instead of short name

* update api event stream package and shortnames

* update docs

* make sync; fix typo

* backwards compat not from 1.0.0-beta event stream api changes

* use api types instead of string

* rm backwards compat note that only changed between prereleases

* remove backwards incompat that only existed in prereleases
2020-12-03 11:48:18 -05:00
Seth Hoenig 3b2b083cbf
Merge pull request #9487 from hashicorp/f-connect-sidecar-concurrency
consul/connect: default envoy concurrency to 1
2020-12-01 15:51:41 -06:00
Drew Bailey f9f5fe8236
Events switch on memdb change table instead of type to prevent duplicates (#9486)
* prevent duplicate job events

when a job is updated, the job_version table is updated with a structs.Job, this caused there to be multiple job events since we are switching off the change type and not the table

* test length

* add table value to tests
2020-12-01 15:14:05 -05:00
Seth Hoenig bf857684d1 consul/connect: default envoy concurrency to 1
Previously, every Envoy Connect sidecar would spawn as many worker
threads as logical CPU cores. That is Envoy's default behavior when
`--concurrency` is not explicitly set. Nomad now sets the concurrency
flag to 1, which is sensible for the default cpu = 250 Mhz resources
allocated for sidecar proxies. The concurrency value can be configured
in Client configuration by setting `meta.connect.proxy_concurrency`.

Closes #9341
2020-12-01 13:12:45 -06:00
Drew Bailey 1f8e1aa631
pass in msgType for UpsertJob (#9475) 2020-12-01 14:00:52 -05:00
Michael Schurter ea0e1789f4
Merge pull request #9435 from hashicorp/f-allocupdate-timer
client: always wait 200ms before sending updates
2020-12-01 08:45:17 -08:00
Drew Bailey 9adca240f8
Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447)
* upsertaclpolicies

* delete acl policies msgtype

* upsert acl policies msgtype

* delete acl tokens msgtype

* acl bootstrap msgtype

wip unsubscribe on token delete

test that subscriptions are closed after an ACL token has been deleted

Start writing policyupdated test

* update test to use before/after policy

* add SubscribeWithACLCheck to run acl checks on subscribe

* update rpc endpoint to use broker acl check

* Add and use subscriptions.closeSubscriptionFunc

This fixes the issue of not being able to defer unlocking the mutex on
the event broker in the for loop.

handle acl policy updates

* rpc endpoint test for terminating acl change

* add comments

Co-authored-by: Kris Hicks <khicks@hashicorp.com>
2020-12-01 11:11:34 -05:00
Drew Bailey 70ae7ec621
return potential errors from txn.Commit (#9483) 2020-12-01 10:05:37 -05:00
Benjamin Buzbee e0acbbfcc6
Fix RPC retry logic in nomad client's rpc.go for blocking queries (#9266) 2020-11-30 15:11:10 -05:00
Drew Bailey a0b7f05a7b
Remove Managed Sinks from Nomad (#9470)
* Remove Managed Sinks from Nomad

Managed Sinks were a beta feature in Nomad 1.0-beta2. During the beta
period it was determined that this was not a scalable approach to
support community and third party sinks.

* update comment

* changelog
2020-11-30 14:00:31 -05:00
Seth Hoenig e81e9223ef consul/connect: enable setting datacenter in connect upstream
Before, upstreams could only be defined using the default datacenter.
Now, the `datacenter` field can be set in a connect upstream definition,
informing consul of the desire for an instance of the upstream service
in the specified datacenter. The field is optional and continues to
default to the local datacenter.

Closes #8964
2020-11-30 10:38:30 -06:00
Tim Gross b2cd0da0a2
CSI: fix transaction handling in state store (#9438)
When making updates to CSI plugins, the state store methods that have open
write transactions were querying the state store using the same methods used
by the CSI RPC endpoint, but these method creates their own top-level read
transactions. During concurrent plugin updates (as happens when a plugin job
is stopped), this can cause write skew in the plugin counts.

* Refactor the CSIPlugin query methods to have an implementation method that
accepts a transaction, which can be called with either a read txn or a write
txn.
* Refactor the CSIVolume query methods to have an implementation method that
accepts a transaction, which can be called with either a read txn or a write
txn.
* CSI volumes need to be "denormalized" with their plugins and (optionally)
allocations. Read-only RPC endpoints should take a snapshot so that we can
make multiple state store method calls with a consistent view.
2020-11-25 11:15:57 -05:00
Michael Schurter 9bd1f267d2 nomad: try to avoid slice resizing when batching 2020-11-24 09:14:00 -08:00
Seth Hoenig 74a34704c5
Merge pull request #8743 from hashicorp/f-task_network_warning
Validate and document 0.12 mbits/network deprecations
2020-11-23 15:36:18 -06:00
Drew Bailey c8b1a84d1e
Events/mv structs (#9430)
* move structs to structs/event.go to avoid import cycle
2020-11-23 14:01:10 -05:00
Seth Hoenig a35c0db6c7 nomad/structs: validate deprecated task.resource.network port labels
Enable users to submit jobs that still make use of the deprecated
task.resources.network stanza. Such jobs can be submitted, but
will emit a warning.
2020-11-23 12:40:40 -06:00
Nick Ethier f1ea79f5a8 remove references to default mbits 2020-11-23 10:32:13 -06:00
Nick Ethier d21cbeb30f command: remove task network usage from init examples 2020-11-23 10:25:11 -06:00
Nick Ethier 9471892df4 mock: add default host network 2020-11-23 10:11:00 -06:00
Nick Ethier 7266376ae6 nomad: update validate to check group networks for task port usage 2020-11-23 10:11:00 -06:00
Nick Ethier c4ddb0a43a website: add mbits and network deprecation notice 2020-11-23 10:09:36 -06:00
Tim Gross c320c1ba57
CSI: fix struct copying errors (#9239)
The CSIVolume struct "denormalizes" allocations when it's first queried from
the state store. The CSIVolumeByID method on the state store copies the volume
before denormalizing so that we don't end up with unexpected changes. The
copying has some subtle bugs that meant that Allocations (as well as
Topologies and MountOptions) were not getting copied when expected.

Also, ensure we never write allocations attached to volumes to the state store
during claims.
2020-11-18 10:59:25 -05:00
Seth Hoenig 4cc3c01d5b
Merge pull request #9352 from hashicorp/f-artifact-headers
jobspec: add support for headers in artifact stanza
2020-11-13 14:04:27 -06:00
Seth Hoenig bb8a5816a0 jobspec: add support for headers in artifact stanza
This PR adds the ability to set HTTP headers when downloading
an artifact from an `http` or `https` resource.

The implementation in `go-getter` is such that a new `HTTPGetter`
must be created for each artifact that sets headers (as opposed
to conveniently setting headers per-request). This PR maintains
the memoization of the default Getter objects, creating new ones
only for artifacts where headers are set.

Closes #9306
2020-11-13 12:03:54 -06:00
Lars Lehtonen 60936f554c
nomad/structs: fix noop breaks (#9348) 2020-11-13 08:28:11 -05:00
Jasmine Dahilig d6110cbed4
lifecycle: add poststop hook (#8194) 2020-11-12 08:01:42 -08:00
Nick Ethier 5e1634eda1
structs: canonicalize allocatedtaskresources to populate shared ports (#9309) 2020-11-11 16:21:47 -05:00
Tim Gross 60874ebe25
csi: Postrun hook should not change mode (#9323)
The unpublish workflow requires that we know the mode (RW vs RO) if we want to
unpublish the node. Update the hook and the Unpublish RPC so that we mark the
claim for release in a new state but leave the mode alone. This fixes a bug
where RO claims were failing node unpublish.

The core job GC doesn't know the mode, but we don't need it for that workflow,
so add a mode specifically for GC; the volumewatcher uses this as a sentinel
to check whether claims (with their specific RW vs RO modes) need to be claimed.
2020-11-11 13:06:30 -05:00
Chris Baker fbe3670b74
Merge pull request #9317 from hashicorp/f-recommendations-cli-autocomplete
recommendations CLI: autocomplete support
2020-11-11 12:04:23 -06:00
Chris Baker e3c0ea654d auto-complete for recommendations CLI, plus OSS components of recommendations prefix search 2020-11-11 11:13:43 +00:00
Chris Baker 53aa5e75c9 fix #9227: use both job and type query on scaling policy list endpoint 2020-11-10 23:26:35 +00:00
Luiz Aoqui ea81ac5d3d
Merge pull request #9296 from hashicorp/b-remove-namespace-from-scale-request
Remove Namespace field from JobScaleRequest
2020-11-09 15:13:33 -05:00
Luiz Aoqui c536286c7a
remove Namespace field from JobScaleRequest 2020-11-09 13:02:05 -05:00
Kris Hicks 0b590a5040
events: Use single eventsFromChanges func (#9281) 2020-11-05 13:06:52 -08:00
Chris Baker b2a4f64b65
Merge pull request #9278 from hashicorp/b-9268-all-namespace-allocs-acl
fix ACL bugs in listing allocs across all namespaces
2020-11-05 14:59:47 -06:00
Kris Hicks bcb460c36e
Fix handling of deleted change (#9280) 2020-11-05 11:06:41 -08:00
Chris Baker be32fb7d3c updated Allocation.List to properly handle ACL checking for namespace=* 2020-11-05 17:26:33 +00:00
Kris Hicks 20f5fa7f99
Refactor GenericEventsFromChanges (#9279) 2020-11-05 09:06:08 -08:00