open-nomad

Commit Graph

Author	SHA1	Message	Date
James Rasell	9264f07cc1	core: add expired token garbage collection periodic jobs. (#13805 ) Two new periodic core jobs have been added which handle removing expired local and global tokens from state. The local core job is run on every leader; the global core job is only run on the leader within the authoritative region.	2022-07-19 15:37:46 +02:00
Tim Gross	cfa2cb140e	fsm: one-time token expiration should be deterministic (#13737 ) When applying a raft log to expire ACL tokens, we need to use a timestamp provided by the leader so that the result is deterministic across servers. Use leader's timestamp from RPC call	2022-07-18 14:19:29 -04:00
James Rasell	0cde3182eb	core: add ACL token expiry state, struct, and RPC handling. (#13718 ) The ACL token state schema has been updated to utilise two new indexes which track expiration of tokens that are configured with an expiration TTL or time. A new state function allows listing ACL expired tokens which will be used by internal garbage collection. The ACL endpoint has been modified so that all validation happens within a single function call. This is easier to understand and see at a glance. The ACL token validation now also includes logic for expiry TTL and times. The ACL endpoint upsert tests have been condensed into a single, table driven test. There is a new token canonicalize which provides a single place for token canonicalization, rather than logic spread in the RPC handler.	2022-07-13 15:40:34 +02:00
Luiz Aoqui	b656981cf0	Track plan rejection history and automatically mark clients as ineligible (#13421 ) Plan rejections occur when the scheduler work and the leader plan applier disagree on the feasibility of a plan. This may happen for valid reasons: since Nomad does parallel scheduling, it is expected that different workers will have a different state when computing placements. As the final plan reaches the leader plan applier, it may no longer be valid due to a concurrent scheduling taking up intended resources. In these situations the plan applier will notify the worker that the plan was rejected and that they should refresh their state before trying again. In some rare and unexpected circumstances it has been observed that workers will repeatedly submit the same plan, even if they are always rejected. While the root cause is still unknown this mitigation has been put in place. The plan applier will now track the history of plan rejections per client and include in the plan result a list of node IDs that should be set as ineligible if the number of rejections in a given time window crosses a certain threshold. The window size and threshold value can be adjusted in the server configuration. To avoid marking several nodes as ineligible at one, the operation is rate limited to 5 nodes every 30min, with an initial burst of 10 operations.	2022-07-12 18:40:20 -04:00
Seth Hoenig	297d386bdc	client: add support for checks in nomad services This PR adds support for specifying checks in services registered to the built-in nomad service provider. Currently only HTTP and TCP checks are supported, though more types could be added later.	2022-07-12 17:09:50 -05:00
Michael Schurter	3e50f72fad	core: merge reserved_ports into host_networks (#13651 ) Fixes #13505 This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports). As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility: Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly. Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me. So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR.	2022-07-12 14:40:25 -07:00
Phil Renaud	e9219a1ae0	Allow wildcard for Evaluations API (#13530 ) * Failing test and TODO for wildcard * Alias the namespace query parameter for Evals * eval: fix list when using ACLs and * namespace Apply the same verification process as in job, allocs and scaling policy list endpoints to handle the eval list when using an ACL token with limited namespace support but querying using the `` wildcard namespace. changelog: add entry for #13530 * ui: set namespace when querying eval Evals have a unique UUID as ID, but when querying them the Nomad API still expects a namespace query param, otherwise it assumes `default`. Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-07-11 16:42:17 -04:00
Tim Gross	a5a9eedc81	core job for secure variables re-key (#13440 ) When the `Full` flag is passed for key rotation, we kick off a core job to decrypt and re-encrypt all the secure variables so that they use the new key.	2022-07-11 13:34:06 -04:00
Tim Gross	6300427228	core job for key rotation (#13309 ) Extend the GC job to support periodic key rotation. Update the GC process to safely support signed workload identity. We can't GC any key used to sign a workload identity. Finding which key was used to sign every allocation will be expensive, but there are not that many keys. This lets us take a conservative approach: find the oldest live allocation and ensure that we don't GC any key older than that key.	2022-07-11 13:34:06 -04:00
Tim Gross	bfcbc00f4e	workload identity (#13223 ) In order to support implicit ACL policies for tasks to get their own secrets, each task would need to have its own ACL token. This would add extra raft overhead as well as new garbage collection jobs for cleaning up task-specific ACL tokens. Instead, Nomad will create a workload Identity Claim for each task. An Identity Claim is a JSON Web Token (JWT) signed by the server’s private key and attached to an Allocation at the time a plan is applied. The encoded JWT can be submitted as the X-Nomad-Token header to replace ACL token secret IDs for the RPCs that support identity claims. Whenever a key is is added to a server’s keyring, it will use the key as the seed for a Ed25519 public-private private keypair. That keypair will be used for signing the JWT and for verifying the JWT. This implementation is a ruthlessly minimal approach to support the secure variables feature. When a JWT is verified, the allocation ID will be checked against the Nomad state store, and non-existent or terminal allocation IDs will cause the validation to be rejected. This is sufficient to support the secure variables feature at launch without requiring implementation of a background process to renew soon-to-expire tokens.	2022-07-11 13:34:05 -04:00
Tim Gross	d5a214484c	core job for root key GC (#13199 ) Inactive and unused keys older than a threshold will be periodically garbage collected.	2022-07-11 13:34:04 -04:00
Tim Gross	d29e85d150	secure variables: initial state store (#12932 ) Implement the core SecureVariable and RootKey structs in memdb, provide the minimal skeleton for FSM, and a dummy storage and keyring RPC endpoint.	2022-07-11 13:34:01 -04:00
Seth Hoenig	239eaf9a29	Merge pull request #13626 from hashicorp/b-client-max-kill-timeout client: enforce max_kill_timeout client configuration	2022-07-07 13:44:39 -05:00
Seth Hoenig	5dd8aa3e27	client: enforce max_kill_timeout client configuration This PR fixes a bug where client configuration max_kill_timeout was not being enforced. The feature was introduced in 9f44780 but seems to have been removed during the major drivers refactoring. We can make sure the value is enforced by pluming it through the DriverHandler, which now uses the lesser of the task.killTimeout or client.maxKillTimeout. Also updates Event.SetKillTimeout to require both the task.killTimeout and client.maxKillTimeout so that we don't make the mistake of using the wrong value - as it was being given only the task.killTimeout before.	2022-07-06 15:29:38 -05:00
Luiz Aoqui	74c5578432	api: refactor ACL check for namespace wildcard (#13606 ) Improve how the all namespaces wildcard (``) is handled when checking ACL permissions. When using the wildcard namespace the `AllowNsOp` would return false since it looks for a namespace called `` to match. This commit changes this behavior to return `true` when the queried namespace is `*` and the token allows the operation in _any_ namespace. Actual permission must be checked per object. The helper function `AllowNsOpFunc` returns a function that can be used to make this verification.	2022-07-06 15:22:30 -04:00
James Rasell	0c0b028a59	core: allow deleting of evaluations (#13492 ) * core: add eval delete RPC and core functionality. * agent: add eval delete HTTP endpoint. * api: add eval delete API functionality. * cli: add eval delete command. * docs: add eval delete website documentation.	2022-07-06 16:30:11 +02:00
Michael Schurter	1cc0ae8795	docs: fix Plan{,Result}.NodeUpdate comment (#13534 ) It appears way back when this was first implemented in 9a917281af9c0a97a6c59575eaa52c5c86ffc60d, it was renamed from NodeEvict (with a correct comment) to NodeUpdate. The comment was changed from referring to only evictions to referring to "all allocs" in the first sentence and "stop or evict" in the second. This confuses every time I see it because I read the name (NodeUpdate) and first sentence ("all the allocs") and assume this represents all allocations... which isn't true. I'm going to assume I'm the only one who doesn't read the 2nd sentence and that's why this suboptimal wording has lasted 7 years, but can we change it for my sake?	2022-06-30 12:47:14 -07:00
Shishir Mahajan	6ba8245283	Fix typo: orthogonal. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-06-27 12:12:51 -07:00
Lance Haig	4bf27d743d	Allow Operator Generated bootstrap token (#12520 )	2022-06-03 07:37:24 -04:00
Seth Hoenig	dca954faac	build: update golangci-lint to v1.46.2 This version of golangci-lint improves support for generics, but also is more strict in copy vs. loop for slice copying.	2022-05-31 23:32:01 +00:00
Seth Hoenig	a5943da0c7	core: add tests for blocked evals math	2022-05-24 09:05:18 -05:00
Luiz Aoqui	a8cc633156	vault: revert support for entity aliases (#12723 ) After a more detailed analysis of this feature, the approach taken in PR #12449 was found to be not ideal due to poor UX (users are responsible for setting the entity alias they would like to use) and issues around jobs potentially masquerading itself as another Vault entity.	2022-04-22 10:46:34 -04:00
James Rasell	716b8e658b	api: Add support for filtering and pagination to the node list endpoint (#12727 )	2022-04-21 17:04:33 +02:00
Derek Strickland	5e309f3f33	reconciler: Handle canaries when client disconnects (#12539 ) * plan_apply: Allow node updates in disconnected node plans * plan: Keep the job when persisting unknown allocs * reconciler: stop unknown allocs when stopping all * reconcile_util: reorder filtering to handle canaries; skip rescheduling unknown * heartbeat: Fix bug in node heartbeating	2022-04-21 10:05:58 -04:00
chavacava	eb1c42e643	QueryOptions.SetTimeToBlock should take pointer receiver Fixes a bug where blocking queries that are retried don't have their blocking timeout reset, resulting in them running longer than expected.	2022-04-18 10:41:27 -04:00
Shishir	f5121d261e	Add os to NodeListStub struct. (#12497 ) * Add os to NodeListStub struct. Signed-off-by: Shishir Mahajan <smahajan@roblox.com> * Add os as a query param to /v1/nodes. Signed-off-by: Shishir Mahajan <smahajan@roblox.com> * Add test: os as a query param to /v1/nodes. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-04-15 17:22:45 -07:00
Derek Strickland	0891218ee9	system_scheduler: support disconnected clients (#12555 ) * structs: Add helper method for checking if alloc is configured to disconnect * system_scheduler: Add support for disconnected clients	2022-04-15 09:31:32 -04:00
Derek Strickland	0ab89b1728	Merge pull request #12476 from hashicorp/f-disconnected-client-allocation-handling disconnected clients: Feature branch merge	2022-04-06 10:11:57 -04:00
Seth Hoenig	2e2ff3f75e	Merge pull request #12419 from hashicorp/exec-cleanup raw_exec: make raw exec driver work with cgroups v2	2022-04-05 16:42:01 -05:00
Derek Strickland	43d20ebdbd	disconnected clients: `TaskGroup` validation (#12418 ) * TaskGroup: Validate that max_client_disconnect and stop_after_client_disconnect are mutually exclusive.	2022-04-05 17:14:50 -04:00
Derek Strickland	bd719bc7b8	reconciler: 2 phase reconnects and tests (#12333 ) * structs: Add alloc.Expired & alloc.Reconnected functions. Add Reconnect eval trigger by. * node_endpoint: Emit new eval for reconnecting unknown allocs. * filterByTainted: handle 2 phase commit filtering rules. * reconciler: Append AllocState on disconnect. Logic updates from testing and 2 phase reconnects. * allocs: Set reconnect timestamp. Destroy if not DesiredStatusRun. Watch for unknown status.	2022-04-05 17:13:10 -04:00
Derek Strickland	bb376320a2	comments: update some stale comments referencing deprecated config name (#12271 ) * comments: update some stale comments referencing deprecated config name	2022-04-05 17:12:23 -04:00
Derek Strickland	8e9f8be511	`MaxClientDisconnect` Jobspec checklist (#12177 ) * api: Add struct, conversion function, and tests * TaskGroup: Add field, validation, and tests * diff: Add diff handler and test * docs: Update docs	2022-04-05 17:12:23 -04:00
Derek Strickland	d7f44448e1	disconnected clients: Observability plumbing (#12141 ) * Add disconnects/reconnect to log output and emit reschedule metrics * TaskGroupSummary: Add Unknown, update StateStore logic, add to metrics	2022-04-05 17:12:23 -04:00
Derek Strickland	3cbd76ea9d	disconnected clients: Add reconnect task event (#12133 ) * Add TaskClientReconnectedEvent constant * Add allocRunner.Reconnect function to manage task state manually * Removes server-side push	2022-04-05 17:12:23 -04:00
Derek Strickland	b128769e19	reconciler: support disconnected clients (#12058 ) * Add merge helper for string maps * structs: add statuses, MaxClientDisconnect, and helper funcs * taintedNodes: Include disconnected nodes * upsertAllocsImpl: don't use existing ClientStatus when upserting unknown * allocSet: update filterByTainted and add delayByMaxClientDisconnect * allocReconciler: support disconnecting and reconnecting allocs * GenericScheduler: upsert unknown and queue reconnecting Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-04-05 17:10:37 -04:00
Luiz Aoqui	ab7eb5de6e	Support Vault entity aliases (#12449 ) Move some common Vault API data struct decoding out of the Vault client so it can be reused in other situations. Make Vault job validation its own function so it's easier to expand it. Rename the `Job.VaultPolicies` method to just `Job.Vault` since it returns the full Vault block, not just their policies. Set `ChangeMode` on `Vault.Canonicalize`. Add some missing tests. Allows specifying an entity alias that will be used by Nomad when deriving the task Vault token. An entity alias assigns an indentity to a token, allowing better control and management of Vault clients since all tokens with the same indentity alias will now be considered the same client. This helps track Nomad activity in Vault's audit logs and better control over Vault billing. Add support for a new Nomad server configuration to define a default entity alias to be used when deriving Vault tokens. This default value will be used if the task doesn't have an entity alias defined.	2022-04-05 14:18:10 -04:00
Seth Hoenig	52aaf86f52	raw_exec: make raw exec driver work with cgroups v2 This PR adds support for the raw_exec driver on systems with only cgroups v2. The raw exec driver is able to use cgroups to manage processes. This happens only on Linux, when exec_driver is enabled, and the no_cgroups option is not set. The driver uses the freezer controller to freeze processes of a task, issue a sigkill, then unfreeze. Previously the implementation assumed cgroups v1, and now it also supports cgroups v2. There is a bit of refactoring in this PR, but the fundamental design remains the same. Closes #12351 #12348	2022-04-04 16:11:38 -05:00
James Rasell	9449e1c3e2	Merge branch 'main' into f-1.3-boogie-nights	2022-03-25 16:40:32 +01:00
Tim Gross	b7075f04fd	CSI: enforce single access mode at validation time (#12337 ) A volume that has single-use access mode is feasibility checked during scheduling to ensure that only a single reader or writer claim exists. However, because feasibility checking is done one alloc at a time before the plan is written, a job that's misconfigured to have count > 1 that mounts one of these volumes will pass feasibility checking. Enforce the check at validation time instead to prevent us from even trying to evaluation a job that's misconfigured this way.	2022-03-23 09:21:26 -04:00
James Rasell	a646333263	Merge branch 'main' into f-1.3-boogie-nights	2022-03-23 09:41:25 +01:00
Tim Gross	60cfeacd76	drainer: defer CSI plugins until last (#12324 ) When a node is drained, system jobs are left until last so that operators can rely on things like log shippers running even as their applications are getting drained off. Include CSI plugins in this set so that Controller plugins deployed as services can be handled as gracefully as Node plugins that are running as system jobs.	2022-03-22 10:26:56 -04:00
Luiz Aoqui	15089f055f	api: add related evals to eval details (#12305 ) The `related` query param is used to indicate that the request should return a list of related (next, previous, and blocked) evaluations. Co-authored-by: Jasmine Dahilig <jasmine@hashicorp.com>	2022-03-17 13:56:14 -04:00
James Rasell	783d7fdc31	jobspec: add service block provider parameter and validation.	2022-03-14 09:21:20 +01:00
Luiz Aoqui	ab8ce87bba	Add pagination, filtering and sort to more API endpoints (#12186 )	2022-03-08 20:54:17 -05:00
Michael Schurter	7bb8de68e5	Merge pull request #12138 from jorgemarey/f-ns-meta Add metadata to namespaces	2022-03-07 10:19:33 -08:00
Luiz Aoqui	01931587ba	api: paginated results with different ordering (#12128 ) The paginator logic was built when go-memdb iterators would return items ordered lexicographically by their ID prefixes, but #12054 added the option for some tables to return results ordered by their `CreateIndex` instead, which invalidated the previous paginator assumption. The iterator used for pagination must still return results in some order so that the paginator can properly handle requests where the next_token value is not present in the results anymore (e.g., the eval was GC'ed). In these situations, the paginator will start the returned page in the first element right after where the requested token should've been. This commit moves the logic to generate pagination tokens from the elements being paginated to the iterator itself so that callers can have more control over the token format to make sure they are properly ordered and stable. It also allows configuring the paginator as being ordered in ascending or descending order, which is relevant when looking for a token that may not be present anymore.	2022-03-01 15:36:49 -05:00
James Rasell	8a23afdb56	events: add state objects and logic for service registrations.	2022-02-28 10:44:58 +01:00
Jorge Marey	a466f01120	Add metadata to namespaces	2022-02-27 09:09:10 +01:00
Florian Apolloner	3bced8f558	namespaces: allow enabling/disabling allowed drivers per namespace	2022-02-24 09:27:32 -05:00
Michael Schurter	7494a0c4fd	core: remove all traces of unused protocol version Nomad inherited protocol version numbering configuration from Consul and Serf, but unlike those projects Nomad has never used it. Nomad's `protocol_version` has always been `1`. While the code is effectively unused and therefore poses no runtime risks to leave, I felt like removing it was best because: 1. Nomad's RPC subsystem has been able to evolve extensively without needing to increment the version number. 2. Nomad's HTTP API has evolved extensively without increment `API{Major,Minor}Version`. If we want to version the HTTP API in the future, I doubt this is the mechanism we would choose. 3. The presence of the `server.protocol_version` configuration parameter is confusing since `server.raft_protocol` is an important parameter for operators to consider. Even more confusing is that there is a distinct Serf protocol version which is included in `nomad server members` output under the heading `Protocol`. `raft_protocol` is the only protocol version relevant to Nomad developers and operators. The other protocol versions are either deadcode or have never changed (Serf). 4. If we were to need to version the RPC, HTTP API, or Serf protocols, I don't think these configuration parameters and variables are the best choice. If we come to that point we should choose a versioning scheme based on the use case and modern best practices -- not this 6+ year old dead code.	2022-02-18 16:12:36 -08:00
Luiz Aoqui	de91954582	initial base work for implementing sorting and filter across API endpoints (#12076 )	2022-02-16 14:34:36 -05:00
Luiz Aoqui	110dbeeb9d	Add `go-bexpr` filters to evals and deployment list endpoints (#12034 )	2022-02-16 11:40:30 -05:00
Seth Hoenig	40c714a681	api: return sorted results in certain list endpoints These API endpoints now return results in chronological order. They can return results in reverse chronological order by setting the query parameter ascending=true. - Eval.List - Deployment.List	2022-02-15 13:48:28 -06:00
Luiz Aoqui	3bf6036487	Version 1.2.6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJiBIXqAAoJELC0QQl2hbZ2M8cP/A7LENJbFSph25M1aGItra5j BphSX//Sq/v9ZzO44rOGNYQGfTpFT8STJgj2GC50qR/ilF4KX4D0oZlDyu/6D0NG ouN9RUjnFd6IEDQrjqqqhr3F69Z95SWVfi1rfgn/pIgOYkVEXfi6DXaulVVyd2ZT J0G5w5ryl5d8PhuL7TWw4zbhZRQn0hVspZv/1s3/I9aG6Sew8SMweeOxbN9lBr7E H19Amdjh6ugRuPgU7YMpKDVrZQRv9Wt7BUP/uc0u3LiW9z3Ko8ZKnCRKErtL5Kc3 HDZsWe+t3va4Uekzd0HULNcYU4kwjogdRYRzX5kRsOyXelrZkQIqYFiKrk1wVbq/ cYM5DUak6eUQBGhgi3UY0fklBFq4GDGpiwEzn7rvQb0PRSuVyykgbZ12fzyIu8dp tWbR/WOEg9F+jva6HkR2kDIcr5mDmny3Pxi5aUT6lMk1111nCzOjDzhLkQVtfsex FDMByXxM4oWAK3ouq2OIdxDL2c742A2933C4/30KWE7Xy7twsvkGw52irw66VO3V 4PHP880cDvEDaEh15mY/8FlaAE7t/gsCUuYLxGwl33TaXSRBLc9vVNrrp89q53TD ZcvXTBpHUOWa6ZlHF/4f8LW44rowM6bU0Wili7NaWOKx86dnUJMG4sqJifNgcpS/ 7lXogv98CYLbMy4X4if0 =NY1Z -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmIFbbkACgkQKfRZwNnL tXOr/g/+N2ZBMK8ohEvtdXLl7WXrVhgJfUSVbdD5Kfshul9CPn3yWRxJzqtEN2Pf 55ozeWLpoziP9y9LviJ7rDidXcTmDFutbFdGJ3L+ZLdLILsNOq1A+lbuwO3fJngZ 5aiPoJLsw4sqj6uHaM6Cls2f145O92nT7GXEHCxuvGHeSf3NkcR+zRY5nPrLTIrA uxYefCOzP6C2I+W7dL4Oj5R5EZd4UDi1WiL8pGzwm24LcagZN2ctctolAeF9OlJX M58UUv9b4GObe617u8MeH0LIlyZiNwn9JqrV33dKVTyrkBIYfYxkzdzMKf1csVYk kQb13KPdPTASBAGTl+sxeXXnw/bg09JXGcvREX5lLyQqY8xGwTv2FpTmybKWLiss Bg6BbejrgtCPBik0EAHWV0+kVzhi9bPfUYwTXLDCzMtrbyCyPoWchruel2sm41U1 ezRDzlSvf6nrXf7sAv6umJICck4Bc5Gol+8W7fxvWqnY9rQ3ds2v7E5lXZMBbOmE JSi+EDWBJjBAXehE6pLxeVsvlHMRWN007Z2UeD4neGIgG7xFJLq6nKeUKoiNIpgk hKBL8iwHyuJfrBB/dcPzI9NV+jL6OZ/oI1RWxSj0MX/B4VXZp8HrqZA5JxzQolUg KIxqe4iX3WIkQv+UU4WiELvs4O7fujB4KWz3iQokhwDxqGUpffk= =5EG2 -----END PGP SIGNATURE----- Merge tag 'v1.2.6' into merge-release-1.2.6-branch Version 1.2.6	2022-02-10 14:55:34 -05:00
Seth Hoenig	437bb4b86d	client: check escaping of alloc dir using symlinks This PR adds symlink resolution when doing validation of paths to ensure they do not escape client allocation directories.	2022-02-09 19:50:13 -05:00
Samantha	54f8c04c91	Fix health checking for ephemeral poststart tasks (#11945 ) Update the logic in the Nomad client's alloc health tracker which erroneously marks existing healthy allocations with dead poststart ephemeral tasks as unhealthy even if they were already successful during a previous deployment.	2022-02-02 16:29:49 -05:00
Michael Schurter	d87ed3fcd7	core: prevent malformed plans from crashing leader The Plan.Submit endpoint assumed PlanRequest.Plan was never nil. While there is no evidence it ever has been nil, we should not panic if a nil plan is ever submitted because that would crash the leader.	2022-01-31 12:15:15 -08:00
Luiz Aoqui	b1753d0568	scheduler: detect and log unexpected scheduling collisions (#11793 )	2022-01-14 20:09:14 -05:00
Conor Evans	8d622797af	replace 'a alloc' with 'an alloc' where appropriate (#11792 )	2022-01-10 11:59:46 -05:00
Derek Strickland	0a8e03f0f7	Expose Consul template configuration parameters (#11606 ) This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza. - `wait` It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza. - `max_stale` - `block_query_wait` - `consul_retry` - `vault_retry` - `wait` Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure. - `wait_bounds` Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-01-10 10:19:07 -05:00
Joel May	4f78bcfb98	Emit metrics on reschedule later decisions as nomad.client.allocs.reschedule (#10237 )	2022-01-06 15:56:43 -05:00
Michael Schurter	20bd8acf43	do not initialize copy's slice if nil in original	2021-12-23 16:40:35 -08:00
Michael Schurter	88200f4eb9	core: fix DNS and CPU Core copying	2021-12-23 12:28:19 -08:00
Michael Schurter	7d741837b0	core: match struct field order in Copy()	2021-12-23 12:27:39 -08:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
Tim Gross	a0cf5db797	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
Tim Gross	624ecab901	evaluations list pagination and filtering (#11648 ) API queries can request pagination using the `NextToken` and `PerPage` fields of `QueryOptions`, when supported by the underlying API. Add a `NextToken` field to the `structs.QueryMeta` so that we have a common field across RPCs to tell the caller where to resume paging from on their next API call. Include this field on the `api.QueryMeta` as well so that it's available for future versions of List HTTP APIs that wrap the response with `QueryMeta` rather than returning a simple list of structs. In the meantime callers can get the `X-Nomad-NextToken`. Add pagination to the `Eval.List` RPC by checking for pagination token and page size in `QueryOptions`. This will allow resuming from the last ID seen so long as the query parameters and the state store itself are unchanged between requests. Add filtering by job ID or evaluation status over the results we get out of the state store. Parse the query parameters of the `Eval.List` API into the arguments expected for filtering in the RPC call.	2021-12-10 13:43:03 -05:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
Danish Prakash	1e2c9b3aa0	client: emit max_memory metric (#11490 )	2021-11-17 08:34:22 -05:00
Alessandro De Blasis	07c670fdc0	cli: show `host_network` in `nomad status` (#11432 ) Enhance the CLI in order to return the host network in two flavors (default, verbose) of the `node status` command. Fixes: #11223. Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2021-11-05 09:02:46 -04:00
Luiz Aoqui	655ac2719f	Allow using specific object ID on diff (#11400 )	2021-11-01 15:16:31 -04:00
Michael Schurter	59fda1894e	Merge pull request #11167 from a-zagaevskiy/master Support configurable dynamic port range	2021-10-13 16:47:38 -07:00
James Rasell	0e926ef3fd	allow configuration of Docker hostnames in bridge mode (#11173 ) Add a new hostname string parameter to the network block which allows operators to specify the hostname of the network namespace. Changing this causes a destructive update to the allocation and it is omitted if empty from API responses. This parameter also supports interpolation. In order to have a hostname passed as a configuration param when creating an allocation network, the CreateNetwork func of the DriverNetworkManager interface needs to be updated. In order to minimize the disruption of future changes, rather than add another string func arg, the function now accepts a request struct along with the allocID param. The struct has the hostname as a field. The in-tree implementations of DriverNetworkManager.CreateNetwork have been modified to account for the function signature change. In updating for the change, the enhancement of adding hostnames to network namespaces has also been added to the Docker driver, whilst the default Linux manager does not current implement it.	2021-09-16 08:13:09 +02:00
Aleksandr Zagaevskiy	ebb87e65fe	Support configurable dynamic port range	2021-09-10 11:52:47 +03:00
James Rasell	b6813f1221	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
Mahmood Ali	84a3522133	Consider all system jobs for a new node (#11054 ) When a node becomes ready, create an eval for all system jobs across namespaces. The previous code uses `job.ID` to deduplicate evals, but that ignores the job namespace. Thus if there are multiple jobs in different namespaces sharing the same ID/Name, only one will be considered for running in the new node. Thus, Nomad may skip running some system jobs in that node.	2021-08-18 09:50:37 -04:00
Mahmood Ali	c37339a8c8	Merge pull request #9160 from hashicorp/f-sysbatch core: implement system batch scheduler	2021-08-16 09:30:24 -04:00
Mahmood Ali	bfc766357e	deployments: canary=0 is implicitly autopromote (#11013 ) In a multi-task-group job, treat 0 canary groups as auto-promote. This change fixes an edge case where Nomad requires a manual promotion, if the job had any group with canary=0 and rest of groups having auto_promote set. Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2021-08-10 17:06:40 -04:00
Seth Hoenig	3371214431	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Seth Hoenig	ac5c83cafd	core: remove internalization of affinity strings Basically the same as #10896 but with the Affinity struct. Since we use reflect.DeepEquals for job comparison, there is risk of false positives for changes due to a job struct with memoized vs non-memoized strings. Closes #10897	2021-07-15 15:15:39 -05:00
Seth Hoenig	1aec25f1df	core: do not memoize constraint strings This PR causes Nomad to no longer memoize the String value of a Constraint. The private memoized variable may or may not be initialized at any given time, which means a reflect.DeepEqual comparison between two jobs (e.g. during Plan) may return incorrect results. Fixes #10836	2021-07-14 10:04:35 -05:00
Mahmood Ali	1f34f2197b	Merge pull request #10806 from hashicorp/munda/idempotent-job-dispatch Enforce idempotency of dispatched jobs using token on dispatch request	2021-07-08 10:23:31 -04:00
Tim Gross	9f128a28ae	service: remove duplicate name check during validation (#10868 ) When a task group with `service` block(s) is validated, we validate that there are no duplicates, but this validation doesn't have access to the task environment because it hasn't been created yet. Services and checks with interpolation can be flagged incorrectly as conflicting. Name conflicts in services are not actually an error in Consul and users have reported wanting to use the same service name for task groups differentiated by tags.	2021-07-08 09:43:38 -04:00
Alex Munda	848918018c	Move idempotency token to write options. Remove DispatchIdempotent	2021-06-30 15:10:48 -05:00
Alex Munda	ca86c7ba0c	Add idempotency token to dispatch request instead of special meta key	2021-06-29 15:59:23 -05:00
Seth Hoenig	d026ff1f66	consul/connect: add support for connect mesh gateways This PR implements first-class support for Nomad running Consul Connect Mesh Gateways. Mesh gateways enable services in the Connect mesh to make cross-DC connections via gateways, where each datacenter may not have full node interconnectivity. Consul docs with more information: https://www.consul.io/docs/connect/gateways/mesh-gateway The following group level service block can be used to establish a Connect mesh gateway. service { connect { gateway { mesh { // no configuration } } } } Services can make use of a mesh gateway by configuring so in their upstream blocks, e.g. service { connect { sidecar_service { proxy { upstreams { destination_name = "<service>" local_bind_port = <port> datacenter = "<datacenter>" mesh_gateway { mode = "<mode>" } } } } } } Typical use of a mesh gateway is to create a bridge between datacenters. A mesh gateway should then be configured with a service port that is mapped from a host_network configured on a WAN interface in Nomad agent config, e.g. client { host_network "public" { interface = "eth1" } } Create a port mapping in the group.network block for use by the mesh gateway service from the public host_network, e.g. network { mode = "bridge" port "mesh_wan" { host_network = "public" } } Use this port label for the service.port of the mesh gateway, e.g. service { name = "mesh-gateway" port = "mesh_wan" connect { gateway { mesh {} } } } Currently Envoy is the only supported gateway implementation in Consul. By default Nomad client will run the latest official Envoy docker image supported by the local Consul agent. The Envoy task can be customized by setting `meta.connect.gateway_image` in agent config or by setting the `connect.sidecar_task` block. Gateways require Consul 1.8.0+, enforced by the Nomad scheduler. Closes #9446	2021-06-04 08:24:49 -05:00
Seth Hoenig	d359eb6f3a	consul/connect: use additional constraints in scheduling connect tasks This PR adds two additional constraints on Connect sidecar and gateway tasks, making sure Nomad schedules them only onto nodes where Connect is actually enabled on the Consul agent. Consul requires `connect.enabled = true` and `ports.grpc = <number>` to be explicitly set on agent configuration before Connect APIs will work. Until now, Nomad would only validate a minimum version of Consul, which would cause confusion for users who try to run Connect tasks on nodes where Consul is not yet sufficiently configured. These contstraints prevent job scheduling on nodes where Connect is not actually use-able. Closes #10700	2021-06-03 15:43:34 -05:00
Tim Gross	e9777a88ce	plan applier: add trace-level log of plan The plans generated by the scheduler produce high-level output of counts on each evaluation, but when debugging scheduler issues it'd be nice to have a more detailed view of the resulting plan. Emitting this log at trace minimizes the overhead, and producing it in the plan applyer makes it easier to find as it will always be on the leader.	2021-06-02 10:25:23 -04:00
Chris Baker	263ddd567c	Node Drain Metadata (#10250 )	2021-05-07 13:58:40 -04:00
Michael Schurter	547a718ef6	Merge pull request #10248 from hashicorp/f-remotetask-2021 core: propagate remote task handles	2021-04-30 08:57:26 -07:00
Luiz Aoqui	f1b9055d21	Add metrics for blocked eval resources (#10454 ) * add metrics for blocked eval resources * docs: add new blocked_evals metrics * fix to call `pruneStats` instead of `stats.prune` directly	2021-04-29 15:03:45 -04:00
Michael Schurter	e62795798d	core: propagate remote task handles Add a new driver capability: RemoteTasks. When a task is run by a driver with RemoteTasks set, its TaskHandle will be propagated to the server in its allocation's TaskState. If the task is replaced due to a down node or draining, its TaskHandle will be propagated to its replacement allocation. This allows tasks to be scheduled in remote systems whose lifecycles are disconnected from the Nomad node's lifecycle. See https://github.com/hashicorp/nomad-driver-ecs for an example ECS remote task driver.	2021-04-27 15:07:03 -07:00
Mahmood Ali	cf24a9eaaf	api: /v1/jobs always include namespaces (#10434 ) Add Namespace as a top-level field in `/v1/jobs` stub. The `/v1/jobs` endpoint already includes the namespace under `JobSummary`, though the API is odd, as typically the job ID and Namespace are in the same level, and the oddity complicates the UI frontend development. The downside of adding it is redundant field, that makes the response body a bit bigger, specially for clusters with large jobs. Though, it should compress nicely and I expect the overhead to be small to overall response size. The benefit of a cleaner and more consistent API seem worth it. Fixes #10431	2021-04-23 16:36:54 -04:00
Mahmood Ali	d2fcce21f8	Migrate all allocs when draining a node (#10411 ) This fixes a bug affecting drain nodes, where allocs may fail to be migrated if they belong to different namespaces but share the same job name. The reason is that the helper function that creates the migration evals indexed the allocs by job ID without accounting for the namespaces. When job ids clash, only an eval is created for one and the rest of the allocs remain intact. Fixes #10172	2021-04-21 12:11:14 -04:00
Seth Hoenig	1ee8d5ffc5	api: implement fuzzy search API This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy searching objects in Nomad. The fuzzy search endpoint routes requests to the Nomad Server leader, which implements the Search.FuzzySearch RPC method. Requests to the fuzzy search API are based on the api.FuzzySearchRequest object, e.g. { "Text": "ed", "Context": "all" } Responses from the fuzzy search API are based on the api.FuzzySearchResponse object, e.g. { "Index": 27, "KnownLeader": true, "LastContact": 0, "Matches": { "tasks": [ { "ID": "redis", "Scope": [ "default", "example", "cache" ] } ], "evals": [], "deployment": [], "volumes": [], "scaling_policy": [], "images": [ { "ID": "redis:3.2", "Scope": [ "default", "example", "cache", "redis" ] } ] }, "Truncations": { "volumes": false, "scaling_policy": false, "evals": false, "deployment": false } } The API is tunable using the new server.search stanza, e.g. server { search { fuzzy_enabled = true limit_query = 200 limit_results = 1000 min_term_length = 5 } } These values can be increased or decreased, so as to provide more search results or to reduce load on the Nomad Server. The fuzzy search API can be disabled entirely by setting `fuzzy_enabled` to `false`.	2021-04-16 16:36:07 -06:00
Tim Gross	dbcc2694b0	refactor: move VolumeRequest validation to Validate method	2021-04-07 11:24:09 -04:00
Seth Hoenig	f17ba33f61	consul: plubming for specifying consul namespace in job/group This PR adds the common OSS changes for adding support for Consul Namespaces, which is going to be a Nomad Enterprise feature. There is no new functionality provided by this changeset and hopefully no new bugs.	2021-04-05 10:03:19 -06:00
Chris Baker	436d46bd19	Merge branch 'main' into f-node-drain-api	2021-04-01 15:22:57 -05:00
Tim Gross	aec5337862	CSI: HTTP handlers for create/delete/list	2021-03-31 16:37:09 -04:00

1 2 3 4 5 ...

1401 Commits