open-nomad

Commit Graph

Author	SHA1	Message	Date
Derek Strickland	6874997f91	scheduler: Fix bug where the would treat multiregion jobs as paused for job types that don't use deployments (#14659 ) * scheduler: Fix bug where the scheduler would treat multiregion jobs as paused for job types that don't use deployments Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-09-22 14:31:27 -04:00
Seth Hoenig	2088ca3345	cleanup more helper updates (#14638 ) * cleanup: refactor MapStringStringSliceValueSet to be cleaner * cleanup: replace SliceStringToSet with actual set * cleanup: replace SliceStringSubset with real set * cleanup: replace SliceStringContains with slices.Contains * cleanup: remove unused function SliceStringHasPrefix * cleanup: fixup StringHasPrefixInSlice doc string * cleanup: refactor SliceSetDisjoint to use real set * cleanup: replace CompareSliceSetString with SliceSetEq * cleanup: replace CompareMapStringString with maps.Equal * cleanup: replace CopyMapStringString with CopyMap * cleanup: replace CopyMapStringInterface with CopyMap * cleanup: fixup more CopyMapStringString and CopyMapStringInt * cleanup: replace CopySliceString with slices.Clone * cleanup: remove unused CopySliceInt * cleanup: refactor CopyMapStringSliceString to be generic as CopyMapOfSlice * cleanup: replace CopyMap with maps.Clone * cleanup: run go mod tidy	2022-09-21 14:53:25 -05:00
Charlie Voiselle	e58998e218	Add client scheduling eligibility to heartbeat (#14483 )	2022-09-08 14:31:36 -04:00
Yan	6e927fa125	warn destructive update only when count > 1 (#13103 )	2022-09-02 15:30:06 -04:00
James Rasell	755b4745ed	Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main	2022-08-30 08:59:13 +01:00
Tim Gross	62a968f443	Merge pull request #14351 from hashicorp/variables-rename Variables rename	2022-08-29 11:36:50 -04:00
Piotr Kazmierczak	5f353503e5	bugfix: fixed template validation panic in case of incorrect ChangeScript configuration (#14374 ) Fixes #14367	2022-08-29 17:11:15 +02:00
Tim Gross	1dc053b917	rename SecureVariables to Variables throughout	2022-08-26 16:06:24 -04:00
James Rasell	601588df6b	Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main	2022-08-25 12:14:29 +01:00
Luiz Aoqui	e012d9411e	Task lifecycle restart (#14127 ) * allocrunner: handle lifecycle when all tasks die When all tasks die the Coordinator must transition to its terminal state, coordinatorStatePoststop, to unblock poststop tasks. Since this could happen at any time (for example, a prestart task dies), all states must be able to transition to this terminal state. * allocrunner: implement different alloc restarts Add a new alloc restart mode where all tasks are restarted, even if they have already exited. Also unifies the alloc restart logic to use the implementation that restarts tasks concurrently and ignores ErrTaskNotRunning errors since those are expected when restarting the allocation. * allocrunner: allow tasks to run again Prevent the task runner Run() method from exiting to allow a dead task to run again. When the task runner is signaled to restart, the function will jump back to the MAIN loop and run it again. The task runner determines if a task needs to run again based on two new task events that were added to differentiate between a request to restart a specific task, the tasks that are currently running, or all tasks that have already run. * api/cli: add support for all tasks alloc restart Implement the new -all-tasks alloc restart CLI flag and its API counterpar, AllTasks. The client endpoint calls the appropriate restart method from the allocrunner depending on the restart parameters used. * test: fix tasklifecycle Coordinator test * allocrunner: kill taskrunners if all tasks are dead When all non-poststop tasks are dead we need to kill the taskrunners so we don't leak their goroutines, which are blocked in the alloc restart loop. This also ensures the allocrunner exits on its own. * taskrunner: fix tests that waited on WaitCh Now that "dead" tasks may run again, the taskrunner Run() method will not return when the task finishes running, so tests must wait for the task state to be "dead" instead of using the WaitCh, since it won't be closed until the taskrunner is killed. * tests: add tests for all tasks alloc restart * changelog: add entry for #14127 * taskrunner: fix restore logic. The first implementation of the task runner restore process relied on server data (`tr.Alloc().TerminalStatus()`) which may not be available to the client at the time of restore. It also had the incorrect code path. When restoring a dead task the driver handle always needs to be clear cleanly using `clearDriverHandle` otherwise, after exiting the MAIN loop, the task may be killed by `tr.handleKill`. The fix is to store the state of the Run() loop in the task runner local client state: if the task runner ever exits this loop cleanly (not with a shutdown) it will never be able to run again. So if the Run() loops starts with this local state flag set, it must exit early. This local state flag is also being checked on task restart requests. If the task is "dead" and its Run() loop is not active it will never be able to run again. * address code review requests * apply more code review changes * taskrunner: add different Restart modes Using the task event to differentiate between the allocrunner restart methods proved to be confusing for developers to understand how it all worked. So instead of relying on the event type, this commit separated the logic of restarting an taskRunner into two methods: - `Restart` will retain the current behaviour and only will only restart the task if it's currently running. - `ForceRestart` is the new method where a `dead` task is allowed to restart if its `Run()` method is still active. Callers will need to restart the allocRunner taskCoordinator to make sure it will allow the task to run again. * minor fixes	2022-08-24 17:43:07 -04:00
Piotr Kazmierczak	7077d1f9aa	template: custom change_mode scripts (#13972 ) This PR adds the functionality of allowing custom scripts to be executed on template change. Resolves #2707	2022-08-24 17:43:01 +02:00
Luiz Aoqui	7a8cacc9ec	allocrunner: refactor task coordinator (#14009 ) The current implementation for the task coordinator unblocks tasks by performing destructive operations over its internal state (like closing channels and deleting maps from keys). This presents a problem in situations where we would like to revert the state of a task, such as when restarting an allocation with tasks that have already exited. With this new implementation the task coordinator behaves more like a finite state machine where task may be blocked/unblocked multiple times by performing a state transition. This initial part of the work only refactors the task coordinator and is functionally equivalent to the previous implementation. Future work will build upon this to provide bug fixes and enhancements.	2022-08-22 18:38:49 -04:00
Tim Gross	bf57d76ec7	allow ACL policies to be associated with workload identity (#14140 ) The original design for workload identities and ACLs allows for operators to extend the automatic capabilities of a workload by using a specially-named policy. This has shown to be potentially unsafe because of naming collisions, so instead we'll allow operators to explicitly attach a policy to a workload identity. This changeset adds workload identity fields to ACL policy objects and threads that all the way down to the command line. It also a new secondary index to the ACL policy table on namespace and job so that claim resolution can efficiently query for related policies.	2022-08-22 16:41:21 -04:00
Luiz Aoqui	dbffdca92e	template: use pointer values for gid and uid (#14203 ) When a Nomad agent starts and loads jobs that already existed in the cluster, the default template uid and gid was being set to 0, since this is the zero value for int. This caused these jobs to fail in environments where it was not possible to use 0, such as in Windows clients. In order to differentiate between an explicit 0 and a template where these properties were not set we need to use a pointer.	2022-08-22 16:25:49 -04:00
Michael Schurter	3b57df33e3	client: fix data races in config handling (#14139 ) Before this change, Client had 2 copies of the config object: config and configCopy. There was no guidance around which to use where (other than configCopy's comment to pass it to alloc runners), both are shared among goroutines and mutated in data racy ways. At least at one point I think the idea was to have `config` be mutable and then grab a lock to overwrite `configCopy`'s pointer atomically. This would have allowed alloc runners to read their config copies in data race safe ways, but this isn't how the current implementation worked. This change takes the following approach to safely handling configs in the client: 1. `Client.config` is the only copy of the config and all access must go through the `Client.configLock` mutex 2. Since the mutex only protects the config pointer itself and not fields inside the Config struct: all config mutation must be done on a copy of the config, and then Client's config pointer is overwritten while the mutex is acquired. Alloc runners and other goroutines with the old config pointer will not see config updates. 3. Deep copying is implemented on the Config struct to satisfy the previous approach. The TLS Keyloader is an exception because it has its own internal locking to support mutating in place. An unfortunate complication but one I couldn't find a way to untangle in a timely fashion. 4. To facilitate deep copying I made an internally backward incompatible API change: our `helper/funcs` used to turn containers (slices and maps) with 0 elements into nils. This probably saves a few memory allocations but makes it very easy to cause panics. Since my new config handling approach uses more copying, it became very difficult to ensure all code that used containers on configs could handle nils properly. Since this code has caused panics in the past, I fixed it: nil containers are copied as nil, but 0-element containers properly return a new 0-element container. No more "downgrading to nil!"	2022-08-18 16:32:04 -07:00
Piotr Kazmierczak	b63944b5c1	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
James Rasell	9e3f1581fb	core: add ACL role functionality to ACL tokens. ACL tokens can now utilize ACL roles in order to provide API authorization. Each ACL token can be created and linked to an array of policies as well as an array of ACL role links. The link can be provided via the role name or ID, but internally, is always resolved to the ID as this is immutable whereas the name can be changed by operators. When resolving an ACL token, the policies linked from an ACL role are unpacked and combined with the policy array to form the complete auth set for the token. The ACL token creation endpoint handles deduplicating ACL role links as well as ensuring they exist within state. When reading a token, Nomad will also ensure the ACL role link is current. This handles ACL roles being deleted from under a token from a UX standpoint.	2022-08-17 14:45:01 +01:00
Tim Gross	4005759d28	move secure variable conflict resolution to state store (#13922 ) Move conflict resolution implementation into the state store with a new Apply RPC. This also makes the RPC for secure variables much more similar to Consul's KV, which will help us support soft deletes in a post-1.4.0 version of Nomad. Reimplement quotas in the state store functions. Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>	2022-08-15 11:19:53 -04:00
James Rasell	e660c9a908	core: add ACL role state schema and functionality. (#13955 ) This commit includes the new state schema for ACL roles along with state interaction functions for CRUD actions. The change also includes snapshot persist and restore functionality and the addition of FSM messages for Raft updates which will come via RPC endpoints.	2022-08-09 09:33:41 +02:00
Seth Hoenig	2b6bda49b9	core: automatically plumb task name into task-level services and checks	2022-08-05 12:42:41 -05:00
Piotr Kazmierczak	530280505f	client: enable specifying user/group permissions in the template stanza (#13755 ) * Adds Uid/Gid parameters to template. * Updated diff_test * fixed order * update jobspec and api * removed obsolete code * helper functions for jobspec parse test * updated documentation * adjusted API jobs test. * propagate uid/gid setting to job_endpoint * adjusted job_endpoint tests * making uid/gid into pointers * refactor * updated documentation * updated documentation * Update client/allocrunner/taskrunner/template/template_test.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * Update website/content/api-docs/json-jobs.mdx Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * propagating documentation change from Luiz * formatting * changelog entry * changed changelog entry Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-08-02 22:15:38 +02:00
James Rasell	663aa92b7a	Merge branch 'main' into f-gh-13120-sso-umbrella	2022-08-02 08:30:03 +01:00
Seth Hoenig	c61e779b48	Merge pull request #13715 from hashicorp/dev-nsd-checks client: add support for checks in nomad services	2022-07-21 10:22:57 -05:00
Seth Hoenig	8e6eeaa37e	Merge pull request #13869 from hashicorp/b-uniq-services-2 servicedisco: ensure service uniqueness in job validation	2022-07-21 08:24:24 -05:00
Tim Gross	d11da1df5c	workload identity: use parent ID for dispatch/periodic jobs (#13748 ) Workload identities grant implicit access to policies, and operators will not want to craft separate policies for each invocation of a periodic or dispatch job. Use the parent job's ID as the JobID claim.	2022-07-21 09:05:54 -04:00
Seth Hoenig	e5978a9cbf	jobspec: ensure service uniqueness in job validation	2022-07-20 12:38:08 -05:00
Seth Hoenig	d83aae253f	cleanup: track task names and providers using set	2022-07-20 11:48:36 -05:00
James Rasell	9264f07cc1	core: add expired token garbage collection periodic jobs. (#13805 ) Two new periodic core jobs have been added which handle removing expired local and global tokens from state. The local core job is run on every leader; the global core job is only run on the leader within the authoritative region.	2022-07-19 15:37:46 +02:00
Tim Gross	cfa2cb140e	fsm: one-time token expiration should be deterministic (#13737 ) When applying a raft log to expire ACL tokens, we need to use a timestamp provided by the leader so that the result is deterministic across servers. Use leader's timestamp from RPC call	2022-07-18 14:19:29 -04:00
James Rasell	0cde3182eb	core: add ACL token expiry state, struct, and RPC handling. (#13718 ) The ACL token state schema has been updated to utilise two new indexes which track expiration of tokens that are configured with an expiration TTL or time. A new state function allows listing ACL expired tokens which will be used by internal garbage collection. The ACL endpoint has been modified so that all validation happens within a single function call. This is easier to understand and see at a glance. The ACL token validation now also includes logic for expiry TTL and times. The ACL endpoint upsert tests have been condensed into a single, table driven test. There is a new token canonicalize which provides a single place for token canonicalization, rather than logic spread in the RPC handler.	2022-07-13 15:40:34 +02:00
Luiz Aoqui	b656981cf0	Track plan rejection history and automatically mark clients as ineligible (#13421 ) Plan rejections occur when the scheduler work and the leader plan applier disagree on the feasibility of a plan. This may happen for valid reasons: since Nomad does parallel scheduling, it is expected that different workers will have a different state when computing placements. As the final plan reaches the leader plan applier, it may no longer be valid due to a concurrent scheduling taking up intended resources. In these situations the plan applier will notify the worker that the plan was rejected and that they should refresh their state before trying again. In some rare and unexpected circumstances it has been observed that workers will repeatedly submit the same plan, even if they are always rejected. While the root cause is still unknown this mitigation has been put in place. The plan applier will now track the history of plan rejections per client and include in the plan result a list of node IDs that should be set as ineligible if the number of rejections in a given time window crosses a certain threshold. The window size and threshold value can be adjusted in the server configuration. To avoid marking several nodes as ineligible at one, the operation is rate limited to 5 nodes every 30min, with an initial burst of 10 operations.	2022-07-12 18:40:20 -04:00
Seth Hoenig	297d386bdc	client: add support for checks in nomad services This PR adds support for specifying checks in services registered to the built-in nomad service provider. Currently only HTTP and TCP checks are supported, though more types could be added later.	2022-07-12 17:09:50 -05:00
Michael Schurter	3e50f72fad	core: merge reserved_ports into host_networks (#13651 ) Fixes #13505 This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports). As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility: Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly. Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me. So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR.	2022-07-12 14:40:25 -07:00
Phil Renaud	e9219a1ae0	Allow wildcard for Evaluations API (#13530 ) * Failing test and TODO for wildcard * Alias the namespace query parameter for Evals * eval: fix list when using ACLs and * namespace Apply the same verification process as in job, allocs and scaling policy list endpoints to handle the eval list when using an ACL token with limited namespace support but querying using the `` wildcard namespace. changelog: add entry for #13530 * ui: set namespace when querying eval Evals have a unique UUID as ID, but when querying them the Nomad API still expects a namespace query param, otherwise it assumes `default`. Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-07-11 16:42:17 -04:00
Tim Gross	a5a9eedc81	core job for secure variables re-key (#13440 ) When the `Full` flag is passed for key rotation, we kick off a core job to decrypt and re-encrypt all the secure variables so that they use the new key.	2022-07-11 13:34:06 -04:00
Tim Gross	6300427228	core job for key rotation (#13309 ) Extend the GC job to support periodic key rotation. Update the GC process to safely support signed workload identity. We can't GC any key used to sign a workload identity. Finding which key was used to sign every allocation will be expensive, but there are not that many keys. This lets us take a conservative approach: find the oldest live allocation and ensure that we don't GC any key older than that key.	2022-07-11 13:34:06 -04:00
Tim Gross	bfcbc00f4e	workload identity (#13223 ) In order to support implicit ACL policies for tasks to get their own secrets, each task would need to have its own ACL token. This would add extra raft overhead as well as new garbage collection jobs for cleaning up task-specific ACL tokens. Instead, Nomad will create a workload Identity Claim for each task. An Identity Claim is a JSON Web Token (JWT) signed by the server’s private key and attached to an Allocation at the time a plan is applied. The encoded JWT can be submitted as the X-Nomad-Token header to replace ACL token secret IDs for the RPCs that support identity claims. Whenever a key is is added to a server’s keyring, it will use the key as the seed for a Ed25519 public-private private keypair. That keypair will be used for signing the JWT and for verifying the JWT. This implementation is a ruthlessly minimal approach to support the secure variables feature. When a JWT is verified, the allocation ID will be checked against the Nomad state store, and non-existent or terminal allocation IDs will cause the validation to be rejected. This is sufficient to support the secure variables feature at launch without requiring implementation of a background process to renew soon-to-expire tokens.	2022-07-11 13:34:05 -04:00
Tim Gross	d5a214484c	core job for root key GC (#13199 ) Inactive and unused keys older than a threshold will be periodically garbage collected.	2022-07-11 13:34:04 -04:00
Tim Gross	d29e85d150	secure variables: initial state store (#12932 ) Implement the core SecureVariable and RootKey structs in memdb, provide the minimal skeleton for FSM, and a dummy storage and keyring RPC endpoint.	2022-07-11 13:34:01 -04:00
Seth Hoenig	239eaf9a29	Merge pull request #13626 from hashicorp/b-client-max-kill-timeout client: enforce max_kill_timeout client configuration	2022-07-07 13:44:39 -05:00
Seth Hoenig	5dd8aa3e27	client: enforce max_kill_timeout client configuration This PR fixes a bug where client configuration max_kill_timeout was not being enforced. The feature was introduced in 9f44780 but seems to have been removed during the major drivers refactoring. We can make sure the value is enforced by pluming it through the DriverHandler, which now uses the lesser of the task.killTimeout or client.maxKillTimeout. Also updates Event.SetKillTimeout to require both the task.killTimeout and client.maxKillTimeout so that we don't make the mistake of using the wrong value - as it was being given only the task.killTimeout before.	2022-07-06 15:29:38 -05:00
Luiz Aoqui	74c5578432	api: refactor ACL check for namespace wildcard (#13606 ) Improve how the all namespaces wildcard (``) is handled when checking ACL permissions. When using the wildcard namespace the `AllowNsOp` would return false since it looks for a namespace called `` to match. This commit changes this behavior to return `true` when the queried namespace is `*` and the token allows the operation in _any_ namespace. Actual permission must be checked per object. The helper function `AllowNsOpFunc` returns a function that can be used to make this verification.	2022-07-06 15:22:30 -04:00
James Rasell	0c0b028a59	core: allow deleting of evaluations (#13492 ) * core: add eval delete RPC and core functionality. * agent: add eval delete HTTP endpoint. * api: add eval delete API functionality. * cli: add eval delete command. * docs: add eval delete website documentation.	2022-07-06 16:30:11 +02:00
Michael Schurter	1cc0ae8795	docs: fix Plan{,Result}.NodeUpdate comment (#13534 ) It appears way back when this was first implemented in 9a917281af9c0a97a6c59575eaa52c5c86ffc60d, it was renamed from NodeEvict (with a correct comment) to NodeUpdate. The comment was changed from referring to only evictions to referring to "all allocs" in the first sentence and "stop or evict" in the second. This confuses every time I see it because I read the name (NodeUpdate) and first sentence ("all the allocs") and assume this represents all allocations... which isn't true. I'm going to assume I'm the only one who doesn't read the 2nd sentence and that's why this suboptimal wording has lasted 7 years, but can we change it for my sake?	2022-06-30 12:47:14 -07:00
Shishir Mahajan	6ba8245283	Fix typo: orthogonal. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-06-27 12:12:51 -07:00
Lance Haig	4bf27d743d	Allow Operator Generated bootstrap token (#12520 )	2022-06-03 07:37:24 -04:00
Seth Hoenig	dca954faac	build: update golangci-lint to v1.46.2 This version of golangci-lint improves support for generics, but also is more strict in copy vs. loop for slice copying.	2022-05-31 23:32:01 +00:00
Seth Hoenig	a5943da0c7	core: add tests for blocked evals math	2022-05-24 09:05:18 -05:00
Luiz Aoqui	a8cc633156	vault: revert support for entity aliases (#12723 ) After a more detailed analysis of this feature, the approach taken in PR #12449 was found to be not ideal due to poor UX (users are responsible for setting the entity alias they would like to use) and issues around jobs potentially masquerading itself as another Vault entity.	2022-04-22 10:46:34 -04:00
James Rasell	716b8e658b	api: Add support for filtering and pagination to the node list endpoint (#12727 )	2022-04-21 17:04:33 +02:00

1 2 3 4 5 ...

1378 Commits