open-nomad

Author	SHA1	Message	Date
grembo	7936c1e33f	Add `disable_file` parameter to job's `vault` stanza (#13343 ) This complements the `env` parameter, so that the operator can author tasks that don't share their Vault token with the workload when using `image` filesystem isolation. As a result, more powerful tokens can be used in a job definition, allowing it to use template stanzas to issue all kinds of secrets (database secrets, Vault tokens with very specific policies, etc.), without sharing that issuing power with the task itself. This is accomplished by creating a directory called `private` within the task's working directory, which shares many properties of the `secrets` directory (tmpfs where possible, not accessible by `nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the container. If the `disable_file` parameter is set to `false` (its default), the Vault token is also written to the NOMAD_SECRETS_DIR, so the default behavior is backwards compatible. Even if the operator never changes the default, they will still benefit from the improved behavior of Nomad never reading the token back in from that - potentially altered - location.	2023-06-23 15:15:04 -04:00
James Rasell	78cdf0d0d8	server: remove unused endpoints struct. (#17665 )	2023-06-23 08:20:33 +01:00
Luiz Aoqui	8f05eaaa68	np: check for license on RPC endpoints (#17656 )	2023-06-22 12:52:20 -04:00
Tim Gross	11216d09af	client: send node secret with every client-to-server RPC (#16799 ) In Nomad 1.5.3 we fixed a security bug that allowed bypass of ACL checks if the request came thru a client node first. But this fix broke (knowingly) the identification of many client-to-server RPCs. These will be now measured as if they were anonymous. The reason for this is that many client-to-server RPCs do not send the node secret and instead rely on the protection of mTLS. This changeset ensures that the node secret is being sent with every client-to-server RPC request. In a future version of Nomad we can add enforcement on the server side, but this was left out of this changeset to reduce risks to the safe upgrade path. Sending the node secret as an auth token introduces a new problem during initial introduction of a client. Clients send many RPCs concurrently with `Node.Register`, but until the node is registered the node secret is unknown to the server and will be rejected as invalid. This causes permission denied errors. To fix that, this changeset introduces a gate on having successfully made a `Node.Register` RPC before any other RPCs can be sent (except for `Status.Ping`, which we need earlier but which also ignores the error because that handler doesn't do an authorization check). This ensures that we only send requests with a node secret already known to the server. This also makes client startup a little easier to reason about because we know `Node.Register` must succeed first, and it should make for a good place to hook in future plans for secure introduction of nodes. The tradeoff is that an existing client that has running allocs will take slightly longer (a second or two) to transition to ready after a restart, because the transition in `Node.UpdateStatus` is gated at the server by first submitting `Node.UpdateAlloc` with client alloc updates.	2023-06-22 11:06:49 -04:00
James Rasell	4e2d019639	variables: remove unused state store functions. (#17660 )	2023-06-22 13:54:58 +01:00
James Rasell	71fdd7e891	core: use faster concatenation for alloc name generation. (#17591 )	2023-06-22 07:46:28 +01:00
Luiz Aoqui	ac08fc751b	node pools: apply node pool scheduler configuration (#17598 )	2023-06-21 20:31:50 -04:00
Tim Gross	ff9ba8ff73	scheduler: tolerate having only one dynamic port available (#17619 ) If the dynamic port range for a node is set so that the min is equal to the max, there's only one port available and this passes config validation. But the scheduler panics when it tries to pick a random port. Only add the randomness when there's more than one to pick from. Adds a test for the behavior but also adjusts the commentary on a couple of the existing tests that made it seem like this case was already covered if you didn't look too closely. Fixes: #17585	2023-06-20 13:29:25 -04:00
Luiz Aoqui	2f5df1d8a4	test: add MultiregionMinJob mock (#17614 )	2023-06-20 10:57:02 -04:00
James Rasell	86e4c6cb9d	state: move variables tests to use must library. (#17609 )	2023-06-20 15:46:16 +01:00
James Rasell	68df578c73	state: remove vague scaling event schema todo item. (#17610 )	2023-06-20 15:22:11 +01:00
Luiz Aoqui	a56b10e857	chore: fix typo and copyright header (#17605 )	2023-06-20 10:09:47 -04:00
Luiz Aoqui	cfb3bb517f	np: scheduler configuration updates (#17575 ) * jobspec: rename node pool scheduler_configuration In HCL specifications we usually call configuration blocks `config` instead of `configuration`. * np: add memory oversubscription config * np: make scheduler config ENT	2023-06-19 11:41:46 -04:00
Luiz Aoqui	d5aa72190f	node pools: namespace integration (#17562 ) Add structs and fields to support the Nomad Pools Governance Enterprise feature of controlling node pool access via namespaces. Nomad Enterprise allows users to specify a default node pool to be used by jobs that don't specify one. In order to accomplish this, it's necessary to distinguish between a job that explicitly uses the `default` node pool and one that did not specify any. If the `default` node pool is set during job canonicalization it's impossible to do this, so this commit allows a job to have an empty node pool value during registration but sets to `default` at the admission controller mutator. In order to guarantee state consistency the state store validates that the job node pool is set and exists before inserting it.	2023-06-16 16:30:22 -04:00
Luiz Aoqui	bdc7f3305f	rpc: fix log message in Node.UpdateStatus (#17537 )	2023-06-14 16:51:46 -04:00
Luiz Aoqui	bc17cffaef	node pool: node pool upsert on multiregion node register (#17503 ) When registering a node with a new node pool in a non-authoritative region we can't create the node pool because this new pool will not be replicated to other regions. This commit modifies the node registration logic to only allow automatic node pool creation in the authoritative region. In non-authoritative regions, the client is registered, but the node pool is not created. The client is kept in the `initialing` status until its node pool is created in the authoritative region and replicated to the client's region.	2023-06-13 11:28:28 -04:00
Tim Gross	952eb2713e	node pools: protect against deleting occupied pools (#17457 ) We don't want to delete node pools that have nodes or non-terminal jobs. Add a check in the `DeleteNodePools` RPC to check locally and in federated regions, similar to how we check that it's safe to delete namespaces.	2023-06-13 09:57:42 -04:00
Tim Gross	e8a361310f	node pools: replicate from authoritative region (#17456 ) Upserts and deletes of node pools are forwarded to the authoritative region, just like we do for namespaces, quotas, ACL policies, etc. Replicate node pools from the authoritative region.	2023-06-12 13:24:24 -04:00
Tim Gross	bb7f0edd6a	node pools: prevent panic on upsert during upgrades (#17474 ) Whenever we write a Raft log entry for node pools, we need to first make sure that all servers can safely apply the log without panicking. Gate upsert and delete RPCs on all servers being upgraded to the minimum version.	2023-06-12 09:01:30 -04:00
Tim Gross	e3a37c0b97	replication: fix potential panic during upgrades (#17476 ) If the authoritative region has been upgraded to a version of Nomad that has new replicated objects (such as ACL Auth Methods, ACL Binding Rules, etc.), the non-authoritative regions will start replicating those objects as soon as their leader is upgraded. If a server in the non-authoritative region is upgraded and then becomes the leader before all the other servers in the region have been upgraded, then it will attempt to write a Raft log entry that the followers don't understand. The followers will then panic. Add same the minimum version checks that we do for RPC writes to the leader's replication loop.	2023-06-12 08:53:56 -04:00
Tim Gross	fbaf4c8b69	node pools: implement support in scheduler (#17443 ) Implement scheduler support for node pool: * When a scheduler is invoked, we get a set of the ready nodes in the DCs that are allowed for that job. Extend the filter to include the node pool. * Ensure that changes to a job's node pool are picked up as destructive allocation updates. * Add `NodesInPool` as a metric to all reporting done by the scheduler. * Add the node-in-pool the filter to the `Node.Register` RPC so that we don't generate spurious evals for nodes in the wrong pool.	2023-06-07 10:39:03 -04:00
Tim Gross	c0f2295510	node pools: implement HTTP API to list jobs in pool (#17431 ) Implements the HTTP API associated with the `NodePool.ListJobs` RPC, including the `api` package for the public API and documentation. Update the `NodePool.ListJobs` RPC to fix the missing handling of the special "all" pool.	2023-06-06 11:40:13 -04:00
Luiz Aoqui	2420c93179	node pools: list nodes in pool (#17413 )	2023-06-06 10:43:43 -04:00
Luiz Aoqui	aa1b33d157	node pools: add event stream support (#17412 )	2023-06-06 10:14:47 -04:00
Tim Gross	2d16ec6c6f	node pools: implement RPC to list jobs in a given node pool (#17396 ) Implements the `NodePool.ListJobs` RPC, with pagination and filtering based on the existing `Job.List` RPC.	2023-06-05 15:36:52 -04:00
Luiz Aoqui	700168e136	node pools: fix node upsert and state mutation tests (#17430 )	2023-06-05 14:58:32 -04:00
Luiz Aoqui	6039c18ab6	node pools: register a node in a node pool (#17405 )	2023-06-02 17:50:50 -04:00
Luiz Aoqui	3a962d07f8	np: fix node pool search permission check (#17400 ) When checking if a token is allowed to query the search endpoints we need to return an error if the search context includes `node_pool` and the token doesn't have access to _any_ pool. This prevents returning an empty list instead of a permission denied error.	2023-06-02 12:22:47 -04:00
Samantha	b92a782b6e	check: Add support for Consul field tls_server_name (#17334 )	2023-06-02 10:19:12 -04:00
Tim Gross	56e9b944e8	node pools: validate pool exists on job registration (#17386 ) Add a new job admission hook for node pools that enforces the pool exists on registration. Also provide the skeleton function we need for Enterprise enforcement functions we'll implement later.	2023-06-02 09:32:07 -04:00
Luiz Aoqui	f755b9469f	core: refactor task validation (#17344 ) Move all validations related to task fields to Task.Validate(). Prior to this, some task validations were being done inside TaskGroup.Validate() because they required access to some group values. But similarly to how TaskGroup.Validate() tasks the job as parameter, it's fair to expect the task to receive its group.	2023-06-01 19:26:42 -04:00
Luiz Aoqui	4be8d7c049	core: fix kill_timeout validation when progress_deadline is 0 (#17342 )	2023-06-01 19:01:32 -04:00
Luiz Aoqui	9bb57c08e3	node pool: add search support (#17385 )	2023-06-01 17:48:14 -04:00
Tim Gross	4f14fa0518	node pools: add `node_pool` field to job spec (#17379 ) This changeset only adds the `node_pool` field to the jobspec, and ensures that it gets picked up correctly as a change. Without the rest of the implementation landed yet, the field will be ignored.	2023-06-01 16:08:55 -04:00
Luiz Aoqui	c61e75f302	node pools: add CRUD API (#17384 )	2023-06-01 15:55:49 -04:00
Seth Hoenig	acfdf0f479	compliance: add headers with fixed copywrite tool (#17353 ) Closes #17117	2023-05-30 09:20:32 -05:00
Charlie Voiselle	86e04a4c6c	[core] nil check and error handling for client status in heartbeat responses (#17316 ) Add a nil check to constructNodeServerInfoResponse to manage an apparent race between deregister and client heartbeats. Fixes #17310	2023-05-25 16:04:54 -04:00
Lance Haig	568da5918b	cli: tls certs not created with correct SANs (#16959 ) The `nomad tls cert` command did not create certificates with the correct SANs for them to work with non default domain and region names. This changset updates the code to support non default domains and regions in the certificates.	2023-05-22 09:31:56 -04:00
Roberto Hidalgo	2f702a9f11	allow periodic jobs to use workload identity ACL policies (#17018 ) When resolving ACL policies, we were not using the parent ID for the policy lookup for dispatch/periodic jobs, even though the claims were signed for that parent ID. This prevents all calls to the Task API (and other WI-authenticated API calls) from a periodically-dispatched job failing with 403. Fix this by using the parent job ID whenever it's available.	2023-05-22 09:19:16 -04:00
Phil Renaud	7e56ca62d1	[ui] Adds a "Scheduling" filter to the job.allocations page (#17227 ) * Basic filter concept * Make sure NextAllocation gets sent up with allocation stub	2023-05-18 16:24:41 -04:00
James Rasell	96f7c84e4e	variable: fixup metadata copy comment and remove unrequired type. (#17234 )	2023-05-18 13:49:41 +01:00
Piotr Kazmierczak	fe272c3686	refactor acl.UpsertTokens to avoid unnecessary RPC calls. (#17194 ) New RPC endpoints introduced during OIDC and JWT auth perform unnecessary many RPC calls when they upsert generated ACL tokens, as pointed out by @tgross. This PR moves the common logic from acl.UpsertTokens method into a helper method that contains common logic, and sidesteps authentication, metrics, etc.	2023-05-16 09:31:51 +02:00
Luiz Aoqui	389212bfda	node pool: initial base work (#17163 ) Implementation of the base work for the new node pools feature. It includes a new `NodePool` struct and its corresponding state store table. Upon start the state store is populated with two built-in node pools that cannot be modified nor deleted: * `all` is a node pool that always includes all nodes in the cluster. * `default` is the node pool where nodes that don't specify a node pool in their configuration are placed.	2023-05-15 10:49:08 -04:00
Seth Hoenig	81e36b3650	core: eliminate second index on job_submissions table (#17146 ) * core: eliminate second index on job_submissions table This PR refactors the job_submissions state store code to eliminate the use of a second index formerly used for purging all versions of a given job. In practice we ended up with duplicate entries on the table. Instead, use index prefix scanning on the primary index and tidy up any potential for creating (or removing) duplicates. * core: pr comments followup	2023-05-11 09:51:08 -05:00
Tim Gross	9ed75e1f72	client: de-duplicate alloc updates and gate during restore (#17074 ) When client nodes are restarted, all allocations that have been scheduled on the node have their modify index updated, including terminal allocations. There are several contributing factors: * The `allocSync` method that updates the servers isn't gated on first contact with the servers. This means that if a server updates the desired state while the client is down, the `allocSync` races with the `Node.ClientGetAlloc` RPC. This will typically result in the client updating the server with "running" and then immediately thereafter "complete". * The `allocSync` method unconditionally sends the `Node.UpdateAlloc` RPC even if it's possible to assert that the server has definitely seen the client state. The allocrunner may queue-up updates even if we gate sending them. So then we end up with a race between the allocrunner updating its internal state to overwrite the previous update and `allocSync` sending the bogus or duplicate update. This changeset adds tracking of server-acknowledged state to the allocrunner. This state gets checked in the `allocSync` before adding the update to the batch, and updated when `Node.UpdateAlloc` returns successfully. To implement this we need to be able to equality-check the updates against the last acknowledged state. We also need to add the last acknowledged state to the client state DB, otherwise we'd drop unacknowledged updates across restarts. The client restart test has been expanded to cover a variety of allocation states, including allocs stopped before shutdown, allocs stopped by the server while the client is down, and allocs that have been completely GC'd on the server while the client is down. I've also bench tested scenarios where the task workload is killed while the client is down, resulting in a failed restore. Fixes #16381	2023-05-11 09:05:24 -04:00
Seth Hoenig	74714272cc	api: set the job submission during job reversion (#17097 ) * api: set the job submission during job reversion This PR fixes a bug where the job submission would always be nil when a job goes through a reversion to a previous version. Basically we need to detect when this happens, lookup the submission of the job version being reverted to, and set that as the submission of the new job being created. * e2e: add e2e test for job submissions during reversion This e2e test ensures a reverted job inherits the job submission associated with the version of the job being reverted to.	2023-05-08 14:18:34 -05:00
Daniel Bennett	a7ed6f5c53	full task cleanup when alloc prerun hook fails (#17104 ) to avoid leaking task resources (e.g. containers, iptables) if allocRunner prerun fails during restore on client restart. now if prerun fails, TaskRunner.MarkFailedKill() will only emit an event, mark the task as failed, and cancel the tr's killCtx, so then ar.runTasks() -> tr.Run() can take care of the actual cleanup. removed from (formerly) tr.MarkFailedDead(), now handled by tr.Run(): * set task state as dead * save task runner local state * task stop hooks also done in tr.Run() now that it's not skipped: * handleKill() to kill tasks while respecting their shutdown delay, and retrying as needed * also includes task preKill hooks * clearDriverHandle() to destroy the task and associated resources * task exited hooks	2023-05-08 13:17:10 -05:00
stswidwinski	9c1c2cb5d2	Correct the status description and modify time of canceled evals. (#17071 ) Fix for #17070. Corrected the status description and modify time of evals which are canceled due to another eval having completed in the meantime.	2023-05-08 08:50:36 -04:00
Seth Hoenig	fff2eec625	connect: use heuristic to detect sidecar task driver (#17065 ) * connect: use heuristic to detect sidecar task driver This PR adds a heuristic to detect whether to use the podman task driver for the connect sidecar proxy. The podman driver will be selected if there is at least one task in the task group configured to use podman, and there are zero tasks in the group configured to use docker. In all other cases the task driver defaults to docker. After this change, we should be able to run typical Connect jobspecs (e.g. nomad job init [-short] -connect) on Clusters configured with the podman task driver, without modification to the job files. Closes #17042 * golf: cleanup driver detection logic	2023-05-05 10:19:30 -05:00
James Rasell	6ec4a69f47	scale: fixed a bug where evals could be created with wrong type. (#17092 ) The job scale RPC endpoint hard-coded the eval creation to use the type of service. This meant scaling events triggered on jobs of type batch would create evaluations with the wrong type, which does not seem to cause any problems, just confusion when correlating the two.	2023-05-05 14:46:10 +01:00

1 2 3 4 5 ...

4374 commits