open-nomad

Commit Graph

Author	SHA1	Message	Date
jmwilkinson	37834dffda	Allow wildcard datacenters to be specified in job file (#11170 ) Also allows for default value of `datacenters = ["*"]`	2023-02-02 09:57:45 -05:00
Seth Hoenig	ca7ead191e	consul: restore consul token when reverting a job (#15996 ) * consul: reset consul token on job during registration of a reversion * e2e: add test for reverting a job with a consul service * cl: fixup cl entry	2023-02-01 14:02:45 -06:00
James Rasell	9e8325d63c	acl: fix a bug in token creation when parsing expiration TTLs. (#15999 ) The ACL token decoding was not correctly handling time duration syntax such as "1h" which forced people to use the nanosecond representation via the HTTP API. The change adds an unmarshal function which allows this syntax to be used, along with other styles correctly.	2023-02-01 17:43:41 +01:00
James Rasell	67acfd9f6b	acl: return 400 not 404 code when creating an invalid policy. (#16000 )	2023-02-01 17:40:15 +01:00
Mike Nomitch	80848b202e	Increases max variable size to 64KiB from 16KiB (#15983 )	2023-01-31 13:32:36 -05:00
stswidwinski	16eefbbf4d	GC: ensure no leakage of evaluations for batch jobs. (#15097 ) Prior to 2409f72 the code compared the modification index of a job to itself. Afterwards, the code compared the creation index of the job to itself. In either case there should never be a case of re-parenting of allocs causing the evaluation to trivially always result in false, which leads to unreclaimable memory. Prior to this change allocations and evaluations for batch jobs were never garbage collected until the batch job was explicitly stopped. The new `batch_eval_gc_threshold` server configuration controls how often they are collected. The default threshold is `24h`.	2023-01-31 13:32:14 -05:00
Jorge Marey	d1c9aad762	Rename fields on proxyConfig (#15541 ) * Change api Fields for expose and paths * Add changelog entry * changelog: add deprecation notes about connect fields * api: minor style tweaks --------- Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-30 09:31:16 -06:00
Piotr Kazmierczak	14b53df3b6	renamed stanza to block for consistency with other projects (#15941 )	2023-01-30 15:48:43 +01:00
Seth Hoenig	074b76e3bf	consul: check for acceptable service identity on consul tokens (#15928 ) When registering a job with a service and 'consul.allow_unauthenticated=false', we scan the given Consul token for an acceptable policy or role with an acceptable policy, but did not scan for an acceptable service identity (which is backed by an acceptable virtual policy). This PR updates our consul token validation to also accept a matching service identity when registering a service into Consul. Fixes #15902	2023-01-27 18:15:51 -06:00
Tim Gross	881a4cfaff	metrics: Add remaining server RPC rate metrics (#15901 )	2023-01-27 08:29:53 -05:00
Tim Gross	ce3eef8037	metrics: Add rate metrics to Client CSI endpoints (#15905 ) Also tightens up authentication for these endpoints by enforcing the server certificate name is valid. We protect these endpoints currently by mTLS and can't use an auth token because these endpoints are (uniquely) called by the leader and followers for a given node won't have the leader's ephemeral ACL token. Add a certificate name check that requests come from a server and not a client, because no client should ever send these RPCs directly.	2023-01-26 16:40:58 -05:00
Tim Gross	bed8716e44	metrics: Add metrics to unauthenticated endpoints (#15899 )	2023-01-26 15:05:51 -05:00
Tim Gross	5e75ea9fb3	metrics: Add RPC rate metrics to endpoints that validate TLS names (#15900 )	2023-01-26 15:04:25 -05:00
Yorick Gersie	2a5c423ae0	Allow per_alloc to be used with host volumes (#15780 ) Disallowing per_alloc for host volumes in some cases makes life of a nomad user much harder. When we rely on the NOMAD_ALLOC_INDEX for any configuration that needs to be re-used across restarts we need to make sure allocation placement is consistent. With CSI volumes we can use the `per_alloc` feature but for some reason this is explicitly disabled for host volumes. Ensure host volumes understand the concept of per_alloc	2023-01-26 09:14:47 -05:00
Piotr Kazmierczak	f4d6efe69f	acl: make auth method default across all types (#15869 )	2023-01-26 14:17:11 +01:00
James Rasell	5d33891910	sso: allow binding rules to create management ACL tokens. (#15860 ) * sso: allow binding rules to create management ACL tokens. * docs: update binding rule docs to detail management type addition.	2023-01-26 09:57:44 +01:00
Tim Gross	6677a103c2	metrics: measure rate of RPC requests that serve API (#15876 ) This changeset configures the RPC rate metrics that were added in #15515 to all the RPCs that support authenticated HTTP API requests. These endpoints already configured with pre-forwarding authentication in #15870, and a handful of others were done already as part of the proof-of-concept work. So this changeset is entirely copy-and-pasting one method call into a whole mess of handlers. Upcoming PRs will wire up pre-forwarding auth and rate metrics for the remaining set of RPCs that have no API consumers or aren't authenticated, in smaller chunks that can be more thoughtfully reviewed.	2023-01-25 16:37:24 -05:00
Luiz Aoqui	3479e2231f	core: enforce strict steps for clients reconnect (#15808 ) When a Nomad client that is running an allocation with `max_client_disconnect` set misses a heartbeat the Nomad server will update its status to `disconnected`. Upon reconnecting, the client will make three main RPC calls: - `Node.UpdateStatus` is used to set the client status to `ready`. - `Node.UpdateAlloc` is used to update the client-side information about allocations, such as their `ClientStatus`, task states etc. - `Node.Register` is used to upsert the entire node information, including its status. These calls are made concurrently and are also running in parallel with the scheduler. Depending on the order they run the scheduler may end up with incomplete data when reconciling allocations. For example, a client disconnects and its replacement allocation cannot be placed anywhere else, so there's a pending eval waiting for resources. When this client comes back the order of events may be: 1. Client calls `Node.UpdateStatus` and is now `ready`. 2. Scheduler reconciles allocations and places the replacement alloc to the client. The client is now assigned two allocations: the original alloc that is still `unknown` and the replacement that is `pending`. 3. Client calls `Node.UpdateAlloc` and updates the original alloc to `running`. 4. Scheduler notices too many allocs and stops the replacement. This creates unnecessary placements or, in a different order of events, may leave the job without any allocations running until the whole state is updated and reconciled. To avoid problems like this clients must update _all_ of its relevant information before they can be considered `ready` and available for scheduling. To achieve this goal the RPC endpoints mentioned above have been modified to enforce strict steps for nodes reconnecting: - `Node.Register` does not set the client status anymore. - `Node.UpdateStatus` sets the reconnecting client to the `initializing` status until it successfully calls `Node.UpdateAlloc`. These changes are done server-side to avoid the need of additional coordination between clients and servers. Clients are kept oblivious of these changes and will keep making these calls as they normally would. The verification of whether allocations have been updates is done by storing and comparing the Raft index of the last time the client missed a heartbeat and the last time it updated its allocations.	2023-01-25 15:53:59 -05:00
Tim Gross	f3f64af821	WI: allow workloads to use RPCs associated with HTTP API (#15870 ) This changeset allows Workload Identities to authenticate to all the RPCs that support HTTP API endpoints, for use with PR #15864. * Extends the work done for pre-forwarding authentication to all RPCs that support a HTTP API endpoint. * Consolidates the auth helpers used by the CSI, Service Registration, and Node endpoints that are currently used to support both tokens and client secrets. Intentionally excluded from this changeset: * The Variables endpoint still has custom handling because of the implicit policies. Ideally we'll figure out an efficient way to resolve those into real policies and then we can get rid of that custom handling. * The RPCs that don't currently support auth tokens (i.e. those that don't support HTTP endpoints) have not been updated with the new pre-forwarding auth We'll be doing this under a separate PR to support RPC rate metrics.	2023-01-25 14:33:06 -05:00
Tim Gross	cf9e5f3327	acl: Fix panic when bogus token is passed (#15863 ) If a consumer of the new `Authenticate` method gets passed a bogus token that's a correctly-shaped UUID, it will correctly get an identity without a ACL token. But most consumers will then panic when they consume this nil `ACLToken` for authorization. Because no API client should ever send a bogus auth token, update the `Authenticate` method to create the identity with remote IP (for metrics tracking) but also return an `ErrPermissionDenied`.	2023-01-25 10:03:17 -05:00
Tim Gross	055434cca9	add metric for count of RPC requests (#15515 ) Implement a metric for RPC requests with labels on the identity, so that administrators can monitor the source of requests within the cluster. This changeset demonstrates the change with the new `ACL.WhoAmI` RPC, and we'll wire up the remaining RPCs once we've threaded the new pre-forwarding authentication through the all. Note that metrics are measured after we forward but before we return any authentication error. This ensures that we only emit metrics on the server that actually serves the request. We'll perform rate limiting at the same place. Includes telemetry configuration to omit identity labels.	2023-01-24 11:54:20 -05:00
Tim Gross	2030d62920	implement pre-forwarding auth on select RPCs (#15513 ) In #15417 we added a new `Authenticate` method to the server that returns an `AuthenticatedIdentity` struct. This changeset implements this method for a small number of RPC endpoints that together represent all the various ways in which RPCs are sent, so that we can validate that we're happy with this approach.	2023-01-24 10:52:07 -05:00
Michael Schurter	ace5faf948	core: backoff considerably when worker is behind raft (#15523 ) Upon dequeuing an evaluation workers snapshot their state store at the eval's wait index or later. This ensures we process an eval at a point in time after it was created or updated. Processing an eval on an old snapshot could cause any number of problems such as: 1. Since job registration atomically updates an eval and job in a single raft entry, scheduling against indexes before that may not have the eval's job or may have an older version. 2. The older the scheduler's snapshot, the higher the likelihood something has changed in the cluster state which will cause the plan applier to reject the scheduler's plan. This could waste work or even cause eval's to be failed needlessly. However, the workers run in parallel with a new server pulling the cluster state from a peer. During this time, which may be many minutes long, the state store is likely far behind the minimum index required to process evaluations. This PR addresses this by adding an additional long backoff period after an eval is nacked. If the scheduler's indexes catches up within the additional backoff, it will unblock early to dequeue the next eval. When the server shuts down we'll get a `context.Canceled` error from the state store method. We need to bubble this error up so that other callers can detect it. Handle this case separately when waiting after dequeue so that we can warn on shutdown instead of throwing an ambiguous error message with just the text "canceled." While there may be more precise ways to block scheduling until the server catches up, this approach adds little risk and covers additional cases where a server may be temporarily behind due to a spike in load or a saturated network. For testing, we make the `raftSyncLimit` into a parameter on the worker's `run` method so that we can run backoff tests without waiting 30+ seconds. We haven't followed thru and made all the worker globals into worker parameters, because there isn't much use outside of testing, but we can consider that in the future. Co-authored-by: Tim Gross <tgross@hashicorp.com>	2023-01-24 08:56:35 -05:00
Tim Gross	a51149736d	Rename `nomad.broker.total_blocked` metric (#15835 ) This changeset fixes a long-standing point of confusion in metrics emitted by the eval broker. The eval broker has a queue of "blocked" evals that are waiting for an in-flight ("unacked") eval of the same job to be completed. But this "blocked" state is not the same as the `blocked` status that we write to raft and expose in the Nomad API to end users. There's a second metric `nomad.blocked_eval.total_blocked` that refers to evaluations in that state. This has caused ongoing confusion in major customer incidents and even in our own documentation! (Fixed in this PR.) There's little functional change in this PR aside from the name of the metric emitted, but there's a bit refactoring to clean up the names in `eval_broker.go` so that there aren't name collisions and multiple names for the same state. Changes included are: * Everything that was previously called "pending" referred to entities that were associated witht he "ready" metric. These are all now called "ready" to match the metric. * Everything named "blocked" in `eval_broker.go` is now named "pending", except for a couple of comments that actually refer to blocked RPCs. * Added a note to the upgrade guide docs for 1.5.0. * Fixed the scheduling performance metrics docs because the description for `nomad.broker.total_blocked` was actually the description for `nomad.blocked_eval.total_blocked`.	2023-01-20 14:23:56 -05:00
Charlie Voiselle	5ea1d8a970	Add raft snapshot configuration options (#15522 ) * Add config elements * Wire in snapshot configuration to raft * Add hot reload of raft config * Add documentation for new raft settings * Add changelog	2023-01-20 14:21:51 -05:00
Seth Hoenig	d2d8ebbeba	consul: correctly interpret missing consul checks as unhealthy (#15822 ) * consul: correctly understand missing consul checks as unhealthy This PR fixes a bug where Nomad assumed any registered Checks would exist in the service registration coming back from Consul. In some cases, the Consul may be slow in processing the check registration, and the response object would not contain checks. Nomad would then scan the empty response looking for Checks with failing health status, finding none, and then marking a task/alloc as healthy. In reality, we must always use Nomad's view of what checks should exist as the source of truth, and compare that with the response Consul gives us, making sure they match, before scanning the Consul response for failing check statuses. Fixes #15536 * consul: minor CR refactor using maps not sets * consul: observe transition from healthy to unhealthy checks * consul: spell healthy correctly	2023-01-19 14:01:12 -06:00
James Rasell	fad9b40e53	Merge branch 'main' into sso/gh-13120-oidc-login	2023-01-18 10:05:31 +00:00
Phil Renaud	98c5259f3e	[sso] OIDC Updates for the UI (#15804 ) * Updated UI to handle OIDC method changes * Remove redundant store unload call	2023-01-17 17:01:47 -05:00
James Rasell	abe8e1cf29	updates based on code review from @tgross.	2023-01-17 08:45:17 +00:00
James Rasell	d29d3412d8	rpc: add OIDC login related endpoints. This adds new OIDC endpoints on the RPC endpoint. These two RPCs handle generating the OIDC provider URL and then completing the login by exchanging the provider token with an internal Nomad token. The RPC endpoints both do double forwarding. The initial forward is to ensure we are talking to the regional leader; the second then takes into account whether the auth method generates local or global tokens. If it creates global tokens, we must then forward onto the federated regional leader.	2023-01-13 13:14:29 +00:00
Seth Hoenig	fe7795ce16	consul/connect: support for proxy upstreams opaque config (#15761 ) This PR adds support for configuring `proxy.upstreams[].config` for Consul Connect upstreams. This is an opaque config value to Nomad - the data is passed directly to Consul and is unknown to Nomad.	2023-01-12 08:20:54 -06:00
Anthony Davis	1c32471805	Fix rejoin_after_leave behavior (#15552 )	2023-01-11 16:39:24 -05:00
Daniel Bennett	7d1059b5ae	connect: ingress gateway validation for http hosts and wildcards (#15749 ) * connect: fix non-"tcp" ingress gateway validation changes apply to http, http2, and grpc: * if "hosts" is excluded, consul will use its default domain e.g. <service-name>.ingress.dc1.consul * can't set hosts with "" service name test http2 and grpc too	2023-01-11 11:52:32 -06:00
Seth Hoenig	719eee8112	consul: add client configuration for grpc_ca_file (#15701 ) * [no ci] first pass at plumbing grpc_ca_file * consul: add support for grpc_ca_file for tls grpc connections in consul 1.14+ This PR adds client config to Nomad for specifying consul.grpc_ca_file These changes combined with https://github.com/hashicorp/consul/pull/15913 should finally enable Nomad users to upgrade to Consul 1.14+ and use tls grpc connections. * consul: add cl entgry for grpc_ca_file * docs: mention grpc_tls changes due to Consul 1.14	2023-01-11 09:34:28 -06:00
Seth Hoenig	83450c8762	vault: configure user agent on Nomad vault clients (#15745 ) * vault: configure user agent on Nomad vault clients This PR attempts to set the User-Agent header on each Vault API client created by Nomad. Still need to figure a way to set User-Agent on the Vault client created internally by consul-template. * vault: fixup find-and-replace gone awry	2023-01-10 10:39:45 -06:00
Piotr Kazmierczak	be36a1924f	acl: binding rules evaluation (#15697 ) Binder provides an interface for binding claims and ACL roles/policies of Nomad.	2023-01-10 16:08:08 +01:00
Tim Gross	32f6ce1c54	Authenticate method improvements (#15734 ) This changeset covers a sidebar discussion that @schmichael and I had around the design for pre-forwarding auth. This includes some changes extracted out of #15513 to make it easier to review both and leave a clean history. * Remove fast path for NodeID. Previously-connected clients will have a NodeID set on the context, and because this is a large portion of the RPCs sent we fast-pathed it at the top of the `Authenticate` method. But the context is shared for all yamux streams over the same yamux session (and TCP connection). This lets an authenticated HTTP request to a client use the NodeID for authentication, which is a privilege escalation. Remove the fast path and annotate it so that we don't break it again. * Add context to decisions around AuthenticatedIdentity. The `Authenticate` method taken on its own looks like it wants to return an `acl.ACL` that folds over all the various identity types (creating an ephemeral ACL on the fly if neccessary). But keeping these fields idependent allows RPC handlers to differentiate between internal and external origins so we most likely want to avoid this. Leave some docstrings as a warning as to why this is built the way it is. * Mutate the request rather than returning. When reviewing #15513 we decided that forcing the request handler to call `SetIdentity` was repetitive and error prone. Instead, the `Authenticate` method mutates the request by setting its `AuthenticatedIdentity`.	2023-01-10 09:46:38 -05:00
Seth Hoenig	84cb5fb03d	deps: update shoenig/test to v0.6.0 (#15715 ) Adds support for custom cmp.Options; need to fix one minor thing causing api breakage.	2023-01-09 09:37:08 -06:00
Seth Hoenig	7214e21402	ci: swap freeport for portal in packages (#15661 )	2023-01-03 11:25:20 -06:00
Seth Hoenig	266ca25a81	cleanup: remove usage of consul/sdk/testutil/retry (#15609 ) This PR removes usages of `consul/sdk/testutil/retry`, as part of the ongoing effort to remove use of any non-API module from Consul. There is one remanining usage in the helper/freeport package, but that will get removed as part of #15589	2023-01-02 08:06:20 -06:00
Piotr Kazmierczak	3fd9fbeece	fix failing UpsertBindingRules unit test (#15604 ) UpsertBindingRules RPC changed in eacecb8, validation happens after the ID check now, because we don't want validation to fail for update payloads which may contain incomplete objects.	2022-12-21 15:19:09 +01:00
Piotr Kazmierczak	b500dfa969	bugfix: unit test for GetACLBindingRules (#15583 ) Unit test for GetACLBindingRules state store method would fail because we'd expect order of returned items.	2022-12-20 15:06:09 +01:00
Piotr Kazmierczak	20a01a0bba	acl: modify update endpoints behavior (#15580 ) API and RPC endpoints for ACLAuthMethods and ACLBindingRules should allow users to send incomplete objects in order to, e.g., update single fields. This PR provides "merging" functionality for these endpoints.	2022-12-20 11:22:19 +01:00
James Rasell	b8aa53d09f	core: add ACL binding rule to replication system. (#15555 ) ACL binding rule create and deletes are always forwarded to the authoritative region. In order to make these available in federated regions, the leaders in these regions need to replicate from the authoritative.	2022-12-16 09:08:00 +01:00
James Rasell	95c9ffa505	ACL: add ACL binding rule RPC and HTTP API handlers. (#15529 ) This change add the RPC ACL binding rule handlers. These handlers are responsible for the creation, updating, reading, and deletion of binding rules. The write handlers are feature gated so that they can only be used when all federated servers are running the required version. The HTTP API handlers and API SDK have also been added where required. This allows the endpoints to be called from the API by users and clients.	2022-12-15 09:18:55 +01:00
James Rasell	13f207ea78	events: add ACL binding rules to core events stream topics. (#15544 )	2022-12-14 14:49:49 +01:00
James Rasell	3c941c6bc3	acl: add binding rule object state schema and functionality. (#15511 ) This change adds a new table that will store ACL binding rule objects. The two indexes allow fast lookups by their ID, or by which auth method they are linked to. Snapshot persist and restore functionality ensures this table can be saved and restored from snapshots. In order to write and delete the object to state, new Raft messages have been added. All RPC request and response structs, along with object functions such as diff and canonicalize have been included within this work as it is nicely separated from the other areas of work.	2022-12-14 08:48:18 +01:00
Seth Hoenig	be3f89b5f9	artifact: enable inheriting environment variables from client (#15514 ) * artifact: enable inheriting environment variables from client This PR adds client configuration for specifying environment variables that should be inherited by the artifact sandbox process from the Nomad Client agent. Most users should not need to set these values but the configuration is provided to ensure backwards compatability. Configuration of go-getter should ideally be done through the artifact block in a jobspec task. e.g. ```hcl client { artifact { set_environment_variables = "TMPDIR,GIT_SSH_OPTS" } } ``` Closes #15498 * website: update set_environment_variables text to mention PATH	2022-12-09 15:46:07 -06:00
Piotr Kazmierczak	db98e26375	bugfix: acl sso auth methods test failures (#15512 ) This PR fixes unit test failures introduced in f4e89e2	2022-12-09 18:47:32 +01:00
Piotr Kazmierczak	08f50f7dbf	acl: make sure there is only one default Auth Method per type (#15504 ) This PR adds a check that makes sure we don't insert a duplicate default ACL auth method for a given type.	2022-12-09 14:46:54 +01:00

1 2 3 4 5 ...

4246 Commits