open-nomad

Author	SHA1	Message	Date
Seth Hoenig	56b0b456dc	Merge pull request #14102 from hashicorp/cleanup-mesh-gateway-value cleanup: consul mesh gateway type need not be pointer	2022-08-16 10:07:16 -05:00
Tim Gross	4005759d28	move secure variable conflict resolution to state store (#13922 ) Move conflict resolution implementation into the state store with a new Apply RPC. This also makes the RPC for secure variables much more similar to Consul's KV, which will help us support soft deletes in a post-1.4.0 version of Nomad. Reimplement quotas in the state store functions. Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>	2022-08-15 11:19:53 -04:00
Seth Hoenig	f9355c29fb	cleanup: consul mesh gateway type need not be pointer This PR changes the use of structs.ConsulMeshGateway to value types instead of via pointers. This will help in a follow up PR where we cleanup a lot of custom comparison code with helper functions instead.	2022-08-13 11:26:58 -05:00
James Rasell	581a5bb6ad	rpc: add ACL Role RPC endpoint for CRUD actions. New ACL Role RPC endpoints have been created to allow the creation, update, read, and deletion of ACL roles. All endpoints require a management token; in the future readers will also be allowed to view roles associated to their ACL token. The create endpoint in particular is responsible for deduplicating ACL policy links and ensuring named policies are found within state. This is done within the RPC handler so we perform a single loop through the links for slight efficiency.	2022-08-11 08:43:50 +01:00
James Rasell	e660c9a908	core: add ACL role state schema and functionality. (#13955 ) This commit includes the new state schema for ACL roles along with state interaction functions for CRUD actions. The change also includes snapshot persist and restore functionality and the addition of FSM messages for Raft updates which will come via RPC endpoints.	2022-08-09 09:33:41 +02:00
Seth Hoenig	0b52c27a15	Merge pull request #14045 from Abirdcfly/main fix minor unreachable code caused by t.Fatal	2022-08-08 11:47:02 -05:00
Abirdcfly	d66943d4f7	fix minor unreachable code caused by t.Fatal Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2022-08-08 23:50:11 +08:00
Seth Hoenig	2b6bda49b9	core: automatically plumb task name into task-level services and checks	2022-08-05 12:42:41 -05:00
Seth Hoenig	f6f26fb72c	nsd: add support for setting request body in http checks This PR adds support for settings check.body in checks of services making use of Nomad's service provider.	2022-08-04 14:40:23 -05:00
Seth Hoenig	dcda57e729	nsd: add support for setting headers on nomad service http checks This PR enables setting of the headers block on services registered into Nomad's service provider. Works just like the existing support in Consul checks.	2022-08-03 10:06:44 -05:00
Seth Hoenig	067aa00a6a	Merge pull request #13953 from hashicorp/f-nsd-check-methods nsd: add support for specifying check.method in nomad service checks	2022-08-03 08:28:38 -05:00
Piotr Kazmierczak	530280505f	client: enable specifying user/group permissions in the template stanza (#13755 ) * Adds Uid/Gid parameters to template. * Updated diff_test * fixed order * update jobspec and api * removed obsolete code * helper functions for jobspec parse test * updated documentation * adjusted API jobs test. * propagate uid/gid setting to job_endpoint * adjusted job_endpoint tests * making uid/gid into pointers * refactor * updated documentation * updated documentation * Update client/allocrunner/taskrunner/template/template_test.go Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * Update website/content/api-docs/json-jobs.mdx Co-authored-by: Luiz Aoqui <luiz@hashicorp.com> * propagating documentation change from Luiz * formatting * changelog entry * changed changelog entry Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-08-02 22:15:38 +02:00
Eric Weber	cbce13c1ac	Add stage_publish_base_dir field to csi_plugin stanza of a job (#13919 ) * Allow specification of CSI staging and publishing directory path * Add website documentation for stage_publish_dir * Replace erroneous reference to csi_plugin.mount_config with csi_plugin.mount_dir * Avoid requiring CSI plugins to be redeployed after introducing StagePublishDir	2022-08-02 09:42:44 -04:00
Tim Gross	e5ac6464f6	secure vars: enforce ENT quotas (OSS work) (#13951 ) Move the secure variables quota enforcement calls into the state store to ensure quota checks are atomic with quota updates (in the same transaction). Switch to a machine-size int instead of a uint64 for quota tracking. The ENT-side quota spec is described as int, and negative values have a meaning as "not permitted at all". Using the same type for tracking will make it easier to the math around checks, and uint64 is infeasibly large anyways. Add secure vars to quota HTTP API and CLI outputs and API docs.	2022-08-02 09:32:09 -04:00
James Rasell	663aa92b7a	Merge branch 'main' into f-gh-13120-sso-umbrella	2022-08-02 08:30:03 +01:00
Seth Hoenig	a4d4a76994	nsd: add support for specifying check.method in nomad service checks Unblock 'check.method' in service validation. Add tests around making sure this value gets plumbed through.	2022-08-01 16:13:48 -05:00
Seth Hoenig	d8fe1d10ba	cleanup: use constants for on_update values	2022-07-21 13:09:47 -05:00
Seth Hoenig	c61e779b48	Merge pull request #13715 from hashicorp/dev-nsd-checks client: add support for checks in nomad services	2022-07-21 10:22:57 -05:00
Seth Hoenig	606e3ebdd4	client: updates from pr feedback	2022-07-21 09:54:27 -05:00
Seth Hoenig	8e6eeaa37e	Merge pull request #13869 from hashicorp/b-uniq-services-2 servicedisco: ensure service uniqueness in job validation	2022-07-21 08:24:24 -05:00
Tim Gross	d11da1df5c	workload identity: use parent ID for dispatch/periodic jobs (#13748 ) Workload identities grant implicit access to policies, and operators will not want to craft separate policies for each invocation of a periodic or dispatch job. Use the parent job's ID as the JobID claim.	2022-07-21 09:05:54 -04:00
Tim Gross	97a6346da0	keyring: use nanos for `CreateTime` in key metadata (#13849 ) Most of our objects use int64 timestamps derived from `UnixNano()` instead of `time.Time` objects. Switch the keyring metadata to use `UnixNano()` for consistency across the API.	2022-07-20 14:46:57 -04:00
Tim Gross	428e23043c	secure vars: limit maximum size of variable data (#13743 ) To discourage accidentally DoS'ing the cluster with secure variables data, we're providing a very low limit to the maximum size of a given secure variable. This currently matches the limit for dispatch payloads. In future versions, we may increase this limit or make it configurable, once we have better metrics from real-world operators.	2022-07-20 14:46:43 -04:00
Seth Hoenig	e5978a9cbf	jobspec: ensure service uniqueness in job validation	2022-07-20 12:38:08 -05:00
Seth Hoenig	d83aae253f	cleanup: track task names and providers using set	2022-07-20 11:48:36 -05:00
Tim Gross	ea38582b40	secure vars: rename automatically accessible vars path for jobs (#13848 ) Tasks are automatically granted access to variables on a path that matches their workload identity, with a well-known prefix. Change the prefix to `nomad/jobs` to allow for future prefixes like `nomad/volumes` or `nomad/plugins`. Reserve the prefix by emitting errors during validation.	2022-07-19 16:17:34 -04:00
James Rasell	9264f07cc1	core: add expired token garbage collection periodic jobs. (#13805 ) Two new periodic core jobs have been added which handle removing expired local and global tokens from state. The local core job is run on every leader; the global core job is only run on the leader within the authoritative region.	2022-07-19 15:37:46 +02:00
Tim Gross	cfa2cb140e	fsm: one-time token expiration should be deterministic (#13737 ) When applying a raft log to expire ACL tokens, we need to use a timestamp provided by the leader so that the result is deterministic across servers. Use leader's timestamp from RPC call	2022-07-18 14:19:29 -04:00
James Rasell	a8a8b1f84f	acl: add token expiry checking to ACL token resolution. (#13756 ) This commit adds basic expiry checking when performing ACL token resolution. This expiry checking is local to each server and does not at this time take into account potential time skew on server hosts. A new error message has been created so clients whose token has expired get a clear message, rather than a generic token not found. The ACL resolution tests have been refactored into table driven tests, so additions are easier in the future.	2022-07-15 15:20:50 +02:00
James Rasell	0cde3182eb	core: add ACL token expiry state, struct, and RPC handling. (#13718 ) The ACL token state schema has been updated to utilise two new indexes which track expiration of tokens that are configured with an expiration TTL or time. A new state function allows listing ACL expired tokens which will be used by internal garbage collection. The ACL endpoint has been modified so that all validation happens within a single function call. This is easier to understand and see at a glance. The ACL token validation now also includes logic for expiry TTL and times. The ACL endpoint upsert tests have been condensed into a single, table driven test. There is a new token canonicalize which provides a single place for token canonicalization, rather than logic spread in the RPC handler.	2022-07-13 15:40:34 +02:00
Luiz Aoqui	b656981cf0	Track plan rejection history and automatically mark clients as ineligible (#13421 ) Plan rejections occur when the scheduler work and the leader plan applier disagree on the feasibility of a plan. This may happen for valid reasons: since Nomad does parallel scheduling, it is expected that different workers will have a different state when computing placements. As the final plan reaches the leader plan applier, it may no longer be valid due to a concurrent scheduling taking up intended resources. In these situations the plan applier will notify the worker that the plan was rejected and that they should refresh their state before trying again. In some rare and unexpected circumstances it has been observed that workers will repeatedly submit the same plan, even if they are always rejected. While the root cause is still unknown this mitigation has been put in place. The plan applier will now track the history of plan rejections per client and include in the plan result a list of node IDs that should be set as ineligible if the number of rejections in a given time window crosses a certain threshold. The window size and threshold value can be adjusted in the server configuration. To avoid marking several nodes as ineligible at one, the operation is rate limited to 5 nodes every 30min, with an initial burst of 10 operations.	2022-07-12 18:40:20 -04:00
Seth Hoenig	297d386bdc	client: add support for checks in nomad services This PR adds support for specifying checks in services registered to the built-in nomad service provider. Currently only HTTP and TCP checks are supported, though more types could be added later.	2022-07-12 17:09:50 -05:00
Michael Schurter	3e50f72fad	core: merge reserved_ports into host_networks (#13651 ) Fixes #13505 This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports). As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility: Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly. Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me. So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR.	2022-07-12 14:40:25 -07:00
Charlie Voiselle	f4784e8d69	SV: fixes for namespace handling (#13705 ) * ACL check namespace value in SecureVariable * Error on wildcard namespace	2022-07-12 11:15:57 -04:00
Phil Renaud	e9219a1ae0	Allow wildcard for Evaluations API (#13530 ) * Failing test and TODO for wildcard * Alias the namespace query parameter for Evals * eval: fix list when using ACLs and * namespace Apply the same verification process as in job, allocs and scaling policy list endpoints to handle the eval list when using an ACL token with limited namespace support but querying using the `` wildcard namespace. changelog: add entry for #13530 * ui: set namespace when querying eval Evals have a unique UUID as ID, but when querying them the Nomad API still expects a namespace query param, otherwise it assumes `default`. Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-07-11 16:42:17 -04:00
Tim Gross	a5a9eedc81	core job for secure variables re-key (#13440 ) When the `Full` flag is passed for key rotation, we kick off a core job to decrypt and re-encrypt all the secure variables so that they use the new key.	2022-07-11 13:34:06 -04:00
Charlie Voiselle	555ac432cd	SV: CAS: Implement Check and Set for Delete and Upsert (#13429 ) * SV: CAS * Implement Check and Set for Delete and Upsert * Reading the conflict from the state store * Update endpoint for new error text * Updated HTTP api tests * Conflicts to the HTTP api * SV: structs: Update SV time to UnixNanos * update mock to UnixNano; refactor * SV: encrypter: quote KeyID in error * SV: mock: add mock for namespace w/ SV	2022-07-11 13:34:06 -04:00
Tim Gross	8a50d2c3e8	implement quota tracking for secure variablees (#13453 ) We need to track per-namespace storage usage for secure variables even in Nomad OSS so that a cluster can be seamlessly upgraded from OSS to ENT without having to re-calculate quota usage. Provide a hook in the upsert RPC for enforcement of quotas in ENT. This will be a no-op in Nomad OSS.	2022-07-11 13:34:06 -04:00
Tim Gross	6300427228	core job for key rotation (#13309 ) Extend the GC job to support periodic key rotation. Update the GC process to safely support signed workload identity. We can't GC any key used to sign a workload identity. Finding which key was used to sign every allocation will be expensive, but there are not that many keys. This lets us take a conservative approach: find the oldest live allocation and ensure that we don't GC any key older than that key.	2022-07-11 13:34:06 -04:00
Tim Gross	160ad9b64f	implement Encrypt/Decrypt methods of encrypter (#13375 ) The `Encrypt` method generates an appropriately-sized nonce and uses that buffer as the prefix for the ciphertext. This keeps the ciphertext and nonce together for decryption, and reuses the buffer as much as possible without presenting the temptation to reuse the cleartext buffer owned by the caller. We include the key ID as the "additional data" field that's used as an extra input to the authentication signature, to provide additional protection that a ciphertext originated with that key. Refactors the locking for the keyring so that the public methods are generally (with one commented exception) responsible for taking the lock and then inner methods are assumed locked.	2022-07-11 13:34:05 -04:00
Charlie Voiselle	1fe080c6de	Implement HTTP search API for Variables (#13257 ) * Add Path only index for SecureVariables * Add GetSecureVariablesByPrefix; refactor tests * Add search for SecureVariables * Add prefix search for secure variables	2022-07-11 13:34:05 -04:00
Charlie Voiselle	06c6a950c4	Secure Variables: Seperate Encrypted and Decrypted structs (#13355 ) This PR splits SecureVariable into SecureVariableDecrypted and SecureVariableEncrypted in order to use the type system to help verify that cleartext secret material is not committed to file. * Make Encrypt function return KeyID * Split SecureVariable Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-07-11 13:34:05 -04:00
Tim Gross	bfcbc00f4e	workload identity (#13223 ) In order to support implicit ACL policies for tasks to get their own secrets, each task would need to have its own ACL token. This would add extra raft overhead as well as new garbage collection jobs for cleaning up task-specific ACL tokens. Instead, Nomad will create a workload Identity Claim for each task. An Identity Claim is a JSON Web Token (JWT) signed by the server’s private key and attached to an Allocation at the time a plan is applied. The encoded JWT can be submitted as the X-Nomad-Token header to replace ACL token secret IDs for the RPCs that support identity claims. Whenever a key is is added to a server’s keyring, it will use the key as the seed for a Ed25519 public-private private keypair. That keypair will be used for signing the JWT and for verifying the JWT. This implementation is a ruthlessly minimal approach to support the secure variables feature. When a JWT is verified, the allocation ID will be checked against the Nomad state store, and non-existent or terminal allocation IDs will cause the validation to be rejected. This is sufficient to support the secure variables feature at launch without requiring implementation of a background process to renew soon-to-expire tokens.	2022-07-11 13:34:05 -04:00
Tim Gross	d5a214484c	core job for root key GC (#13199 ) Inactive and unused keys older than a threshold will be periodically garbage collected.	2022-07-11 13:34:04 -04:00
Tim Gross	5a85d96322	remove end-user algorithm selection (#13190 ) After internal design review, we decided to remove exposing algorithm choice to the end-user for the initial release. We'll solve nonce rotation by forcing rotations automatically on key GC (in a core job, not included in this changeset). Default to AES-256 GCM for the following criteria: * faster implementation when hardware acceleration is available * FIPS compliant * implementation in pure go * post-quantum resistance Also fixed a bug in the decoding from keystore and switched to a harder-to-misuse encoding method.	2022-07-11 13:34:04 -04:00
Charlie Voiselle	3717688f3e	Secure Variables: Variables - State store, FSM, RPC (#13098 ) * Secure Variables: State Store * Secure Variables: FSM * Secure Variables: RPC * Secure Variables: HTTP API Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-07-11 13:34:04 -04:00
Tim Gross	05eef2b95c	keystore serialization (#13106 ) This changeset implements the keystore serialization/deserialization: * Adds a JSON serialization extension for the `RootKey` struct, along with a metadata stub. When we serialize RootKey to the on-disk keystore, we want to base64 encode the key material but also exclude any frequently-changing fields which are stored in raft. * Implements methods for loading/saving keys to the keystore. * Implements methods for restoring the whole keystore from disk. * Wires it all up with the `Keyring` RPC handlers and fixes up any fallout on tests.	2022-07-11 13:34:04 -04:00
Tim Gross	2f0fd556ad	keyring RPC handlers (#13075 ) Implement the upsert, list, delete, and rotate RPC handlers for the secure variables keyring. Operations on the keyring itself are still stubbed out.	2022-07-11 13:34:04 -04:00
Tim Gross	b1dc6dcef0	keyring state store operations (#13016 ) Implement the basic upsert, list, and delete operations for `RootKeyMeta` needed by the Keyring RPCs. This changeset also implements two convenience methods `RootKeyMetaByID` and `GetActiveRootKeyMeta` which are useful for testing but also will be needed to implement the rest of the RPCs.	2022-07-11 13:34:04 -04:00
Charlie Voiselle	2019eab2c8	Provide mock secure variables implementation (#12980 ) * Add SecureVariable mock * Add SecureVariableStub * Add SecureVariable Copy and Stub funcs	2022-07-11 13:34:03 -04:00
Tim Gross	d29e85d150	secure variables: initial state store (#12932 ) Implement the core SecureVariable and RootKey structs in memdb, provide the minimal skeleton for FSM, and a dummy storage and keyring RPC endpoint.	2022-07-11 13:34:01 -04:00
Seth Hoenig	239eaf9a29	Merge pull request #13626 from hashicorp/b-client-max-kill-timeout client: enforce max_kill_timeout client configuration	2022-07-07 13:44:39 -05:00
Seth Hoenig	5dd8aa3e27	client: enforce max_kill_timeout client configuration This PR fixes a bug where client configuration max_kill_timeout was not being enforced. The feature was introduced in 9f44780 but seems to have been removed during the major drivers refactoring. We can make sure the value is enforced by pluming it through the DriverHandler, which now uses the lesser of the task.killTimeout or client.maxKillTimeout. Also updates Event.SetKillTimeout to require both the task.killTimeout and client.maxKillTimeout so that we don't make the mistake of using the wrong value - as it was being given only the task.killTimeout before.	2022-07-06 15:29:38 -05:00
Luiz Aoqui	74c5578432	api: refactor ACL check for namespace wildcard (#13606 ) Improve how the all namespaces wildcard (``) is handled when checking ACL permissions. When using the wildcard namespace the `AllowNsOp` would return false since it looks for a namespace called `` to match. This commit changes this behavior to return `true` when the queried namespace is `*` and the token allows the operation in _any_ namespace. Actual permission must be checked per object. The helper function `AllowNsOpFunc` returns a function that can be used to make this verification.	2022-07-06 15:22:30 -04:00
James Rasell	0c0b028a59	core: allow deleting of evaluations (#13492 ) * core: add eval delete RPC and core functionality. * agent: add eval delete HTTP endpoint. * api: add eval delete API functionality. * cli: add eval delete command. * docs: add eval delete website documentation.	2022-07-06 16:30:11 +02:00
James Rasell	181b247384	core: allow pausing and un-pausing of leader broker routine (#13045 ) * core: allow pause/un-pause of eval broker on region leader. * agent: add ability to pause eval broker via scheduler config. * cli: add operator scheduler commands to interact with config. * api: add ability to pause eval broker via scheduler config * e2e: add operator scheduler test for eval broker pause. * docs: include new opertor scheduler CLI and pause eval API info.	2022-07-06 16:13:48 +02:00
Seth Hoenig	97726c2fd8	Merge pull request #12862 from hashicorp/f-choose-services api: enable selecting subset of services using rendezvous hashing	2022-06-30 15:17:40 -05:00
Michael Schurter	1cc0ae8795	docs: fix Plan{,Result}.NodeUpdate comment (#13534 ) It appears way back when this was first implemented in 9a917281af9c0a97a6c59575eaa52c5c86ffc60d, it was renamed from NodeEvict (with a correct comment) to NodeUpdate. The comment was changed from referring to only evictions to referring to "all allocs" in the first sentence and "stop or evict" in the second. This confuses every time I see it because I read the name (NodeUpdate) and first sentence ("all the allocs") and assume this represents all allocations... which isn't true. I'm going to assume I'm the only one who doesn't read the 2nd sentence and that's why this suboptimal wording has lasted 7 years, but can we change it for my sake?	2022-06-30 12:47:14 -07:00
James Rasell	d080eed9ae	client: fixed a problem calculating a service namespace. (#13493 ) When calculating a services namespace for registration, the code assumed the first task within the task array would include a service block. This is incorrect as it is possible only a latter task within the array contains a service definition. This change fixes the logic, so we correctly search for a service definition before identifying the namespace.	2022-06-28 09:47:28 +02:00
Shishir Mahajan	6ba8245283	Fix typo: orthogonal. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-06-27 12:12:51 -07:00
Seth Hoenig	9467bc9eb3	api: enable selecting subset of services using rendezvous hashing This PR adds the 'choose' query parameter to the '/v1/service/<service>' endpoint. The value of 'choose' is in the form '<number>\|<key>', number is the number of desired services and key is a value unique but consistent to the requester (e.g. allocID). Folks aren't really expected to use this API directly, but rather through consul-template which will soon be getting a new helper function making use of this query parameter. Example, curl 'localhost:4646/v1/service/redis?choose=2\|abc123' Note: consul-templte v0.29.1 includes the necessary nomadServices functionality.	2022-06-25 10:37:37 -05:00
Grant Griffiths	99896da443	CSI: make plugin health_timeout configurable in csi_plugin stanza (#13340 ) Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>	2022-06-14 10:04:16 -04:00
Lance Haig	4bf27d743d	Allow Operator Generated bootstrap token (#12520 )	2022-06-03 07:37:24 -04:00
Huan Wang	7d15157635	adding support for customized ingress tls (#13184 )	2022-06-02 18:43:58 -04:00
Seth Hoenig	0399b7e4c5	Merge pull request #12951 from jorgemarey/f-srv-tagged-addresses Allow setting tagged addresses on services	2022-06-01 10:51:49 -05:00
Seth Hoenig	189176f052	consul: avoid reflection in comparing service map types	2022-06-01 10:22:00 -05:00
Seth Hoenig	dca954faac	build: update golangci-lint to v1.46.2 This version of golangci-lint improves support for generics, but also is more strict in copy vs. loop for slice copying.	2022-05-31 23:32:01 +00:00
Seth Hoenig	54efec5dfe	docs: add docs and tests for tagged_addresses	2022-05-31 13:02:48 -05:00
Jorge Marey	f966614602	Allow setting tagged addresses on services	2022-05-31 10:06:55 -05:00
Seth Hoenig	4631045d83	connect: enable setting connect upstream destination namespace	2022-05-26 09:39:36 -05:00
Luiz Aoqui	769ff1dcc3	Merge pull request #13109 from hashicorp/merge-release-1.3.1-branch Merge release 1.3.1 branch	2022-05-25 10:45:09 -04:00
Seth Hoenig	626a345fb2	ci: switch to 22.04 LTS for GHA Core CI tests	2022-05-25 08:19:40 -05:00
Michael Schurter	2965dc6a1a	artifact: fix numerous go-getter security issues Fix numerous go-getter security issues: - Add timeouts to http, git, and hg operations to prevent DoS - Add size limit to http to prevent resource exhaustion - Disable following symlinks in both artifacts and `job run` - Stop performing initial HEAD request to avoid file corruption on retries and DoS opportunities. Approach Since Nomad has no ability to differentiate a DoS-via-large-artifact vs a legitimate workload, all of the new limits are configurable at the client agent level. The max size of HTTP downloads is also exposed as a node attribute so that if some workloads have large artifacts they can specify a high limit in their jobspecs. In the future all of this plumbing could be extended to enable/disable specific getters or artifact downloading entirely on a per-node basis.	2022-05-24 16:29:39 -04:00
Seth Hoenig	a5943da0c7	core: add tests for blocked evals math	2022-05-24 09:05:18 -05:00
Luiz Aoqui	a8cc633156	vault: revert support for entity aliases (#12723 ) After a more detailed analysis of this feature, the approach taken in PR #12449 was found to be not ideal due to poor UX (users are responsible for setting the entity alias they would like to use) and issues around jobs potentially masquerading itself as another Vault entity.	2022-04-22 10:46:34 -04:00
Seth Hoenig	c4aab10e53	services: cr followup	2022-04-22 09:14:29 -05:00
Seth Hoenig	3fcac242c6	services: enable setting arbitrary address value in service registrations This PR introduces the `address` field in the `service` block so that Nomad or Consul services can be registered with a custom `.Address.` to advertise. The address can be an IP address or domain name. If the `address` field is set, the `service.address_mode` must be set in `auto` mode.	2022-04-22 09:14:29 -05:00
James Rasell	716b8e658b	api: Add support for filtering and pagination to the node list endpoint (#12727 )	2022-04-21 17:04:33 +02:00
Derek Strickland	5e309f3f33	reconciler: Handle canaries when client disconnects (#12539 ) * plan_apply: Allow node updates in disconnected node plans * plan: Keep the job when persisting unknown allocs * reconciler: stop unknown allocs when stopping all * reconcile_util: reorder filtering to handle canaries; skip rescheduling unknown * heartbeat: Fix bug in node heartbeating	2022-04-21 10:05:58 -04:00
James Rasell	010acce59f	job_hooks: add implicit constraint when using Consul for services. (#12602 )	2022-04-20 14:09:13 +02:00
chavacava	eb1c42e643	QueryOptions.SetTimeToBlock should take pointer receiver Fixes a bug where blocking queries that are retried don't have their blocking timeout reset, resulting in them running longer than expected.	2022-04-18 10:41:27 -04:00
Shishir	f5121d261e	Add os to NodeListStub struct. (#12497 ) * Add os to NodeListStub struct. Signed-off-by: Shishir Mahajan <smahajan@roblox.com> * Add os as a query param to /v1/nodes. Signed-off-by: Shishir Mahajan <smahajan@roblox.com> * Add test: os as a query param to /v1/nodes. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-04-15 17:22:45 -07:00
Tim Gross	826d9d47f9	CSI: replace structs->api with serialization extension (#12583 ) The CSI HTTP API has to transform the CSI volume to redact secrets, remove the claims fields, and to consolidate the allocation stubs into a single slice of alloc stubs. This was done manually in #8590 but this is a large amount of code and has proven both very bug prone (see #8659, #8666, #8699, #8735, and #12150) and requires updating lots of code every time we add a field to volumes or plugins. In #10202 we introduce encoding improvements for the `Node` struct that allow a more minimal transformation. Apply this same approach to serializing `structs.CSIVolume` to API responses. Also, the original reasoning behind #8590 for plugins no longer holds because the counts are now denormalized within the state store, so we can simply remove this transformation entirely.	2022-04-15 14:29:34 -04:00
Derek Strickland	0891218ee9	system_scheduler: support disconnected clients (#12555 ) * structs: Add helper method for checking if alloc is configured to disconnect * system_scheduler: Add support for disconnected clients	2022-04-15 09:31:32 -04:00
James Rasell	4cdc46ae75	service discovery: add pagination and filtering support to info requests (#12552 ) * services: add pagination and filter support to info RPC. * cli: add filter flag to service info command. * docs: add pagination and filter details to services info API. * paginator: minor updates to comment and func signature.	2022-04-13 07:41:44 +02:00
Tim Gross	4078e6ea0e	scripts: fix interpreter for bash (#12549 ) Many of our scripts have a non-portable interpreter line for bash and use bash-specific variables like `BASH_SOURCE`. Update the interpreter line to be portable between various Linuxes and macOS without complaint from posix shell users.	2022-04-12 10:08:21 -04:00
Jorge Marey	96dd3f53c6	Fix in-place updates over ineligible nodes (#12264 )	2022-04-06 11:30:40 -04:00
Derek Strickland	0ab89b1728	Merge pull request #12476 from hashicorp/f-disconnected-client-allocation-handling disconnected clients: Feature branch merge	2022-04-06 10:11:57 -04:00
Seth Hoenig	2e2ff3f75e	Merge pull request #12419 from hashicorp/exec-cleanup raw_exec: make raw exec driver work with cgroups v2	2022-04-05 16:42:01 -05:00
Derek Strickland	43d20ebdbd	disconnected clients: `TaskGroup` validation (#12418 ) * TaskGroup: Validate that max_client_disconnect and stop_after_client_disconnect are mutually exclusive.	2022-04-05 17:14:50 -04:00
Derek Strickland	bd719bc7b8	reconciler: 2 phase reconnects and tests (#12333 ) * structs: Add alloc.Expired & alloc.Reconnected functions. Add Reconnect eval trigger by. * node_endpoint: Emit new eval for reconnecting unknown allocs. * filterByTainted: handle 2 phase commit filtering rules. * reconciler: Append AllocState on disconnect. Logic updates from testing and 2 phase reconnects. * allocs: Set reconnect timestamp. Destroy if not DesiredStatusRun. Watch for unknown status.	2022-04-05 17:13:10 -04:00
Derek Strickland	bb376320a2	comments: update some stale comments referencing deprecated config name (#12271 ) * comments: update some stale comments referencing deprecated config name	2022-04-05 17:12:23 -04:00
Derek Strickland	8e9f8be511	`MaxClientDisconnect` Jobspec checklist (#12177 ) * api: Add struct, conversion function, and tests * TaskGroup: Add field, validation, and tests * diff: Add diff handler and test * docs: Update docs	2022-04-05 17:12:23 -04:00
Derek Strickland	d7f44448e1	disconnected clients: Observability plumbing (#12141 ) * Add disconnects/reconnect to log output and emit reschedule metrics * TaskGroupSummary: Add Unknown, update StateStore logic, add to metrics	2022-04-05 17:12:23 -04:00
Derek Strickland	3cbd76ea9d	disconnected clients: Add reconnect task event (#12133 ) * Add TaskClientReconnectedEvent constant * Add allocRunner.Reconnect function to manage task state manually * Removes server-side push	2022-04-05 17:12:23 -04:00
Derek Strickland	b128769e19	reconciler: support disconnected clients (#12058 ) * Add merge helper for string maps * structs: add statuses, MaxClientDisconnect, and helper funcs * taintedNodes: Include disconnected nodes * upsertAllocsImpl: don't use existing ClientStatus when upserting unknown * allocSet: update filterByTainted and add delayByMaxClientDisconnect * allocReconciler: support disconnecting and reconnecting allocs * GenericScheduler: upsert unknown and queue reconnecting Co-authored-by: Tim Gross <tgross@hashicorp.com>	2022-04-05 17:10:37 -04:00
Luiz Aoqui	ab7eb5de6e	Support Vault entity aliases (#12449 ) Move some common Vault API data struct decoding out of the Vault client so it can be reused in other situations. Make Vault job validation its own function so it's easier to expand it. Rename the `Job.VaultPolicies` method to just `Job.Vault` since it returns the full Vault block, not just their policies. Set `ChangeMode` on `Vault.Canonicalize`. Add some missing tests. Allows specifying an entity alias that will be used by Nomad when deriving the task Vault token. An entity alias assigns an indentity to a token, allowing better control and management of Vault clients since all tokens with the same indentity alias will now be considered the same client. This helps track Nomad activity in Vault's audit logs and better control over Vault billing. Add support for a new Nomad server configuration to define a default entity alias to be used when deriving Vault tokens. This default value will be used if the task doesn't have an entity alias defined.	2022-04-05 14:18:10 -04:00
James Rasell	e2b730d7c9	Merge pull request #12454 from hashicorp/f-rename-service-event-stream events: add service API logic and rename topic to service from serviceregistration	2022-04-05 16:19:14 +02:00
Grant Griffiths	18a0a2c9a4	CSI: Add secrets flag support for delete volume (#11245 )	2022-04-05 08:59:11 -04:00
James Rasell	cc7b448d63	events: fixup service events and rename topic to service.	2022-04-05 08:25:22 +01:00
Seth Hoenig	52aaf86f52	raw_exec: make raw exec driver work with cgroups v2 This PR adds support for the raw_exec driver on systems with only cgroups v2. The raw exec driver is able to use cgroups to manage processes. This happens only on Linux, when exec_driver is enabled, and the no_cgroups option is not set. The driver uses the freezer controller to freeze processes of a task, issue a sigkill, then unfreeze. Previously the implementation assumed cgroups v1, and now it also supports cgroups v2. There is a bit of refactoring in this PR, but the fundamental design remains the same. Closes #12351 #12348	2022-04-04 16:11:38 -05:00
James Rasell	9449e1c3e2	Merge branch 'main' into f-1.3-boogie-nights	2022-03-25 16:40:32 +01:00
James Rasell	96d8512c85	test: move remaining tests to use ci.Parallel.	2022-03-24 08:45:13 +01:00
Tim Gross	b7075f04fd	CSI: enforce single access mode at validation time (#12337 ) A volume that has single-use access mode is feasibility checked during scheduling to ensure that only a single reader or writer claim exists. However, because feasibility checking is done one alloc at a time before the plan is written, a job that's misconfigured to have count > 1 that mounts one of these volumes will pass feasibility checking. Enforce the check at validation time instead to prevent us from even trying to evaluation a job that's misconfigured this way.	2022-03-23 09:21:26 -04:00
James Rasell	a646333263	Merge branch 'main' into f-1.3-boogie-nights	2022-03-23 09:41:25 +01:00
Tim Gross	60cfeacd76	drainer: defer CSI plugins until last (#12324 ) When a node is drained, system jobs are left until last so that operators can rely on things like log shippers running even as their applications are getting drained off. Include CSI plugins in this set so that Controller plugins deployed as services can be handled as gracefully as Node plugins that are running as system jobs.	2022-03-22 10:26:56 -04:00
James Rasell	042bf0fa57	client: hookup service wrapper for use within client hooks.	2022-03-21 10:29:57 +01:00
Luiz Aoqui	15089f055f	api: add related evals to eval details (#12305 ) The `related` query param is used to indicate that the request should return a list of related (next, previous, and blocked) evaluations. Co-authored-by: Jasmine Dahilig <jasmine@hashicorp.com>	2022-03-17 13:56:14 -04:00
Seth Hoenig	2631659551	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
James Rasell	dc1378d6eb	job: add native service discovery job constraint mutator.	2022-03-14 12:42:12 +01:00
James Rasell	783d7fdc31	jobspec: add service block provider parameter and validation.	2022-03-14 09:21:20 +01:00
Luiz Aoqui	ab8ce87bba	Add pagination, filtering and sort to more API endpoints (#12186 )	2022-03-08 20:54:17 -05:00
Michael Schurter	7bb8de68e5	Merge pull request #12138 from jorgemarey/f-ns-meta Add metadata to namespaces	2022-03-07 10:19:33 -08:00
Tim Gross	2dafe46fe3	CSI: allow updates to volumes on re-registration (#12167 ) CSI `CreateVolume` RPC is idempotent given that the topology, capabilities, and parameters are unchanged. CSI volumes have many user-defined fields that are immutable once set, and many fields that are not user-settable. Update the `Register` RPC so that updating a volume via the API merges onto any existing volume without touching Nomad-controlled fields, while validating it with the same strict requirements expected for idempotent `CreateVolume` RPCs. Also, clarify that this state store method is used for everything, not just for the `Register` RPC.	2022-03-07 11:06:59 -05:00
James Rasell	ca6ba2e047	rpc: add job service registration list RPC endpoint.	2022-03-03 11:26:14 +01:00
James Rasell	b68d573aa5	rpc: add alloc service registration list RPC endpoint.	2022-03-03 11:25:55 +01:00
James Rasell	1ad8ea558a	rpc: add service registration RPC endpoints.	2022-03-03 11:25:29 +01:00
Luiz Aoqui	01931587ba	api: paginated results with different ordering (#12128 ) The paginator logic was built when go-memdb iterators would return items ordered lexicographically by their ID prefixes, but #12054 added the option for some tables to return results ordered by their `CreateIndex` instead, which invalidated the previous paginator assumption. The iterator used for pagination must still return results in some order so that the paginator can properly handle requests where the next_token value is not present in the results anymore (e.g., the eval was GC'ed). In these situations, the paginator will start the returned page in the first element right after where the requested token should've been. This commit moves the logic to generate pagination tokens from the elements being paginated to the iterator itself so that callers can have more control over the token format to make sure they are properly ordered and stable. It also allows configuring the paginator as being ordered in ascending or descending order, which is relevant when looking for a token that may not be present anymore.	2022-03-01 15:36:49 -05:00
Tim Gross	f2a4ad0949	CSI: implement support for topology (#12129 )	2022-03-01 10:15:46 -05:00
James Rasell	8a23afdb56	events: add state objects and logic for service registrations.	2022-02-28 10:44:58 +01:00
James Rasell	cfdb5a3c66	structs: add service registration struct and basic composed funcs.	2022-02-28 10:14:40 +01:00
Jorge Marey	a466f01120	Add metadata to namespaces	2022-02-27 09:09:10 +01:00
Tim Gross	cfe3117af8	CSI: enforce usage at claim time (#12112 ) * Remove redundant schedulable check in `FreeWriteClaims`. If a volume has been created but not yet claimed, its capabilities will be checked in `WriteSchedulable` at both scheduling time and claim time. We don't need to also check them in the `FreeWriteClaims` method. * Enforce maximum volume claims for writers. When the scheduler checks feasibility for CSI volumes, the check is fairly loose: earlier versions of the same job are not counted as active claims. This allows the scheduler to place new allocations for the new version of a job, under the assumption that we'll replace the existing allocations and their volume claims. But when the alloc runner claims the volume, we need to enforce the active claims even if they're for allocations of an earlier version of the job. Otherwise we'll try to mount a volume that's currently being unmounted, and this will cause replacement allocations to frequently fail. * Enforce single-node reader check for read-only volumes. When the alloc runner makes a claim for a read-only volume, we only check that the volume is potentially schedulable and not that it actually has free read claims.	2022-02-24 09:37:37 -05:00
Sander Mol	42b338308f	add go-sockaddr templating support to nomad consul address (#12084 )	2022-02-24 09:34:54 -05:00
Florian Apolloner	3bced8f558	namespaces: allow enabling/disabling allowed drivers per namespace	2022-02-24 09:27:32 -05:00
Tim Gross	57a546489f	CSI: minor refactoring (#12105 ) * rename method checking that free write claims are available * use package-level variables for claim errors * semgrep fix for testify	2022-02-23 11:13:51 -05:00
Michael Schurter	7494a0c4fd	core: remove all traces of unused protocol version Nomad inherited protocol version numbering configuration from Consul and Serf, but unlike those projects Nomad has never used it. Nomad's `protocol_version` has always been `1`. While the code is effectively unused and therefore poses no runtime risks to leave, I felt like removing it was best because: 1. Nomad's RPC subsystem has been able to evolve extensively without needing to increment the version number. 2. Nomad's HTTP API has evolved extensively without increment `API{Major,Minor}Version`. If we want to version the HTTP API in the future, I doubt this is the mechanism we would choose. 3. The presence of the `server.protocol_version` configuration parameter is confusing since `server.raft_protocol` is an important parameter for operators to consider. Even more confusing is that there is a distinct Serf protocol version which is included in `nomad server members` output under the heading `Protocol`. `raft_protocol` is the only protocol version relevant to Nomad developers and operators. The other protocol versions are either deadcode or have never changed (Serf). 4. If we were to need to version the RPC, HTTP API, or Serf protocols, I don't think these configuration parameters and variables are the best choice. If we come to that point we should choose a versioning scheme based on the use case and modern best practices -- not this 6+ year old dead code.	2022-02-18 16:12:36 -08:00
Luiz Aoqui	de91954582	initial base work for implementing sorting and filter across API endpoints (#12076 )	2022-02-16 14:34:36 -05:00
Luiz Aoqui	110dbeeb9d	Add `go-bexpr` filters to evals and deployment list endpoints (#12034 )	2022-02-16 11:40:30 -05:00
Seth Hoenig	40c714a681	api: return sorted results in certain list endpoints These API endpoints now return results in chronological order. They can return results in reverse chronological order by setting the query parameter ascending=true. - Eval.List - Deployment.List	2022-02-15 13:48:28 -06:00
Luiz Aoqui	3bf6036487	Version 1.2.6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJiBIXqAAoJELC0QQl2hbZ2M8cP/A7LENJbFSph25M1aGItra5j BphSX//Sq/v9ZzO44rOGNYQGfTpFT8STJgj2GC50qR/ilF4KX4D0oZlDyu/6D0NG ouN9RUjnFd6IEDQrjqqqhr3F69Z95SWVfi1rfgn/pIgOYkVEXfi6DXaulVVyd2ZT J0G5w5ryl5d8PhuL7TWw4zbhZRQn0hVspZv/1s3/I9aG6Sew8SMweeOxbN9lBr7E H19Amdjh6ugRuPgU7YMpKDVrZQRv9Wt7BUP/uc0u3LiW9z3Ko8ZKnCRKErtL5Kc3 HDZsWe+t3va4Uekzd0HULNcYU4kwjogdRYRzX5kRsOyXelrZkQIqYFiKrk1wVbq/ cYM5DUak6eUQBGhgi3UY0fklBFq4GDGpiwEzn7rvQb0PRSuVyykgbZ12fzyIu8dp tWbR/WOEg9F+jva6HkR2kDIcr5mDmny3Pxi5aUT6lMk1111nCzOjDzhLkQVtfsex FDMByXxM4oWAK3ouq2OIdxDL2c742A2933C4/30KWE7Xy7twsvkGw52irw66VO3V 4PHP880cDvEDaEh15mY/8FlaAE7t/gsCUuYLxGwl33TaXSRBLc9vVNrrp89q53TD ZcvXTBpHUOWa6ZlHF/4f8LW44rowM6bU0Wili7NaWOKx86dnUJMG4sqJifNgcpS/ 7lXogv98CYLbMy4X4if0 =NY1Z -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmIFbbkACgkQKfRZwNnL tXOr/g/+N2ZBMK8ohEvtdXLl7WXrVhgJfUSVbdD5Kfshul9CPn3yWRxJzqtEN2Pf 55ozeWLpoziP9y9LviJ7rDidXcTmDFutbFdGJ3L+ZLdLILsNOq1A+lbuwO3fJngZ 5aiPoJLsw4sqj6uHaM6Cls2f145O92nT7GXEHCxuvGHeSf3NkcR+zRY5nPrLTIrA uxYefCOzP6C2I+W7dL4Oj5R5EZd4UDi1WiL8pGzwm24LcagZN2ctctolAeF9OlJX M58UUv9b4GObe617u8MeH0LIlyZiNwn9JqrV33dKVTyrkBIYfYxkzdzMKf1csVYk kQb13KPdPTASBAGTl+sxeXXnw/bg09JXGcvREX5lLyQqY8xGwTv2FpTmybKWLiss Bg6BbejrgtCPBik0EAHWV0+kVzhi9bPfUYwTXLDCzMtrbyCyPoWchruel2sm41U1 ezRDzlSvf6nrXf7sAv6umJICck4Bc5Gol+8W7fxvWqnY9rQ3ds2v7E5lXZMBbOmE JSi+EDWBJjBAXehE6pLxeVsvlHMRWN007Z2UeD4neGIgG7xFJLq6nKeUKoiNIpgk hKBL8iwHyuJfrBB/dcPzI9NV+jL6OZ/oI1RWxSj0MX/B4VXZp8HrqZA5JxzQolUg KIxqe4iX3WIkQv+UU4WiELvs4O7fujB4KWz3iQokhwDxqGUpffk= =5EG2 -----END PGP SIGNATURE----- Merge tag 'v1.2.6' into merge-release-1.2.6-branch Version 1.2.6	2022-02-10 14:55:34 -05:00
Seth Hoenig	437bb4b86d	client: check escaping of alloc dir using symlinks This PR adds symlink resolution when doing validation of paths to ensure they do not escape client allocation directories.	2022-02-09 19:50:13 -05:00
Karthick Ramachandran	0600bc32e2	improve error message on service length (#12012 )	2022-02-04 19:39:34 -05:00
Samantha	54f8c04c91	Fix health checking for ephemeral poststart tasks (#11945 ) Update the logic in the Nomad client's alloc health tracker which erroneously marks existing healthy allocations with dead poststart ephemeral tasks as unhealthy even if they were already successful during a previous deployment.	2022-02-02 16:29:49 -05:00
Michael Schurter	d87ed3fcd7	core: prevent malformed plans from crashing leader The Plan.Submit endpoint assumed PlanRequest.Plan was never nil. While there is no evidence it ever has been nil, we should not panic if a nil plan is ever submitted because that would crash the leader.	2022-01-31 12:15:15 -08:00
Nomad Release bot	de3070d49a	Generate files for 1.2.4 release	2022-01-18 23:43:00 +00:00
Luiz Aoqui	b1753d0568	scheduler: detect and log unexpected scheduling collisions (#11793 )	2022-01-14 20:09:14 -05:00
Michael Schurter	e6eff95769	agent: validate reserved_ports are valid Goal is to fix at least one of the causes that can cause a node to be ineligible to receive work: https://github.com/hashicorp/nomad/issues/9506#issuecomment-1002880600	2022-01-12 14:21:47 -08:00
Conor Evans	8d622797af	replace 'a alloc' with 'an alloc' where appropriate (#11792 )	2022-01-10 11:59:46 -05:00
Derek Strickland	0a8e03f0f7	Expose Consul template configuration parameters (#11606 ) This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza. - `wait` It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza. - `max_stale` - `block_query_wait` - `consul_retry` - `vault_retry` - `wait` Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure. - `wait_bounds` Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-01-10 10:19:07 -05:00
Joel May	4f78bcfb98	Emit metrics on reschedule later decisions as nomad.client.allocs.reschedule (#10237 )	2022-01-06 15:56:43 -05:00
Michael Schurter	20bd8acf43	do not initialize copy's slice if nil in original	2021-12-23 16:40:35 -08:00
Michael Schurter	88200f4eb9	core: fix DNS and CPU Core copying	2021-12-23 12:28:19 -08:00
Michael Schurter	7d741837b0	core: match struct field order in Copy()	2021-12-23 12:27:39 -08:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
Tim Gross	a0cf5db797	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
Tim Gross	624ecab901	evaluations list pagination and filtering (#11648 ) API queries can request pagination using the `NextToken` and `PerPage` fields of `QueryOptions`, when supported by the underlying API. Add a `NextToken` field to the `structs.QueryMeta` so that we have a common field across RPCs to tell the caller where to resume paging from on their next API call. Include this field on the `api.QueryMeta` as well so that it's available for future versions of List HTTP APIs that wrap the response with `QueryMeta` rather than returning a simple list of structs. In the meantime callers can get the `X-Nomad-NextToken`. Add pagination to the `Eval.List` RPC by checking for pagination token and page size in `QueryOptions`. This will allow resuming from the last ID seen so long as the query parameters and the state store itself are unchanged between requests. Add filtering by job ID or evaluation status over the results we get out of the state store. Parse the query parameters of the `Eval.List` API into the arguments expected for filtering in the RPC call.	2021-12-10 13:43:03 -05:00
Tim Gross	03e697a69d	scheduler: config option to reject job registration (#11610 ) During incident response, operators may find that automated processes elsewhere in the organization can be generating new workloads on Nomad clusters that are unable to handle the workload. This changeset adds a field to the `SchedulerConfiguration` API that causes all job registration calls to be rejected unless the request has a management ACL token.	2021-12-06 15:20:34 -05:00
Tim Gross	39acac33a0	ui: change Consul/Vault base URL field name (#11589 ) Give ourselves some room for extension in the UI configuration block by naming the field `ui_url`, which will let us have an `api_url`. Fix the template path to ensure we're getting the right value from the API.	2021-11-30 13:20:29 -05:00
Luiz Aoqui	0cf1964651	Merge remote-tracking branch 'origin/release-1.2.2' into merge-release-1.2.2-branch	2021-11-24 14:40:45 -05:00

1 2 3 4 5 ...

2066 commits