open-nomad

Author	SHA1	Message	Date
James Rasell	9e8325d63c	acl: fix a bug in token creation when parsing expiration TTLs. (#15999 ) The ACL token decoding was not correctly handling time duration syntax such as "1h" which forced people to use the nanosecond representation via the HTTP API. The change adds an unmarshal function which allows this syntax to be used, along with other styles correctly.	2023-02-01 17:43:41 +01:00
James Rasell	67acfd9f6b	acl: return 400 not 404 code when creating an invalid policy. (#16000 )	2023-02-01 17:40:15 +01:00
Seth Hoenig	d0bd8172d3	e2e: remove unused consulacls directory (#15995 ) This pile was deprecated when we starting using HCP Consul for e2e instead of standing up our own cluster and managing Consuls at test runtime.	2023-01-31 16:03:47 -06:00
Luiz Aoqui	d22984f659	docs: add upgrade notice for batch GC changes (#15985 )	2023-01-31 15:04:12 -05:00
Jai	18fa6a8c64	15154/alloc redirect (#15969 ) * refact: add conditional error handling * test: test conditional logic	2023-01-31 14:59:05 -05:00
Glen Yu	8be2c3a5f1	docs: removed extra 'end' in one of the code blocks in template stanza documentation (#15963 )	2023-01-31 13:55:10 -05:00
Mike Nomitch	80848b202e	Increases max variable size to 64KiB from 16KiB (#15983 )	2023-01-31 13:32:36 -05:00
stswidwinski	16eefbbf4d	GC: ensure no leakage of evaluations for batch jobs. (#15097 ) Prior to 2409f72 the code compared the modification index of a job to itself. Afterwards, the code compared the creation index of the job to itself. In either case there should never be a case of re-parenting of allocs causing the evaluation to trivially always result in false, which leads to unreclaimable memory. Prior to this change allocations and evaluations for batch jobs were never garbage collected until the batch job was explicitly stopped. The new `batch_eval_gc_threshold` server configuration controls how often they are collected. The default threshold is `24h`.	2023-01-31 13:32:14 -05:00
Seth Hoenig	7838f16e2b	tests: bump consul and vault versions in test-core (#15979 )	2023-01-31 10:00:27 -06:00
Seth Hoenig	139f2c0b0f	docker: set force=true on remove image to handle images referenced by multiple tags (#15962 ) * docker: set force=true on remove image to handle images referenced by multiple tags This PR changes our call of docker client RemoveImage() to RemoveImageExtended with the Force=true option set. This fixes a bug where an image referenced by more than one tag could never be garbage collected by Nomad. The Force option only applies to stopped containers; it does not affect running workloads. * docker: add note about image_delay and multiple tags	2023-01-31 07:53:18 -06:00
Renan Cakirerk	31eb30f5f5	Fix typo in documentation (#15970 )	2023-01-31 09:05:37 +01:00
Charlie Voiselle	e05bfb0c6b	Fix broken link, typo, style edits. (#15968 )	2023-01-30 15:52:43 -05:00
Sudharshan K S	df91fbcc4a	Corrected a typo (#15942 )	2023-01-30 15:18:18 -05:00
Charlie Voiselle	cd4043abb2	docs: Add info about variable item key name restrictions (#15966 ) While you can use any string value for a variable Item's key name using characters that are outside of the set [unicode.Letter, unicode.Number,`_`] will require the `index` function for direct access.	2023-01-30 15:10:15 -05:00
Yorick Gersie	d94f22bee2	Ensure infra_image gets proper label used for reconciliation (#15898 ) * Ensure infra_image gets proper label used for reconciliation Currently infra containers are not cleaned up as part of the dangling container cleanup routine. The reason is that Nomad checks if a container is a Nomad owned container by verifying the existence of the: `com.hashicorp.nomad.alloc_id` label. Ensure we set this label on the infra container as well. * fix unit test * changelog: add entry --------- Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-30 09:46:45 -06:00
Jorge Marey	d1c9aad762	Rename fields on proxyConfig (#15541 ) * Change api Fields for expose and paths * Add changelog entry * changelog: add deprecation notes about connect fields * api: minor style tweaks --------- Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-30 09:31:16 -06:00
dependabot[bot]	bb79824a20	build(deps): bump github.com/docker/docker from 20.10.21+incompatible to 20.10.23+incompatible (#15848 ) * build(deps): bump github.com/docker/docker Bumps [github.com/docker/docker](https://github.com/docker/docker) from 20.10.21+incompatible to 20.10.23+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v20.10.21...v20.10.23) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * changelog: add entry for docker/docker --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-30 09:10:06 -06:00
dependabot[bot]	565078ba28	build(deps): bump github.com/brianvoe/gofakeit/v6 from 6.19.0 to 6.20.1 (#15846 ) Bumps [github.com/brianvoe/gofakeit/v6](https://github.com/brianvoe/gofakeit) from 6.19.0 to 6.20.1. - [Release notes](https://github.com/brianvoe/gofakeit/releases) - [Commits](https://github.com/brianvoe/gofakeit/compare/v6.19.0...v6.20.1) --- updated-dependencies: - dependency-name: github.com/brianvoe/gofakeit/v6 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-30 09:06:08 -06:00
dependabot[bot]	0fb2650baf	build(deps): bump github.com/hashicorp/vault/api from 1.8.2 to 1.8.3 (#15847 ) Bumps [github.com/hashicorp/vault/api](https://github.com/hashicorp/vault) from 1.8.2 to 1.8.3. - [Release notes](https://github.com/hashicorp/vault/releases) - [Changelog](https://github.com/hashicorp/vault/blob/main/CHANGELOG.md) - [Commits](https://github.com/hashicorp/vault/compare/v1.8.2...v1.8.3) --- updated-dependencies: - dependency-name: github.com/hashicorp/vault/api dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-30 09:04:20 -06:00
Piotr Kazmierczak	14b53df3b6	renamed stanza to block for consistency with other projects (#15941 )	2023-01-30 15:48:43 +01:00
dependabot[bot]	24b85bf02b	build(deps): bump github.com/docker/cli (#15938 ) Bumps [github.com/docker/cli](https://github.com/docker/cli) from 20.10.22+incompatible to 20.10.23+incompatible. - [Release notes](https://github.com/docker/cli/releases) - [Commits](https://github.com/docker/cli/compare/v20.10.22...v20.10.23) --- updated-dependencies: - dependency-name: github.com/docker/cli dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-30 08:47:12 -06:00
舍我其谁	3abb453bd0	volume: Add the missing option propagation_mode (#15626 )	2023-01-30 09:32:07 -05:00
Dao Thanh Tung	2fd908f63f	Fix documentation for `meta` block: string replacement in key from `-` to `_` (#15940 ) Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>	2023-01-30 14:51:04 +01:00
James Rasell	6accfb1f43	cli: separate auth method config output for easier reading. (#15892 )	2023-01-30 11:44:26 +01:00
James Rasell	06664baeb1	docs: add ACL concepts page to introduce objects. (#15895 )	2023-01-30 11:00:29 +01:00
dependabot[bot]	52a86b9d32	build(deps): bump github.com/shoenig/test from 0.6.0 to 0.6.1 in /api (#15939 ) * build(deps): bump github.com/shoenig/test from 0.6.0 to 0.6.1 in /api Bumps [github.com/shoenig/test](https://github.com/shoenig/test) from 0.6.0 to 0.6.1. - [Release notes](https://github.com/shoenig/test/releases) - [Commits](https://github.com/shoenig/test/compare/v0.6.0...v0.6.1) --- updated-dependencies: - dependency-name: github.com/shoenig/test dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * deps: update test --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Seth Hoenig <shoenig@duck.com>	2023-01-29 14:03:56 -06:00
Seth Hoenig	074b76e3bf	consul: check for acceptable service identity on consul tokens (#15928 ) When registering a job with a service and 'consul.allow_unauthenticated=false', we scan the given Consul token for an acceptable policy or role with an acceptable policy, but did not scan for an acceptable service identity (which is backed by an acceptable virtual policy). This PR updates our consul token validation to also accept a matching service identity when registering a service into Consul. Fixes #15902	2023-01-27 18:15:51 -06:00
Tim Gross	d2fc65764e	docs: add more warnings about running agent as root on Linux (#15926 )	2023-01-27 15:22:18 -05:00
Tim Gross	40a47f63f2	docs: add post-install steps for CNI to main install docs page (#15919 ) The getting started Tutorial has a post-installation steps section that includes installing CNI plugins. Many users will want to use `bridge` networking right out of the gate, so adding these same post-install instructions to the main docs will be a better Day 0 experience for them.	2023-01-27 13:16:14 -05:00
Seth Hoenig	0fac4e19b3	client: always run alloc cleanup hooks on final update (#15855 ) * client: run alloc pre-kill hooks on last pass despite no live tasks This PR fixes a bug where alloc pre-kill hooks were not run in the edge case where there are no live tasks remaining, but it is also the final update to process for the (terminal) allocation. We need to run cleanup hooks here, otherwise they will not run until the allocation gets garbage collected (i.e. via Destroy()), possibly at a distant time in the future. Fixes #15477 * client: do not run ar cleanup hooks if client is shutting down	2023-01-27 09:59:31 -06:00
Luiz Aoqui	de87cdc816	template: restore driver handle on update (#15915 ) When the template hook Update() method is called it may recreate the template manager if the Nomad or Vault token has been updated. This caused the new template manager did not have a driver handler because this was only being set on the Poststart hook, which is not called for inplace updates.	2023-01-27 10:55:59 -05:00
Tim Gross	881a4cfaff	metrics: Add remaining server RPC rate metrics (#15901 )	2023-01-27 08:29:53 -05:00
Luiz Aoqui	09fc054c82	ui: fix alloc memory stats to match CLI output (#15909 )	2023-01-26 17:08:13 -05:00
Tim Gross	ce3eef8037	metrics: Add rate metrics to Client CSI endpoints (#15905 ) Also tightens up authentication for these endpoints by enforcing the server certificate name is valid. We protect these endpoints currently by mTLS and can't use an auth token because these endpoints are (uniquely) called by the leader and followers for a given node won't have the leader's ephemeral ACL token. Add a certificate name check that requests come from a server and not a client, because no client should ever send these RPCs directly.	2023-01-26 16:40:58 -05:00
Luiz Aoqui	f133866fd3	ui: scope variables form footer style (#15907 ) Scope the `footer` tag SCSS rule for the New Variable form to prevent it from affecting other `<footer>` elements, such as the gutter menu Nomad version section.	2023-01-26 16:18:08 -05:00
Luiz Aoqui	bb323ef3de	ui: fix navigation for namespaced jobs in search and job version (#15906 )	2023-01-26 16:03:07 -05:00
Tim Gross	bed8716e44	metrics: Add metrics to unauthenticated endpoints (#15899 )	2023-01-26 15:05:51 -05:00
Tim Gross	5e75ea9fb3	metrics: Add RPC rate metrics to endpoints that validate TLS names (#15900 )	2023-01-26 15:04:25 -05:00
Seth Hoenig	caba88128e	artifact: enable reading system git/mercurial configuration (#15903 ) This PR adjusts the artifact sandbox on Linux to enable reading from known system-wide git or mercurial configuration, if they exist. Folks doing something odd like specifying custom paths for global config will need to use the standard locations, or disable artifact filesystem isolation.	2023-01-26 13:07:40 -06:00
Jai	033bb7346a	ui: add client links to topo viz (#15867 ) * ui: add client links to topo viz * test: add node name for a11y violation	2023-01-26 10:14:13 -05:00
Seth Hoenig	7375fd40fc	nsd: block on removal of services (#15862 ) * nsd: block on removal of services This PR uses a WaitGroup to ensure workload removals are complete before returning from ServiceRegistrationHandler.RemoveWorkload of the nomad service provider. The de-registration of individual services still occurs asynchrously, but we must block on the parent removal call so that we do not race with further operations on the same set of services - e.g. in the case of a task restart where we de-register and then re-register the services in quick succession. Fixes #15032 * nsd: add e2e test for initial failing check and restart	2023-01-26 08:17:57 -06:00
Yorick Gersie	2a5c423ae0	Allow per_alloc to be used with host volumes (#15780 ) Disallowing per_alloc for host volumes in some cases makes life of a nomad user much harder. When we rely on the NOMAD_ALLOC_INDEX for any configuration that needs to be re-used across restarts we need to make sure allocation placement is consistent. With CSI volumes we can use the `per_alloc` feature but for some reason this is explicitly disabled for host volumes. Ensure host volumes understand the concept of per_alloc	2023-01-26 09:14:47 -05:00
Piotr Kazmierczak	f4d6efe69f	acl: make auth method default across all types (#15869 )	2023-01-26 14:17:11 +01:00
James Rasell	5d33891910	sso: allow binding rules to create management ACL tokens. (#15860 ) * sso: allow binding rules to create management ACL tokens. * docs: update binding rule docs to detail management type addition.	2023-01-26 09:57:44 +01:00
dependabot[bot]	0bb408bc10	build(deps): bump github.com/aws/aws-sdk-go from 1.44.175 to 1.44.184 (#15849 ) Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.44.175 to 1.44.184. - [Release notes](https://github.com/aws/aws-sdk-go/releases) - [Commits](https://github.com/aws/aws-sdk-go/compare/v1.44.175...v1.44.184) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-25 19:04:56 -05:00
scottduszy	851a3a8e6c	docs: correct "User" attribute in Podman Task Driver Docs (#15421 )	2023-01-25 18:52:16 -05:00
Tim Gross	6677a103c2	metrics: measure rate of RPC requests that serve API (#15876 ) This changeset configures the RPC rate metrics that were added in #15515 to all the RPCs that support authenticated HTTP API requests. These endpoints already configured with pre-forwarding authentication in #15870, and a handful of others were done already as part of the proof-of-concept work. So this changeset is entirely copy-and-pasting one method call into a whole mess of handlers. Upcoming PRs will wire up pre-forwarding auth and rate metrics for the remaining set of RPCs that have no API consumers or aren't authenticated, in smaller chunks that can be more thoughtfully reviewed.	2023-01-25 16:37:24 -05:00
Luiz Aoqui	f2dd46d1db	docs: add caveat on dynamic blocks (#15857 )	2023-01-25 15:54:45 -05:00
Luiz Aoqui	3479e2231f	core: enforce strict steps for clients reconnect (#15808 ) When a Nomad client that is running an allocation with `max_client_disconnect` set misses a heartbeat the Nomad server will update its status to `disconnected`. Upon reconnecting, the client will make three main RPC calls: - `Node.UpdateStatus` is used to set the client status to `ready`. - `Node.UpdateAlloc` is used to update the client-side information about allocations, such as their `ClientStatus`, task states etc. - `Node.Register` is used to upsert the entire node information, including its status. These calls are made concurrently and are also running in parallel with the scheduler. Depending on the order they run the scheduler may end up with incomplete data when reconciling allocations. For example, a client disconnects and its replacement allocation cannot be placed anywhere else, so there's a pending eval waiting for resources. When this client comes back the order of events may be: 1. Client calls `Node.UpdateStatus` and is now `ready`. 2. Scheduler reconciles allocations and places the replacement alloc to the client. The client is now assigned two allocations: the original alloc that is still `unknown` and the replacement that is `pending`. 3. Client calls `Node.UpdateAlloc` and updates the original alloc to `running`. 4. Scheduler notices too many allocs and stops the replacement. This creates unnecessary placements or, in a different order of events, may leave the job without any allocations running until the whole state is updated and reconciled. To avoid problems like this clients must update _all_ of its relevant information before they can be considered `ready` and available for scheduling. To achieve this goal the RPC endpoints mentioned above have been modified to enforce strict steps for nodes reconnecting: - `Node.Register` does not set the client status anymore. - `Node.UpdateStatus` sets the reconnecting client to the `initializing` status until it successfully calls `Node.UpdateAlloc`. These changes are done server-side to avoid the need of additional coordination between clients and servers. Clients are kept oblivious of these changes and will keep making these calls as they normally would. The verification of whether allocations have been updates is done by storing and comparing the Raft index of the last time the client missed a heartbeat and the last time it updated its allocations.	2023-01-25 15:53:59 -05:00
Tim Gross	f3f64af821	WI: allow workloads to use RPCs associated with HTTP API (#15870 ) This changeset allows Workload Identities to authenticate to all the RPCs that support HTTP API endpoints, for use with PR #15864. * Extends the work done for pre-forwarding authentication to all RPCs that support a HTTP API endpoint. * Consolidates the auth helpers used by the CSI, Service Registration, and Node endpoints that are currently used to support both tokens and client secrets. Intentionally excluded from this changeset: * The Variables endpoint still has custom handling because of the implicit policies. Ideally we'll figure out an efficient way to resolve those into real policies and then we can get rid of that custom handling. * The RPCs that don't currently support auth tokens (i.e. those that don't support HTTP endpoints) have not been updated with the new pre-forwarding auth We'll be doing this under a separate PR to support RPC rate metrics.	2023-01-25 14:33:06 -05:00

... 10 11 12 13 14 ...

24775 commits