open-nomad

Commit Graph

Author	SHA1	Message	Date
Luiz Aoqui	0e09b120e4	fix mTLS certificate check on agent to agent RPCs (#11998 ) PR #11956 implemented a new mTLS RPC check to validate the role of the certificate used in the request, but further testing revealed two flaws: 1. client-only endpoints did not accept server certificates so the request would fail when forwarded from one server to another. 2. the certificate was being checked after the request was forwarded, so the check would happen over the server certificate, not the actual source. This commit checks for the desired mTLS level, where the client level accepts both, a server or a client certificate. It also validates the cercertificate before the request is forwarded.	2022-02-04 20:35:20 -05:00
ttys3	5faf344152	style: fix up very long tag word breaking the allocation service table width (#11995 )	2022-02-04 19:40:03 -05:00
Karthick Ramachandran	0600bc32e2	improve error message on service length (#12012 )	2022-02-04 19:39:34 -05:00
Dylan Staley	e135369549	feat: display warning in IE 11	2022-02-04 14:25:52 -08:00
Seth Hoenig	420fd17459	Merge pull request #12002 from hashicorp/dependabot/go_modules/github.com/hashicorp/go-version-1.4.0 build(deps): bump github.com/hashicorp/go-version from 1.3.0 to 1.4.0	2022-02-04 08:31:53 -06:00
Seth Hoenig	77f4015426	Merge pull request #11937 from hashicorp/dependabot/go_modules/google.golang.org/grpc-1.44.0 build(deps): bump google.golang.org/grpc from 1.42.0 to 1.44.0	2022-02-04 08:29:58 -06:00
Luiz Aoqui	c459c17579	add semgrep rule to check for potential time.After leaks (#12001 )	2022-02-03 17:33:07 -05:00
dependabot[bot]	898107e311	build(deps): bump github.com/hashicorp/go-version from 1.3.0 to 1.4.0 Bumps [github.com/hashicorp/go-version](https://github.com/hashicorp/go-version) from 1.3.0 to 1.4.0. - [Release notes](https://github.com/hashicorp/go-version/releases) - [Changelog](https://github.com/hashicorp/go-version/blob/main/CHANGELOG.md) - [Commits](https://github.com/hashicorp/go-version/compare/v1.3.0...v1.4.0) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-version dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-02-03 21:41:10 +00:00
dependabot[bot]	685f011d07	build(deps): bump google.golang.org/grpc from 1.42.0 to 1.44.0 Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.42.0 to 1.44.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.42.0...v1.44.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-02-03 21:40:06 +00:00
Seth Hoenig	34cb21fecc	Merge pull request #11929 from hashicorp/dependabot/go_modules/github.com/mitchellh/copystructure-1.2.0 build(deps): bump github.com/mitchellh/copystructure from 1.1.1 to 1.2.0	2022-02-03 15:28:21 -06:00
Tim Gross	790e49b0dd	core: log CSI GC cutoff index only on non-forced GC (#11997 ) Non-CSI garbage collection tasks on the server only log the cutoff index in the case where it's not a forced GC from `nomad system gc`. Do the same for CSI for consistency.	2022-02-03 15:03:39 -05:00
Tim Gross	7ad15b2b42	raft: default to protocol v3 (#11572 ) Many of Nomad's Autopilot features require raft protocol version 3. Set the default raft protocol to 3, and improve the upgrade documentation.	2022-02-03 15:03:12 -05:00
Seth Hoenig	5f48e18189	Merge pull request #11983 from hashicorp/b-select-after cleanup: prevent leaks from time.After	2022-02-03 09:38:06 -06:00
Glen Yu	8abc28fc24	added Int32ToPtr helper function (#11985 )	2022-02-02 17:12:54 -05:00
ttys3	1ab3b4d3d8	correct task row memory unit (#11980 )	2022-02-02 17:00:25 -05:00
René Moser	05db861938	api-docs: add SysBatchSchedulerEnabled docs (#11973 )	2022-02-02 16:54:47 -05:00
Samantha	54f8c04c91	Fix health checking for ephemeral poststart tasks (#11945 ) Update the logic in the Nomad client's alloc health tracker which erroneously marks existing healthy allocations with dead poststart ephemeral tasks as unhealthy even if they were already successful during a previous deployment.	2022-02-02 16:29:49 -05:00
Seth Hoenig	db2347a86c	cleanup: prevent leaks from time.After This PR replaces use of time.After with a safe helper function that creates a time.Timer to use instead. The new function returns both a time.Timer and a Stop function that the caller must handle. Unlike time.NewTimer, the helper function does not panic if the duration set is <= 0.	2022-02-02 14:32:26 -06:00
Luiz Aoqui	c4cff5359f	Verify TLS certificate on endpoints that are used between agents only (#11956 )	2022-02-02 15:03:18 -05:00
Seth Hoenig	f6217fe424	Merge pull request #11972 from hashicorp/b-disable-semgrep-structs build: disable semgrep on structs.go for now	2022-02-02 12:57:47 -06:00
James Rasell	ba735bc35f	Merge pull request #11976 from hashicorp/b-gh-11950-missed e2e: moved missed volume test stop command to util helper.	2022-02-02 09:58:11 +01:00
James Rasell	adc3c44e29	e2e: moved missed volume test stop command to util helper.	2022-02-02 08:42:58 +01:00
Tim Gross	0b1978736e	Merge pull request #11971 from hashicorp/merge-release-1.2.5-branch prepare for next release	2022-02-01 11:16:38 -05:00
Tim Gross	7a0d151fab	prepare for next release	2022-02-01 11:13:22 -05:00
Seth Hoenig	60ca29161f	build: disable semgrep on structs.go for now	2022-02-01 10:09:49 -06:00
Tim Gross	95f26b307d	update download to Nomad v1.2.5 (#11969 )	2022-02-01 11:04:06 -05:00
James Rasell	fb7dbdf35d	Merge pull request #11968 from hashicorp/b-gh-11950 e2e: account for new job stop CLI exit behaviour.	2022-02-01 15:56:56 +01:00
Seth Hoenig	5f07ab5c80	Merge pull request #11966 from hashicorp/deps-no-special-vendor deps: import libtime the normal way	2022-02-01 08:46:30 -06:00
James Rasell	0a50d9fd2a	e2e: account for new job stop CLI exit behaviour. PR #11550 changed the job stop exit behaviour when monitoring the deployment. When stopping a job, the deployment becomes cancelled and therefore the CLI now exits with status code 1 as it see this as an error. This change adds a new utility e2e function that accounts for this behaviour.	2022-02-01 14:16:37 +01:00
Michael Schurter	fd242ab7f8	Merge pull request #11878 from kainoaseto/fix/multi-task-group-canary-deploys Bugfix: auto-promote canary taskgroups when mixed with non-canary taskgroups	2022-01-31 16:22:51 -08:00
kainoaseto	d575b3f4ae	rename test variable names to something a easier to identify	2022-01-31 14:59:52 -08:00
Michael Schurter	8973cc39a3	Merge pull request #11944 from hashicorp/b-validate-plan core: prevent malformed plans from crashing leader	2022-01-31 13:14:28 -08:00
Seth Hoenig	04f84bcdfe	deps: import libtime the normal way Previously we copied this library by hand to avoid vendor-ing a bunch of files related to minimock. Now that we no longer vendor, just import the library normally. Also we might use more of the library for handling `time.After` uses, for which this library provides a Context-based solution.	2022-01-31 14:49:05 -06:00
Michael Schurter	dcf15d5960	docs: add changelog for #11878	2022-01-31 12:21:31 -08:00
Michael Schurter	d87ed3fcd7	core: prevent malformed plans from crashing leader The Plan.Submit endpoint assumed PlanRequest.Plan was never nil. While there is no evidence it ever has been nil, we should not panic if a nil plan is ever submitted because that would crash the leader.	2022-01-31 12:15:15 -08:00
Nomad Release Bot	cf7f0977ff	Release v1.2.5	2022-01-31 15:36:54 +00:00
Nomad Release bot	8af121bfbe	Generate files for 1.2.5 release	2022-01-31 14:54:26 +00:00
Tim Gross	18c528313c	docs: add 1.2.5 to changelog	2022-01-28 15:08:48 -05:00
Tim Gross	6af1b359ed	docs: missing changelog for #11892 (#11959 )	2022-01-28 15:08:48 -05:00
Tim Gross	ea69eda522	docs: missing changelog for #11892 (#11959 )	2022-01-28 15:04:32 -05:00
Tim Gross	d8a74efb07	set LAST_RELEASE to 1.2.4 for the 1.2.5 release branch	2022-01-28 14:50:54 -05:00
Tim Gross	622ed093ae	CSI: node unmount from the client before unpublish RPC (#11892 ) When an allocation stops, the `csi_hook` makes an unpublish RPC to the servers to unpublish via the CSI RPCs: first to the node plugins and then the controller plugins. The controller RPCs must happen after the node RPCs so that the node has had a chance to unmount the volume before the controller tries to detach the associated device. But the client has local access to the node plugins and can independently determine if it's safe to send unpublish RPC to those plugins. This will allow the server to treat the node plugin as abandoned if a client is disconnected and `stop_on_client_disconnect` is set. This will let the server try to send unpublish RPCs to the controller plugins, under the assumption that the client will be trying to unmount the volume on its end first. Note that the CSI `NodeUnpublishVolume`/`NodeUnstageVolume` RPCs can return ignorable errors in the case where the volume has already been unmounted from the node. Handle all other errors by retrying until we get success so as to give operators the opportunity to reschedule a failed node plugin (ex. in the case where they accidentally drained a node without `-ignore-system`). Fan-out the work for each volume into its own goroutine so that we can release a subset of volumes if only one is stuck.	2022-01-28 14:43:58 -05:00
Tim Gross	5773fc93a2	CSI: move terminal alloc handling into denormalization (#11931 ) * The volume claim GC method and volumewatcher both have logic collecting terminal allocations that duplicates most of the logic that's now in the state store's `CSIVolumeDenormalize` method. Copy this logic into the state store so that all code paths have the same view of the past claims. * Remove logic in the volume claim GC that now lives in the state store's `CSIVolumeDenormalize` method. * Remove logic in the volumewatcher that now lives in the state store's `CSIVolumeDenormalize` method. * Remove logic in the node unpublish RPC that now lives in the state store's `CSIVolumeDenormalize` method.	2022-01-28 14:43:50 -05:00
Tim Gross	c67c31e543	csi: ensure that PastClaims are populated with correct mode (#11932 ) In the client's `(csiHook) Postrun()` method, we make an unpublish RPC that includes a claim in the `CSIVolumeClaimStateUnpublishing` state and using the mode from the client. But then in the `(CSIVolume) Unpublish` RPC handler, we query the volume from the state store (because we only get an ID from the client). And when we make the client RPC for the node unpublish step, we use the _current volume's_ view of the mode. If the volume's mode has been changed before the old allocations can have their claims released, then we end up making a CSI RPC that will never succeed. Why does this code path get the mode from the volume and not the claim? Because the claim written by the GC job in `(*CoreScheduler) csiVolumeClaimGC` doesn't have a mode. Instead it just writes a claim in the unpublishing state to ensure the volumewatcher detects a "past claim" change and reaps all the claims on the volumes. Fix this by ensuring that the `CSIVolumeDenormalize` creates past claims for all nil allocations with a correct access mode set.	2022-01-28 14:43:43 -05:00
Tim Gross	951661db04	CSI: resolve invalid claim states (#11890 ) * csi: resolve invalid claim states on read It's currently possible for CSI volumes to be claimed by allocations that no longer exist. This changeset asserts a reasonable state at the state store level by registering these nil allocations as "past claims" on any read. This will cause any pass through the periodic GC or volumewatcher to trigger the unpublishing workflow for those claims. * csi: make feasibility check errors more understandable When the feasibility checker finds we have no free write claims, it checks to see if any of those claims are for the job we're currently scheduling (so that earlier versions of a job can't block claims for new versions) and reports a conflict if the volume can't be scheduled so that the user can fix their claims. But when the checker hits a claim that has a GCd allocation, the state is recoverable by the server once claim reaping completes and no user intervention is required; the blocked eval should complete. Differentiate the scheduler error produced by these two conditions.	2022-01-28 14:43:35 -05:00
Tim Gross	4e559c6255	csi: update leader's ACL in volumewatcher (#11891 ) The volumewatcher that runs on the leader needs to make RPC calls rather than writing to raft (as we do in the deploymentwatcher) because the unpublish workflow needs to make RPC calls to the clients. This requires that the volumewatcher has access to the leader's ACL token. But when leadership transitions, the new leader creates a new leader ACL token. This ACL token needs to be passed into the volumewatcher when we enable it, otherwise the volumewatcher can find itself with a stale token.	2022-01-28 14:43:27 -05:00
Derek Strickland	460416e787	Update IsEmpty to check for pre-1.2.4 fields (#11930 )	2022-01-28 14:41:49 -05:00
Noel Quiles	9dcb7306da	website: Add Demandbase tag to consent manager (#11941 ) * chore: Add Demandbase tag to consent manager * fix: Add services to manager options	2022-01-28 14:37:35 -05:00
Jai	9a3a440dcf	Merge pull request #11711 from hashicorp/f-ui/evaluations-table feat: add evaluations view with table	2022-01-28 11:15:42 -05:00
Jai Bhagat	8533abde2e	fix: update eval serializer to latest changes	2022-01-28 10:16:23 -05:00

... 7 8 9 10 11 ...

22842 Commits All Branches Search

22842 Commits

All Branches