open-nomad

Author	SHA1	Message	Date
Tim Gross	70c262eb95	E2E: terraform provisioner upgrades (#12652 ) While working on infrastructure for testing the UI in E2E, we needed to upgrade the certificate provider. Performing a provider upgrade via the TF `init -upgrade` brought in updates for the file and AWS providers as well. These updates include deprecating the use of `sensitive_content` fields, removing CA algorithm parameters that can be inferred from keys, and removing the requirement to manually specify AWS assume role parameters in the provider config if they're available in the calling environment's AWS config file (as they are via doormat or our E2E environment).	2022-04-19 14:27:14 -04:00
Seth Hoenig	8084dd29a1	Merge pull request #12604 from hashicorp/b-fixup-chroot-test ci: fixup task runner chroot test	2022-04-19 12:58:03 -05:00
Seth Hoenig	46066fb7fb	Merge pull request #12622 from hashicorp/b-fix-docker-logger-test ci: fix docker logger not supported test	2022-04-19 12:57:47 -05:00
Seth Hoenig	d1bda4a954	ci: fixup task runner chroot test This PR is 2 fixes for the flaky TestTaskRunner_TaskEnv_Chroot test. And also the TestTaskRunner_Download_ChrootExec test. - Use TinyChroot to stop copying gigabytes of junk, which causes GHA to fail to create the environment in time. - Pre-create cgroups on V2 systems. Normally the cgroup directory is managed by the cpuset manager, but that is not active in taskrunner tests, so create it by hand in the test framework.	2022-04-19 10:37:46 -05:00
Seth Hoenig	16cab10346	ci: fix docker logger not supported test This test checks for behavior when asking for logs of a docker task configured with a log driver that does not support streaming logs. Previously this was using the 'gelf' log driver, but it seems that no longer returns an error as expected. Instead we can just use the 'none' log driver, which has the desired effect 2022-04-19T10:23:19.129-0500 [ERROR] docklog/docker_logger.go:133: log streaming ended with terminal error: error="API error (501): configured logging driver does not support reading"	2022-04-19 10:27:01 -05:00
Luiz Aoqui	8dccc48f17	changelog: fix entry for #11927 (#12577 )	2022-04-19 10:46:25 -04:00
Luiz Aoqui	950a2109aa	changelog: add entry for #11944 (#12578 )	2022-04-19 10:46:11 -04:00
Seth Hoenig	411158acff	Merge pull request #12586 from hashicorp/f-local-si-token connect: create SI tokens in local scope	2022-04-19 07:53:01 -05:00
Seth Hoenig	a7950e5624	cl: add missing prefix	2022-04-19 07:48:56 -05:00
Derek Strickland	7c6eb47b78	`consul-template`: revert `function_denylist` logic (#12071 ) * consul-template: replace config rather than append Co-authored-by: Seth Hoenig <seth.a.hoenig@gmail.com>	2022-04-18 13:57:56 -04:00
chavacava	eb1c42e643	QueryOptions.SetTimeToBlock should take pointer receiver Fixes a bug where blocking queries that are retried don't have their blocking timeout reset, resulting in them running longer than expected.	2022-04-18 10:41:27 -04:00
Tim Gross	0cf14a49cc	CI: build binaries for UI branches (#12594 ) Build binaries for every code change, not just backend code changes. This means that we'll have up-to-date compiled assets for every commit available in CircleCI artifacts.	2022-04-18 10:29:20 -04:00
Seth Hoenig	df587d8263	docs: update documentation with connect acls changes This PR updates the changelog, adds notes the 1.3 upgrade guide, and updates the connect integration docs with documentation about the new requirement on Consul ACL policies of Consul agent default anonymous ACL tokens.	2022-04-18 08:22:33 -05:00
Jorge Marey	707c7f3a11	Change consul SI tokens to be local	2022-04-18 08:22:33 -05:00
Shishir	f5121d261e	Add os to NodeListStub struct. (#12497 ) * Add os to NodeListStub struct. Signed-off-by: Shishir Mahajan <smahajan@roblox.com> * Add os as a query param to /v1/nodes. Signed-off-by: Shishir Mahajan <smahajan@roblox.com> * Add test: os as a query param to /v1/nodes. Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2022-04-15 17:22:45 -07:00
Tim Gross	826d9d47f9	CSI: replace structs->api with serialization extension (#12583 ) The CSI HTTP API has to transform the CSI volume to redact secrets, remove the claims fields, and to consolidate the allocation stubs into a single slice of alloc stubs. This was done manually in #8590 but this is a large amount of code and has proven both very bug prone (see #8659, #8666, #8699, #8735, and #12150) and requires updating lots of code every time we add a field to volumes or plugins. In #10202 we introduce encoding improvements for the `Node` struct that allow a more minimal transformation. Apply this same approach to serializing `structs.CSIVolume` to API responses. Also, the original reasoning behind #8590 for plugins no longer holds because the counts are now denormalized within the state store, so we can simply remove this transformation entirely.	2022-04-15 14:29:34 -04:00
Tim Gross	b14e53e446	CSI: fix volume status prefix matching in CLI (#12584 ) The API for `CSIVolume.List` sorts by created index and not by ID, which breaks the logic for prefix matching in the `volume status` output when the prefix is also an exact match. Ensure that we're handling this case correctly.	2022-04-15 14:16:30 -04:00
Kevin Wang	c74c06746b	chore: redirects (#12560 ) Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-04-15 13:13:40 -04:00
Derek Strickland	4d3a0aae6d	heartbeat: Handle transitioning from disconnected to down (#12559 )	2022-04-15 09:47:45 -04:00
Derek Strickland	0891218ee9	system_scheduler: support disconnected clients (#12555 ) * structs: Add helper method for checking if alloc is configured to disconnect * system_scheduler: Add support for disconnected clients	2022-04-15 09:31:32 -04:00
Tim Gross	f5d8c636c7	CSI: handle per-alloc volumes in `alloc status -verbose` CLI (#12573 ) The Nomad client's `csi_hook` interpolates the alloc suffix with the volume request's name for CSI volumes with `per_alloc = true`, turning `example` into `example[1]`. We need to do this same behavior in the `alloc status` output so that we show the correct volume.	2022-04-15 09:26:19 -04:00
Seth Hoenig	e8b0b91418	Merge pull request #12579 from hashicorp/ci-missing-packages-oss ci: ensure package coverage of test-core	2022-04-15 08:11:41 -05:00
Lars Lehtonen	81bb1ef030	command/agent: check err before close (#12574 )	2022-04-15 08:54:03 -04:00
Seth Hoenig	47040391bb	ci: ensure package coverage of test-core	2022-04-14 19:04:06 -05:00
Michael Schurter	70a04dd106	docs: add plan for node rejected details and more (#12564 ) - Moved federation docs to the bottom since everyone is potentially affected by the other sections on the page, but only users of federation are affected by it. - Added section on the plan for node rejected bug since it is fairly easy to diagnose and removing affected nodes is a fairly reliable workaround. - Mention 5s cliff for wait_for_index. - Remove the lie that we do not have job status metrics! How old was that?! - Reinforce the importance of monitoring basic system resources	2022-04-14 16:09:33 -07:00
Tim Gross	d62dd5b3fe	E2E: add debugging outputs for disconnected clients test (#12572 ) This test has a failure that's happening only occassionally and not very reproducibly. Print out the allocation status on test failure so that we can do some post-mortum debugging of the test on nightly.	2022-04-14 17:03:57 -04:00
Tim Gross	267c056e0e	ui: remove beta tag from gutter menu for CSI (#12570 )	2022-04-14 14:56:04 -04:00
Tim Gross	82b65899a1	fix data race in dynamic plugin registry tests (#12554 ) These tests have a data race where the test assertion is reading a value that's being set in the `listenFunc` goroutines that are subscribing to registry update events. Move the assertion into the subscribing goroutine to remove the race. This bug was discovered in #12098 but does not impact production Nomad code.	2022-04-14 14:55:56 -04:00
Seth Hoenig	6d042340b4	Merge pull request #12543 from idrennanvmware/add-allocid-to-sidecar Add alloc_id to sidecar bootstrap	2022-04-14 13:27:09 -05:00
Luiz Aoqui	8b2ea6b61b	ci: fix backport target branch pattern (#12571 )	2022-04-14 14:12:41 -04:00
Seth Hoenig	a1c4f16cf1	connect: prefix tag with nomad.; merge into envoy_stats_tags; update docs This PR expands on the work done in #12543 to - prefix the tag, so it is now "nomad.alloc_id" to be more consistent with Consul tags - merge into pre-existing envoy_stats_tags fields - update the upgrade guide docs - update changelog	2022-04-14 12:52:52 -05:00
Ian Drennan	70bd32df83	Add alloc_id to sidecar bootstrap	2022-04-14 11:46:06 -05:00
Michael Schurter	7351f45672	test: test the buffered pipe used by nsd (#12563 ) Nomad Service Discovery uses an in-memory buffered pipe implementation to connect consul-template to the Nomad API. This adds a basic test for that helper functionality.	2022-04-14 08:38:25 -07:00
James Rasell	5a67866ae1	jobspec: add max_client_disconnect to hcl1 group parsing. (#12568 )	2022-04-14 14:56:58 +02:00
Derek Strickland	3f871973f9	Update E2E terraform output command (#12561 )	2022-04-13 16:46:09 -04:00
James Rasell	4cdc46ae75	service discovery: add pagination and filtering support to info requests (#12552 ) * services: add pagination and filter support to info RPC. * cli: add filter flag to service info command. * docs: add pagination and filter details to services info API. * paginator: minor updates to comment and func signature.	2022-04-13 07:41:44 +02:00
claire labry	d2a3fa1921	updates for backport assistant (#12311 )	2022-04-12 14:01:19 -04:00
Tim Gross	a135d9b260	CSI: fix data race in plugin manager (#12553 ) The plugin manager for CSI hands out instances of a plugin for callers that need to mount a volume. The `MounterForPlugin` method accesses the internal instances map without a lock, and can be called concurrently from outside the plugin manager's main run-loop. The original commit for the instances map included a warning that it needed to be accessed only from the main loop but that comment was unfortunately ignored shortly thereafter, so this bug has existed in the code for a couple years without being detected until we ran tests with `-race` in #12098. Lesson learned here: comments make for lousy enforcement of invariants!	2022-04-12 12:18:04 -04:00
Luiz Aoqui	82027edb2f	add some godocs for the API pagination tokenizer options (#12547 )	2022-04-12 10:27:22 -04:00
Tim Gross	4078e6ea0e	scripts: fix interpreter for bash (#12549 ) Many of our scripts have a non-portable interpreter line for bash and use bash-specific variables like `BASH_SOURCE`. Update the interpreter line to be portable between various Linuxes and macOS without complaint from posix shell users.	2022-04-12 10:08:21 -04:00
Tim Gross	31e72e93ff	E2E: fix flaky event stream test (#12548 ) This changeset fixes two sources of flakiness in the event stream test. First, the stream request gets the event closest to the index, not the exact match. Although events are written before raft entries they're written asynchronously, so it's possible to race and get a raft index from this query higher than the current head of the event buffer. Ensure the job is running before we try to get the index, so that we've given the event enough time to land in the buffer. Second, the assertion that the found index is greater than the start index is only true if the `PlanResult` event manages to land before we do the second registration. Although it should now with the first fix above, it's not a correct assertion for what we're testing.	2022-04-12 08:35:39 -04:00
Luiz Aoqui	bc78c8617f	ci: change notification channel to feed-nomad-releases (#12550 )	2022-04-11 19:12:58 -04:00
claire labry	76fc79ce46	move nomad.service out of etc (#12541 )	2022-04-11 18:26:10 -04:00
Seth Hoenig	f59488bda6	Merge pull request #12532 from greut/feat/remove-consul-lib feat: remove dependency to consul/lib	2022-04-11 13:52:05 -05:00
Karan Sharma	37c907a8d2	feat: add nomctx and nomad-events-sink (#12542 )	2022-04-11 14:47:03 -04:00
Yoan Blanc	3e79d58e4a	fix: use NewSafeTimer Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2022-04-11 19:37:14 +02:00
Tim Gross	77ab8d92f1	E2E: oversubscription assertion needs to wait for stats (#12540 ) The oversubscription test expects an output that requires the client has polled the task for stats at least once. Wait long enough to ensure that we've polled the stats before failing the test.	2022-04-11 11:40:51 -04:00
Tim Gross	c9c3cbd878	E2E: test for nodes disconnected by netsplit (#12407 )	2022-04-11 11:34:27 -04:00
Tim Gross	57b3a0028f	allocs without max_client_disconnect should be lost on disconnect (#12529 ) In the reconciler's filtering for tainted nodes, we use whether the server supports disconnected clients as a gate to a bunch of our logic, but this doesn't account for cases where the job doesn't have `max_client_disconnect`. The only real consequence of this appears to be that allocs on disconnected nodes are marked "complete" instead of "lost".	2022-04-11 11:24:49 -04:00
Seth Hoenig	fecf4b46eb	Merge pull request #12527 from fynxiu/plugins/drivers/ctxdone fix(plugins): should return when ctx.Done	2022-04-11 07:46:39 -05:00

... 7 8 9 10 11 ...

23319 commits