open-nomad

Commit Graph

Author	SHA1	Message	Date
James Rasell	4b9bcf94da	chore: remove use of "err" a log line context key for errors. (#14433 ) Log lines which include an error should use the full term "error" as the context key. This provides consistency across the codebase and avoids a Go style which operators might not be aware of.	2022-09-01 15:06:10 +02:00
Seth Hoenig	0d97a94814	drivers/java: add parsing test case for corretto 17	2022-08-24 09:16:38 -05:00
Tim Gross	c763c4cb96	remove pre-0.9 driver code and related E2E test (#12791 ) This test exercises upgrades between 0.8 and Nomad versions greater than 0.9. We have not supported 0.8.x in a very long time and in any case the test has been marked to skip because the downloader doesn't work.	2022-04-27 09:53:37 -04:00
Seth Hoenig	2e5c6de820	client: enable support for cgroups v2 This PR introduces support for using Nomad on systems with cgroups v2 [1] enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems for Nomad users. Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer, but not so for managing cpuset cgroups. Before, Nomad has been making use of a feature in v1 where a PID could be a member of more than one cgroup. In v2 this is no longer possible, and so the logic around computing cpuset values must be modified. When Nomad detects v2, it manages cpuset values in-process, rather than making use of cgroup heirarchy inheritence via shared/reserved parents. Nomad will only activate the v2 logic when it detects cgroups2 is mounted at /sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2 mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to use the v1 logic, and should operate as before. Systems that do not support cgroups v2 are also not affected. When v2 is activated, Nomad will create a parent called nomad.slice (unless otherwise configured in Client conifg), and create cgroups for tasks using naming convention <allocID>-<task>.scope. These follow the naming convention set by systemd and also used by Docker when cgroups v2 is detected. Client nodes now export a new fingerprint attribute, unique.cgroups.version which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by Nomad. The new cpuset management strategy fixes #11705, where docker tasks that spawned processes on startup would "leak". In cgroups v2, the PIDs are started in the cgroup they will always live in, and thus the cause of the leak is eliminated. [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html Closes #11289 Fixes #11705 #11773 #11933	2022-03-23 11:35:27 -05:00
Seth Hoenig	2631659551	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
Seth Hoenig	8f493cfa89	client/fingerprint/java: improve java version string regex matching This PR improves the regular expression used for matching the java version string, which varies a lot depending on the java vendor and version. These are the example strings we now test for: java version "1.7.0_80" openjdk version "11.0.1" 2018-10-16 openjdk version "11.0.1" 2018-10-16 java version "1.6.0_36" openjdk version "1.8.0_192" openjdk 11.0.11 2021-04-20 LTS The last one is a new test added on behalf of #6081, which is still broken on today's CentOS 7 default JDK package. openjdk 11.0.11 2021-04-20 LTS OpenJDK Runtime Environment 18.9 (build 11.0.11+9-LTS) OpenJDK 64-Bit Server VM 18.9 (build 11.0.11+9-LTS, mixed mode, sharing) ==> Evaluation "21c6caf7" finished with status "complete" but failed to place all allocations: Task Group "example" (failed to place 1 allocation): * Constraint "${driver.java.version} >= 11.0.0": 1 nodes excluded by filter Evaluation "2b737d48" waiting for additional capacity to place remainder Fixes #6081	2021-06-15 14:15:01 -05:00
Seth Hoenig	fe9258b754	drivers/exec: pass capabilities through executor RPC Add capabilities to the LaunchRequest proto so that the capabilities set actually gets plumbed all the way through to task launch.	2021-05-17 12:37:40 -06:00
Seth Hoenig	87c96eed11	drivers/docker: reuse capabilities plumbing in docker driver This changeset does not introduce any functional change for the docker driver, but rather cleans up the implementation around computing configured capabilities by re-using code written for the exec/java task drivers.	2021-05-17 12:37:40 -06:00
Seth Hoenig	2361a91938	drivers/java: enable setting allow_caps on java driver Enable setting allow_caps on the java task driver plugin, along with the associated cap_add and cap_drop options in java task configuration.	2021-05-17 12:37:40 -06:00
Nick Ethier	b6b74a98a9	client/fingerprint: move existing cgroup concerns to cgutil	2021-04-13 13:28:36 -04:00
Yoan Blanc	ac0d5d8bd3	chore: bump golangci-lint from v1.24 to v1.39 Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2021-04-03 09:50:23 +02:00
Seth Hoenig	8ee9835923	drivers/exec+java: Add task configuration to restore previous PID/IPC isolation behavior This PR adds pid_mode and ipc_mode options to the exec and java task driver config options. By default these will defer to the default_pid_mode and default_ipc_mode agent plugin options created in #9969. Setting these values to "host" mode disables isolation for the task. Doing so is not recommended, but may be necessary to support legacy job configurations. Closes #9970	2021-02-08 14:26:35 -06:00
Seth Hoenig	152534fe21	docs: fixup comments, var names	2021-02-08 10:58:44 -06:00
Seth Hoenig	4bc6e5a215	drivers/exec+java: Add configuration to restore previous PID/IPC namespace behavior. This PR adds default_pid_mode and default_ipc_mode options to the exec and java task drivers. By default these will default to "private" mode, enabling PID and IPC isolation for tasks. Setting them to "host" mode disables isolation. Doing so is not recommended, but may be necessary to support legacy job configurations. Closes #9969	2021-02-05 15:52:11 -06:00
Charlie Voiselle	16b6098df8	Fix for Java fingerprinter on macOS (#9225 ) Use alternative test for macOS JVM with /usr/libexec/java_home	2020-11-01 13:20:31 -05:00
Mahmood Ali	91376cccf2	tests: failover to copying when symlinking fails Symlinking busybox may fail when the test code and the test temporary directory live on different volumes/partitions; so we should copy instead. This situation arises in the Vagrant setup, where the code repository live on special file sharing volume. Somewhat unrelated, remove `f.Sync()` invocation from a test copyFile helper function. Sync is useful only for crash recovery, and isn't necessary in our test setup. The sync invocation is a significant overhead as it requires the OS to flush any cached writes to disk.	2020-09-30 09:58:22 -04:00
Nick Ethier	1e4ea699ad	fix test failures from rebase	2020-06-18 11:05:32 -07:00
Nick Ethier	0bc0403cc3	Task DNS Options (#7661 ) Co-Authored-By: Tim Gross <tgross@hashicorp.com> Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>	2020-06-18 11:01:31 -07:00
Mahmood Ali	2588b3bc98	cleanup driver eventor goroutines This fixes few cases where driver eventor goroutines are leaked during normal operations, but especially so in tests. This change makes few modifications: First, it switches drivers to use `Context`s to manage shutdown events. Previously, it relied on callers invoking `.Shutdown()` function that is specific to internal drivers only and require casting. Using `Contexts` provide a consistent idiomatic way to manage lifecycle for both internal and external drivers. Also, I discovered few places where we don't clean up a temporary driver instance in the plugin catalog code, where we dispense a driver to inspect and validate the schema config without properly cleaning it up.	2020-05-26 11:04:04 -04:00
Tim Gross	aa8927abb4	volumes: return better error messages for unsupported task drivers (#8030 ) When an allocation runs for a task driver that can't support volume mounts, the mounting will fail in a way that can be hard to understand. With host volumes this usually means failing silently, whereas with CSI the operator gets inscrutable internals exposed in the `nomad alloc status`. This changeset adds a MountConfig field to the task driver Capabilities response. We validate this when the `csi_hook` or `volume_hook` fires and return a user-friendly error. Note that we don't currently have a way to get driver capabilities up to the server, except through attributes. Validating this when the user initially submits the jobspec would be even better than what we're doing here (and could be useful for all our other capabilities), but that's out of scope for this changeset. Also note that the MountConfig enum starts with "supports all" in order to support community plugins in a backwards compatible way, rather than cutting them off from volume mounting unexpectedly.	2020-05-21 09:18:02 -04:00
Mahmood Ali	aa1c83871b	drivers: always initialize taskHandle.logger Looks like the RecoverTask doesn't set taskHandle.logger field causing a panic when the handle attempts to log (e.g. when Shutdown or Signaling fails).	2019-11-22 10:44:59 -05:00
Mahmood Ali	ea221cfe87	always destroy	2019-11-18 21:31:29 -05:00
Tim Gross	9efca131be	driver/java: pass task network isolation to executor Without passing the network isolation configuration to the executor, java tasks are not placed in the same network namespace as the other processes in their task group, which breaks Consul Connect.	2019-09-27 08:26:54 -04:00
Nick Ethier	53d3ea8ebd	driver: set correct network isolation caps for exec and java dr… (#6368 )	2019-09-25 11:48:14 -04:00
Mahmood Ali	dd8762e348	typo: "atleast" -> "at least"	2019-05-13 10:01:19 -04:00
Mahmood Ali	a4640db7a6	drivers: implement streaming exec for executor based drivers These simply delegate call to backend executor.	2019-05-10 19:17:14 -04:00
Mahmood Ali	b2b7618a1c	clarify unknown signal log line (#5466 )	2019-03-25 17:19:43 -04:00
Mahmood Ali	8c82c19831	tests: IsTravis() -> IsCI() Replace IsTravis() references that is intended for more CI environments rather than for Travis environment specifically.	2019-02-20 08:21:03 -05:00
Mahmood Ali	f7102cd01d	tests: add hcl task driver config parsing tests (#5314 ) * drivers: add config parsing tests Add basic tests for parsing and encoding task config. * drivers/docker: fix some config declarations * refactor and document config parse helpers	2019-02-12 14:46:37 -05:00
Mahmood Ali	aec9120994	drivers/java: restore 0.8.7 java version detection (#5317 ) Restore 0.8.x behavior where java driver is marked as detected when `java -version` exits with 0 but returns unexpected output. Furthermore, we restore behavior when `java -version` where we parse the first three lines of `java -version` but ignore rest. If `java -version` returns less than 3 lines, Nomad 0.8.7 would panic. In this implementation, we'd still mark java as detected but returns empty version. The 0.8.7 logic for detecting java version is found in https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/java.go#L132-L172 . I punt on revamping how we can be more resilient to java -version syntax, and aimed for preserving existing behavior instead.	2019-02-12 13:41:26 -05:00
Nick Ethier	e7ea26449e	client: fix bug during 0.8 state up grade that causes external drivers to fail	2019-01-30 14:22:29 -05:00
Nick Ethier	be976d9c9a	Merge branch 'master' into f-driver-upgradepath-test * master: (23 commits) tests: avoid assertion in goroutine spell check ci: run checkscripts tests: deflake TestRktDriver_StartWaitRecoverWaitStop drivers/rkt: Remove unused github.com/rkt/rkt drivers/rkt: allow development on non-linux cli: Hide `nomad docker_logger` from help output api: test api and structs are in sync goimports until make check is happy nil check node resources to prevent panic tr: use context in as select statement move pluginutils -> helper/pluginutils vet goimports gofmt Split hclspec move hclutils Driver tests do not use hcl2/hcl, hclspec, or hclutils move reattach config loader and singleton ...	2019-01-23 21:01:24 -05:00
Nick Ethier	5b9013528e	drivers: add docker upgrade path and e2e test	2019-01-23 14:44:42 -05:00
Michael Schurter	32daa7b47b	goimports until make check is happy	2019-01-23 06:27:14 -08:00
Michael Schurter	be0bab7c3f	move pluginutils -> helper/pluginutils I wanted a different color bikeshed, so I get to paint it	2019-01-22 15:50:08 -08:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Alex Dadgar	72a5691897	Driver tests do not use hcl2/hcl, hclspec, or hclutils	2019-01-22 15:43:34 -08:00
Alex Dadgar	b2c7268843	move reattach config	2019-01-22 15:11:58 -08:00
Alex Dadgar	cdcd3c929c	loader and singleton	2019-01-22 15:11:57 -08:00
Nick Ethier	b840a2eb7b	drivers: fix func naming	2019-01-18 18:31:02 -05:00
Nick Ethier	e3c6f89b9a	drivers: use consts for task handle version	2019-01-18 18:31:01 -05:00
Nick Ethier	9dd4eb3581	drivers: add upgrade path for rawexec, java, rkt and qemu	2019-01-18 18:31:01 -05:00
Nick Ethier	05bd369d1f	driver: add pre09 migration logic	2019-01-18 18:31:01 -05:00
Nick Ethier	9eabf0f2c7	drivers: fix func naming	2019-01-16 22:01:18 -05:00
Nick Ethier	80c90ef745	drivers: use consts for task handle version	2019-01-16 21:52:31 -05:00
Nick Ethier	ae77fbbe28	drivers: add upgrade path for rawexec, java, rkt and qemu	2019-01-16 11:19:25 -05:00
Nick Ethier	f626b6bf3d	driver: add pre09 migration logic	2019-01-15 16:57:09 -05:00
Alex Dadgar	471fdb3ccf	Merge pull request #5173 from hashicorp/b-log-levels Plugins use parent loggers	2019-01-14 16:14:30 -08:00
Nick Ethier	c619e70d39	Merge pull request #5018 from hashicorp/f-executor-stats executor: streaming stats api	2019-01-14 15:02:35 -05:00
Nick Ethier	9fea54e0dc	executor: implement streaming stats API plugins/driver: update driver interface to support streaming stats client/tr: use streaming stats api TODO: * how to handle errors and closed channel during stats streaming * prevent tight loop if Stats(ctx) returns an error drivers: update drivers TaskStats RPC to handle streaming results executor: better error handling in stats rpc docker: better control and error handling of stats rpc driver: allow stats to return a recoverable error	2019-01-12 12:18:22 -05:00

1 2

85 Commits