open-nomad

Author	SHA1	Message	Date
Seth Hoenig	d1bda4a954	ci: fixup task runner chroot test This PR is 2 fixes for the flaky TestTaskRunner_TaskEnv_Chroot test. And also the TestTaskRunner_Download_ChrootExec test. - Use TinyChroot to stop copying gigabytes of junk, which causes GHA to fail to create the environment in time. - Pre-create cgroups on V2 systems. Normally the cgroup directory is managed by the cpuset manager, but that is not active in taskrunner tests, so create it by hand in the test framework.	2022-04-19 10:37:46 -05:00
fyn	1174bc2052	fix(plugins): should return when ctx.Done	2022-04-09 01:04:29 +08:00
Seth Hoenig	52aaf86f52	raw_exec: make raw exec driver work with cgroups v2 This PR adds support for the raw_exec driver on systems with only cgroups v2. The raw exec driver is able to use cgroups to manage processes. This happens only on Linux, when exec_driver is enabled, and the no_cgroups option is not set. The driver uses the freezer controller to freeze processes of a task, issue a sigkill, then unfreeze. Previously the implementation assumed cgroups v1, and now it also supports cgroups v2. There is a bit of refactoring in this PR, but the fundamental design remains the same. Closes #12351 #12348	2022-04-04 16:11:38 -05:00
James Rasell	19281bb2fe	Merge pull request #12304 from th0m/tlefebvre/fix-wrong-drivernetworkmanager-interface fix: update incorrect DriverNetworkManager interface implementation	2022-04-04 11:29:22 +02:00
Seth Hoenig	2e5c6de820	client: enable support for cgroups v2 This PR introduces support for using Nomad on systems with cgroups v2 [1] enabled as the cgroups controller mounted on /sys/fs/cgroups. Newer Linux distros like Ubuntu 21.10 are shipping with cgroups v2 only, causing problems for Nomad users. Nomad mostly "just works" with cgroups v2 due to the indirection via libcontainer, but not so for managing cpuset cgroups. Before, Nomad has been making use of a feature in v1 where a PID could be a member of more than one cgroup. In v2 this is no longer possible, and so the logic around computing cpuset values must be modified. When Nomad detects v2, it manages cpuset values in-process, rather than making use of cgroup heirarchy inheritence via shared/reserved parents. Nomad will only activate the v2 logic when it detects cgroups2 is mounted at /sys/fs/cgroups. This means on systems running in hybrid mode with cgroups2 mounted at /sys/fs/cgroups/unified (as is typical) Nomad will continue to use the v1 logic, and should operate as before. Systems that do not support cgroups v2 are also not affected. When v2 is activated, Nomad will create a parent called nomad.slice (unless otherwise configured in Client conifg), and create cgroups for tasks using naming convention <allocID>-<task>.scope. These follow the naming convention set by systemd and also used by Docker when cgroups v2 is detected. Client nodes now export a new fingerprint attribute, unique.cgroups.version which will be set to 'v1' or 'v2' to indicate the cgroups regime in use by Nomad. The new cpuset management strategy fixes #11705, where docker tasks that spawned processes on startup would "leak". In cgroups v2, the PIDs are started in the cgroup they will always live in, and thus the cause of the leak is eliminated. [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html Closes #11289 Fixes #11705 #11773 #11933	2022-03-23 11:35:27 -05:00
James Rasell	1a4db3523d	Merge branch 'main' into tlefebvre/fix-wrong-drivernetworkmanager-interface	2022-03-17 09:38:13 +01:00
Thomas Lefebvre	c7fbf1089c	fix: update incorrect DriverNetworkManager interface implementation in plugins/drivers/client.go and drivers/mock/driver.go And add assertions to catch drifts at compilation time.	2022-03-15 11:51:01 -07:00
Seth Hoenig	2631659551	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
Seth Hoenig	6f0998fcad	testing: use a smaller chroot when running exec driver tests The default chroot copies all of /bin, /usr, etc. which can ammount to gigabytes of stuff not actually needed for running our tests. Use a smaller chroot in test cases so that CI infra with poor disk IO has a chance.	2022-03-09 16:24:07 -06:00
Tim Gross	1dad0e597e	fix integer bounds checks (#11815 ) * driver: fix integer conversion error The shared executor incorrectly parsed the user's group into int32 and then cast to uint32 without bounds checking. This is harmless because an out-of-bounds gid will throw an error later, but it triggers security and code quality scans. Parse directly to uint32 so that we get correct error handling. * helper: fix integer conversion error The autopilot flags helper incorrectly parses a uint64 to a uint which is machine specific size. Although we don't have 32-bit builds, this sets off security and code quality scaans. Parse to the machine sized uint. * driver: restrict bounds of port map The plugin server doesn't constrain the maximum integer for port maps. This could result in a user-visible misconfiguration, but it also triggers security and code quality scans. Restrict the bounds before casting to int32 and return an error. * cpuset: restrict upper bounds of cpuset values Our cpuset configuration expects values in the range of uint16 to match the expectations set by the kernel, but we don't constrain the values before downcasting. An underflow could lead to allocations failing on the client rather than being caught earlier. This also make security and code quality scanners happy. * http: fix integer downcast for per_page parameter The parser for the `per_page` query parameter downcasts to int32 without bounds checking. This could result in underflow and nonsensical paging, but there's no server-side consequences for this. Fixing this will silence some security and code quality scanners though.	2022-01-25 11:16:48 -05:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
Michael Schurter	fd68bbc342	test: update tests to properly use AllocDir Also use t.TempDir when possible.	2021-10-19 10:49:07 -07:00
Michael Schurter	10c3bad652	client: never embed alloc_dir in chroot Fixes #2522 Skip embedding client.alloc_dir when building chroot. If a user configures a Nomad client agent so that the chroot_env will embed the client.alloc_dir, Nomad will happily infinitely recurse while building the chroot until something horrible happens. The best case scenario is the filesystem's path length limit is hit. The worst case scenario is disk space is exhausted. A bad agent configuration will look something like this: ```hcl data_dir = "/tmp/nomad-badagent" client { enabled = true chroot_env { # Note that the source matches the data_dir "/tmp/nomad-badagent" = "/ohno" # ... } } ``` Note that `/ohno/client` (the state_dir) will still be created but not `/ohno/alloc` (the alloc_dir). While I cannot think of a good reason why someone would want to embed Nomad's client (and possibly server) directories in chroots, there should be no cause for harm. chroots are only built when Nomad runs as root, and Nomad disables running exec jobs as root by default. Therefore even if client state is copied into chroots, it will be inaccessible to tasks. Skipping the `data_dir` and `{client,server}.state_dir` is possible, but this PR attempts to implement the minimum viable solution to reduce risk of unintended side effects or bugs. When running tests as root in a vm without the fix, the following error occurs: ``` === RUN TestAllocDir_SkipAllocDir alloc_dir_test.go:520: Error Trace: alloc_dir_test.go:520 Error: Received unexpected error: Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long Test: TestAllocDir_SkipAllocDir --- FAIL: TestAllocDir_SkipAllocDir (22.76s) ``` Also removed unused Copy methods on AllocDir and TaskDir structs. Thanks to @eveld for not letting me forget about this!	2021-10-18 09:22:01 -07:00
Mahmood Ali	4d90afb425	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
James Rasell	0e926ef3fd	allow configuration of Docker hostnames in bridge mode (#11173 ) Add a new hostname string parameter to the network block which allows operators to specify the hostname of the network namespace. Changing this causes a destructive update to the allocation and it is omitted if empty from API responses. This parameter also supports interpolation. In order to have a hostname passed as a configuration param when creating an allocation network, the CreateNetwork func of the DriverNetworkManager interface needs to be updated. In order to minimize the disruption of future changes, rather than add another string func arg, the function now accepts a request struct along with the allocID param. The struct has the hostname as a field. The in-tree implementations of DriverNetworkManager.CreateNetwork have been modified to account for the function signature change. In updating for the change, the enhancement of adding hostnames to network namespaces has also been added to the Docker driver, whilst the default Linux manager does not current implement it.	2021-09-16 08:13:09 +02:00
Tim Gross	7bd61bbf43	docker: generate /etc/hosts file for bridge network mode (#10766 ) When `network.mode = "bridge"`, we create a pause container in Docker with no networking so that we have a process to hold the network namespace we create in Nomad. The default `/etc/hosts` file of that pause container is then used for all the Docker tasks that share that network namespace. Some applications rely on this file being populated. This changeset generates a `/etc/hosts` file and bind-mounts it to the container when Nomad owns the network, so that the container's hostname has an IP in the file as expected. The hosts file will include the entries added by the Docker driver's `extra_hosts` field. In this changeset, only the Docker task driver will take advantage of this option, as the `exec`/`java` drivers currently copy the host's `/etc/hosts` file and this can't be changed without breaking backwards compatibility. But the fields are available in the task driver protobuf for community task drivers to use if they'd like.	2021-06-16 14:55:22 -04:00
James Rasell	87413ff0cd	plugins: fix test data race.	2021-06-15 09:31:08 +02:00
Michael Schurter	e62795798d	core: propagate remote task handles Add a new driver capability: RemoteTasks. When a task is run by a driver with RemoteTasks set, its TaskHandle will be propagated to the server in its allocation's TaskState. If the task is replaced due to a down node or draining, its TaskHandle will be propagated to its replacement allocation. This allows tasks to be scheduled in remote systems whose lifecycles are disconnected from the Nomad node's lifecycle. See https://github.com/hashicorp/nomad-driver-ecs for an example ECS remote task driver.	2021-04-27 15:07:03 -07:00
Nick Ethier	110f982eb3	plugins/drivers: fix deprecated fields	2021-04-16 14:13:29 -04:00
Nick Ethier	db1f697fc0	plugins/driver: add cpuset_cpus back and mark cpuset_mems as reserved	2021-04-15 13:31:18 -04:00
Nick Ethier	fe283c5a8f	executor: add support for cpuset cgroup	2021-04-15 10:24:31 -04:00
Nick Ethier	155a2ca5fb	client/ar: thread through cpuset manager	2021-04-13 13:28:36 -04:00
Drew Bailey	82dbf08e9f	remove flakey exec test that tests non-deterministic docker behavior (#10291 )	2021-04-02 11:15:22 -04:00
Mahmood Ali	f44a04454d	oversubscription: driver/exec to honor MemoryMaxMB	2021-03-30 16:55:58 -04:00
Mahmood Ali	275feb5bec	oversubscription: docker to honor MemoryMaxMB values	2021-03-30 16:55:58 -04:00
Adrian Todorov	47e1cb11df	driver/docker: add extra labels ( job name, task and task group name)	2021-03-08 08:59:52 -05:00
Chris Baker	9b125b8837	update template and artifact interpolation to use client-relative paths resolves #9839 resolves #6929 resolves #6910 e2e: template env interpolation path testing	2021-01-04 22:25:34 +00:00
Kris Hicks	0cf9cae656	Apply some suggested fixes from staticcheck (#9598 )	2020-12-10 07:29:18 -08:00
Kris Hicks	0a3a748053	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Mahmood Ali	98c02851c8	use comment ignores (#9448 ) Use targetted ignore comments for the cases where we are bound by backward compatibility. I've left some file based linters, especially when the file is riddled with linter voilations (e.g. enum names), or if it's a property of the file (e.g. package and file names). I encountered an odd behavior related to RPC_REQUEST_RESPONSE_UNIQUE and RPC_REQUEST_STANDARD_NAME. Apparently, if they target a `stream` type, we must separate them into separate lines so that the ignore comment targets the type specifically.	2020-11-25 16:03:01 -05:00
Mahmood Ali	b2a8752c5f	honor task user when execing into raw_exec task (#9439 ) Fix #9210 . This update the executor so it honors the User when using nomad alloc exec. The bug was that the exec task didn't honor the init command when execing.	2020-11-25 09:34:10 -05:00
Nick Ethier	c9bd7e89ca	command: use correct port mapping syntax in examples	2020-11-23 10:25:30 -06:00
Kris Hicks	9d03cf4c5f	protos: Update .proto files not to use Go package name (#9301 ) Previously, it was required that you `go get github.com/hashicorp/nomad` to be able to build protos, as the protoc invocation added an include directive that pointed to `$GOPATH/src`, which is how dependent protos were discovered. As Nomad now uses Go modules, it won't necessarily be cloned to `$GOPATH`. (Additionally, if you _had_ go-gotten Nomad at some point, protoc compilation would have possibly used the _wrong_ protos, as those wouldn't necessarily be the most up-to-date ones.) This change modifies the proto files and the `protoc` invocation to handle discovering dependent protos via protoc plugin modifier statements that are specific to the protoc plugin being used. In this change, `make proto` was run to recompile the protos, which results in changes only to the gzipped `FileDescriptorProto`.	2020-11-10 08:42:35 -08:00
Michael Schurter	9c3972937b	s/0.13/1.0/g 1.0 here we come!	2020-10-14 15:17:47 -07:00
Yoan Blanc	891accb89a	use allow/deny instead of the colored alternatives (#9019 ) Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-10-12 08:47:05 -04:00
Mahmood Ali	567597e108	Compare to the correct host network setting In systemd-resolved hosts with no DNS customizations, the docker driver DNS setting should be compared to /run/systemd/resolve/resolv.conf while exec/java drivers should be compared to /etc/resolv.conf. When system-resolved is enabled, /etc/resolv.conf is a stub that points to 127.0.0.53. Docker avoids this stub because this address isn't accessible from the container. The exec/java drivers that don't create network isolations use the stub though in the default configuration.	2020-10-01 10:23:14 -04:00
Nick Ethier	e39574be59	docker: support group allocated ports and host_networks (#8623 ) * docker: support group allocated ports * docker: add new ports driver config to specify which group ports are mapped * docker: update port mapping docs	2020-08-11 18:30:22 -04:00
Nick Ethier	0bc0403cc3	Task DNS Options (#7661 ) Co-Authored-By: Tim Gross <tgross@hashicorp.com> Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>	2020-06-18 11:01:31 -07:00
Mahmood Ali	2588b3bc98	cleanup driver eventor goroutines This fixes few cases where driver eventor goroutines are leaked during normal operations, but especially so in tests. This change makes few modifications: First, it switches drivers to use `Context`s to manage shutdown events. Previously, it relied on callers invoking `.Shutdown()` function that is specific to internal drivers only and require casting. Using `Contexts` provide a consistent idiomatic way to manage lifecycle for both internal and external drivers. Also, I discovered few places where we don't clean up a temporary driver instance in the plugin catalog code, where we dispense a driver to inspect and validate the schema config without properly cleaning it up.	2020-05-26 11:04:04 -04:00
Tim Gross	aa8927abb4	volumes: return better error messages for unsupported task drivers (#8030 ) When an allocation runs for a task driver that can't support volume mounts, the mounting will fail in a way that can be hard to understand. With host volumes this usually means failing silently, whereas with CSI the operator gets inscrutable internals exposed in the `nomad alloc status`. This changeset adds a MountConfig field to the task driver Capabilities response. We validate this when the `csi_hook` or `volume_hook` fires and return a user-friendly error. Note that we don't currently have a way to get driver capabilities up to the server, except through attributes. Validating this when the user initially submits the jobspec would be even better than what we're doing here (and could be useful for all our other capabilities), but that's out of scope for this changeset. Also note that the MountConfig enum starts with "supports all" in order to support community plugins in a backwards compatible way, rather than cutting them off from volume mounting unexpectedly.	2020-05-21 09:18:02 -04:00
Anthony Scalisi	9664c6b270	fix spelling errors (#6985 )	2020-04-20 09:28:19 -04:00
Yoan Blanc	225c9c1215	fixup! vendor: explicit use of hashicorp/go-msgpack Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-31 09:48:07 -04:00
Yoan Blanc	761d014071	vendor: explicit use of hashicorp/go-msgpack Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-03-31 09:45:21 -04:00
Chris Baker	d6364e13bc	fix typo in comment	2020-03-13 09:09:46 -05:00
Mahmood Ali	88cfe504a0	update grpc Upgrade grpc to v1.27.1 and protobuf plugins to v1.3.4.	2020-03-03 08:39:54 -05:00
Mahmood Ali	0b7085ba3a	driver: allow disabling log collection Operators commonly have docker logs aggregated using various tools and don't need nomad to manage their docker logs. Worse, Nomad uses a somewhat heavy docker api call to collect them and it seems to cause problems when a client runs hundreds of log collections. Here we add a knob to disable log aggregation completely for nomad. When log collection is disabled, we avoid running logmon and docker_logger for the docker tasks in this implementation. The downside here is once disabled, `nomad logs ...` commands and API no longer return logs and operators must corrolate alloc-ids with their aggregated log info. This is meant as a stop gap measure. Ideally, we'd follow up with at least two changes: First, we should optimize behavior when we can such that operators don't need to disable docker log collection. Potentially by reverting to using pre-0.9 syslog aggregation in linux environments, though with different trade-offs. Second, when/if logs are disabled, nomad logs endpoints should lookup docker logs api on demand. This ensures that the cost of log collection is paid sparingly.	2019-12-08 14:15:03 -05:00
Danielle Lancashire	4fbcc668d0	volumes: Add support for mount propagation This commit introduces support for configuring mount propagation when mounting volumes with the `volume_mount` stanza on Linux targets. Similar to Kubernetes, we expose 3 options for configuring mount propagation: - private, which is equivalent to `rprivate` on Linux, which does not allow the container to see any new nested mounts after the chroot was created. - host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts that have been created _outside of the container_ to be visible inside the container after the chroot is created. - bidirectional, which is equivalent to `rshared` on Linux, which allows both the container to see new mounts created on the host, but importantly _allows the container to create mounts that are visible in other containers an don the host_ private and host-to-task are safe, but bidirectional mounts can be dangerous, as if the code inside a container creates a mount, and does not clean it up before tearing down the container, it can cause bad things to happen inside the kernel. To add a layer of safety here, we require that the user has ReadWrite permissions on the volume before allowing bidirectional mounts, as a defense in depth / validation case, although creating mounts should also require a priviliged execution environment inside the container.	2019-10-14 14:09:58 +02:00
Tim Gross	d965a15490	driver/networking: don't recreate existing network namespaces	2019-09-25 14:58:17 -04:00
Grégoire Delattre	c6ac788258	Fix the ExecTask function in DriverExecTaskNotSupported (#6145 ) This fixes the ExecTask definition to match with the DriverPlugin interface.	2019-08-29 11:36:29 -04:00
Lucas BEE	35a1b72962	Add NetworkIsolation in TaskConfig (#6135 ) NetworkIsolation was left out of the task config when using an external task driver plugin	2019-08-15 13:05:55 -04:00

1 2 3 4

190 commits