Commit graph

198 commits

Author SHA1 Message Date
Mahmood Ali a89da9982d
raw_exec: don't use cgroups when no_cgroup is set (#9328)
When raw_exec is configured with [`no_cgroups`](https://www.nomadproject.io/docs/drivers/raw_exec#no_cgroups), raw_exec shouldn't attempt to create a cgroup.

Prior to this change, we accidentally always required freezer cgroup to do stats PID tracking. We already have the proper fallback in place for metrics, so only need to ensure that we don't create a cgroup for the task.

Fixes https://github.com/hashicorp/nomad/issues/8565
2020-11-11 16:20:34 -05:00
Mahmood Ali 2d4634bcc3
Merge pull request #9304 from hashicorp/b-legacy-executors-are-executors
Legacy executors are executors after all
2020-11-10 12:54:03 -05:00
Kris Hicks 9d03cf4c5f
protos: Update .proto files not to use Go package name (#9301)
Previously, it was required that you `go get github.com/hashicorp/nomad` to be
able to build protos, as the protoc invocation added an include directive that
pointed to `$GOPATH/src`, which is how dependent protos were discovered. As
Nomad now uses Go modules, it won't necessarily be cloned to `$GOPATH`.
(Additionally, if you _had_ go-gotten Nomad at some point, protoc compilation
would have possibly used the _wrong_ protos, as those wouldn't necessarily be
the most up-to-date ones.)

This change modifies the proto files and the `protoc` invocation to handle
discovering dependent protos via protoc plugin modifier statements that are
specific to the protoc plugin being used.

In this change, `make proto` was run to recompile the protos, which results in
changes only to the gzipped `FileDescriptorProto`.
2020-11-10 08:42:35 -08:00
Mahmood Ali ac185b41e2 Legacy executors are executors after all
This fixes a bug where pre-0.9 executors fail to recover after an
upgrade.

The bug is that legacyExecutorWrappers didn't get updated with
ExecStreaming function, and thus failed to implement the Executor
function. Sadly, this meant that all recovery attempts fail, as the
runtime check in
b312aacbc9/drivers/shared/executor/utils.go (L103-L110)
.
2020-11-10 10:20:07 -05:00
Mahmood Ali f4b0aa0c1c tests: copy permissions when copying files
On the failover path, copy the permission bits (a.k.a. file mode),
specially the execution bit.
2020-10-01 10:23:14 -04:00
Mahmood Ali cd060db42a tests: ignore empty cgroup
My latest Vagrant box contains an empty cgroup name that isn't used for
isolation:

```
$ cat /proc/self/cgroup  | grep ::
0::/user.slice/user-1000.slice/session-17.scope
```
2020-10-01 10:23:13 -04:00
Mahmood Ali 91376cccf2 tests: failover to copying when symlinking fails
Symlinking busybox may fail when the test code and the test temporary
directory live on different volumes/partitions; so we should copy
instead.  This situation arises in the Vagrant setup, where the code
repository live on special file sharing volume.

Somewhat unrelated, remove `f.Sync()` invocation from a test copyFile
helper function.  Sync is useful only for crash recovery, and isn't
necessary in our test setup.  The sync invocation is a significant
overhead as it requires the OS to flush any cached writes to disk.
2020-09-30 09:58:22 -04:00
Mahmood Ali d4f385d6e1
Upgrade to golang 1.15 (#8858)
Upgrade to golang 1.15

Starting with golang 1.5, setting Ctty value result in `Setctty set but Ctty not valid in child` error, as part of https://github.com/golang/go/issues/29458 .
This commit lifts the fix in https://github.com/creack/pty/pull/97 .
2020-09-09 15:59:29 -04:00
Shengjing Zhu 7a4f48795d Adjust cgroup change in libcontainer 2020-08-20 00:31:07 +08:00
Mahmood Ali 1fcc7970e4 tests: ensure that test is long enough to configure cgroups 2020-05-31 10:42:06 -04:00
Mahmood Ali 88cfe504a0 update grpc
Upgrade grpc to v1.27.1 and protobuf plugins to v1.3.4.
2020-03-03 08:39:54 -05:00
Thomas Lefebvre 84baa950ce client: support no_pivot_root in exec driver configuration 2020-02-18 09:27:16 -08:00
Mahmood Ali ac80d62c84 Pass stats interval colleciton to executor
This fixes a bug where executor based drivers emit stats every second,
regardless of user configuration.

When serializing the Stats request across grpc, the nomad agent dropped
the Interval value, and then executor uses 1s as a default value.
2020-01-31 14:17:15 -05:00
Mahmood Ali d80ae6765b simplify cgroup path lookup 2019-12-11 12:43:25 -05:00
Mahmood Ali 94ab62dfb4 executor: stop joining executor to container cgroup
Stop joining libcontainer executor process into the newly created task
container cgroup, to ensure that the cgroups are fully destroyed on
shutdown, and to make it consistent with other plugin processes.

Previously, executor process is added to the container cgroup so the
executor process resources get aggregated along with user processes in
our metric aggregation.

However, adding executor process to container cgroup adds some
complications with much benefits:

First, it complicates cleanup.  We must ensure that the executor is
removed from container cgroup on shutdown.  Though, we had a bug where
we missed removing it from the systemd cgroup.  Because executor uses
`containerState.CgroupPaths` on launch, which includes systemd, but
`cgroups.GetAllSubsystems` which doesn't.

Second, it may have advese side-effects.  When a user process is cpu
bound or uses too much memory, executor should remain functioning
without risk of being killed (by OOM killer) or throttled.

Third, it is inconsistent with other drivers and plugins.  Logmon and
DockerLogger processes aren't in the task cgroups.  Neither are
containerd processes, though it is equivalent to executor in
responsibility.

Fourth, in my experience when executor process moves cgroup while it's
running, the cgroup aggregation is odd.  The cgroup
`memory.usage_in_bytes` doesn't seem to capture the full memory usage of
the executor process and becomes a red-harring when investigating memory
issues.

For all the reasons above, I opted to have executor remain in nomad
agent cgroup and we can revisit this when we have a better story for
plugin process cgroup management.
2019-12-11 11:28:09 -05:00
Mahmood Ali 739e5e8811 drivers/exec: test all cgroups are destroyed 2019-12-11 11:12:29 -05:00
Danielle Lancashire 4fbcc668d0
volumes: Add support for mount propagation
This commit introduces support for configuring mount propagation when
mounting volumes with the `volume_mount` stanza on Linux targets.

Similar to Kubernetes, we expose 3 options for configuring mount
propagation:

- private, which is equivalent to `rprivate` on Linux, which does not allow the
           container to see any new nested mounts after the chroot was created.

- host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts
                that have been created _outside of the container_ to be visible
                inside the container after the chroot is created.

- bidirectional, which is equivalent to `rshared` on Linux, which allows both
                 the container to see new mounts created on the host, but
                 importantly _allows the container to create mounts that are
                 visible in other containers an don the host_

private and host-to-task are safe, but bidirectional mounts can be
dangerous, as if the code inside a container creates a mount, and does
not clean it up before tearing down the container, it can cause bad
things to happen inside the kernel.

To add a layer of safety here, we require that the user has ReadWrite
permissions on the volume before allowing bidirectional mounts, as a
defense in depth / validation case, although creating mounts should also require
a priviliged execution environment inside the container.
2019-10-14 14:09:58 +02:00
Nick Ethier 8b881d83d5
executor: rename wrapNetns to withNetworkIsolation 2019-09-30 21:38:31 -04:00
Nick Ethier 5127caef11
comment wrapNetns 2019-09-30 12:06:52 -04:00
Nick Ethier 67ac161565
executor: removed unused field from exec_utils.go 2019-09-30 11:57:34 -04:00
Nick Ethier 6fd773eb88
executor: run exec commands in netns if set 2019-09-30 11:50:22 -04:00
Nick Ethier 533b2850fc
executor: cleanup netns handling in executor 2019-07-31 01:04:05 -04:00
Nick Ethier b8a1ebb3b7
executor: support network namespacing on universal executor 2019-07-31 01:03:58 -04:00
Nick Ethier 971c8c9c2b
Driver networking support
Adds support for passing network isolation config into drivers and
implements support in the rawexec driver as a proof of concept
2019-07-31 01:03:20 -04:00
Lang Martin 1e33da5fd1 executor_universal_linux log a link to the docs on cgroup error 2019-07-24 12:37:33 -04:00
Lang Martin a1d496c05c executor_universal_linux raw_exec cgroup failure is not fatal 2019-07-22 15:16:36 -04:00
Lang Martin a0fe1ffdd5 default e.getAllPids in executor_basic 2019-07-18 10:57:27 -04:00
Lang Martin 9d0c0c459d executor_unix and _windows stub getAllPids ByScanning 2019-07-17 17:34:06 -04:00
Lang Martin e071f6b022 executor_universal_linux getAllPids chooses cgroup when available 2019-07-17 17:33:55 -04:00
Lang Martin e1bab541ad executor use e.getAllPids() 2019-07-17 17:33:11 -04:00
Lang Martin 18597c4917 resource_container_linux new getAllPidsByCgroup 2019-07-17 17:31:36 -04:00
Lang Martin 2e981a812e pid_collector getAllPids -> getAllPidsByScanning 2019-07-17 17:31:20 -04:00
Mahmood Ali ac64509c59 comment on use of init() for plugin handlers 2019-06-18 20:54:55 -04:00
Mahmood Ali 962921f86c Use init to handle plugin invocation
Currently, nomad "plugin" processes (e.g. executor, logmon, docker_logger) are started as CLI
commands to be handled by command CLI framework.  Plugin launchers use
`discover.NomadBinary()` to identify the binary and start it.

This has few downsides: The trivial one is that when running tests, one
must re-compile the nomad binary as the tests need to invoke the nomad
executable to start plugin.  This is frequently overlooked, resulting in
puzzlement.

The more significant issue with `executor` in particular is in relation
to external driver:

* Plugin must identify the path of invoking nomad binary, which is not
trivial; `discvoer.NomadBinary()` now returns the path to the plugin
rather than to nomad, preventing external drivers from launching
executors.

* The external driver may get a different version of executor than it
expects (specially if we make a binary incompatible change in future).

This commit addresses both downside by having the plugin invocation
handling through an `init()` call, similar to how libcontainer init
handler is done in [1] and recommened by libcontainer [2].  `init()`
will be invoked and handled properly in tests and external drivers.

For external drivers, this change will cause external drivers to launch
the executor that's compiled against.

There a are a couple of downsides to this approach:
* These specific packages (i.e executor, logmon, and dockerlog) need to
be careful in use of `init()`, package initializers.  Must avoid having
command execution rely on any other init in the package.  I prefixed
files with `z_` (golang processes files in lexical order), but ensured
we don't depend on order.
* The command handling is spread in multiple packages making it a bit
less obvious how plugin starts are handled.

[1] drivers/shared/executor/libcontainer_nsenter_linux.go
[2] eb4aeed24f/libcontainer (using-libcontainer)
2019-06-13 16:48:01 -04:00
Mahmood Ali 5734c8a648 update comment 2019-06-11 13:00:26 -04:00
Mahmood Ali f7608c4cef exec: use an independent name=systemd cgroup path
We aim for containers to be part of a new cgroups hierarchy independent
from nomad agent.  However, we've been setting a relative path as
libcontainer `cfg.Cgroups.Path`, which makes libcontainer concatinate
the executor process cgroup with passed cgroup, as set in [1].

By setting an absolute path, we ensure that all cgroups subsystem
(including `name=systemd` get a dedicated one).  This matches behavior
in Nomad 0.8, and behavior of how Docker and OCI sets CgroupsPath[2]

Fixes #5736

[1] d7edf9b2e4/vendor/github.com/opencontainers/runc/libcontainer/cgroups/fs/apply_raw.go (L326-L340)
[2] 238f8eaa31/vendor/github.com/containerd/containerd/oci/spec.go (L229)
2019-06-10 22:00:12 -04:00
Mahmood Ali cb554a015f Fix test comparisons 2019-05-24 21:38:22 -05:00
Mahmood Ali 99637c8bbc Test for expected capabilities specifically 2019-05-24 16:07:05 -05:00
Mahmood Ali 7455c746aa use /bin/bash 2019-05-24 14:50:23 -04:00
Mahmood Ali 68813def56 special case root capabilities 2019-05-24 14:10:10 -04:00
Mahmood Ali 01d5c90cbb tests: Fix binary dir permissions 2019-05-24 11:31:12 -04:00
Mahmood Ali 00081b15d6 fix 2019-05-20 15:30:07 -04:00
Mahmood Ali 807e7b90e0 drivers/exec: Restore 0.8 capabilities
Nomad 0.9 incidentally set effective capabilities that is higher than
what's expected of a `nobody` process, and what's set in 0.8.

This change restores the capabilities to ones used in Nomad 0.9.
2019-05-20 13:11:29 -04:00
Lang Martin 0256cf700d
Merge pull request #5649 from hashicorp/b-lookup-exe-chroot
lookup executables inside chroot
2019-05-17 15:07:41 -04:00
Mahmood Ali b4df061fef use pty/tty terminology similar to github.com/kr/pty 2019-05-10 19:17:14 -04:00
Mahmood Ali 3055fd53df executors: implement streaming exec
Implements streamign exec handling in both executors (i.e. universal and
libcontainer).

For creation of TTY, some incidental complexity leaked in.  The universal
executor uses github.com/kr/pty for creation of TTYs.

On the other hand, libcontainer expects a console socket and for libcontainer to
create the underlying console object on process start.  The caller can then use
`libcontainer.utils.RecvFd()` to get tty master end.

I chose github.com/kr/pty for managing TTYs here.  I tried
`github.com/containerd/console` package (which is already imported), but the
package did not work as expected on macOS.
2019-05-10 19:17:14 -04:00
Mahmood Ali 085d2ef759 executor: scaffolding for executor grpc handling
Prepare executor to handle streaming exec API calls that reuse drivers protobuf
structs.
2019-05-10 19:17:14 -04:00
Lang Martin 99359d7fbe executor_linux only do path resolution in the taskDir, not local
split out lookPathIn to show it's similarity to exec.LookPath
2019-05-10 11:33:35 -04:00
Lang Martin 3ae276cfd2 executor_linux_test call lookupTaskBin with an ExecCommand 2019-05-08 10:01:51 -04:00
Lang Martin 743a2a2875 executor_linux pass the command to lookupTaskBin to get path 2019-05-08 10:01:20 -04:00
Lang Martin 8db3fe047c executor/* Launch log at top of Launch is more explicit, trace 2019-05-07 17:01:05 -04:00
Lang Martin 87585e950d move lookupTaskBin to executor_linux, for os dependency clarity 2019-05-07 16:58:27 -04:00
Lang Martin de807a410a driver_test leave cat in the test, but add cat to the chroot 2019-05-07 16:14:01 -04:00
Lang Martin 1e5d851d23 executor_test cleanup old lookupBin tests 2019-05-04 10:21:59 -04:00
Lang Martin c0741e392d executor lookupTaskBin also does PATH expansion, anchored in taskDIR 2019-05-03 16:22:09 -04:00
Lang Martin 1619d3e3cb executor_linux_test test PATH lookup inside the container 2019-05-03 16:21:58 -04:00
Lang Martin 22e99e41c1 executor and executor_linux debug launch prep and process start 2019-05-03 14:42:57 -04:00
Lang Martin 47b9fc3d26 executor_linux_test new TestExecutor_EscapeContainer 2019-05-03 14:38:42 -04:00
Lang Martin 1cf936e90f executor_test test for more edges of lookupBin behavior 2019-05-03 11:55:19 -04:00
Lang Martin 88ce590dac executor_linux call new lookupTaskBin 2019-05-03 11:55:19 -04:00
Lang Martin ed63d6743b executor split up lookupBin 2019-05-03 11:55:19 -04:00
Mahmood Ali 6014a884be comment on using init() for libcontainer handling 2019-04-19 09:49:04 -04:00
Mahmood Ali 4322055301 comment what refer to 2019-04-19 09:49:04 -04:00
Mahmood Ali 18993421f2 Move libcontainer helper to executor package 2019-04-19 09:49:04 -04:00
Mahmood Ali 77a5edd3ae an alternative order 2019-04-02 20:00:54 -04:00
Mahmood Ali 334c6e9f5f try not without checking stat first 2019-04-02 19:55:44 -04:00
Mahmood Ali 17df86acda basic test for #4809 2019-04-02 19:50:35 -04:00
Michael Schurter 923cd91850
Merge pull request #5504 from hashicorp/b-exec-path
executor/linux: make chroot binary paths absolute
2019-04-02 14:09:50 -07:00
Michael Schurter 47bed4316f executor/linux: comment this bizarre code 2019-04-02 11:25:45 -07:00
Michael Schurter 1d569a27dc Revert "executor/linux: add defensive checks to binary path"
This reverts commit cb36f4537e63d53b198c2a87d1e03880895631bd.
2019-04-02 11:17:12 -07:00
Michael Schurter fc5487dbbc executor/linux: add defensive checks to binary path 2019-04-02 09:40:53 -07:00
Michael Schurter 7d49bc4c71 executor/linux: make chroot binary paths absolute
Avoid libcontainer.Process trying to lookup the binary via $PATH as the
executor has already found where the binary is located.
2019-04-01 15:45:31 -07:00
Mahmood Ali cb16ad7e3f comment configureTLogging 2019-04-01 16:52:58 -04:00
Mahmood Ali 81f4f07ed7 rename fifo methods for clarity 2019-04-01 16:52:58 -04:00
Mahmood Ali 88dc4a255a avoid opening files just to close them 2019-04-01 13:24:18 -04:00
Mahmood Ali dac2cd3df3 Add test cases for waiting on children
Also, make the test use files just like in the non-test case.
2019-04-01 13:24:18 -04:00
Michael Schurter b8d1dd95a0
Update drivers/shared/executor/executor_test.go
Co-Authored-By: notnoop <mahmood@notnoop.com>
2019-03-31 20:34:24 -04:00
Mahmood Ali df5d7ba50d fix test setup 2019-03-26 09:15:22 -04:00
Mahmood Ali d737a9836a test kill wait 2019-03-26 09:15:22 -04:00
Mahmood Ali 2a7b18aec4
Revert "executor: synchronize exitState accesses" (#5449)
Reverts hashicorp/nomad#5433

Apparently, channel communications can constitute Happens-Before even for proximate variables, so this syncing isn't necessary.

> _The closing of a channel happens before a receive that returns a zero value because the channel is closed._
https://golang.org/ref/mem#tmp_7
2019-03-20 07:33:05 -04:00
Nick Ethier 505e36ff7a
Merge pull request #5429 from hashicorp/b-blocking-executor-shutdown
executor: block shutdown on process exiting
2019-03-19 15:18:01 -04:00
Mahmood Ali a1776dba34 executor: synchronize exitState accesses
exitState is set in `wait()` goroutine but accessed in a different
`Wait()` goroutine, so accesses must be synchronized by a lock.
2019-03-17 11:56:58 -04:00
Nick Ethier 7418d09cf0
executor: block shutdown on process exiting 2019-03-15 23:50:17 -04:00
Mahmood Ali fb55717b0c
Regenerate Proto files (#5421)
Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin.

The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)), and [0.9.0-beta3 preperation](b849d84f2f).

This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync.  Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085
2019-03-14 10:56:27 -04:00
Iskander (Alex) Sharipov e69909fbd3
drivers/shared/executor: fix strings.Replace call
strings.Replace call with n=0 argument makes no sense
as it will do nothing. Probably -1 is intended.

Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>
2019-03-02 00:33:17 +03:00
Michael Schurter 38821954b7 plugins: squelch context Canceled error logs
As far as I can tell this is the most straightforward and resilient way
to skip error logging on context cancellation with grpc streams. You
cannot compare the error against context.Canceled directly as it is of
type `*status.statusError`. The next best solution I found was:

```go
resp, err := stream.Recv()
if code, ok := err.(interface{ Code() code.Code }); ok {
	if code.Code == code.Canceled {
		return
	}
}
```

However I think checking ctx.Err() directly makes the code much easier
to read and is resilient against grpc API changes.
2019-02-21 15:32:18 -08:00
Mahmood Ali a394cd63f4
CVE-2019-5736: Update libcontainer depedencies (#5334)
* CVE-2019-5736: Update libcontainer depedencies

Libcontainer is vulnerable to a runc container breakout, that was
reported as CVE-2019-5736[1].  Upgrading vendored libcontainer with the fix.

The runc changes are captured in 369b920277 .

[1] https://seclists.org/oss-sec/2019/q1/119
2019-02-19 20:21:18 -05:00
Alex Dadgar bc804dda2e Nomad 0.9.0-beta1 generated code 2019-01-30 10:49:44 -08:00
Nick Ethier bb9a8afe9b
executor: fix bug and add tests for incorrect stats timestamp reporting 2019-01-28 21:57:45 -05:00
Nick Ethier c7cc81924d
drivers/docker: handle shutdown of upgraded tasks correctly 2019-01-24 14:21:59 -05:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Nick Ethier e3c6f89b9a
drivers: use consts for task handle version 2019-01-18 18:31:01 -05:00
Nick Ethier 05bd369d1f
driver: add pre09 migration logic 2019-01-18 18:31:01 -05:00
Nick Ethier e5a6fc9271
executor: add pre 0.9 client and wrapper 2019-01-18 18:30:58 -05:00
Mahmood Ali 5df63fda7c
Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Danielle Tomlinson b918d25e62
Merge pull request #5192 from hashicorp/dani/executor-close
executor: Always close stdout/stderr fifos
2019-01-15 17:49:04 +01:00
Danielle Tomlinson 7f1ff3fab6 executor: Always close stdout/stderr fifos 2019-01-15 16:47:27 +01:00
Mahmood Ali 5649f72d27 propogate logs to executor plugin 2019-01-15 08:25:03 -05:00
Alex Dadgar 471fdb3ccf
Merge pull request #5173 from hashicorp/b-log-levels
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Mahmood Ali 9909d98bee Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00