open-nomad

Commit Graph

Author	SHA1	Message	Date
Mahmood Ali	6631d42bfa	tests: try deflake TestDockerDriver_OOMKilled Noticed an issue in Docker daemon failing to handle the OOM test case failure in build https://travis-ci.org/hashicorp/nomad/jobs/468027848 , and I suspect it's related to the process dying so quickly, and potentially the way we are starting the task, so added a start up delay and made it more consistent with other tests that don't seem as flaky. The following is the log line showing Docker returning 500 error condition; while we can probably handle it gracefully without retrying, the retry is very cheap in this case and it's more of an optimization that we can handle in follow up PR. ``` testlog.go:32: 2018-12-14T14:57:52.626Z [DEBUG] docker/driver.go:852: docker: setting container startup command: task_name=nc-demo command="/bin/nc -l 127.0.0.1 -p 0" testlog.go:32: 2018-12-14T14:57:52.626Z [DEBUG] docker/driver.go:866: docker: setting container name: task_name=nc-demo container_name=724a3e77-8b15-e657-f6aa-84c2d3243b18 testlog.go:32: 2018-12-14T14:57:52.694Z [INFO ] docker/driver.go:196: docker: created container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be testlog.go:32: 2018-12-14T14:57:53.523Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=1 error="API error (500): {"message":"cannot start a stopped process: unknown"} " testlog.go:32: 2018-12-14T14:57:55.394Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=2 error="API error (500): {"message":"cannot start a stopped process: unknown"} " testlog.go:32: 2018-12-14T14:57:57.243Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=3 error="API error (500): {"message":"cannot start a stopped process: unknown"} " ```	2018-12-15 15:03:56 -05:00
Mahmood Ali	6b216a6015	tests: pin busybox image to a specific point tag Using `:latest` tag is typically a cause of pain, as underlying image changes behavior. Here, I'm switching to using a point release, and re-updating the stored tarballs with it. Sadly, when saving/loading images, the repo digeset is not supported: https://github.com/moby/moby/issues/22011 ; but using point releases should mitigate the problem. The motivation here is that docker tests have some flakiness due to accidental importing of `busybox:latest` which has `/bin/nc` that no longer supports `-p 0`: ``` $ docker run -it --rm busybox /bin/nc -l 127.0.0.1 -p 0 Unable to find image 'busybox:latest' locally latest: Pulling from library/busybox Digest: sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812 Status: Downloaded newer image for busybox:latest nc: bad local port '0' ``` Looks like older busybox versions (e.g. `busybox:1.24` do honor `-p 0` as the test expect, but I would rather update busybox to fix.	2018-12-15 15:03:56 -05:00
Mahmood Ali	f0ec27da3c	tests: ensure exec tests pass valid task resources (#4992 ) Prior to 97f33bb1537d04905cb84199672bcdf46ebb4e65, executor cgroup validation errors were silently ignored. Enforcing them reveals test cases that missed them. This doesn't change customer facing contract, as resource struct is is either configured or we default to 100 (much higher than 2).	2018-12-12 20:40:38 -05:00
Mahmood Ali	74bd0be6ea	drivers/exec: support device binds and mounts	2018-12-11 18:35:21 -05:00
Mahmood Ali	8726ab3b9e	Merge pull request #4985 from hashicorp/test-with-xenial ci: Test with Ubuntu 16.04 in TravisCI	2018-12-11 18:00:39 -05:00
Mahmood Ali	69b2355274	Merge pull request #4975 from hashicorp/fix-master-20181209 Some test fixes and remedies	2018-12-11 18:00:21 -05:00
Mahmood Ali	979a65486d	tests: tag image explicitly	2018-12-11 17:59:45 -05:00
Alex Dadgar	1531b6d534	Merge pull request #4970 from hashicorp/f-no-iops Deprecate IOPS	2018-12-11 12:51:22 -08:00
Mahmood Ali	e6e71fb47a	tests: skip checking rdma cgroup rdma was added in most recent kernels and libcontainer/docker don't isolate them by default.	2018-12-11 15:49:11 -05:00
Mahmood Ali	84ded28c6d	drivers/docker: enforce volumes.enabled (#4983 ) When volumes.enable flag is off in Docker driver, disable all mounts of paths outside alloc dir.	2018-12-11 14:22:50 -05:00
Mahmood Ali	f6f39f1314	add a note about busybox license	2018-12-11 09:35:26 -05:00
Mahmood Ali	5a487ac884	tests: prevent indefinite blocking in some tests Noticed few places where tests seem to block indefinitely and panic after the test run reaches the test package timeout. I intend to follow up with the proper fix later, but timing out is much better than indefinitely blocking.	2018-12-11 09:35:26 -05:00
Mahmood Ali	23c07b9afe	tests: update stop/kill tests with new pattern Update rawexec and rkt stop/kill tests with the patterns introduced in 7a49e9b68e519050a0c2ef0b67c33503bfbc51be. This implementation should be more resilient to discrepancy between task stopping and task being marked as exited.	2018-12-11 09:35:26 -05:00
Mahmood Ali	8453ce7d56	tests: setup libcontainer rootfs Using statically linked busybox binary to setup a basic rootfs for testing, by symlinking it to provide the basic commands used in tests. I considered using a proper rootfs tarball, but the overhead of managing tarfile and expanding it seems significant enough that I went with this implementation.	2018-12-11 09:35:26 -05:00
Mahmood Ali	97829a3f02	fix dtestutil.NewDriverHarness ref	2018-12-08 09:58:23 -05:00
Mahmood Ali	021d3720b5	Merge pull request #4950 from hashicorp/b-exc-libcontainer-kill executor: kill all container processes	2018-12-08 09:52:42 -05:00
Mahmood Ali	7d5b5bb5f9	Merge pull request #4933 from hashicorp/f-mount-device Mount Devices in container based drivers	2018-12-07 10:32:03 -05:00
Alex Dadgar	1e3c3cb287	Deprecate IOPS IOPS have been modelled as a resource since Nomad 0.1 but has never actually been detected and there is no plan in the short term to add detection. This is because IOPS is a bit simplistic of a unit to define the performance requirements from the underlying storage system. In its current state it adds unnecessary confusion and can be removed without impacting any users. This PR leaves IOPS defined at the jobspec parsing level and in the api/ resources since these are the two public uses of the field. These should be considered deprecated and only exist to allow users to stop using them during the Nomad 0.9.x release. In the future, there should be no expectation that the field will exist.	2018-12-06 15:09:26 -08:00
Mahmood Ali	a7b205daf2	Merge pull request #4955 from hashicorp/fix-docker-tests-20181203 Fix docker driver tests	2018-12-06 16:41:33 -05:00
Mahmood Ali	bdc53b1d8e	driver/rkt: mount plugin devices	2018-12-06 15:46:35 -05:00
Mahmood Ali	2c0fd2a902	driver/lxc: mount plugin devices Also, LXC requires target paths to be relative. Container paths in LXC binds should never be absolute paths, so we strip any preceeding `/`, even if a user sets one.	2018-12-06 15:46:35 -05:00
Mahmood Ali	699875eb1c	fixup: add missed docker utils test	2018-12-06 15:46:35 -05:00
Mahmood Ali	e9557ae596	tests: ensure image is loaded as test setup	2018-12-06 15:36:43 -05:00
Mahmood Ali	b55fb642f1	driver/docker: honor plugin devices	2018-12-04 21:31:28 -05:00
Mahmood Ali	a580cef986	refactor device manipulation	2018-12-04 20:55:59 -05:00
Mahmood Ali	3a18105d06	drivers/exec: refactor stop/kill tests Simplify the tests to do all assertions within the main goroutine and account for status propagation delay.	2018-12-04 20:34:43 -05:00
Mahmood Ali	428d35a5a9	executor: Keep 0.8.6 exit code for wait() failures 0.8.6 uses exit code 1 when `proc.Wait()` fails: https://github.com/hashicorp/nomad/blob/v0.8.6/client/driver/executor/executor.go#L442	2018-12-04 19:38:25 -05:00
Mahmood Ali	8df9de6fd5	driver/rkt: use rkt environment The rkt command itself needs an environment with PATH set to find iptables.	2018-12-04 14:00:45 -05:00
Mahmood Ali	06a5cadf35	drivers/rkt: use image isolation for rkt	2018-12-04 11:40:10 -05:00
Mahmood Ali	178365848e	tests: don't assert in WaitForResult WaitForResult expects body to fail and retries few times before giving up. Assertions inside the testfn body causes it to terminate abruptly without retrying.	2018-12-04 11:40:10 -05:00
Mahmood Ali	f8ceeebf11	no t.Parallel() in excutor table driven tests (#4948 ) When `t.Parallel()` is used inside a `t.Run()` sub-set, the closure doesn't behave as expected, and some cases effectively get skipped. More details can be found in https://gist.github.com/posener/92a55c4cd441fc5e5e85f27bca008721	2018-12-04 09:04:04 -05:00
Mahmood Ali	216a2566c7	Update LXC with drivers/testutils changes (#4951 )	2018-12-04 08:57:54 -05:00
Mahmood Ali	c88e3723eb	Fix docker tests Some tests have containers that die almost immediately, and may die and cleaned up before `driver.WaitUntilStarted` runs. The causes for container dying seems special for each test: * TestDockerDriver_Cleanup: `hello-world` image just emits a message and exits immediately * TestDockerDriver_ForcePull_RepoDigest: the busybox image in `TestDockerDriver_ForcePull_RepoDigest` test didn't support `-p 0` argument * TestDockerDriver_Entrypoint: with the entrypoint being `/bin/sh -c`, the command needs to be the entire string; otherwise, it ignores the comments	2018-12-03 23:08:52 -05:00
Mahmood Ali	2516cb16b9	Kill all container processes on shutdown Currently, libcontainer-based executor, upon shutdown, kills the container initial process. The children of the killed process remain running, and the executor is never marked as terminated until they do. Also, fix a case where we treat processes as successful, when `proc.Wait()` fails. In some attempts, I was getting "waitid no child processes" errors and such error shouldn't get process to be considered successful.	2018-12-03 20:40:49 -05:00
Mahmood Ali	bd8e4f1c15	Test Stopping a multi-process exec Ensure that exec children processes get killed as well.	2018-12-03 20:40:19 -05:00
Danielle Tomlinson	10b3e68a6d	Merge pull request #4925 from hashicorp/f-driver-plugins-dani Third Party Driver Plugins Support	2018-12-03 20:48:19 +01:00
Mahmood Ali	88622b97bd	libcontainer to manage /dev and /proc (#4945 ) libcontainer already manages `/dev`, overriding task_dir - so let's use it for `/proc` as well and remove deadcode.	2018-12-03 10:41:01 -05:00
Danielle Tomlinson	393b76ed7f	plugins: Move driver testing support to subpackage this allows us to drop a cyclical import, but is subobptimal as it requires BaseDriver tests to move. This falls firmly into the realm of being a hack. Alternatives welcome.	2018-12-01 17:29:39 +01:00
Danielle Tomlinson	66c521ca17	client: Move fingerprint structs to pkg This removes a cyclical dependency when importing client/structs from dependencies of the plugin_loader, specifically, drivers. Due to client/config also depending on the plugin_loader. It also better reflects the ownership of fingerprint structs, as they are fairly internal to the fingerprint manager.	2018-12-01 17:10:39 +01:00
Danielle Tomlinson	51a9f7369e	Merge pull request #4936 from hashicorp/f-legacy-refactor Refactor and repackage client/driver	2018-11-30 13:38:06 +01:00
Mahmood Ali	84e04cfa40	Merge pull request #4926 from hashicorp/f-docker-image-ref Use user provided image name to launch container	2018-11-30 07:27:39 -05:00
Mahmood Ali	94d43b8003	Merge pull request #4924 from hashicorp/f-docker-mounts Support bind and tmpfs docker mounts	2018-11-30 07:27:17 -05:00
Danielle Tomlinson	2db5ae38d8	client: Rename drivers/shared/env => client/taskenv	2018-11-30 12:18:39 +01:00
Danielle Tomlinson	f3a77b8084	client: Merge driver/shared/structs and client/structs	2018-11-30 10:56:45 +01:00
Danielle Tomlinson	fdfe93aa25	fixup: executorplugin: fix rkt build	2018-11-30 10:47:08 +01:00
Danielle Tomlinson	04c8851b4c	client: Migrate DriverStats optout to drivers/shared/structs	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	d582ea1d8b	drivers: Create drivers/shared/structs This creates a drivers/shared/structs package and moves the buffer size checks into it.	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	0544a57abe	drivers: Move client/drivers/executor to drivers/shared/executor	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	1a29811169	drivers: Move client/drivers/env to drivers/shared/env As part of deprecating legacy drivers, we're moving the env package to a new drivers/shared tree, as it is used by the modern docker and rkt driver packages, and is useful for 3rd party plugins.	2018-11-30 10:46:13 +01:00
Preetha Appan	0d90ba392e	Fix lxc test panic	2018-11-28 13:56:17 -06:00

1 2 3 4

174 Commits