open-nomad

Author	SHA1	Message	Date
Nick Ethier	7e306afde3	executor: fix failing stats related test	2019-01-12 12:18:23 -05:00
Nick Ethier	b0d9440474	docker: add test for stats collection	2019-01-12 12:18:22 -05:00
Nick Ethier	9fea54e0dc	executor: implement streaming stats API plugins/driver: update driver interface to support streaming stats client/tr: use streaming stats api TODO: * how to handle errors and closed channel during stats streaming * prevent tight loop if Stats(ctx) returns an error drivers: update drivers TaskStats RPC to handle streaming results executor: better error handling in stats rpc docker: better control and error handling of stats rpc driver: allow stats to return a recoverable error	2019-01-12 12:18:22 -05:00
Alex Dadgar	14ed757a56	Plugins use parent loggers This PR fixes various instances of plugins being launched without using the parent loggers. This meant that logs would not all go to the same output, break formatting etc.	2019-01-11 11:36:37 -08:00
Mahmood Ali	5389ebae41	Merge pull request #5166 from hashicorp/b-docker-tests-mac tests: run docker tests in macOS out of box	2019-01-09 13:07:37 -05:00
Mahmood Ali	90f3cea187	Merge pull request #5157 from hashicorp/r-drivers-no-cstructs drivers: avoid referencing client/structs package	2019-01-09 13:06:46 -05:00
Mahmood Ali	4952f2a182	Merge pull request #5159 from hashicorp/r-macos-tests Fix Travis MacOS job	2019-01-09 08:22:30 -05:00
Mahmood Ali	c78ed7246f	tests: run docker tests in macOS out of box Use `/tmp` as temporary directory for docker driver tests, so tests can run out of the box without any intervention. macOS sets tempdir as `/var`, which Docker does not whitelist as a path that can be bind-mounted.	2019-01-08 14:35:40 -05:00
Mahmood Ali	64f80343fc	drivers: re-export ResourceUsage structs Re-export the ResourceUsage structs in drivers package to avoid drivers directly depending on the internal client/structs package directly. I attempted moving the structs to drivers, but that caused some import cycles that was a bit hard to disentagle. Alternatively, I added an alias here that's sufficient for our purposes of avoiding external drivers depend on internal packages, while allowing us to restructure packages in future without breaking source compatibility.	2019-01-08 09:11:47 -05:00
Mahmood Ali	916a40bb9e	move cstructs.DeviceNetwork to drivers pkg	2019-01-08 09:11:47 -05:00
Mahmood Ali	9369b123de	use drivers.FSIsolation	2019-01-08 09:11:47 -05:00
Danielle Tomlinson	a9b9ad34dc	drivers: Implement InternalPluginDriver interface This implements the InternalPluginDriver interface in each driver, and calls the cancellation fn for their respective eventers. This fixes a per task goroutine leak during test suite execution.	2019-01-08 13:49:31 +01:00
Alex Dadgar	0106f23aaa	Review comments	2019-01-07 14:50:28 -08:00
Alex Dadgar	6c6e035dba	add docker logger to separate main	2019-01-07 14:49:40 -08:00
Alex Dadgar	a6b36df4de	remove nil logger	2019-01-07 14:48:01 -08:00
Mahmood Ali	0ba7b0c132	tests: helper function for checking docker presense	2019-01-07 08:27:06 -05:00
Mahmood Ali	796d625ab6	Skip tests requiring Docker deamon if not found.	2019-01-07 07:59:13 -05:00
Preetha Appan	2fb2de3cef	Standardize driver health description messages for all drivers	2019-01-06 22:06:38 -06:00
Nick Ethier	ce1a5cba0e	drivermanager: use allocID and task name to route task events	2018-12-18 23:01:51 -05:00
Alex Dadgar	bc55ec81b5	fix docker launching plugins	2018-12-18 16:48:01 -08:00
Alex Dadgar	4c57d2ec4d	Add plugin API versioning to plugin loader and plugins	2018-12-18 16:48:00 -08:00
Alex Dadgar	327b551b39	Drivers	2018-12-18 15:50:11 -08:00
Danielle Tomlinson	b61da13c20	docker: Delete Task on Destroy Currently the docker driver does not remove tasks from its state map when destroying the task, which leads to issues when restarting tasks in place, and leaks expired handles over time.	2018-12-18 15:53:31 +01:00
Mahmood Ali	168749ffd1	Merge pull request #5008 from hashicorp/b-docker-test-20181214 Fix flakiness in docker tests	2018-12-15 16:03:00 -05:00
Mahmood Ali	e4f44b9be5	testes: remove TestDockerDriver_Kill We already have two other Kill tests (e.g. TestDockerDriver_Start_Kill_Wait and TestDockerDriver_Start_KillTimeout), so don't need yet another flaky test.	2018-12-15 15:03:56 -05:00
Mahmood Ali	990a7d6776	driver/docker: stopping a dead container not error	2018-12-15 15:03:56 -05:00
Mahmood Ali	eaaaaf5c69	tests: assert docker containers start	2018-12-15 15:03:56 -05:00
Mahmood Ali	6631d42bfa	tests: try deflake TestDockerDriver_OOMKilled Noticed an issue in Docker daemon failing to handle the OOM test case failure in build https://travis-ci.org/hashicorp/nomad/jobs/468027848 , and I suspect it's related to the process dying so quickly, and potentially the way we are starting the task, so added a start up delay and made it more consistent with other tests that don't seem as flaky. The following is the log line showing Docker returning 500 error condition; while we can probably handle it gracefully without retrying, the retry is very cheap in this case and it's more of an optimization that we can handle in follow up PR. ``` testlog.go:32: 2018-12-14T14:57:52.626Z [DEBUG] docker/driver.go:852: docker: setting container startup command: task_name=nc-demo command="/bin/nc -l 127.0.0.1 -p 0" testlog.go:32: 2018-12-14T14:57:52.626Z [DEBUG] docker/driver.go:866: docker: setting container name: task_name=nc-demo container_name=724a3e77-8b15-e657-f6aa-84c2d3243b18 testlog.go:32: 2018-12-14T14:57:52.694Z [INFO ] docker/driver.go:196: docker: created container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be testlog.go:32: 2018-12-14T14:57:53.523Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=1 error="API error (500): {"message":"cannot start a stopped process: unknown"} " testlog.go:32: 2018-12-14T14:57:55.394Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=2 error="API error (500): {"message":"cannot start a stopped process: unknown"} " testlog.go:32: 2018-12-14T14:57:57.243Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=3 error="API error (500): {"message":"cannot start a stopped process: unknown"} " ```	2018-12-15 15:03:56 -05:00
Mahmood Ali	6b216a6015	tests: pin busybox image to a specific point tag Using `:latest` tag is typically a cause of pain, as underlying image changes behavior. Here, I'm switching to using a point release, and re-updating the stored tarballs with it. Sadly, when saving/loading images, the repo digeset is not supported: https://github.com/moby/moby/issues/22011 ; but using point releases should mitigate the problem. The motivation here is that docker tests have some flakiness due to accidental importing of `busybox:latest` which has `/bin/nc` that no longer supports `-p 0`: ``` $ docker run -it --rm busybox /bin/nc -l 127.0.0.1 -p 0 Unable to find image 'busybox:latest' locally latest: Pulling from library/busybox Digest: sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812 Status: Downloaded newer image for busybox:latest nc: bad local port '0' ``` Looks like older busybox versions (e.g. `busybox:1.24` do honor `-p 0` as the test expect, but I would rather update busybox to fix.	2018-12-15 15:03:56 -05:00
Nick Ethier	0c50a51c19	executor: encode mounts and devices correctly when using grpc	2018-12-15 00:08:23 -05:00
Mahmood Ali	69b2355274	Merge pull request #4975 from hashicorp/fix-master-20181209 Some test fixes and remedies	2018-12-11 18:00:21 -05:00
Mahmood Ali	979a65486d	tests: tag image explicitly	2018-12-11 17:59:45 -05:00
Mahmood Ali	84ded28c6d	drivers/docker: enforce volumes.enabled (#4983 ) When volumes.enable flag is off in Docker driver, disable all mounts of paths outside alloc dir.	2018-12-11 14:22:50 -05:00
Mahmood Ali	7d5b5bb5f9	Merge pull request #4933 from hashicorp/f-mount-device Mount Devices in container based drivers	2018-12-07 10:32:03 -05:00
Mahmood Ali	699875eb1c	fixup: add missed docker utils test	2018-12-06 15:46:35 -05:00
Mahmood Ali	e9557ae596	tests: ensure image is loaded as test setup	2018-12-06 15:36:43 -05:00
Mahmood Ali	b55fb642f1	driver/docker: honor plugin devices	2018-12-04 21:31:28 -05:00
Mahmood Ali	a580cef986	refactor device manipulation	2018-12-04 20:55:59 -05:00
Mahmood Ali	c88e3723eb	Fix docker tests Some tests have containers that die almost immediately, and may die and cleaned up before `driver.WaitUntilStarted` runs. The causes for container dying seems special for each test: * TestDockerDriver_Cleanup: `hello-world` image just emits a message and exits immediately * TestDockerDriver_ForcePull_RepoDigest: the busybox image in `TestDockerDriver_ForcePull_RepoDigest` test didn't support `-p 0` argument * TestDockerDriver_Entrypoint: with the entrypoint being `/bin/sh -c`, the command needs to be the entire string; otherwise, it ignores the comments	2018-12-03 23:08:52 -05:00
Danielle Tomlinson	393b76ed7f	plugins: Move driver testing support to subpackage this allows us to drop a cyclical import, but is subobptimal as it requires BaseDriver tests to move. This falls firmly into the realm of being a hack. Alternatives welcome.	2018-12-01 17:29:39 +01:00
Danielle Tomlinson	66c521ca17	client: Move fingerprint structs to pkg This removes a cyclical dependency when importing client/structs from dependencies of the plugin_loader, specifically, drivers. Due to client/config also depending on the plugin_loader. It also better reflects the ownership of fingerprint structs, as they are fairly internal to the fingerprint manager.	2018-12-01 17:10:39 +01:00
Danielle Tomlinson	51a9f7369e	Merge pull request #4936 from hashicorp/f-legacy-refactor Refactor and repackage client/driver	2018-11-30 13:38:06 +01:00
Mahmood Ali	84e04cfa40	Merge pull request #4926 from hashicorp/f-docker-image-ref Use user provided image name to launch container	2018-11-30 07:27:39 -05:00
Mahmood Ali	94d43b8003	Merge pull request #4924 from hashicorp/f-docker-mounts Support bind and tmpfs docker mounts	2018-11-30 07:27:17 -05:00
Danielle Tomlinson	2db5ae38d8	client: Rename drivers/shared/env => client/taskenv	2018-11-30 12:18:39 +01:00
Danielle Tomlinson	f3a77b8084	client: Merge driver/shared/structs and client/structs	2018-11-30 10:56:45 +01:00
Danielle Tomlinson	d582ea1d8b	drivers: Create drivers/shared/structs This creates a drivers/shared/structs package and moves the buffer size checks into it.	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	1a29811169	drivers: Move client/drivers/env to drivers/shared/env As part of deprecating legacy drivers, we're moving the env package to a new drivers/shared tree, as it is used by the modern docker and rkt driver packages, and is useful for 3rd party plugins.	2018-11-30 10:46:13 +01:00
Preetha Appan	9f4439243b	Fix docker driver to use new fingerprint typed attributes	2018-11-28 10:01:03 -06:00
Preetha Appan	f89dbcd9cc	modify fingerprint interface to use typed attribute struct	2018-11-28 10:01:03 -06:00
Mahmood Ali	844fd47acc	Use user provided image name to launch container This allows the container to be tagged with a user friendly image name (e.g. `redis:3.2`) rather than the image ID (e.g. `sha256:87856cc39862cec77541d68382e4867d7ccb29a85a17221446c857ddaebca916`). Useful for human debugging, as well as some debugging and image scanning tools. This risks two bad changes: 1. Discrepancy in image resolution between docker and Nomad's image loader. * I checked the image creation paths in Nomad, and noticed that we either pulled the image or inspect the image with the user provided name. 2. A race in image tagging where the tag is modified between image loading and container creation. * I, personally, don't think this case is cause for concern, as it is analogous to the task running a bit later. As long as the image is still present, creating the container should be good.	2018-11-27 16:12:15 -05:00
Mahmood Ali	f6d6a50c39	add support for tmpfs	2018-11-27 07:20:17 -05:00
Mahmood Ali	0a09f5521d	Support docker bind mounts	2018-11-27 07:20:17 -05:00
Mahmood Ali	141092e46d	Formatting and typo fixes	2018-11-25 11:53:21 -05:00
Mahmood Ali	c61d99b525	Merge pull request #4908 from hashicorp/f-docker-opts-storageopt Add support for docker storage options	2018-11-20 21:08:27 -05:00
Nick Ethier	1f3fe02e62	docker: sync access to exit result within a handle	2018-11-20 20:41:32 -05:00
Michael Schurter	2275153875	Apply suggestions from code review Co-Authored-By: nickethier <ncethier@gmail.com>	2018-11-20 20:33:31 -05:00
Mahmood Ali	e9e415f186	Add support for storage opt	2018-11-20 16:11:02 -05:00
Nick Ethier	3ccd359735	docker: unexport new coordinator func	2018-11-19 23:07:07 -05:00
Nick Ethier	8b9b2b476e	docker: add default blocks for driver plugin config schema	2018-11-19 22:59:18 -05:00
Nick Ethier	2667f48a5d	docker: move config RPCs to config.go	2018-11-19 22:59:18 -05:00
Nick Ethier	aa9f45ae47	docker: fix tests	2018-11-19 22:59:18 -05:00
Nick Ethier	0f03e8f520	docker: remove container pointer from task handle	2018-11-19 22:59:18 -05:00
Nick Ethier	ce4b867d21	docker: move volume driver options to seperate block	2018-11-19 22:59:18 -05:00
Nick Ethier	fca2df3c79	docker: group common config into blocks	2018-11-19 22:59:17 -05:00
Michael Schurter	813341dd59	Apply suggestions from code review Co-Authored-By: nickethier <ncethier@gmail.com>	2018-11-19 22:59:17 -05:00
Nick Ethier	b7bd36db30	docker: remove global pull coordinator	2018-11-19 22:59:17 -05:00
Nick Ethier	f0a86859a0	docker: remove call to global metrics instance	2018-11-19 22:59:17 -05:00
Nick Ethier	8ef73e63ce	docker: moved fingerprint code to it's own file	2018-11-19 22:59:17 -05:00
Nick Ethier	4be8a86ef9	plugins/driver: remove NodeResources from task Resources and use PercentTicks field for docker driver	2018-11-19 22:59:17 -05:00
Nick Ethier	ced5d5c445	docker: move recoverable error proto to shared structs	2018-11-19 22:59:16 -05:00
Nick Ethier	585e468085	docker: implement recover task logic	2018-11-19 22:59:16 -05:00
Nick Ethier	ee51cb6a93	docker: finished porting tests	2018-11-19 22:59:16 -05:00
Nick Ethier	3d7cdea19e	drivers/docker: more work porting tests from old driver plugin	2018-11-19 22:59:16 -05:00
Nick Ethier	8f8698b3e1	docker: started work on porting docker driver to new plugin framework	2018-11-19 22:59:15 -05:00
Nick Ethier	9e64ce7d73	docker: properly launch docker logger process	2018-11-19 22:59:12 -05:00
Mahmood Ali	1f4db08f42	Regenerate proto files with protoc-gen-go@v1.2.0	2018-11-14 09:01:26 -05:00
Mahmood Ali	470d20cdf3	Avoid downloading image if present locally	2018-11-13 10:21:40 -05:00
Mahmood Ali	e9067e52b4	pull alpine image needed for test The test requires the image to be present locally, so importing it as part of setup.	2018-11-13 10:21:40 -05:00
Alex Dadgar	204ca8230c	Device manager Introduce a device manager that manages the lifecycle of device plugins on the client. It fingerprints, collects stats, and forwards Reserve requests to the correct plugin. The manager, also handles device plugins failing and validates their output.	2018-11-07 10:43:15 -08:00
Nick Ethier	448a709603	driver/docker: rename container logger	2018-10-16 16:56:56 -07:00
Nick Ethier	64e9fc3798	docklog: add go-plugin for forwarding of docker logs	2018-10-16 16:56:56 -07:00

... 2 3 4 5 6

282 commits