open-nomad

Commit Graph

Author	SHA1	Message	Date
Mahmood Ali	e2723399f3	Update drivers/docker/stats.go comment Co-Authored-By: schmichael <michael.schurter@gmail.com>	2019-04-02 09:09:17 -07:00
Michael Schurter	df3467a6ce	docker: fix send after close panic in stats destCh was being written to by one goroutine and closed by another goroutine. This panic occurred in Travis: ``` === FAIL: drivers/docker TestDockerCoordinator_ConcurrentPulls (117.66s) === PAUSE TestDockerCoordinator_ConcurrentPulls === CONT TestDockerCoordinator_ConcurrentPulls panic: send on closed channel goroutine 5358 [running]: github.com/hashicorp/nomad/drivers/docker.dockerStatsCollector(0xc0003a4a20, 0xc0003a49c0, 0x3b9aca00) /home/travis/gopath/src/github.com/hashicorp/nomad/drivers/docker/stats.go:108 +0x167 created by github.com/hashicorp/nomad/drivers/docker.TestDriver_DockerStatsCollector /home/travis/gopath/src/github.com/hashicorp/nomad/drivers/docker/stats_test.go:33 +0x1ab ``` The 2 ways to fix this kind of error are to either (1) add extra coordination around multiple goroutines writing to a chan or (2) make it so only one goroutines writes to a chan. I implemented (2) first as it's simpler, but @notnoop pointed out since the same destCh in reused in the stats loop there's now a double close panic possible! So this implements (1) by adding a *usageSender struct for handling concurrent senders and closing.	2019-04-02 08:28:08 -07:00
Michael Schurter	7d49bc4c71	executor/linux: make chroot binary paths absolute Avoid libcontainer.Process trying to lookup the binary via $PATH as the executor has already found where the binary is located.	2019-04-01 15:45:31 -07:00
Mahmood Ali	cb16ad7e3f	comment configureTLogging	2019-04-01 16:52:58 -04:00
Mahmood Ali	81f4f07ed7	rename fifo methods for clarity	2019-04-01 16:52:58 -04:00
Mahmood Ali	88dc4a255a	avoid opening files just to close them	2019-04-01 13:24:18 -04:00
Mahmood Ali	dac2cd3df3	Add test cases for waiting on children Also, make the test use files just like in the non-test case.	2019-04-01 13:24:18 -04:00
Mahmood Ali	2648d4176c	Merge pull request #5468 from hashicorp/b-test-exec-kill-wait Adding tests for killing task without grace	2019-03-31 20:37:15 -04:00
Michael Schurter	b8d1dd95a0	Update drivers/shared/executor/executor_test.go Co-Authored-By: notnoop <mahmood@notnoop.com>	2019-03-31 20:34:24 -04:00
Michael Schurter	294d405397	docker: restore pre-0.9 container names As far as I can tell Nomad itself does not use the container name after container creation, so this should be safe. OP: https://groups.google.com/d/topic/nomad-tool/kYkyERfVRXE/discussion v0.8.7 code: https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L1530-L1531	2019-03-29 13:55:43 -07:00
Mahmood Ali	df5d7ba50d	fix test setup	2019-03-26 09:15:22 -04:00
Mahmood Ali	d737a9836a	test kill wait	2019-03-26 09:15:22 -04:00
Mahmood Ali	b2b7618a1c	clarify unknown signal log line (#5466 )	2019-03-25 17:19:43 -04:00
Mahmood Ali	2a7b18aec4	Revert "executor: synchronize exitState accesses" (#5449 ) Reverts hashicorp/nomad#5433 Apparently, channel communications can constitute Happens-Before even for proximate variables, so this syncing isn't necessary. > _The closing of a channel happens before a receive that returns a zero value because the channel is closed._ https://golang.org/ref/mem#tmp_7	2019-03-20 07:33:05 -04:00
Nick Ethier	505e36ff7a	Merge pull request #5429 from hashicorp/b-blocking-executor-shutdown executor: block shutdown on process exiting	2019-03-19 15:18:01 -04:00
Mahmood Ali	a1776dba34	executor: synchronize exitState accesses exitState is set in `wait()` goroutine but accessed in a different `Wait()` goroutine, so accesses must be synchronized by a lock.	2019-03-17 11:56:58 -04:00
Nick Ethier	7418d09cf0	executor: block shutdown on process exiting	2019-03-15 23:50:17 -04:00
Mahmood Ali	8ec49fc133	Handle when cannot fetch docker logs (#5420 ) Fix #5418 When using a docker logger that doesn't support log streaming through API, currently docker logger runs a tight loop of Docker API calls unexpectedly. This change ensures we stop fetching logs early. Also, this adds some basic backoff strategy when Docker API logging fails unexpectedly, to avoid accidentally DoSing the docker daemon.	2019-03-14 16:23:11 -04:00
Mahmood Ali	fb55717b0c	Regenerate Proto files (#5421 ) Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin. The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](`ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)`), and [0.9.0-beta3 preperation](`b849d84f2f`). This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync. Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085	2019-03-14 10:56:27 -04:00
Preetha Appan	7f0d9e0c8e	minor review feedback	2019-03-13 13:27:28 -05:00
Preetha Appan	273f1e993d	Validate all auth fields being empty rather than just email This fixes a regression in 0.9 beta3 compared to 0.8.7 in validating docker auth config	2019-03-13 11:47:37 -05:00
Preetha Appan	549ae657f0	Dont require email address for docker auth	2019-03-13 11:08:56 -05:00
Preetha	7759166b0d	Merge pull request #5380 from quasilyte/patch-1 drivers/shared/executor: fix strings.Replace call	2019-03-06 11:47:01 -06:00
Mahmood Ali	bb32ba8784	Support driver config fields being set to nil (#5391 ) To pick up https://github.com/hashicorp/hcl2/pull/90	2019-03-05 21:47:06 -05:00
Iskander (Alex) Sharipov	e69909fbd3	drivers/shared/executor: fix strings.Replace call strings.Replace call with n=0 argument makes no sense as it will do nothing. Probably -1 is intended. Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>	2019-03-02 00:33:17 +03:00
Mahmood Ali	4726cb2207	logging.Type over logging.Driver	2019-02-28 16:40:18 -05:00
Mahmood Ali	104869c0e1	drivers/docker: rename logging `type` to `driver` Docker uses the term logging `driver` in its public documentations: in `docker` daemon config[1], `docker run` arguments [2] and in docker compose file[3]. Interestingly, docker used `type` in its API [4] instead of everywhere else. It's unfortunate that Nomad used `type` modeling after the Docker API rather than the user facing documents. Nomad using `type` feels very non-user friendly as it's disconnected from how Docker markets the flag and shows internal representation instead. Here, we rectify the situation by introducing `driver` field and prefering it over `type` in logging. [1] https://docs.docker.com/config/containers/logging/configure/ [2] https://docs.docker.com/engine/reference/run/#logging-drivers---log-driver [3] https://docs.docker.com/compose/compose-file/#logging [4] https://docs.docker.com/engine/api/v1.39/#operation/ContainerCreate	2019-02-28 16:04:03 -05:00
Mahmood Ali	67e2a0ac05	docker: report unhealthy in unsupported Windows (#5356 ) On Windows, Nomad only supports Windows containers, so report as unhealthy otherwise.	2019-02-27 08:10:23 -05:00
Michael Schurter	812f1679e2	Merge pull request #5352 from hashicorp/b-leaked-logmon logmon fixes	2019-02-26 08:35:46 -08:00
Danielle Tomlinson	e250aad31b	Merge pull request #5355 from hashicorp/dani/windows-dockerstats docker: Support Stats on Windows	2019-02-26 16:39:48 +01:00
Danielle Tomlinson	e3dc80bea3	docker: Return undetected before first detection This commit causes the docker driver to return undetected before it first establishes a connection to the docker daemon. This fixes a bug where hosts without docker installed would return as unhealthy, rather than undetected.	2019-02-25 11:02:42 +01:00
Danielle Tomlinson	8aff115fca	docker: Support stats on Windows	2019-02-22 14:19:58 +01:00
Michael Schurter	38821954b7	plugins: squelch context Canceled error logs As far as I can tell this is the most straightforward and resilient way to skip error logging on context cancellation with grpc streams. You cannot compare the error against context.Canceled directly as it is of type `*status.statusError`. The next best solution I found was: ```go resp, err := stream.Recv() if code, ok := err.(interface{ Code() code.Code }); ok { if code.Code == code.Canceled { return } } ``` However I think checking ctx.Err() directly makes the code much easier to read and is resilient against grpc API changes.	2019-02-21 15:32:18 -08:00
Mahmood Ali	6d30284ec9	Merge pull request #5341 from hashicorp/ci-windows-docker Run Docker tests in Windows AppVeyor CI	2019-02-21 13:17:33 -05:00
Danielle Tomlinson	2610e2d9ef	docker: Avoid leaking containers during Reattach Currently if a docker_logger cannot be reattached to, we will leak the container that was being used. This is problematic if e.g using static ports as it means you can never recover your task, or if a service is expensive to run and will then be running without supervision.	2019-02-20 17:47:06 +01:00
Danielle Tomlinson	953755ce24	Merge pull request #5335 from hashicorp/dani/docker-logger-spawn Increase resiliency of docker driver logging	2019-02-20 17:16:05 +01:00
Michael Schurter	a1645edb0b	Update drivers/docker/docklog/docker_logger.go Co-Authored-By: dantoml <dani@tomlinson.io>	2019-02-20 17:12:56 +01:00
Danielle Tomlinson	2f18441a47	docker: Respawn docker logger during recovery Sometimes the nomad docker_logger may be killed by a service manager when restarting the client for upgrades or reliability reasons. Currently if this happens, we leak the users container and try to reschedule over it. This commit adds a new step to the recovery process that will spawn a new docker logger process that will fetch logs from _the current timestamp_. This is to avoid restarting users tasks because our logging sidecar has failed.	2019-02-20 17:12:56 +01:00
Mahmood Ali	8c82c19831	tests: IsTravis() -> IsCI() Replace IsTravis() references that is intended for more CI environments rather than for Travis environment specifically.	2019-02-20 08:21:03 -05:00
Mahmood Ali	fedab3d7b0	driver/docker: Skip failing Windows tests Skip currently Docker tests that fail on Windows for further investigation.	2019-02-20 07:48:02 -05:00
Mahmood Ali	dd8a5c862a	Merge pull request #5321 from hashicorp/b-portmap-regression drivers: restore port_map old json support	2019-02-19 20:58:37 -05:00
Mahmood Ali	4def8529db	driver/docker: use BlockAttrs for storage_opts storage_opts is a new field in 0.9 cycle and doesn't have backward compatibility constraints.	2019-02-19 20:35:28 -05:00
Mahmood Ali	a394cd63f4	CVE-2019-5736: Update libcontainer depedencies (#5334 ) * CVE-2019-5736: Update libcontainer depedencies Libcontainer is vulnerable to a runc container breakout, that was reported as CVE-2019-5736[1]. Upgrading vendored libcontainer with the fix. The runc changes are captured in `369b920277` . [1] https://seclists.org/oss-sec/2019/q1/119	2019-02-19 20:21:18 -05:00
Danielle Tomlinson	3cf3ac7eac	dlogger: Increase resilience to docker api failure This commit adds some extra resiliency to the docker logger in the case of API failure from the docker daemon, by restarting the stream from the current point in time if the stream returns and the container is still running.	2019-02-19 15:17:54 +01:00
Mahmood Ali	46cd3c3f55	drivers: restore port_map old json support This ensures that `port_map` along with other block like attribute declarations (e.g. ulimit, labels, etc) can handle various hcl and json syntax that was supported in 0.8. In 0.8.7, the following declarations are effectively equivalent: ``` // hcl block port_map { http = 80 https = 443 } // hcl assignment port_map = { http = 80 https = 443 } // json single element array of map (default in API response) {"port_map": [{"http": 80, "https": 443}]} // json array of individual maps (supported accidentally iiuc) {"port_map: [{"http": 80}, {"https": 443}]} ``` We achieve compatbility by using `NewAttr("...", "list(map(string))", false)` to be serialized to a `map[string]string` wrapper, instead of using `BlockAttrs` declaration. The wrapper merges the list of maps automatically, to ease driver development. This approach is closer to how v0.8.7 implemented the fields [1][2], and despite its verbosity, seems to perserve 0.8.7 behavior in hcl2. This is only required for built-in types that have backward compatibility constraints. External drivers should use `BlockAttrs` instead, as they see fit. [1] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L216 [2] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L698-L700	2019-02-16 11:37:33 -05:00
Danielle Tomlinson	be431cb83d	Merge pull request #5326 from hashicorp/dani/json-submission api: Fix compatibility with pre 0.9 API jobs	2019-02-14 18:56:13 +01:00
Mahmood Ali	1430f94b2a	Update drivers/docker/config_test.go Co-Authored-By: dantoml <dani@tomlinson.io>	2019-02-14 18:55:10 +01:00
Danielle Tomlinson	3f696be06b	Add regression test for parsing null mounts	2019-02-14 18:03:35 +01:00
Danielle Tomlinson	a3a1491958	drivers/docker: SIGTERM to stop containers Windows Docker daemon does not support SIGINT, SIGTERM is the semantic equivalent that allows for graceful shutdown before being followed up by a SIGKILL.	2019-02-14 15:38:54 +00:00
Mahmood Ali	f7102cd01d	tests: add hcl task driver config parsing tests (#5314 ) * drivers: add config parsing tests Add basic tests for parsing and encoding task config. * drivers/docker: fix some config declarations * refactor and document config parse helpers	2019-02-12 14:46:37 -05:00
Mahmood Ali	aec9120994	drivers/java: restore 0.8.7 java version detection (#5317 ) Restore 0.8.x behavior where java driver is marked as detected when `java -version` exits with 0 but returns unexpected output. Furthermore, we restore behavior when `java -version` where we parse the first three lines of `java -version` but ignore rest. If `java -version` returns less than 3 lines, Nomad 0.8.7 would panic. In this implementation, we'd still mark java as detected but returns empty version. The 0.8.7 logic for detecting java version is found in https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/java.go#L132-L172 . I punt on revamping how we can be more resilient to java -version syntax, and aimed for preserving existing behavior instead.	2019-02-12 13:41:26 -05:00
Michael Schurter	3b84e08fa4	Merge pull request #5297 from hashicorp/b-docker-logging Docker: Fix logging config parsing	2019-02-11 06:57:52 -08:00
Gertjan Roggemans	94ca78354b	docker: Fix volume driver_config options spec (#5309 ) Fixes #5308	2019-02-11 09:18:44 -05:00
Michael Schurter	e1e4b10884	docker: fix logging config parsing Fixes https://groups.google.com/d/topic/nomad-tool/B3Uo6Kns2BI/discussion	2019-02-04 11:07:57 -08:00
Nick Ethier	e7ea26449e	client: fix bug during 0.8 state up grade that causes external drivers to fail	2019-01-30 14:22:29 -05:00
Alex Dadgar	bc804dda2e	Nomad 0.9.0-beta1 generated code	2019-01-30 10:49:44 -08:00
Nick Ethier	bb9a8afe9b	executor: fix bug and add tests for incorrect stats timestamp reporting	2019-01-28 21:57:45 -05:00
Nick Ethier	bcbed3c532	Merge pull request #5248 from hashicorp/b-rawexec-leak Fix leaked executor in raw_exec	2019-01-28 21:18:31 -05:00
Alex Dadgar	991bcc3ef1	Don't fall through	2019-01-28 09:53:19 -08:00
Alex Dadgar	403faa0d7c	comment	2019-01-28 09:47:53 -08:00
Nick Ethier	1f4c26e19e	raw_exec: ensure executor is killed after task is stopped	2019-01-25 23:06:31 -05:00
Alex Dadgar	68ced492fb	Fix killing non-existant container with a kill timeout	2019-01-25 16:21:51 -08:00
Nick Ethier	448f93f1e9	Merge pull request #5237 from hashicorp/b-docker-09upgrade-shutdown drivers/docker: handle shutdown of upgraded tasks correctly	2019-01-25 12:13:03 -05:00
Nick Ethier	ac693fe6ac	docker: only log missing bridge_ip on initial fingerprint	2019-01-24 22:34:57 -05:00
Nick Ethier	c7cc81924d	drivers/docker: handle shutdown of upgraded tasks correctly	2019-01-24 14:21:59 -05:00
Nick Ethier	a5823a270c	Merge pull request #5209 from hashicorp/f-driver-upgradepath-test e2e tests for upgrade path to driver plugins	2019-01-23 21:03:29 -05:00
Nick Ethier	be976d9c9a	Merge branch 'master' into f-driver-upgradepath-test * master: (23 commits) tests: avoid assertion in goroutine spell check ci: run checkscripts tests: deflake TestRktDriver_StartWaitRecoverWaitStop drivers/rkt: Remove unused github.com/rkt/rkt drivers/rkt: allow development on non-linux cli: Hide `nomad docker_logger` from help output api: test api and structs are in sync goimports until make check is happy nil check node resources to prevent panic tr: use context in as select statement move pluginutils -> helper/pluginutils vet goimports gofmt Split hclspec move hclutils Driver tests do not use hcl2/hcl, hclspec, or hclutils move reattach config loader and singleton ...	2019-01-23 21:01:24 -05:00
Nick Ethier	5b9013528e	drivers: add docker upgrade path and e2e test	2019-01-23 14:44:42 -05:00
Alex Dadgar	48dfbbcad9	fix docker runtime attribute	2019-01-23 10:58:46 -08:00
Mahmood Ali	698f0a1575	Merge pull request #5229 from hashicorp/r-grabbag-201901019 Grab bag of small changes	2019-01-23 13:06:51 -05:00
Mahmood Ali	8d76bb0e82	tests: avoid assertion in goroutine	2019-01-23 12:34:16 -05:00
Mahmood Ali	464a70fda6	tests: deflake TestRktDriver_StartWaitRecoverWaitStop	2019-01-23 10:54:49 -05:00
Mahmood Ali	a64ad99b94	drivers/rkt: allow development on non-linux Avoid depending on OS-specific library for some simple structs.	2019-01-23 10:53:49 -05:00
Michael Schurter	32daa7b47b	goimports until make check is happy	2019-01-23 06:27:14 -08:00
Michael Schurter	be0bab7c3f	move pluginutils -> helper/pluginutils I wanted a different color bikeshed, so I get to paint it	2019-01-22 15:50:08 -08:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Alex Dadgar	b7a65676fe	gofmt	2019-01-22 15:43:34 -08:00
Alex Dadgar	72a5691897	Driver tests do not use hcl2/hcl, hclspec, or hclutils	2019-01-22 15:43:34 -08:00
Alex Dadgar	b2c7268843	move reattach config	2019-01-22 15:11:58 -08:00
Alex Dadgar	cdcd3c929c	loader and singleton	2019-01-22 15:11:57 -08:00
Michael Schurter	9edff19625	test: port SignalFailure test from 0.8 Also fix signal error handling in mock_driver.	2019-01-22 08:08:08 -08:00
Nick Ethier	b840a2eb7b	drivers: fix func naming	2019-01-18 18:31:02 -05:00
Nick Ethier	e3c6f89b9a	drivers: use consts for task handle version	2019-01-18 18:31:01 -05:00
Nick Ethier	9dd4eb3581	drivers: add upgrade path for rawexec, java, rkt and qemu	2019-01-18 18:31:01 -05:00
Nick Ethier	6804450c69	cleanup code comments and small fixes from refactor	2019-01-18 18:31:01 -05:00
Nick Ethier	05bd369d1f	driver: add pre09 migration logic	2019-01-18 18:31:01 -05:00
Nick Ethier	e5a6fc9271	executor: add pre 0.9 client and wrapper	2019-01-18 18:30:58 -05:00
Mahmood Ali	5df63fda7c	Merge pull request #5190 from hashicorp/f-memory-usage Track Basic Memory Usage as reported by cgroups	2019-01-18 16:46:02 -05:00
Danielle Tomlinson	b65bf78513	docker: Fix missing import	2019-01-17 18:44:27 +01:00
Danielle Tomlinson	7fca934509	chore: General Cleanup	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	e73962d8d6	docker: Only run Cleanup test on unix os'	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	af202f347f	chore: goimports exec driver	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	30a5e25d94	fixup: Typo in docker test	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	82018cd030	chore: Fix docklog linting	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	3b2ff2005b	chore: Fix docker test linting Due to https://github.com/tsenart/deadcode/issues/3 we can't specify these consts on their own. This moves them into the _platform_test.go files to avoid creating a package that only exposes a couple of values.	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	15b1571882	drivers/exec: SIGINT unavailable on windows	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	d78120f890	rawexec: Fix Exec test on windows	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	65457dd2f2	rawexec: SIGINT is not available on Windows	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	de86435cf8	docker: Test cleanup for windows * Docker for Windows does not support ulimits * Use filepath.ToSlash to test workdir * Convert expected mount paths to system style * Skip security-opt test on windows - Windows does not support seccomp, and it's unclear which options are available. * Skip StartN due to lack of sigint * docker: Use api to get image info on windows * No bridge on windows * Stop hardcoding /bin/	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	3d2f0bb7da	docker: ExpandPath tests validate slashpath	2019-01-17 18:43:13 +01:00
Danielle Tomlinson	37190a5595	dockerlogger: Fix tests on windows Uses the home directory and windows path expansion, as c:\tmp doesn't necessarily exist, and mktemp would involve unnecessarily complicating the commands.	2019-01-17 18:43:13 +01:00
Danielle Tomlinson	e6c0738b65	Expand unix build definition	2019-01-17 18:43:13 +01:00
Mahmood Ali	677e11da86	fix imports	2019-01-16 19:53:48 -05:00
Michael Schurter	bea4d297b9	Merge pull request #5197 from hashicorp/b-rkt-cpu rkt: revert to pre-0.9 --cpu flag	2019-01-16 15:05:21 -08:00
Preetha Appan	55319b05d1	clean up read access	2019-01-16 11:04:11 -06:00
Preetha Appan	469a286b1b	Refactor logging in drivers to use a tri-state boolean Changes logging warnings/errors only if the state changes from healthy to unhealthy	2019-01-16 10:19:31 -06:00
Preetha Appan	0c2c0a2d43	Make docker driver logging less redundant	2019-01-16 10:16:57 -06:00
Michael Schurter	1879a7f788	rkt: revert to pre-0.9 --cpu flag See https://github.com/hashicorp/nomad/issues/3394#issuecomment-453296121 for details. During 0.9 development we switched to shares, but we'd prefer to maintain backward compat.	2019-01-15 13:15:28 -08:00
oleksii.shyman	e41fbf7577	Add support for docker runtimes - docker fingerprint issues a docker api system info call to get the list of supported OCI runtimes. - OCI runtimes are reported as comma separated list of names - docker driver is aware of GPU runtime presence - docker driver throws an error when user tries to run container with GPU, when GPU runtime is not present - docker GPU runtime name is configurable	2019-01-15 11:34:47 -08:00
Danielle Tomlinson	b918d25e62	Merge pull request #5192 from hashicorp/dani/executor-close executor: Always close stdout/stderr fifos	2019-01-15 17:49:04 +01:00
Danielle Tomlinson	f120c8f8f6	Merge pull request #5184 from hashicorp/dani/b-logmon-reattach docker: Terminate dockerlogger	2019-01-15 16:48:40 +01:00
Danielle Tomlinson	7f1ff3fab6	executor: Always close stdout/stderr fifos	2019-01-15 16:47:27 +01:00
Danielle Tomlinson	272a8726d7	docker: Terminate dockerlogger Previously, we did not attempt to stop Docker Logger processes until DestroyTask, which means that under many circumstances, we will never successfully close the plugin client. This commit terminates the plugin process when `run` terminates, or when `DestroyTask` is called. Steps to repro: ``` $ nomad agent -dev $ nomad init $ nomad run example.nomad $ nomad stop example $ ps aux \| grep nomad # See docker logger process running $ signal the dev agent $ ps aux \| grep nomad # See docker logger process running ```	2019-01-15 14:58:05 +01:00
Mahmood Ali	5649f72d27	propogate logs to executor plugin	2019-01-15 08:25:03 -05:00
Alex Dadgar	471fdb3ccf	Merge pull request #5173 from hashicorp/b-log-levels Plugins use parent loggers	2019-01-14 16:14:30 -08:00
Mahmood Ali	9909d98bee	Track Basic Memory Usage as reported by cgroups Track current memory usage, `memory.usage_in_bytes`, in addition to `memory.max_memory_usage_in_bytes` and friends. This number is closer what Docker reports. Related to https://github.com/hashicorp/nomad/issues/5165 .	2019-01-14 18:47:52 -05:00
Nick Ethier	c619e70d39	Merge pull request #5018 from hashicorp/f-executor-stats executor: streaming stats api	2019-01-14 15:02:35 -05:00
Nick Ethier	a4534779d3	qemu: missing gofmt	2019-01-13 16:06:56 -05:00
Michael Schurter	ff034ffbc9	Update drivers/qemu/driver_test.go use t.Logf instead of fmt.Printf Co-Authored-By: nickethier <ncethier@gmail.com>	2019-01-12 21:33:55 -05:00
Nick Ethier	3b395d7100	drivers: plumb grpc client logger	2019-01-12 12:18:23 -05:00
Nick Ethier	7e306afde3	executor: fix failing stats related test	2019-01-12 12:18:23 -05:00
Nick Ethier	b0d9440474	docker: add test for stats collection	2019-01-12 12:18:22 -05:00
Nick Ethier	9fea54e0dc	executor: implement streaming stats API plugins/driver: update driver interface to support streaming stats client/tr: use streaming stats api TODO: * how to handle errors and closed channel during stats streaming * prevent tight loop if Stats(ctx) returns an error drivers: update drivers TaskStats RPC to handle streaming results executor: better error handling in stats rpc docker: better control and error handling of stats rpc driver: allow stats to return a recoverable error	2019-01-12 12:18:22 -05:00
Alex Dadgar	bb6ea30f58	fix rkt use of executor	2019-01-11 11:36:37 -08:00
Alex Dadgar	14ed757a56	Plugins use parent loggers This PR fixes various instances of plugins being launched without using the parent loggers. This meant that logs would not all go to the same output, break formatting etc.	2019-01-11 11:36:37 -08:00
Mahmood Ali	614f63f40d	drivers/java: use libcontainer executor on java linux	2019-01-10 10:10:40 -05:00
Mahmood Ali	9740443703	tests: ignore _JAVA_OPTIONS line ignore _JAVA_OPTIONS line in `java -version`, as it's relevant.	2019-01-10 10:10:40 -05:00
Mahmood Ali	5389ebae41	Merge pull request #5166 from hashicorp/b-docker-tests-mac tests: run docker tests in macOS out of box	2019-01-09 13:07:37 -05:00
Mahmood Ali	b08f59cdda	Merge pull request #5162 from hashicorp/f-extract-lxc Extract LXC from nomad	2019-01-09 13:07:05 -05:00
Mahmood Ali	90f3cea187	Merge pull request #5157 from hashicorp/r-drivers-no-cstructs drivers: avoid referencing client/structs package	2019-01-09 13:06:46 -05:00
Mahmood Ali	ff48dbb8a9	Merge pull request #5163 from hashicorp/r-minor-changes-20180108 Fix a panic on node.Deregister fail	2019-01-09 09:56:00 -05:00
Mahmood Ali	1f2473263e	fix more cases of logging arity errors	2019-01-09 09:22:47 -05:00
Mahmood Ali	4952f2a182	Merge pull request #5159 from hashicorp/r-macos-tests Fix Travis MacOS job	2019-01-09 08:22:30 -05:00
Mahmood Ali	c78ed7246f	tests: run docker tests in macOS out of box Use `/tmp` as temporary directory for docker driver tests, so tests can run out of the box without any intervention. macOS sets tempdir as `/var`, which Docker does not whitelist as a path that can be bind-mounted.	2019-01-08 14:35:40 -05:00
Mahmood Ali	8f20bc8ce2	Merge pull request #5154 from hashicorp/f-revert-exec-devs drivers/exec: restrict devices exposed to tasks	2019-01-08 12:43:06 -05:00
Mahmood Ali	d19b92edec	executor: add a comment detailing isolation	2019-01-08 12:10:26 -05:00
Mahmood Ali	62a7f951c0	remove lxc references	2019-01-08 09:28:20 -05:00
Mahmood Ali	426c981c34	Remove some dead code	2019-01-08 09:11:48 -05:00
Mahmood Ali	64f80343fc	drivers: re-export ResourceUsage structs Re-export the ResourceUsage structs in drivers package to avoid drivers directly depending on the internal client/structs package directly. I attempted moving the structs to drivers, but that caused some import cycles that was a bit hard to disentagle. Alternatively, I added an alias here that's sufficient for our purposes of avoiding external drivers depend on internal packages, while allowing us to restructure packages in future without breaking source compatibility.	2019-01-08 09:11:47 -05:00
Mahmood Ali	916a40bb9e	move cstructs.DeviceNetwork to drivers pkg	2019-01-08 09:11:47 -05:00
Mahmood Ali	9369b123de	use drivers.FSIsolation	2019-01-08 09:11:47 -05:00
Danielle Tomlinson	a9b9ad34dc	drivers: Implement InternalPluginDriver interface This implements the InternalPluginDriver interface in each driver, and calls the cancellation fn for their respective eventers. This fixes a per task goroutine leak during test suite execution.	2019-01-08 13:49:31 +01:00
Alex Dadgar	0106f23aaa	Review comments	2019-01-07 14:50:28 -08:00
Alex Dadgar	8a35d7b1dd	Test recovery	2019-01-07 14:49:41 -08:00
Alex Dadgar	f40f8ce02e	Mock driver has recovery, stats	2019-01-07 14:49:40 -08:00
Alex Dadgar	c9825a9c36	recover	2019-01-07 14:49:40 -08:00
Alex Dadgar	6c6e035dba	add docker logger to separate main	2019-01-07 14:49:40 -08:00
Alex Dadgar	39542b4cf0	rkt fingerprint logs once	2019-01-07 14:49:40 -08:00
Alex Dadgar	a6b36df4de	remove nil logger	2019-01-07 14:48:01 -08:00
Mahmood Ali	58fb6812db	tests: busybox only depends on arch Busybox is compiled for linux only. Making the file used in executor tests even for non-linux targets, as having the file present has no side-effects.	2019-01-07 08:36:32 -05:00
Mahmood Ali	0ba7b0c132	tests: helper function for checking docker presense	2019-01-07 08:27:06 -05:00
Mahmood Ali	796d625ab6	Skip tests requiring Docker deamon if not found.	2019-01-07 07:59:13 -05:00
Preetha Appan	2fb2de3cef	Standardize driver health description messages for all drivers	2019-01-06 22:06:38 -06:00
Preetha Appan	76c09c7cbf	remove unnecessary logging in rkt driver fingerprint method	2019-01-06 20:59:20 -06:00
Mahmood Ali	8797a4f0ea	drivers/exec: restrict devices exposed to tasks We ultimately decided to provide a limited set of devices in exec/java drivers instead of all of host ones. Pre-0.9, we made all host devices available to exec tasks accidentally, yet most applications only use a small subset, and this choice limits our ability to restrict/isolate GPU and other devices. Starting with 0.9, by default, we only provide the same subset of devices Docker provides, and allow users to provide more devices as needed on case-by-case basis. This reverts commit 5805c64a9f1c3b409693493dfa30e7136b9f547b. This reverts commit ff9a4a17e59388dcab067949e0664f645b2f5bcf.	2019-01-06 17:03:19 -05:00
Mahmood Ali	56e3171310	driver/exec: use dedicated /dev mount (#5147 ) Use a dedicated /dev mount so we can inject more devices if necessary, and avoid allowing a container to contaminate host /dev. Follow up to https://github.com/hashicorp/nomad/pull/5143 - and fixes master.	2019-01-04 10:36:05 -05:00
Mahmood Ali	5b0702c9eb	drivers/exec: bind mount /dev into rootfs Restores pre-0.9 behavior, where Nomad makes /dev available to exec task. Switching to libcontainer, we accidentally made only a small subset available. Here, we err on the side of preserving behavior of 0.8, instead of going for the sensible route, where only a reasonable subset of devices is mounted by default and user can opt to request more.	2019-01-03 14:29:18 -05:00
Mahmood Ali	d23f47736c	drivers/exec: run as `nobody` by default libcontainer based drivers (e.g. exec, java) should default to running processes as `nobody` [1]; but libcontainer treats empty user as `root` in our case (either because of default or due to `root` being current user). [1] `94c28a4c6c/website/source/docs/job-specification/task.html.md (task-parameters)`	2019-01-03 14:29:18 -05:00
Danielle Tomlinson	6c9b9dc9f1	rkt: Return consistent error when not root	2018-12-20 13:02:46 +01:00
Danielle Tomlinson	6709de199b	java: Return undetected when not running as root This is an unrecoverable error, so we should only do this check once, rather than returning unhealthy constantly.	2018-12-20 12:55:07 +01:00
Danielle Tomlinson	7b31027ea3	exec: Return undetected when not running as root This is an unrecoverable error, so we should only do this check once, rather than returning unhealthy constantly.	2018-12-20 12:54:19 +01:00
Nick Ethier	ce1a5cba0e	drivermanager: use allocID and task name to route task events	2018-12-18 23:01:51 -05:00
Alex Dadgar	bc55ec81b5	fix docker launching plugins	2018-12-18 16:48:01 -08:00
Alex Dadgar	730a6f5b9a	lint	2018-12-18 16:48:00 -08:00
Alex Dadgar	4c57d2ec4d	Add plugin API versioning to plugin loader and plugins	2018-12-18 16:48:00 -08:00
Alex Dadgar	b8268d9a46	Lint	2018-12-18 15:50:44 -08:00
Alex Dadgar	327b551b39	Drivers	2018-12-18 15:50:11 -08:00
Alex Dadgar	b9ee03b2c1	protos	2018-12-18 15:48:52 -08:00
Danielle Tomlinson	b61da13c20	docker: Delete Task on Destroy Currently the docker driver does not remove tasks from its state map when destroying the task, which leads to issues when restarting tasks in place, and leaks expired handles over time.	2018-12-18 15:53:31 +01:00
Mahmood Ali	56dfdd0874	tests: fix rkt command environment (#5011 ) The environment variables needed for envoking `rkt` command line should include host PATH (to access `iptables`). Given that the command runs outside the VM, untrusted task environment variables should NOT be honored here. We do this already with `rkt`, but the change is quite subtle to miss when refactoring.	2018-12-15 20:25:36 -05:00
Mahmood Ali	168749ffd1	Merge pull request #5008 from hashicorp/b-docker-test-20181214 Fix flakiness in docker tests	2018-12-15 16:03:00 -05:00
Mahmood Ali	e4f44b9be5	testes: remove TestDockerDriver_Kill We already have two other Kill tests (e.g. TestDockerDriver_Start_Kill_Wait and TestDockerDriver_Start_KillTimeout), so don't need yet another flaky test.	2018-12-15 15:03:56 -05:00
Mahmood Ali	990a7d6776	driver/docker: stopping a dead container not error	2018-12-15 15:03:56 -05:00
Mahmood Ali	eaaaaf5c69	tests: assert docker containers start	2018-12-15 15:03:56 -05:00
Mahmood Ali	6631d42bfa	tests: try deflake TestDockerDriver_OOMKilled Noticed an issue in Docker daemon failing to handle the OOM test case failure in build https://travis-ci.org/hashicorp/nomad/jobs/468027848 , and I suspect it's related to the process dying so quickly, and potentially the way we are starting the task, so added a start up delay and made it more consistent with other tests that don't seem as flaky. The following is the log line showing Docker returning 500 error condition; while we can probably handle it gracefully without retrying, the retry is very cheap in this case and it's more of an optimization that we can handle in follow up PR. ``` testlog.go:32: 2018-12-14T14:57:52.626Z [DEBUG] docker/driver.go:852: docker: setting container startup command: task_name=nc-demo command="/bin/nc -l 127.0.0.1 -p 0" testlog.go:32: 2018-12-14T14:57:52.626Z [DEBUG] docker/driver.go:866: docker: setting container name: task_name=nc-demo container_name=724a3e77-8b15-e657-f6aa-84c2d3243b18 testlog.go:32: 2018-12-14T14:57:52.694Z [INFO ] docker/driver.go:196: docker: created container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be testlog.go:32: 2018-12-14T14:57:53.523Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=1 error="API error (500): {"message":"cannot start a stopped process: unknown"} " testlog.go:32: 2018-12-14T14:57:55.394Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=2 error="API error (500): {"message":"cannot start a stopped process: unknown"} " testlog.go:32: 2018-12-14T14:57:57.243Z [DEBUG] docker/driver.go:416: docker: failed to start container: container_id=362b6ea183f3c4ce472d7d7571ca47023cea1df0f5eb920827921716f17718be attempt=3 error="API error (500): {"message":"cannot start a stopped process: unknown"} " ```	2018-12-15 15:03:56 -05:00
Mahmood Ali	6b216a6015	tests: pin busybox image to a specific point tag Using `:latest` tag is typically a cause of pain, as underlying image changes behavior. Here, I'm switching to using a point release, and re-updating the stored tarballs with it. Sadly, when saving/loading images, the repo digeset is not supported: https://github.com/moby/moby/issues/22011 ; but using point releases should mitigate the problem. The motivation here is that docker tests have some flakiness due to accidental importing of `busybox:latest` which has `/bin/nc` that no longer supports `-p 0`: ``` $ docker run -it --rm busybox /bin/nc -l 127.0.0.1 -p 0 Unable to find image 'busybox:latest' locally latest: Pulling from library/busybox Digest: sha256:2a03a6059f21e150ae84b0973863609494aad70f0a80eaeb64bddd8d92465812 Status: Downloaded newer image for busybox:latest nc: bad local port '0' ``` Looks like older busybox versions (e.g. `busybox:1.24` do honor `-p 0` as the test expect, but I would rather update busybox to fix.	2018-12-15 15:03:56 -05:00
Nick Ethier	0c50a51c19	executor: encode mounts and devices correctly when using grpc	2018-12-15 00:08:23 -05:00
Nick Ethier	a771ee59aa	rawexec: fix misleading log	2018-12-14 23:40:37 -05:00
Nick Ethier	49e03542cc	executor: use int when encoding signal in RPC	2018-12-14 22:20:01 -05:00
Nick Ethier	09dadf0a23	Merge branch 'master' into f-grpc-executor * master: (71 commits) Fix output of 'nomad deployment fail' with no arg Always create a running allocation when testing task state tests: ensure exec tests pass valid task resources (#4992) some changes for more idiomatic code fix iops related tests fixed bug in loop delay gofmt improved code for readability client: updateAlloc release lock after read fixup! device attributes in `nomad node status -verbose` drivers/exec: support device binds and mounts fix iops bug and increase test matrix coverage tests: tag image explicitly changelog ci: install lxc-templates explicitly tests: skip checking rdma cgroup ci: use Ubuntu 16.04 (Xenial) in TravisCI client: update driver info on new fingerprint drivers/docker: enforce volumes.enabled (#4983) client: Style: use fluent style for building loggers ...	2018-12-13 14:41:09 -05:00
Mahmood Ali	f0ec27da3c	tests: ensure exec tests pass valid task resources (#4992 ) Prior to 97f33bb1537d04905cb84199672bcdf46ebb4e65, executor cgroup validation errors were silently ignored. Enforcing them reveals test cases that missed them. This doesn't change customer facing contract, as resource struct is is either configured or we default to 100 (much higher than 2).	2018-12-12 20:40:38 -05:00
Mahmood Ali	74bd0be6ea	drivers/exec: support device binds and mounts	2018-12-11 18:35:21 -05:00
Mahmood Ali	8726ab3b9e	Merge pull request #4985 from hashicorp/test-with-xenial ci: Test with Ubuntu 16.04 in TravisCI	2018-12-11 18:00:39 -05:00
Mahmood Ali	69b2355274	Merge pull request #4975 from hashicorp/fix-master-20181209 Some test fixes and remedies	2018-12-11 18:00:21 -05:00
Mahmood Ali	979a65486d	tests: tag image explicitly	2018-12-11 17:59:45 -05:00
Alex Dadgar	1531b6d534	Merge pull request #4970 from hashicorp/f-no-iops Deprecate IOPS	2018-12-11 12:51:22 -08:00
Mahmood Ali	e6e71fb47a	tests: skip checking rdma cgroup rdma was added in most recent kernels and libcontainer/docker don't isolate them by default.	2018-12-11 15:49:11 -05:00
Mahmood Ali	84ded28c6d	drivers/docker: enforce volumes.enabled (#4983 ) When volumes.enable flag is off in Docker driver, disable all mounts of paths outside alloc dir.	2018-12-11 14:22:50 -05:00
Mahmood Ali	f6f39f1314	add a note about busybox license	2018-12-11 09:35:26 -05:00
Mahmood Ali	5a487ac884	tests: prevent indefinite blocking in some tests Noticed few places where tests seem to block indefinitely and panic after the test run reaches the test package timeout. I intend to follow up with the proper fix later, but timing out is much better than indefinitely blocking.	2018-12-11 09:35:26 -05:00
Mahmood Ali	23c07b9afe	tests: update stop/kill tests with new pattern Update rawexec and rkt stop/kill tests with the patterns introduced in 7a49e9b68e519050a0c2ef0b67c33503bfbc51be. This implementation should be more resilient to discrepancy between task stopping and task being marked as exited.	2018-12-11 09:35:26 -05:00
Mahmood Ali	8453ce7d56	tests: setup libcontainer rootfs Using statically linked busybox binary to setup a basic rootfs for testing, by symlinking it to provide the basic commands used in tests. I considered using a proper rootfs tarball, but the overhead of managing tarfile and expanding it seems significant enough that I went with this implementation.	2018-12-11 09:35:26 -05:00
Mahmood Ali	97829a3f02	fix dtestutil.NewDriverHarness ref	2018-12-08 09:58:23 -05:00
Mahmood Ali	021d3720b5	Merge pull request #4950 from hashicorp/b-exc-libcontainer-kill executor: kill all container processes	2018-12-08 09:52:42 -05:00
Nick Ethier	35268fdb54	executor: misspell	2018-12-08 01:52:06 -05:00
Nick Ethier	86e9c11ec2	executor: don't drop errors when configuring libcontainer cfg, add nil check on resources	2018-12-07 14:03:42 -05:00
Mahmood Ali	7d5b5bb5f9	Merge pull request #4933 from hashicorp/f-mount-device Mount Devices in container based drivers	2018-12-07 10:32:03 -05:00
Nick Ethier	47df1dde10	Merge branch 'master' into f-grpc-executor	2018-12-06 21:42:38 -05:00
Nick Ethier	19a695308f	executor: fix tests	2018-12-06 21:39:53 -05:00
Nick Ethier	913efed9f5	executor: fix broken non-linux build	2018-12-06 21:33:20 -05:00
Nick Ethier	2283cb2c39	executor: use drivers.Resources as resource model	2018-12-06 21:22:02 -05:00
Nick Ethier	29ef54c0ee	executor: merge plugin shim with executor package	2018-12-06 21:13:45 -05:00
Nick Ethier	71353a88d4	executor: remove structs package	2018-12-06 20:54:14 -05:00
Alex Dadgar	1e3c3cb287	Deprecate IOPS IOPS have been modelled as a resource since Nomad 0.1 but has never actually been detected and there is no plan in the short term to add detection. This is because IOPS is a bit simplistic of a unit to define the performance requirements from the underlying storage system. In its current state it adds unnecessary confusion and can be removed without impacting any users. This PR leaves IOPS defined at the jobspec parsing level and in the api/ resources since these are the two public uses of the field. These should be considered deprecated and only exist to allow users to stop using them during the Nomad 0.9.x release. In the future, there should be no expectation that the field will exist.	2018-12-06 15:09:26 -08:00
Mahmood Ali	a7b205daf2	Merge pull request #4955 from hashicorp/fix-docker-tests-20181203 Fix docker driver tests	2018-12-06 16:41:33 -05:00
Mahmood Ali	bdc53b1d8e	driver/rkt: mount plugin devices	2018-12-06 15:46:35 -05:00
Mahmood Ali	2c0fd2a902	driver/lxc: mount plugin devices Also, LXC requires target paths to be relative. Container paths in LXC binds should never be absolute paths, so we strip any preceeding `/`, even if a user sets one.	2018-12-06 15:46:35 -05:00
Mahmood Ali	699875eb1c	fixup: add missed docker utils test	2018-12-06 15:46:35 -05:00
Mahmood Ali	e9557ae596	tests: ensure image is loaded as test setup	2018-12-06 15:36:43 -05:00
Nick Ethier	57ffece7f8	executor: update test references	2018-12-05 11:07:48 -05:00
Nick Ethier	02f4b0fac5	executor: update driver references	2018-12-05 11:04:18 -05:00
Nick Ethier	8b20de4801	executor: use grpc instead of netrpc as plugin protocol * Added protobuf spec for executor * Seperated executor structs into their own package	2018-12-05 11:03:56 -05:00
Mahmood Ali	b55fb642f1	driver/docker: honor plugin devices	2018-12-04 21:31:28 -05:00
Mahmood Ali	a580cef986	refactor device manipulation	2018-12-04 20:55:59 -05:00
Mahmood Ali	3a18105d06	drivers/exec: refactor stop/kill tests Simplify the tests to do all assertions within the main goroutine and account for status propagation delay.	2018-12-04 20:34:43 -05:00
Mahmood Ali	428d35a5a9	executor: Keep 0.8.6 exit code for wait() failures 0.8.6 uses exit code 1 when `proc.Wait()` fails: https://github.com/hashicorp/nomad/blob/v0.8.6/client/driver/executor/executor.go#L442	2018-12-04 19:38:25 -05:00
Mahmood Ali	8df9de6fd5	driver/rkt: use rkt environment The rkt command itself needs an environment with PATH set to find iptables.	2018-12-04 14:00:45 -05:00
Mahmood Ali	06a5cadf35	drivers/rkt: use image isolation for rkt	2018-12-04 11:40:10 -05:00
Mahmood Ali	178365848e	tests: don't assert in WaitForResult WaitForResult expects body to fail and retries few times before giving up. Assertions inside the testfn body causes it to terminate abruptly without retrying.	2018-12-04 11:40:10 -05:00
Mahmood Ali	f8ceeebf11	no t.Parallel() in excutor table driven tests (#4948 ) When `t.Parallel()` is used inside a `t.Run()` sub-set, the closure doesn't behave as expected, and some cases effectively get skipped. More details can be found in https://gist.github.com/posener/92a55c4cd441fc5e5e85f27bca008721	2018-12-04 09:04:04 -05:00
Mahmood Ali	216a2566c7	Update LXC with drivers/testutils changes (#4951 )	2018-12-04 08:57:54 -05:00
Mahmood Ali	c88e3723eb	Fix docker tests Some tests have containers that die almost immediately, and may die and cleaned up before `driver.WaitUntilStarted` runs. The causes for container dying seems special for each test: * TestDockerDriver_Cleanup: `hello-world` image just emits a message and exits immediately * TestDockerDriver_ForcePull_RepoDigest: the busybox image in `TestDockerDriver_ForcePull_RepoDigest` test didn't support `-p 0` argument * TestDockerDriver_Entrypoint: with the entrypoint being `/bin/sh -c`, the command needs to be the entire string; otherwise, it ignores the comments	2018-12-03 23:08:52 -05:00
Mahmood Ali	2516cb16b9	Kill all container processes on shutdown Currently, libcontainer-based executor, upon shutdown, kills the container initial process. The children of the killed process remain running, and the executor is never marked as terminated until they do. Also, fix a case where we treat processes as successful, when `proc.Wait()` fails. In some attempts, I was getting "waitid no child processes" errors and such error shouldn't get process to be considered successful.	2018-12-03 20:40:49 -05:00
Mahmood Ali	bd8e4f1c15	Test Stopping a multi-process exec Ensure that exec children processes get killed as well.	2018-12-03 20:40:19 -05:00
Danielle Tomlinson	10b3e68a6d	Merge pull request #4925 from hashicorp/f-driver-plugins-dani Third Party Driver Plugins Support	2018-12-03 20:48:19 +01:00
Mahmood Ali	88622b97bd	libcontainer to manage /dev and /proc (#4945 ) libcontainer already manages `/dev`, overriding task_dir - so let's use it for `/proc` as well and remove deadcode.	2018-12-03 10:41:01 -05:00
Danielle Tomlinson	393b76ed7f	plugins: Move driver testing support to subpackage this allows us to drop a cyclical import, but is subobptimal as it requires BaseDriver tests to move. This falls firmly into the realm of being a hack. Alternatives welcome.	2018-12-01 17:29:39 +01:00
Danielle Tomlinson	66c521ca17	client: Move fingerprint structs to pkg This removes a cyclical dependency when importing client/structs from dependencies of the plugin_loader, specifically, drivers. Due to client/config also depending on the plugin_loader. It also better reflects the ownership of fingerprint structs, as they are fairly internal to the fingerprint manager.	2018-12-01 17:10:39 +01:00
Danielle Tomlinson	51a9f7369e	Merge pull request #4936 from hashicorp/f-legacy-refactor Refactor and repackage client/driver	2018-11-30 13:38:06 +01:00
Mahmood Ali	84e04cfa40	Merge pull request #4926 from hashicorp/f-docker-image-ref Use user provided image name to launch container	2018-11-30 07:27:39 -05:00
Mahmood Ali	94d43b8003	Merge pull request #4924 from hashicorp/f-docker-mounts Support bind and tmpfs docker mounts	2018-11-30 07:27:17 -05:00
Danielle Tomlinson	2db5ae38d8	client: Rename drivers/shared/env => client/taskenv	2018-11-30 12:18:39 +01:00
Danielle Tomlinson	f3a77b8084	client: Merge driver/shared/structs and client/structs	2018-11-30 10:56:45 +01:00
Danielle Tomlinson	fdfe93aa25	fixup: executorplugin: fix rkt build	2018-11-30 10:47:08 +01:00
Danielle Tomlinson	04c8851b4c	client: Migrate DriverStats optout to drivers/shared/structs	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	d582ea1d8b	drivers: Create drivers/shared/structs This creates a drivers/shared/structs package and moves the buffer size checks into it.	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	0544a57abe	drivers: Move client/drivers/executor to drivers/shared/executor	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	1a29811169	drivers: Move client/drivers/env to drivers/shared/env As part of deprecating legacy drivers, we're moving the env package to a new drivers/shared tree, as it is used by the modern docker and rkt driver packages, and is useful for 3rd party plugins.	2018-11-30 10:46:13 +01:00
Preetha Appan	0d90ba392e	Fix lxc test panic	2018-11-28 13:56:17 -06:00
Preetha Appan	924f1b69e9	Fix failing lxc test	2018-11-28 11:05:35 -06:00
Preetha Appan	bf58c65ef7	Fix LXC driver fingerprint to use typedattributes	2018-11-28 10:09:10 -06:00
Preetha Appan	9f4439243b	Fix docker driver to use new fingerprint typed attributes	2018-11-28 10:01:03 -06:00
Preetha Appan	f89dbcd9cc	modify fingerprint interface to use typed attribute struct	2018-11-28 10:01:03 -06:00
Mahmood Ali	9af8deabbf	address review comments	2018-11-27 21:40:43 -05:00
Mahmood Ali	6d34d2fade	Add Driver Plugin for LXC	2018-11-27 21:40:43 -05:00
Michael Schurter	e565c63eed	gofmt -s -w drivers/rkt/driver_test.go	2018-11-27 17:24:23 -08:00
Mahmood Ali	844fd47acc	Use user provided image name to launch container This allows the container to be tagged with a user friendly image name (e.g. `redis:3.2`) rather than the image ID (e.g. `sha256:87856cc39862cec77541d68382e4867d7ccb29a85a17221446c857ddaebca916`). Useful for human debugging, as well as some debugging and image scanning tools. This risks two bad changes: 1. Discrepancy in image resolution between docker and Nomad's image loader. * I checked the image creation paths in Nomad, and noticed that we either pulled the image or inspect the image with the user provided name. 2. A race in image tagging where the tag is modified between image loading and container creation. * I, personally, don't think this case is cause for concern, as it is analogous to the task running a bit later. As long as the image is still present, creating the container should be good.	2018-11-27 16:12:15 -05:00
Mahmood Ali	f6d6a50c39	add support for tmpfs	2018-11-27 07:20:17 -05:00
Mahmood Ali	0a09f5521d	Support docker bind mounts	2018-11-27 07:20:17 -05:00
Chris Baker	3dd6ba514a	drivers/rkt: updated test to include new AllocID field in TaskConfig	2018-11-26 21:37:58 +00:00
Chris Baker	9bd4317139	modified TaskConfig to include AllocID use this for volume names in drivers/rkt to address #1150	2018-11-26 18:54:26 +00:00
Mahmood Ali	141092e46d	Formatting and typo fixes	2018-11-25 11:53:21 -05:00
Mahmood Ali	c61d99b525	Merge pull request #4908 from hashicorp/f-docker-opts-storageopt Add support for docker storage options	2018-11-20 21:08:27 -05:00
Nick Ethier	1f3fe02e62	docker: sync access to exit result within a handle	2018-11-20 20:41:32 -05:00
Michael Schurter	2275153875	Apply suggestions from code review Co-Authored-By: nickethier <ncethier@gmail.com>	2018-11-20 20:33:31 -05:00
Mahmood Ali	e9e415f186	Add support for storage opt	2018-11-20 16:11:02 -05:00
Nick Ethier	3ccd359735	docker: unexport new coordinator func	2018-11-19 23:07:07 -05:00
Nick Ethier	8b9b2b476e	docker: add default blocks for driver plugin config schema	2018-11-19 22:59:18 -05:00
Nick Ethier	2667f48a5d	docker: move config RPCs to config.go	2018-11-19 22:59:18 -05:00
Nick Ethier	aa9f45ae47	docker: fix tests	2018-11-19 22:59:18 -05:00
Nick Ethier	0f03e8f520	docker: remove container pointer from task handle	2018-11-19 22:59:18 -05:00
Nick Ethier	ce4b867d21	docker: move volume driver options to seperate block	2018-11-19 22:59:18 -05:00
Nick Ethier	fca2df3c79	docker: group common config into blocks	2018-11-19 22:59:17 -05:00
Michael Schurter	813341dd59	Apply suggestions from code review Co-Authored-By: nickethier <ncethier@gmail.com>	2018-11-19 22:59:17 -05:00
Nick Ethier	b7bd36db30	docker: remove global pull coordinator	2018-11-19 22:59:17 -05:00
Nick Ethier	f0a86859a0	docker: remove call to global metrics instance	2018-11-19 22:59:17 -05:00
Nick Ethier	8ef73e63ce	docker: moved fingerprint code to it's own file	2018-11-19 22:59:17 -05:00
Nick Ethier	4be8a86ef9	plugins/driver: remove NodeResources from task Resources and use PercentTicks field for docker driver	2018-11-19 22:59:17 -05:00
Nick Ethier	ced5d5c445	docker: move recoverable error proto to shared structs	2018-11-19 22:59:16 -05:00
Nick Ethier	585e468085	docker: implement recover task logic	2018-11-19 22:59:16 -05:00
Nick Ethier	ee51cb6a93	docker: finished porting tests	2018-11-19 22:59:16 -05:00
Nick Ethier	3d7cdea19e	drivers/docker: more work porting tests from old driver plugin	2018-11-19 22:59:16 -05:00
Nick Ethier	8f8698b3e1	docker: started work on porting docker driver to new plugin framework	2018-11-19 22:59:15 -05:00
Nick Ethier	9e64ce7d73	docker: properly launch docker logger process	2018-11-19 22:59:12 -05:00
Michael Schurter	f8cdd561f0	client: interpolate driver configurations Also add missing SetDriverNetwork calls.	2018-11-15 16:25:57 -08:00
Mahmood Ali	1f4db08f42	Regenerate proto files with protoc-gen-go@v1.2.0	2018-11-14 09:01:26 -05:00
Mahmood Ali	1e92161f14	Merge pull request #4858 from hashicorp/b-fix-master-20181109 Fix some tests in master	2018-11-13 16:08:26 -05:00
Alex Dadgar	17e8446484	Merge pull request #4868 from hashicorp/b-plugin-ctx Plugin client's handle plugin dying	2018-11-13 10:26:53 -08:00
Mahmood Ali	356c194acc	Use materialized duration fields for driver config	2018-11-13 10:21:40 -05:00
Mahmood Ali	470d20cdf3	Avoid downloading image if present locally	2018-11-13 10:21:40 -05:00
Mahmood Ali	7690f389a0	Prioritize checking consumer context cancellation Tests expect that as soon as eventer shuts down immediately on context cancellations; but golang does not guarantee priority when multiple pending channels are ready in a select statement.	2018-11-13 10:21:40 -05:00
Mahmood Ali	e9067e52b4	pull alpine image needed for test The test requires the image to be present locally, so importing it as part of setup.	2018-11-13 10:21:40 -05:00
Mahmood Ali	8923ea4663	Handle time.Duration in mock Mock driver config uses `time.Duration` fields but we initialize them inconsistently, as time.Duration sometimes and as duration strings other times. Previously, `mapstructure` handles it and does the right thing. This is no longer the case with MsgPack. I could not find a good way to bring back old behavior without too much complexity. `MsgPack` extended types weren't ideal here as we lose type information (e.g. int64 vs string), and the input is a generic map and not a MsgPack serialization of duration. As such, I went with the simple solution of declaring the config field as duration string, and panicing if the test doesn't pass a valid string. I found this to cause the smallest change in tests, but we can alternatively force all to be int64 instead.	2018-11-13 10:21:40 -05:00
Mahmood Ali	fb56dd699d	distinguish java driver tests from others	2018-11-13 10:21:40 -05:00
Mahmood Ali	343df28165	Fix java driver tests	2018-11-13 10:21:40 -05:00
Mahmood Ali	e506ebbc24	Merge pull request #4845 from hashicorp/r-exec-refactor Update exec driver to match rawexec	2018-11-12 20:59:32 -05:00
Alex Dadgar	693f244cce	Plugin client's handle plugin dying This PR plumbs the plugins done ctx through the base and driver plugin clients (device already had it). Further, it adds generic handling of gRPC stream errors.	2018-11-12 17:09:27 -08:00
Danielle Tomlinson	880e5015f2	rawexec: Only use cgroups when running as root. If Nomad is not running as root, we should not try to use cgroups for pid freezing. This originally was implemented pre-driver-support in https://github.com/hashicorp/nomad/blob/v0.8.6/client/driver/raw_exec.go#L120-L130	2018-11-10 06:45:11 -08:00
Mahmood Ali	9c76a7d9b8	handle nil resources limits Config may not express resource constraints on the job!	2018-11-09 15:52:44 -05:00
Mahmood Ali	1ed53ca478	shorter health descriptions	2018-11-08 10:00:49 -08:00
Mahmood Ali	032f86bc78	Add a helper functions for checking unix root	2018-11-08 10:00:49 -08:00
Mahmood Ali	21d126cd68	Update exec driver to match rawexec Update exec driver with latest patterns used by rawexec, and eliminate accidental diffs.	2018-11-08 10:00:49 -08:00
Alex Dadgar	a7ca737fb6	review comments	2018-11-07 11:31:52 -08:00
Alex Dadgar	204ca8230c	Device manager Introduce a device manager that manages the lifecycle of device plugins on the client. It fingerprints, collects stats, and forwards Reserve requests to the correct plugin. The manager, also handles device plugins failing and validates their output.	2018-11-07 10:43:15 -08:00
Mahmood Ali	32283b2a66	reconcile java with latest patterns	2018-11-06 12:49:29 -08:00
Mahmood Ali	467360954a	driver/java: support non-linux targets Java driver supports non-linux targets, so LinuxResources fields isn't guaranteed to be not-nil.	2018-11-06 12:41:39 -08:00
Mahmood Ali	e64ef839b8	Reconcile with latest master patterns	2018-11-06 12:41:39 -08:00
Mahmood Ali	c3881bd8d1	add java driver tests	2018-11-06 12:41:39 -08:00
Mahmood Ali	4f525d0a17	Implement java driver	2018-11-06 12:41:39 -08:00
Michael Schurter	392d548b85	Merge pull request #4828 from hashicorp/b-restore Implement client agent restarting	2018-11-05 18:50:15 -06:00
Michael Schurter	2bbd88888c	client: first pass at implementing task restoring Task restoring works but dead tasks may be restarted	2018-11-05 12:32:05 -08:00
Mahmood Ali	a17521475d	Merge pull request #4826 from hashicorp/b-driver-exec-tweaks-20181031 Register exec driver plugin among some fixes	2018-11-02 10:11:05 -04:00
Mahmood Ali	dc0fc442ab	add plugin config	2018-11-02 08:51:23 -04:00
Michael Schurter	83db3e6a50	rkt: standardize names/code	2018-11-01 15:53:07 -07:00
Michael Schurter	1a7fb51285	rawexec: standardize names/code	2018-11-01 15:53:07 -07:00
Michael Schurter	5a5d52023d	qemu: standardize names/code	2018-11-01 15:53:07 -07:00
Michael Schurter	3c5e606ef6	drivers/mock: standardize names/code	2018-10-31 11:52:51 -07:00
Mahmood Ali	9a45837c4e	Register exec driver plugin among some fixes Namely, remove the `enabled` configuration flag, as it's specific to `raw_exec` driver. Also, pass resource limits to underlying call.	2018-10-31 13:55:59 -04:00
Michael Schurter	d3dafc9761	mock_driver: match other's fingerprint message	2018-10-30 17:38:23 -07:00
Michael Schurter	2a38f575c0	rkt: fix race, register by default, fix fingerprint	2018-10-30 17:37:00 -07:00
Michael Schurter	5b0b75e1ee	qemu: fix fingerprint description and handle race	2018-10-30 17:34:08 -07:00
Michael Schurter	6f4a28a7c2	rawexec: fix fingerprint results and tests In tests: * always cleanup (Kill) the harness * don't call SetConfig more than once (a race) * avoid cgroups to allow running without being root	2018-10-30 17:16:22 -07:00
Michael Schurter	616801417c	rawexec: fix race around h.procState	2018-10-30 17:15:47 -07:00
Michael Schurter	9fa53565ad	drivers: remove stutter from exported driver names Also fix a comment on the logger that got copy/pasted around.	2018-10-30 14:05:31 -07:00
Michael Schurter	fc320906aa	qemu: fix build and register by default API changes had broken the build.	2018-10-29 22:10:37 -07:00
Nick Ethier	3fcf8ba7e6	Merge pull request #4795 from hashicorp/f-plugin-config Pass client configuration to plugins through loader	2018-10-29 18:42:27 -07:00
Nick Ethier	bda3b1d3b3	rename NomadConfig to ClientAgentConfig	2018-10-29 21:34:34 -04:00
Preetha Appan	68991d8acb	Merge branch 'f-driver-plugin-qemu' of github.com:hashicorp/nomad into f-driver-plugin-qemu	2018-10-25 13:51:58 -05:00
Preetha Appan	bfff771416	remove driver config struct from qemu driver since its not used	2018-10-25 13:51:06 -05:00
Michael Schurter	48d25dbef1	Fix hclog log line formatting Co-Authored-By: preetapan <preetha@hashicorp.com>	2018-10-25 13:47:17 -05:00
Preetha Appan	95a1637741	Fix imports	2018-10-25 13:45:44 -05:00
Michael Schurter	ca151a7828	Fix copy pasted comment Co-Authored-By: preetapan <preetha@hashicorp.com>	2018-10-25 13:43:29 -05:00
Michael Schurter	c85b90f1e4	remove newline Co-Authored-By: preetapan <preetha@hashicorp.com>	2018-10-25 13:42:20 -05:00
Michael Schurter	66c15e3a67	update comment Co-Authored-By: preetapan <preetha@hashicorp.com>	2018-10-25 13:41:24 -05:00
Michael Schurter	9ca6c998c7	Remove unnecessary return Co-Authored-By: preetapan <preetha@hashicorp.com>	2018-10-25 13:40:28 -05:00
Michael Schurter	fc7ba5e79d	remove unnecessary return Co-Authored-By: preetapan <preetha@hashicorp.com>	2018-10-25 13:40:15 -05:00
Nick Ethier	9916b9e0ce	drivers: update SetConfig to handle driver client config	2018-10-18 23:32:17 -04:00
Michael Schurter	af3e9adf2f	Update drivers/qemu/driver.go	2018-10-17 17:42:01 -05:00
Preetha Appan	93eb048b87	QEMU driver plugin	2018-10-17 11:43:15 -05:00
Alex Dadgar	4b694b2dec	Merge pull request #4793 from hashicorp/b-no-cty Do not use cty in drivers	2018-10-16 20:04:41 -07:00
Nick Ethier	823a6c8d6c	update rkt and exec setconfig funcs	2018-10-16 23:00:26 -04:00
Nick Ethier	65adb80ebf	plumb NomadConfig into plugins	2018-10-16 22:47:22 -04:00
Nick Ethier	b657c3909a	drivers/exec: better fingerprinting and detection on non linux	2018-10-16 22:45:33 -04:00
Nick Ethier	d94b631b6b	drivers/exec: add exec implementation	2018-10-16 22:45:28 -04:00
Alex Dadgar	88c1885044	Do not use cty in drivers	2018-10-16 17:17:07 -07:00
Michael Schurter	838ddf4d4a	fix linter errors	2018-10-16 16:56:57 -07:00
Nick Ethier	448a709603	driver/docker: rename container logger	2018-10-16 16:56:56 -07:00
Michael Schurter	64605a3957	drivers/mock: fix plugin name Was mock_driver before plugins, so keep the name.	2018-10-16 16:56:56 -07:00
Preetha Appan	d634157158	Address review comments around logging task properly	2018-10-16 16:56:56 -07:00
Preetha Appan	3280727c4e	Address review comments	2018-10-16 16:56:56 -07:00
Preetha Appan	02b1f26d34	Switch back to using map[string]string for port map	2018-10-16 16:56:56 -07:00
Michael Schurter	ed5641055f	fixup comments, logging, and missing method impls from #4777 comments	2018-10-16 16:56:56 -07:00
Michael Schurter	1a29337e48	register drivers by default Do not register mock_driver on release builds.	2018-10-16 16:56:56 -07:00
Michael Schurter	118ce7c782	drivers/mock: complete plugin impl	2018-10-16 16:56:56 -07:00
Nick Ethier	9f564781ce	drivers/mock: start mock driver implementation	2018-10-16 16:56:56 -07:00
Preetha Appan	9b3e82a4e7	make port map a slice of maps to match existing rkt driver	2018-10-16 16:56:56 -07:00
Preetha Appan	4f4777d6a6	Review comments	2018-10-16 16:56:56 -07:00
Preetha Appan	edad5e9e6d	Stats collection test	2018-10-16 16:56:56 -07:00
Preetha Appan	678072ecd1	RKT driver plugin and unit tests	2018-10-16 16:56:56 -07:00
Nick Ethier	c7ac1186c9	client: add test for driverfailure during fingerprinting	2018-10-16 16:56:56 -07:00

... 5 6 7 8 9 ...

663 Commits