open-nomad

Commit Graph

Author	SHA1	Message	Date
Michael Schurter	7d49bc4c71	executor/linux: make chroot binary paths absolute Avoid libcontainer.Process trying to lookup the binary via $PATH as the executor has already found where the binary is located.	2019-04-01 15:45:31 -07:00
Mahmood Ali	cb16ad7e3f	comment configureTLogging	2019-04-01 16:52:58 -04:00
Mahmood Ali	81f4f07ed7	rename fifo methods for clarity	2019-04-01 16:52:58 -04:00
Mahmood Ali	88dc4a255a	avoid opening files just to close them	2019-04-01 13:24:18 -04:00
Mahmood Ali	dac2cd3df3	Add test cases for waiting on children Also, make the test use files just like in the non-test case.	2019-04-01 13:24:18 -04:00
Michael Schurter	b8d1dd95a0	Update drivers/shared/executor/executor_test.go Co-Authored-By: notnoop <mahmood@notnoop.com>	2019-03-31 20:34:24 -04:00
Mahmood Ali	df5d7ba50d	fix test setup	2019-03-26 09:15:22 -04:00
Mahmood Ali	d737a9836a	test kill wait	2019-03-26 09:15:22 -04:00
Mahmood Ali	2a7b18aec4	Revert "executor: synchronize exitState accesses" (#5449 ) Reverts hashicorp/nomad#5433 Apparently, channel communications can constitute Happens-Before even for proximate variables, so this syncing isn't necessary. > _The closing of a channel happens before a receive that returns a zero value because the channel is closed._ https://golang.org/ref/mem#tmp_7	2019-03-20 07:33:05 -04:00
Nick Ethier	505e36ff7a	Merge pull request #5429 from hashicorp/b-blocking-executor-shutdown executor: block shutdown on process exiting	2019-03-19 15:18:01 -04:00
Mahmood Ali	a1776dba34	executor: synchronize exitState accesses exitState is set in `wait()` goroutine but accessed in a different `Wait()` goroutine, so accesses must be synchronized by a lock.	2019-03-17 11:56:58 -04:00
Nick Ethier	7418d09cf0	executor: block shutdown on process exiting	2019-03-15 23:50:17 -04:00
Mahmood Ali	fb55717b0c	Regenerate Proto files (#5421 ) Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin. The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](`ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)`), and [0.9.0-beta3 preperation](`b849d84f2f`). This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync. Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085	2019-03-14 10:56:27 -04:00
Iskander (Alex) Sharipov	e69909fbd3	drivers/shared/executor: fix strings.Replace call strings.Replace call with n=0 argument makes no sense as it will do nothing. Probably -1 is intended. Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>	2019-03-02 00:33:17 +03:00
Michael Schurter	38821954b7	plugins: squelch context Canceled error logs As far as I can tell this is the most straightforward and resilient way to skip error logging on context cancellation with grpc streams. You cannot compare the error against context.Canceled directly as it is of type `*status.statusError`. The next best solution I found was: ```go resp, err := stream.Recv() if code, ok := err.(interface{ Code() code.Code }); ok { if code.Code == code.Canceled { return } } ``` However I think checking ctx.Err() directly makes the code much easier to read and is resilient against grpc API changes.	2019-02-21 15:32:18 -08:00
Mahmood Ali	a394cd63f4	CVE-2019-5736: Update libcontainer depedencies (#5334 ) * CVE-2019-5736: Update libcontainer depedencies Libcontainer is vulnerable to a runc container breakout, that was reported as CVE-2019-5736[1]. Upgrading vendored libcontainer with the fix. The runc changes are captured in `369b920277` . [1] https://seclists.org/oss-sec/2019/q1/119	2019-02-19 20:21:18 -05:00
Alex Dadgar	bc804dda2e	Nomad 0.9.0-beta1 generated code	2019-01-30 10:49:44 -08:00
Nick Ethier	bb9a8afe9b	executor: fix bug and add tests for incorrect stats timestamp reporting	2019-01-28 21:57:45 -05:00
Nick Ethier	c7cc81924d	drivers/docker: handle shutdown of upgraded tasks correctly	2019-01-24 14:21:59 -05:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Nick Ethier	e3c6f89b9a	drivers: use consts for task handle version	2019-01-18 18:31:01 -05:00
Nick Ethier	05bd369d1f	driver: add pre09 migration logic	2019-01-18 18:31:01 -05:00
Nick Ethier	e5a6fc9271	executor: add pre 0.9 client and wrapper	2019-01-18 18:30:58 -05:00
Mahmood Ali	5df63fda7c	Merge pull request #5190 from hashicorp/f-memory-usage Track Basic Memory Usage as reported by cgroups	2019-01-18 16:46:02 -05:00
Danielle Tomlinson	b918d25e62	Merge pull request #5192 from hashicorp/dani/executor-close executor: Always close stdout/stderr fifos	2019-01-15 17:49:04 +01:00
Danielle Tomlinson	7f1ff3fab6	executor: Always close stdout/stderr fifos	2019-01-15 16:47:27 +01:00
Mahmood Ali	5649f72d27	propogate logs to executor plugin	2019-01-15 08:25:03 -05:00
Alex Dadgar	471fdb3ccf	Merge pull request #5173 from hashicorp/b-log-levels Plugins use parent loggers	2019-01-14 16:14:30 -08:00
Mahmood Ali	9909d98bee	Track Basic Memory Usage as reported by cgroups Track current memory usage, `memory.usage_in_bytes`, in addition to `memory.max_memory_usage_in_bytes` and friends. This number is closer what Docker reports. Related to https://github.com/hashicorp/nomad/issues/5165 .	2019-01-14 18:47:52 -05:00
Nick Ethier	3b395d7100	drivers: plumb grpc client logger	2019-01-12 12:18:23 -05:00
Nick Ethier	7e306afde3	executor: fix failing stats related test	2019-01-12 12:18:23 -05:00
Nick Ethier	9fea54e0dc	executor: implement streaming stats API plugins/driver: update driver interface to support streaming stats client/tr: use streaming stats api TODO: * how to handle errors and closed channel during stats streaming * prevent tight loop if Stats(ctx) returns an error drivers: update drivers TaskStats RPC to handle streaming results executor: better error handling in stats rpc docker: better control and error handling of stats rpc driver: allow stats to return a recoverable error	2019-01-12 12:18:22 -05:00
Alex Dadgar	14ed757a56	Plugins use parent loggers This PR fixes various instances of plugins being launched without using the parent loggers. This meant that logs would not all go to the same output, break formatting etc.	2019-01-11 11:36:37 -08:00
Mahmood Ali	90f3cea187	Merge pull request #5157 from hashicorp/r-drivers-no-cstructs drivers: avoid referencing client/structs package	2019-01-09 13:06:46 -05:00
Mahmood Ali	4952f2a182	Merge pull request #5159 from hashicorp/r-macos-tests Fix Travis MacOS job	2019-01-09 08:22:30 -05:00
Mahmood Ali	8f20bc8ce2	Merge pull request #5154 from hashicorp/f-revert-exec-devs drivers/exec: restrict devices exposed to tasks	2019-01-08 12:43:06 -05:00
Mahmood Ali	d19b92edec	executor: add a comment detailing isolation	2019-01-08 12:10:26 -05:00
Mahmood Ali	426c981c34	Remove some dead code	2019-01-08 09:11:48 -05:00
Mahmood Ali	64f80343fc	drivers: re-export ResourceUsage structs Re-export the ResourceUsage structs in drivers package to avoid drivers directly depending on the internal client/structs package directly. I attempted moving the structs to drivers, but that caused some import cycles that was a bit hard to disentagle. Alternatively, I added an alias here that's sufficient for our purposes of avoiding external drivers depend on internal packages, while allowing us to restructure packages in future without breaking source compatibility.	2019-01-08 09:11:47 -05:00
Mahmood Ali	9369b123de	use drivers.FSIsolation	2019-01-08 09:11:47 -05:00
Alex Dadgar	8a35d7b1dd	Test recovery	2019-01-07 14:49:41 -08:00
Mahmood Ali	58fb6812db	tests: busybox only depends on arch Busybox is compiled for linux only. Making the file used in executor tests even for non-linux targets, as having the file present has no side-effects.	2019-01-07 08:36:32 -05:00
Mahmood Ali	8797a4f0ea	drivers/exec: restrict devices exposed to tasks We ultimately decided to provide a limited set of devices in exec/java drivers instead of all of host ones. Pre-0.9, we made all host devices available to exec tasks accidentally, yet most applications only use a small subset, and this choice limits our ability to restrict/isolate GPU and other devices. Starting with 0.9, by default, we only provide the same subset of devices Docker provides, and allow users to provide more devices as needed on case-by-case basis. This reverts commit 5805c64a9f1c3b409693493dfa30e7136b9f547b. This reverts commit ff9a4a17e59388dcab067949e0664f645b2f5bcf.	2019-01-06 17:03:19 -05:00
Mahmood Ali	56e3171310	driver/exec: use dedicated /dev mount (#5147 ) Use a dedicated /dev mount so we can inject more devices if necessary, and avoid allowing a container to contaminate host /dev. Follow up to https://github.com/hashicorp/nomad/pull/5143 - and fixes master.	2019-01-04 10:36:05 -05:00
Mahmood Ali	5b0702c9eb	drivers/exec: bind mount /dev into rootfs Restores pre-0.9 behavior, where Nomad makes /dev available to exec task. Switching to libcontainer, we accidentally made only a small subset available. Here, we err on the side of preserving behavior of 0.8, instead of going for the sensible route, where only a reasonable subset of devices is mounted by default and user can opt to request more.	2019-01-03 14:29:18 -05:00
Alex Dadgar	b8268d9a46	Lint	2018-12-18 15:50:44 -08:00
Alex Dadgar	327b551b39	Drivers	2018-12-18 15:50:11 -08:00
Alex Dadgar	b9ee03b2c1	protos	2018-12-18 15:48:52 -08:00
Nick Ethier	0c50a51c19	executor: encode mounts and devices correctly when using grpc	2018-12-15 00:08:23 -05:00
Nick Ethier	49e03542cc	executor: use int when encoding signal in RPC	2018-12-14 22:20:01 -05:00
Nick Ethier	09dadf0a23	Merge branch 'master' into f-grpc-executor * master: (71 commits) Fix output of 'nomad deployment fail' with no arg Always create a running allocation when testing task state tests: ensure exec tests pass valid task resources (#4992) some changes for more idiomatic code fix iops related tests fixed bug in loop delay gofmt improved code for readability client: updateAlloc release lock after read fixup! device attributes in `nomad node status -verbose` drivers/exec: support device binds and mounts fix iops bug and increase test matrix coverage tests: tag image explicitly changelog ci: install lxc-templates explicitly tests: skip checking rdma cgroup ci: use Ubuntu 16.04 (Xenial) in TravisCI client: update driver info on new fingerprint drivers/docker: enforce volumes.enabled (#4983) client: Style: use fluent style for building loggers ...	2018-12-13 14:41:09 -05:00
Mahmood Ali	74bd0be6ea	drivers/exec: support device binds and mounts	2018-12-11 18:35:21 -05:00
Mahmood Ali	69b2355274	Merge pull request #4975 from hashicorp/fix-master-20181209 Some test fixes and remedies	2018-12-11 18:00:21 -05:00
Alex Dadgar	1531b6d534	Merge pull request #4970 from hashicorp/f-no-iops Deprecate IOPS	2018-12-11 12:51:22 -08:00
Mahmood Ali	f6f39f1314	add a note about busybox license	2018-12-11 09:35:26 -05:00
Mahmood Ali	5a487ac884	tests: prevent indefinite blocking in some tests Noticed few places where tests seem to block indefinitely and panic after the test run reaches the test package timeout. I intend to follow up with the proper fix later, but timing out is much better than indefinitely blocking.	2018-12-11 09:35:26 -05:00
Mahmood Ali	8453ce7d56	tests: setup libcontainer rootfs Using statically linked busybox binary to setup a basic rootfs for testing, by symlinking it to provide the basic commands used in tests. I considered using a proper rootfs tarball, but the overhead of managing tarfile and expanding it seems significant enough that I went with this implementation.	2018-12-11 09:35:26 -05:00
Mahmood Ali	021d3720b5	Merge pull request #4950 from hashicorp/b-exc-libcontainer-kill executor: kill all container processes	2018-12-08 09:52:42 -05:00
Nick Ethier	35268fdb54	executor: misspell	2018-12-08 01:52:06 -05:00
Nick Ethier	86e9c11ec2	executor: don't drop errors when configuring libcontainer cfg, add nil check on resources	2018-12-07 14:03:42 -05:00
Nick Ethier	47df1dde10	Merge branch 'master' into f-grpc-executor	2018-12-06 21:42:38 -05:00
Nick Ethier	19a695308f	executor: fix tests	2018-12-06 21:39:53 -05:00
Nick Ethier	913efed9f5	executor: fix broken non-linux build	2018-12-06 21:33:20 -05:00
Nick Ethier	2283cb2c39	executor: use drivers.Resources as resource model	2018-12-06 21:22:02 -05:00
Nick Ethier	29ef54c0ee	executor: merge plugin shim with executor package	2018-12-06 21:13:45 -05:00
Nick Ethier	71353a88d4	executor: remove structs package	2018-12-06 20:54:14 -05:00
Alex Dadgar	1e3c3cb287	Deprecate IOPS IOPS have been modelled as a resource since Nomad 0.1 but has never actually been detected and there is no plan in the short term to add detection. This is because IOPS is a bit simplistic of a unit to define the performance requirements from the underlying storage system. In its current state it adds unnecessary confusion and can be removed without impacting any users. This PR leaves IOPS defined at the jobspec parsing level and in the api/ resources since these are the two public uses of the field. These should be considered deprecated and only exist to allow users to stop using them during the Nomad 0.9.x release. In the future, there should be no expectation that the field will exist.	2018-12-06 15:09:26 -08:00
Nick Ethier	57ffece7f8	executor: update test references	2018-12-05 11:07:48 -05:00
Nick Ethier	8b20de4801	executor: use grpc instead of netrpc as plugin protocol * Added protobuf spec for executor * Seperated executor structs into their own package	2018-12-05 11:03:56 -05:00
Mahmood Ali	428d35a5a9	executor: Keep 0.8.6 exit code for wait() failures 0.8.6 uses exit code 1 when `proc.Wait()` fails: https://github.com/hashicorp/nomad/blob/v0.8.6/client/driver/executor/executor.go#L442	2018-12-04 19:38:25 -05:00
Mahmood Ali	f8ceeebf11	no t.Parallel() in excutor table driven tests (#4948 ) When `t.Parallel()` is used inside a `t.Run()` sub-set, the closure doesn't behave as expected, and some cases effectively get skipped. More details can be found in https://gist.github.com/posener/92a55c4cd441fc5e5e85f27bca008721	2018-12-04 09:04:04 -05:00
Mahmood Ali	2516cb16b9	Kill all container processes on shutdown Currently, libcontainer-based executor, upon shutdown, kills the container initial process. The children of the killed process remain running, and the executor is never marked as terminated until they do. Also, fix a case where we treat processes as successful, when `proc.Wait()` fails. In some attempts, I was getting "waitid no child processes" errors and such error shouldn't get process to be considered successful.	2018-12-03 20:40:49 -05:00
Mahmood Ali	88622b97bd	libcontainer to manage /dev and /proc (#4945 ) libcontainer already manages `/dev`, overriding task_dir - so let's use it for `/proc` as well and remove deadcode.	2018-12-03 10:41:01 -05:00
Danielle Tomlinson	2db5ae38d8	client: Rename drivers/shared/env => client/taskenv	2018-11-30 12:18:39 +01:00
Danielle Tomlinson	f3a77b8084	client: Merge driver/shared/structs and client/structs	2018-11-30 10:56:45 +01:00
Danielle Tomlinson	04c8851b4c	client: Migrate DriverStats optout to drivers/shared/structs	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	d582ea1d8b	drivers: Create drivers/shared/structs This creates a drivers/shared/structs package and moves the buffer size checks into it.	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	0544a57abe	drivers: Move client/drivers/executor to drivers/shared/executor	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	1a29811169	drivers: Move client/drivers/env to drivers/shared/env As part of deprecating legacy drivers, we're moving the env package to a new drivers/shared tree, as it is used by the modern docker and rkt driver packages, and is useful for 3rd party plugins.	2018-11-30 10:46:13 +01:00
Nick Ethier	8b9b2b476e	docker: add default blocks for driver plugin config schema	2018-11-19 22:59:18 -05:00
Mahmood Ali	1e92161f14	Merge pull request #4858 from hashicorp/b-fix-master-20181109 Fix some tests in master	2018-11-13 16:08:26 -05:00
Mahmood Ali	7690f389a0	Prioritize checking consumer context cancellation Tests expect that as soon as eventer shuts down immediately on context cancellations; but golang does not guarantee priority when multiple pending channels are ready in a select statement.	2018-11-13 10:21:40 -05:00
Alex Dadgar	693f244cce	Plugin client's handle plugin dying This PR plumbs the plugins done ctx through the base and driver plugin clients (device already had it). Further, it adds generic handling of gRPC stream errors.	2018-11-12 17:09:27 -08:00
Nick Ethier	5fb95c1a00	drivers/shared: added func comment to eventer	2018-10-16 16:56:55 -07:00
Nick Ethier	951a084198	drivers/shared: move eventer to subpackage under drivers shared package	2018-10-16 16:56:55 -07:00

1 2 3

135 Commits