open-nomad

Author	SHA1	Message	Date
Danielle Tomlinson	04c8851b4c	client: Migrate DriverStats optout to drivers/shared/structs	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	dbd82e1af4	client: Remove test dependency on client/driver	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	0544a57abe	drivers: Move client/drivers/executor to drivers/shared/executor	2018-11-30 10:46:13 +01:00
Danielle Tomlinson	1a29811169	drivers: Move client/drivers/env to drivers/shared/env As part of deprecating legacy drivers, we're moving the env package to a new drivers/shared tree, as it is used by the modern docker and rkt driver packages, and is useful for 3rd party plugins.	2018-11-30 10:46:13 +01:00
Nick Ethier	bbe420718a	Merge pull request #4922 from hashicorp/f-drivermananger add generic plugin manager interface and orchestration	2018-11-28 22:17:04 -05:00
Preetha	1f526db414	Merge pull request #4919 from hashicorp/f-fingerprint-attribute-type Modify fingerprint interface to use typed attribute struct	2018-11-28 14:18:28 -06:00
Michael Schurter	1bd9a9f9dd	Merge pull request #4894 from hashicorp/f-device-hook Device hook and devices affect computed node class	2018-11-28 12:10:43 -06:00
Preetha Appan	f89dbcd9cc	modify fingerprint interface to use typed attribute struct	2018-11-28 10:01:03 -06:00
Nick Ethier	60c6907ea5	client/plugin: remove println from plugin group func	2018-11-27 22:45:09 -05:00
Nick Ethier	600738e991	client/plugin: lint/spelling errors	2018-11-27 22:45:09 -05:00
Nick Ethier	45a6bf7acd	client/plugin: add generic plugin mananger interface and orchestration	2018-11-27 22:45:03 -05:00
Mahmood Ali	ad1f8d8c20	Fixes in old lxc driver	2018-11-27 21:40:43 -05:00
Michael Schurter	3e56ee005a	add nil check around task resources in device hook Looking at NewTaskRunner I'm unsure whether TaskRunner.TaskResources (from which req.TaskResources is set) is intended to be nil at times or if the TODO in NewTaskRunner is intended to ensure it is always non-nil.	2018-11-27 17:25:33 -08:00
Michael Schurter	b75e9fce37	assume that slices contain only non-nil items	2018-11-27 17:25:33 -08:00
Michael Schurter	85073f9d29	client: properly support hook env vars The old approach was incomplete. Hook env vars are now: * persisted and restored between agent restarts * deterministic (LWW if 2 hooks set the same key)	2018-11-27 17:25:33 -08:00
Alex Dadgar	4ee603c382	Device hook and devices affect computed node class This PR introduces a device hook that retrieves the device mount information for an allocation. It also updates the computed node class computation to take into account devices. TODO Fix the task runner unit test. The environment variable is being lost even though it is being properly set in the prestart hook.	2018-11-27 17:25:33 -08:00
Michael Schurter	27e07f657e	Merge pull request #4896 from hashicorp/b-prevalloc-deadlock Fix deadlock in previous alloc watcher by emitting last alloc update	2018-11-27 19:07:16 -06:00
Michael Schurter	b75f79a793	fix test breakage caused by rebase	2018-11-27 16:34:01 -08:00
Michael Schurter	91da566935	fix mispelings	2018-11-27 16:33:55 -08:00
Chris Baker	a1fb1f3830	Merge pull request #4891 from hashicorp/b-1150-rkt-volume-names drivers/rkt: fix invalid volumes	2018-11-27 18:55:00 -05:00
Danielle Tomlinson	3651dbdc25	Merge pull request #4909 from hashicorp/b-restart-delay taskrunner: Return the restart delay correctly	2018-11-27 23:55:54 +01:00
Michael Schurter	22149a661e	client: comment on importance of chan ops ordering	2018-11-27 14:11:32 -08:00
Mahmood Ali	05a958dc21	Update client/structs/broadcaster.go Co-Authored-By: schmichael <michael.schurter@gmail.com>	2018-11-27 14:06:08 -08:00
Michael Schurter	81b6a24a84	client: fix send-after-close in broadcaster	2018-11-27 14:06:08 -08:00
Michael Schurter	c429e6b0ab	client: check if prev alloc is already terminated This is a defensive fast-path as 7c6aa0be already fixed the deadlock.	2018-11-27 14:06:08 -08:00
Michael Schurter	944ea6d38b	client: emit last sent alloc to new listeners Fixes a deadlock where the allocwatcher would block forever waiting for an update from a terminal alloc. Made the broadcaster easier to debug as well.	2018-11-27 14:06:08 -08:00
Michael Schurter	1e4ef139dd	Merge pull request #4883 from hashicorp/f-graceful-shutdown Support graceful shutdowns in agent	2018-11-27 15:55:15 -06:00
Michael Schurter	4f7e6f9464	client: fix races in use of goroutine group The group utility struct does not support asynchronously launched goroutines (goroutines-inside-of-goroutines), so switch those uses to a normal go call. This means watchNodeUpdates and watchNodeEvents may not be shutdown when Shutdown() exits. During nomad agent shutdown this does not matter. During tests this means a test may leak those goroutines or be unable to know when those goroutines have exited. Since there's no runtime impact and these goroutines do not affect alloc state syncing it seems ok to risk leaking them.	2018-11-26 12:52:55 -08:00
Michael Schurter	9f43fb6d29	client: reuse group instead of diy'ing it	2018-11-26 12:52:31 -08:00
Michael Schurter	22771aa19e	client/ar: remove useless wait ch from runTasks Arguably this makes task.WaitCh() useless, but I think exposing a wait chan from TaskRunners is a generically useful API.	2018-11-26 12:51:18 -08:00
Michael Schurter	2fdd013956	client: document how AR/TR Run methods behave	2018-11-26 12:50:35 -08:00
Chris Baker	9bd4317139	modified TaskConfig to include AllocID use this for volume names in drivers/rkt to address #1150	2018-11-26 18:54:26 +00:00
Nick Ethier	95362eaa02	Merge pull request #4844 from hashicorp/f-docker-plugin Docker driver plugin	2018-11-20 20:43:03 -05:00
Mahmood Ali	e1994e59bd	address review comments	2018-11-20 17:10:54 -05:00
Mahmood Ali	171b73fde7	Emit metric counters for Vault token and renewal failures	2018-11-20 17:10:54 -05:00
Mahmood Ali	5b10da5de6	Set User-Agent header when hitting Vault API	2018-11-20 17:10:54 -05:00
Danielle Tomlinson	093f029d5b	taskrunner: Return the restart delay correctly We were incorrectly returning a 0 duration to the taskrunner when determining when a task should restart. This would cause tasks to be restarted immediately, ignoring the restart {} stanza in a users configuration. This commit causes us to return the restart duration to the task runner so it may correctly delay further execution.	2018-11-20 21:52:23 +01:00
Nick Ethier	3e42d6914e	task_runner: use NodeResources instead of deprecated struct	2018-11-20 13:46:39 -05:00
Nick Ethier	93c0200566	task_runner: use task and alloc copies instead of referencing the original pointer	2018-11-20 13:34:46 -05:00
Nick Ethier	29591a7c2e	task_runner: emit event on task exit with exit result details	2018-11-19 22:59:17 -05:00
Nick Ethier	4be8a86ef9	plugins/driver: remove NodeResources from task Resources and use PercentTicks field for docker driver	2018-11-19 22:59:17 -05:00
Nick Ethier	69049d37f5	drivers: added NodeResources to drivers.TaskConfig	2018-11-19 22:59:16 -05:00
Nick Ethier	8f8698b3e1	docker: started work on porting docker driver to new plugin framework	2018-11-19 22:59:15 -05:00
Michael Schurter	88577fe083	client.rpc: don't log errors on shutdown	2018-11-19 16:39:30 -08:00
Michael Schurter	5bd744ac3d	client: support graceful shutdowns Client.Shutdown now blocks until all AllocRunners and TaskRunners have exited their Run loops. Tasks are left running.	2018-11-19 16:39:30 -08:00
Mahmood Ali	9479015f51	Merge pull request #4884 from hashicorp/f-alloc-devices-cli Report alloc device statistics in API and CLI	2018-11-16 18:04:54 -05:00
Mahmood Ali	f139234372	address review comments	2018-11-16 17:13:01 -05:00
Mahmood Ali	f72e599ee7	Populate alloc stats API with device stats This change makes few compromises: * Looks up the devices associated with tasks at look up time. Given that `nomad alloc status` is called rarely generally (compared to stats telemetry and general job reporting), it seems fine. However, the lookup overhead grows bounded by number of `tasks x total-host-devices`, which can be significant. * `client.Client` performs the task devices->statistics lookup. It passes self to alloc/task runners so they can look up the device statistics allocated to them. * Currently alloc/task runners are responsible for constructing the entire RPC response for stats * The alternatives for making task runners device statistics aware don't seem appealing (e.g. having task runners contain reference to hostStats) * On the alloc aggregation resource usage, I did a naive merging of task device statistics. * Personally, I question the value of such aggregation, compared to costs of struct duplication and bloating the response - but opted to be consistent in the API. * With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.	2018-11-16 10:26:32 -05:00
Michael Schurter	0cdb188ae4	tests: fix tests post-rebase	2018-11-15 17:40:56 -08:00
Michael Schurter	59f106ecee	client/tr: add a bit of context to envbuilder errors	2018-11-15 16:26:25 -08:00

1 2 3 4 5 ...

3370 commits