open-nomad

Commit Graph

Author	SHA1	Message	Date
Nick Ethier	8f8698b3e1	docker: started work on porting docker driver to new plugin framework	2018-11-19 22:59:15 -05:00
Mahmood Ali	9479015f51	Merge pull request #4884 from hashicorp/f-alloc-devices-cli Report alloc device statistics in API and CLI	2018-11-16 18:04:54 -05:00
Mahmood Ali	f139234372	address review comments	2018-11-16 17:13:01 -05:00
Mahmood Ali	f72e599ee7	Populate alloc stats API with device stats This change makes few compromises: * Looks up the devices associated with tasks at look up time. Given that `nomad alloc status` is called rarely generally (compared to stats telemetry and general job reporting), it seems fine. However, the lookup overhead grows bounded by number of `tasks x total-host-devices`, which can be significant. * `client.Client` performs the task devices->statistics lookup. It passes self to alloc/task runners so they can look up the device statistics allocated to them. * Currently alloc/task runners are responsible for constructing the entire RPC response for stats * The alternatives for making task runners device statistics aware don't seem appealing (e.g. having task runners contain reference to hostStats) * On the alloc aggregation resource usage, I did a naive merging of task device statistics. * Personally, I question the value of such aggregation, compared to costs of struct duplication and bloating the response - but opted to be consistent in the API. * With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.	2018-11-16 10:26:32 -05:00
Michael Schurter	0cdb188ae4	tests: fix tests post-rebase	2018-11-15 17:40:56 -08:00
Michael Schurter	59f106ecee	client/tr: add a bit of context to envbuilder errors	2018-11-15 16:26:25 -08:00
Michael Schurter	742f8775ba	client: remove old proxy references from comments	2018-11-15 16:26:25 -08:00
Michael Schurter	2d0b44c3b4	client: test more env key variations	2018-11-15 16:26:25 -08:00
Michael Schurter	8bcd90d78d	client: add new nested variables to task's hcl ctx The error messages are really bad, but it's extremely difficult to produce good error messages without the original HCL.	2018-11-15 16:26:25 -08:00
Michael Schurter	5e51e2c2d5	client: turn env into nested objects for task configs	2018-11-15 16:25:57 -08:00
Michael Schurter	f8cdd561f0	client: interpolate driver configurations Also add missing SetDriverNetwork calls.	2018-11-15 16:25:57 -08:00
Mahmood Ali	046f098bac	Track Node Device attributes and serve them in API	2018-11-14 14:42:29 -05:00
Mahmood Ali	63acda956c	Add Client Device Stats structs in `api` package	2018-11-14 14:41:19 -05:00
Mahmood Ali	b74ccc742c	Expose Device Stats in /client/stats API endpoint	2018-11-14 14:41:19 -05:00
Mahmood Ali	c5de71a424	Allow nullable fields in StatValues In state values, we need to be able to distinguish between zero values (e.g. `false`) and unset values (e.g. `nil`). We can alternatively use protobuf `oneOf` and nested map to ensure consistency of fields that are set together, but the golang representation does not represent that well and introducing a mismatch between representations. Thus, I opted not to use it.	2018-11-14 14:41:19 -05:00
Mahmood Ali	713c9fe683	Move Stat{Object\|Value} to plugins/shared/structs Moving them as they may be useful for other packages/plugins besides devices.	2018-11-14 09:01:26 -05:00
Mahmood Ali	1f4db08f42	Regenerate proto files with protoc-gen-go@v1.2.0	2018-11-14 09:01:26 -05:00
Danielle Tomlinson	0917e93537	Merge pull request #4869 from hashicorp/b-executor-stdout executor: Fix stdout stderr copy/paste	2018-11-13 19:22:37 -08:00
Mahmood Ali	865419e756	convert all config durations to strings in tests	2018-11-13 10:21:40 -05:00
Mahmood Ali	ac3b4571eb	Address review comments	2018-11-13 10:21:40 -05:00
Mahmood Ali	69f26783e4	avoid setting resource limit on rkt command Was accidentally modified in 5b14d24bf4626bab420d00783d92bcf25e0b641e .	2018-11-13 10:21:40 -05:00
Mahmood Ali	8fa26f5521	Fix docker log fetching in tests We no longer use syslog for tracking logs so tracking them explicitly here	2018-11-13 10:21:40 -05:00
Mahmood Ali	88fa968623	killing should be done with wait client Incidentally changed in 5b14d24bf4626bab420d00783d92bcf25e0b641e	2018-11-13 10:21:40 -05:00
Mahmood Ali	7690f389a0	Prioritize checking consumer context cancellation Tests expect that as soon as eventer shuts down immediately on context cancellations; but golang does not guarantee priority when multiple pending channels are ready in a select statement.	2018-11-13 10:21:40 -05:00
Mahmood Ali	c62ec124c0	Set clean config for mock driver The default job here contains some exec task config (for setting command and args) that aren't used for mock driver. Now, the alloc runner seems stricter about validating fields and errors on unexpected fields. Updating configs in tests so we can have an explicit task config whenever driver is set explicitly.	2018-11-13 10:21:40 -05:00
Mahmood Ali	e5e6f9a785	Update Docker name parsing lookup `ParseNamed` function changed in e9f3f2cfee9d729a8642344c4fa4ea70b2d49468 where became `ParsedNormalizedName` with extra checks.	2018-11-13 10:21:40 -05:00
Danielle Tomlinson	bfeded1f30	executor: Fix stdout stderr copy/paste	2018-11-12 22:08:04 -08:00
Alex Dadgar	c4f9e22aeb	fix race	2018-11-07 12:22:07 -08:00
Alex Dadgar	a7ca737fb6	review comments	2018-11-07 11:31:52 -08:00
Alex Dadgar	f0c7a8159b	tests	2018-11-07 10:43:15 -08:00
Alex Dadgar	204ca8230c	Device manager Introduce a device manager that manages the lifecycle of device plugins on the client. It fingerprints, collects stats, and forwards Reserve requests to the correct plugin. The manager, also handles device plugins failing and validates their output.	2018-11-07 10:43:15 -08:00
Michael Schurter	a4e6a92d18	client: update alloc status when terminating Defensively update alloc status whenever killing all tasks.	2018-11-05 15:11:10 -08:00
Michael Schurter	66bf3db455	client: block on context as well as waitCh For lifecycle operations such as Restart and Kill, the client should not expect driver plugins to be well behaved and close their waitCh on context cancelation. Always wait on the passed in context as well as the waitCh.	2018-11-05 12:32:05 -08:00
Michael Schurter	b994f51990	client: fix tr lifecycle logic and shutdown delay ShutdownDelay must be honored whenever the task is killed or restarted. Services were not being deregistered prior to restarting.	2018-11-05 12:32:05 -08:00
Michael Schurter	2d3479147a	client: fix ar and tr tests	2018-11-05 12:32:05 -08:00
Michael Schurter	d29d09023e	client: do not run terminal allocs	2018-11-05 12:32:05 -08:00
Michael Schurter	2bbd88888c	client: first pass at implementing task restoring Task restoring works but dead tasks may be restarted	2018-11-05 12:32:05 -08:00
Nick Ethier	b0ddc03409	Merge pull request #4765 from jippi/increase-line-scan-limit fix: increase log rotator line scan limit	2018-10-29 18:46:30 -07:00
Nick Ethier	3fcf8ba7e6	Merge pull request #4795 from hashicorp/f-plugin-config Pass client configuration to plugins through loader	2018-10-29 18:42:27 -07:00
Nick Ethier	bda3b1d3b3	rename NomadConfig to ClientAgentConfig	2018-10-29 21:34:34 -04:00
Michael Schurter	6f2cffb196	Merge pull request #4803 from hashicorp/b-leader-fixes AR Fixes: task leader handling, restoring, state updating, AR.Destroy deadlocks	2018-10-29 17:38:59 -05:00
Michael Schurter	d71a1b4547	tests: more fixes due to api changes	2018-10-29 15:25:22 -07:00
Preetha Appan	b85cc38f3d	Stat path to binary to handle raw exec driver interpolated binary path	2018-10-26 17:24:05 -05:00
Preetha Appan	55ac8d3d12	Fix test linting	2018-10-26 10:30:12 -05:00
Michael Schurter	b7a9d61a38	ar: initialize allocwatcher on restore Fixes a panic. Left a comment on how the behavior could be improved, but this is what releases <0.9.0 did.	2018-10-19 09:45:45 -07:00
Michael Schurter	e060174130	ar: fix leader handling, state restoring, and destroying unrun ARs * Migrated all of the old leader task tests and got them passing * Refactor and consolidate task killing code in AR to always kill leader tasks first * Fixed lots of issues with state restoring * Fixed deadlock in AR.Destroy if AR.Run had never been called * Added a new in memory statedb for testing	2018-10-19 09:45:45 -07:00
Nick Ethier	58b430edae	added driver specific client config struct to plugin configuration	2018-10-18 23:31:01 -04:00
Michael Schurter	cefbf00bf0	ar: refactor task killing into 1 method Update comments and address some PR comments from #4775	2018-10-17 10:06:59 -07:00
Michael Schurter	21d78be961	tests: explicitly cleanup after clients	2018-10-17 10:06:59 -07:00
Michael Schurter	222f6b5741	ar: fix task leader, update, and stop handling	2018-10-17 10:06:59 -07:00

1 2 3 4 5 ...

3326 Commits