open-nomad

Commit Graph

Author	SHA1	Message	Date
Michael Schurter	22771aa19e	client/ar: remove useless wait ch from runTasks Arguably this makes task.WaitCh() useless, but I think exposing a wait chan from TaskRunners is a generically useful API.	2018-11-26 12:51:18 -08:00
Michael Schurter	2fdd013956	client: document how AR/TR Run methods behave	2018-11-26 12:50:35 -08:00
Chris Baker	9bd4317139	modified TaskConfig to include AllocID use this for volume names in drivers/rkt to address #1150	2018-11-26 18:54:26 +00:00
Danielle Tomlinson	093f029d5b	taskrunner: Return the restart delay correctly We were incorrectly returning a 0 duration to the taskrunner when determining when a task should restart. This would cause tasks to be restarted immediately, ignoring the restart {} stanza in a users configuration. This commit causes us to return the restart duration to the task runner so it may correctly delay further execution.	2018-11-20 21:52:23 +01:00
Nick Ethier	3e42d6914e	task_runner: use NodeResources instead of deprecated struct	2018-11-20 13:46:39 -05:00
Nick Ethier	93c0200566	task_runner: use task and alloc copies instead of referencing the original pointer	2018-11-20 13:34:46 -05:00
Nick Ethier	29591a7c2e	task_runner: emit event on task exit with exit result details	2018-11-19 22:59:17 -05:00
Nick Ethier	4be8a86ef9	plugins/driver: remove NodeResources from task Resources and use PercentTicks field for docker driver	2018-11-19 22:59:17 -05:00
Nick Ethier	69049d37f5	drivers: added NodeResources to drivers.TaskConfig	2018-11-19 22:59:16 -05:00
Nick Ethier	8f8698b3e1	docker: started work on porting docker driver to new plugin framework	2018-11-19 22:59:15 -05:00
Michael Schurter	5bd744ac3d	client: support graceful shutdowns Client.Shutdown now blocks until all AllocRunners and TaskRunners have exited their Run loops. Tasks are left running.	2018-11-19 16:39:30 -08:00
Mahmood Ali	9479015f51	Merge pull request #4884 from hashicorp/f-alloc-devices-cli Report alloc device statistics in API and CLI	2018-11-16 18:04:54 -05:00
Mahmood Ali	f139234372	address review comments	2018-11-16 17:13:01 -05:00
Mahmood Ali	f72e599ee7	Populate alloc stats API with device stats This change makes few compromises: * Looks up the devices associated with tasks at look up time. Given that `nomad alloc status` is called rarely generally (compared to stats telemetry and general job reporting), it seems fine. However, the lookup overhead grows bounded by number of `tasks x total-host-devices`, which can be significant. * `client.Client` performs the task devices->statistics lookup. It passes self to alloc/task runners so they can look up the device statistics allocated to them. * Currently alloc/task runners are responsible for constructing the entire RPC response for stats * The alternatives for making task runners device statistics aware don't seem appealing (e.g. having task runners contain reference to hostStats) * On the alloc aggregation resource usage, I did a naive merging of task device statistics. * Personally, I question the value of such aggregation, compared to costs of struct duplication and bloating the response - but opted to be consistent in the API. * With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.	2018-11-16 10:26:32 -05:00
Michael Schurter	0cdb188ae4	tests: fix tests post-rebase	2018-11-15 17:40:56 -08:00
Michael Schurter	59f106ecee	client/tr: add a bit of context to envbuilder errors	2018-11-15 16:26:25 -08:00
Michael Schurter	742f8775ba	client: remove old proxy references from comments	2018-11-15 16:26:25 -08:00
Michael Schurter	8bcd90d78d	client: add new nested variables to task's hcl ctx The error messages are really bad, but it's extremely difficult to produce good error messages without the original HCL.	2018-11-15 16:26:25 -08:00
Michael Schurter	f8cdd561f0	client: interpolate driver configurations Also add missing SetDriverNetwork calls.	2018-11-15 16:25:57 -08:00
Mahmood Ali	865419e756	convert all config durations to strings in tests	2018-11-13 10:21:40 -05:00
Michael Schurter	a4e6a92d18	client: update alloc status when terminating Defensively update alloc status whenever killing all tasks.	2018-11-05 15:11:10 -08:00
Michael Schurter	66bf3db455	client: block on context as well as waitCh For lifecycle operations such as Restart and Kill, the client should not expect driver plugins to be well behaved and close their waitCh on context cancelation. Always wait on the passed in context as well as the waitCh.	2018-11-05 12:32:05 -08:00
Michael Schurter	b994f51990	client: fix tr lifecycle logic and shutdown delay ShutdownDelay must be honored whenever the task is killed or restarted. Services were not being deregistered prior to restarting.	2018-11-05 12:32:05 -08:00
Michael Schurter	2d3479147a	client: fix ar and tr tests	2018-11-05 12:32:05 -08:00
Michael Schurter	d29d09023e	client: do not run terminal allocs	2018-11-05 12:32:05 -08:00
Michael Schurter	2bbd88888c	client: first pass at implementing task restoring Task restoring works but dead tasks may be restarted	2018-11-05 12:32:05 -08:00
Nick Ethier	3fcf8ba7e6	Merge pull request #4795 from hashicorp/f-plugin-config Pass client configuration to plugins through loader	2018-10-29 18:42:27 -07:00
Michael Schurter	e060174130	ar: fix leader handling, state restoring, and destroying unrun ARs * Migrated all of the old leader task tests and got them passing * Refactor and consolidate task killing code in AR to always kill leader tasks first * Fixed lots of issues with state restoring * Fixed deadlock in AR.Destroy if AR.Run had never been called * Added a new in memory statedb for testing	2018-10-19 09:45:45 -07:00
Michael Schurter	cefbf00bf0	ar: refactor task killing into 1 method Update comments and address some PR comments from #4775	2018-10-17 10:06:59 -07:00
Michael Schurter	21d78be961	tests: explicitly cleanup after clients	2018-10-17 10:06:59 -07:00
Michael Schurter	222f6b5741	ar: fix task leader, update, and stop handling	2018-10-17 10:06:59 -07:00
Michael Schurter	1badbb2fc4	tr: cleanup hook logs	2018-10-17 09:42:32 -07:00
Nick Ethier	65adb80ebf	plumb NomadConfig into plugins	2018-10-16 22:47:22 -04:00
Michael Schurter	0baaba8b09	templates: fix tests	2018-10-16 16:56:57 -07:00
Michael Schurter	838ddf4d4a	fix linter errors	2018-10-16 16:56:57 -07:00
Michael Schurter	e27c82ea4d	client: remove unused handleproxy	2018-10-16 16:56:56 -07:00
Michael Schurter	4ea5217d72	tr: remove unused DriverHandle interface was causing typed nil interface panics and served no purpose	2018-10-16 16:56:56 -07:00
Michael Schurter	528c426c53	Port client portion of #4392 to new taskrunner PR #4392 was merged to master after allocrunnerv2 was branched, so the client-specific portions must be ported from master to arv2.	2018-10-16 16:56:56 -07:00
Michael Schurter	f12501d4c3	tr: implement dispatch payload hook Now passing the TaskDir struct to prestart hooks instead of just the root task dir itself as dispatch needs local/.	2018-10-16 16:56:56 -07:00
Nick Ethier	8cf669b5aa	taskrunner: return error on waitCh	2018-10-16 16:56:56 -07:00
Nick Ethier	047fad2953	client: simplify driver plugin logic from review comments	2018-10-16 16:56:56 -07:00
Nick Ethier	9686e1b258	client: fix broked tests from refactoring	2018-10-16 16:56:56 -07:00
Nick Ethier	3183b33d24	client: review comments and fixup/skip tests	2018-10-16 16:56:56 -07:00
Nick Ethier	f192c3752a	client: refactor post allocrunnerv2 finalization	2018-10-16 16:56:56 -07:00
Nick Ethier	4a4c7dbbfc	client: begin driver plugin integration client: fingerprint driver plugins	2018-10-16 16:56:56 -07:00
Alex Dadgar	7946a14aa8	Fix lints	2018-10-16 16:56:56 -07:00
Alex Dadgar	45e41cca03	allocrunnerv2 -> allocrunner	2018-10-16 16:56:56 -07:00
Alex Dadgar	6c9d9d5173	move files around	2018-10-16 16:56:55 -07:00
Michael Schurter	9d1ea3b228	client: hclog-ify most of the client Leaving fingerprinters in case that interface changes with plugins.	2018-10-16 16:53:30 -07:00
Michael Schurter	e42154fc46	implement stopping, destroying, and disk migration * Stopping an alloc is implemented via Updates but update hooks are not run. * Destroying an alloc is a best effort cleanup. * AllocRunner destroy hooks implemented. * Disk migration and blocking on a previous allocation exiting moved to its own package to avoid cycles. Now only depends on alloc broadcaster instead of also using a waitch. * AllocBroadcaster now only drops stale allocations and always keeps the latest version. * Made AllocDir safe for concurrent use Lots of internal contexts that are currently unused. Unsure if they should be used or removed.	2018-10-16 16:53:30 -07:00
Michael Schurter	820af27171	wrap boltdb in a write deduplicator Saves a tiny bit of cpu and some IO. Sadly doesn't prevent all IO on duplicate writes as the transactions are still created and committed. $ go test -bench=. -benchmem goos: linux goarch: amd64 pkg: github.com/hashicorp/nomad/helper/boltdd BenchmarkWriteDeduplication_On-4 500 4059591 ns/op 23736 B/op 56 allocs/op BenchmarkWriteDeduplication_Off-4 300 4115319 ns/op 25942 B/op 55 allocs/op	2018-10-16 16:53:30 -07:00
Michael Schurter	5383d20505	removing old restoration path before api change	2018-10-16 16:53:30 -07:00
Michael Schurter	39b3f3a85b	call handle.Network() instead of storing it	2018-10-16 16:53:30 -07:00
Michael Schurter	a4b4d7b266	consul service hook Deregistration works but difficult to test due to terminal updates not being fully implemented in the new client/ar/tr.	2018-10-16 16:53:29 -07:00
Michael Schurter	9a63d6103d	tr: add validate task hook	2018-10-16 16:53:29 -07:00
Alex Dadgar	e401c660e7	Implement lifecycle hooks on the task runner	2018-10-16 16:53:29 -07:00
Michael Schurter	eae54e2954	artifact task hook	2018-10-16 16:53:29 -07:00
Alex Dadgar	52f9cd7637	fixing tests	2018-10-04 14:26:19 -07:00
Alex Dadgar	ca28afa3b2	small fixes	2018-09-15 16:42:38 -07:00
Alex Dadgar	7739ef51ce	agent + consul	2018-09-13 10:43:40 -07:00
Michael Schurter	6def5bc4f9	client: set host name when migrating over tls Not setting the host name led the Go HTTP client to expect a certificate with a DNS-resolvable name. Since Nomad uses `${role}.${region}.nomad` names ephemeral dir migrations were broken when TLS was enabled. Added an e2e test to ensure this doesn't break again as it's very difficult to test and the TLS configuration is very easy to get wrong.	2018-09-05 17:24:17 -07:00
Andrei Burd	444ee45aff	Parametrized/periodic jobs per child tagged metric emmision	2018-06-21 10:40:56 +03:00
Alex Dadgar	300b1a7a15	Tests only use testlog package logger	2018-06-13 15:40:56 -07:00
Alex Dadgar	9bab9edf27	test fixes	2018-06-12 17:45:39 -07:00
Alex Dadgar	90c2108bfb	Fix gc tests + parallel destroy + small test fixes	2018-06-12 10:23:45 -07:00
Alex Dadgar	f5ff509fa5	Refactor - wip	2018-06-12 10:23:45 -07:00

... 8 9 10 11 12

566 Commits