open-nomad

Author	SHA1	Message	Date
Samantha	54f8c04c91	Fix health checking for ephemeral poststart tasks (#11945 ) Update the logic in the Nomad client's alloc health tracker which erroneously marks existing healthy allocations with dead poststart ephemeral tasks as unhealthy even if they were already successful during a previous deployment.	2022-02-02 16:29:49 -05:00
Drew Bailey	8507d54e3b	e2e test for on_update service checks check_restart not compatible with on_update=ignore reword caveat	2021-02-08 08:32:40 -05:00
Drew Bailey	82f971f289	OnUpdate configuration for services and checks Allow for readiness type checks by configuring nomad to ignore warnings or errors reported by a service check. This allows the deployment to progress and while Consul handles introducing the sercive into a resource pool once the check passes.	2021-02-08 08:32:40 -05:00
Tim Gross	d55e3e2018	lifecycle: successful prestart tasks should not fail deployments In 492d62d we prevented poststop tasks from contributing to allocation health status, which fixed a bug where poststop tasks would prevent a deployment from ever being marked successful. The patch introduced a regression where prestart tasks that complete are causing the allocation to be marked unhealthy. This changeset restores the previous behavior for prestart tasks.	2021-01-13 11:40:21 -05:00
Drew Bailey	03a9541822	ignore poststop task in alloc health tracker (#9548 ), fixes #9361 * investigating where to ignore poststop task in alloc health tracker * ignore poststop when setting latest start time for allocation * clean up logic * lifecycle: isolate mocks for poststop deployment test * lifecycle: update comments in tracker Co-authored-by: Jasmine Dahilig <jasmine@dahilig.com>	2021-01-12 10:03:48 -08:00
Mahmood Ali	0ece631e60	allochealth: Fix when check health preceeds task health Fix a bug where if the alloc check becomes healthy before the task health, the alloc may never be considered healthy.	2020-05-13 07:44:39 -04:00
Mahmood Ali	934c5e8ff0	tests: tests for health check sequencing Add a failing tests to show that if an alloc checks is marked healthy before the alloc tasks start up, the alloc may be forever considered unhealthy.	2020-05-13 07:43:00 -04:00
Mahmood Ali	fa1244f8c5	health tracker: account for group service checks	2020-03-22 12:38:37 -04:00
Mahmood Ali	d61140dcac	health check account for task lifecycle In service jobs, lifecycles non-sidecar task tweak health logic a bit: they may terminate successfully without impacting alloc health, but fail the alloc if they fail. Sidecars should be treated just like a normal task.	2020-03-22 12:37:40 -04:00
Mahmood Ali	07a30580ac	health: fail health if any task is pending Fixes a bug where an allocation is considered healthy if some of the tasks are being restarted and as such, their checks aren't tracked by consul agent client. Here, we fix the immediate case by ensuring that an alloc is healthy only if tasks are running and the registered checks at the time are healthy. Previously, health tracker tracked task "health" independently from checks and leads to problems when a task restarts. Consider the following series of events: 1. all tasks start running -> `tracker.tasksHealthy` is true 2. one task has unhealthy checks and get restarted 3. remaining checks are healthy -> `tracker.checksHealthy` is true 4. propagate health status now that `tracker.tasksHealthy` and `tracker.checksHealthy`. This change ensures that we accurately use the latest status of tasks and checks regardless of their status changes. Also, ensures that we only consider check health after tasks are considered healthy, otherwise we risk trusting incomplete checks. This approach accomodates task dependencies well. Service jobs can have prestart short-lived tasks that will terminate before main process runs. These dead tasks that complete successfully will not negate health status.	2020-03-22 11:13:41 -04:00
Nick Ethier	bd454a4c6f	client: improve group service stanza interpolation and check_re… (#6586 ) * client: improve group service stanza interpolation and check_restart support Interpolation can now be done on group service stanzas. Note that some task runtime specific information that was previously available when the service was registered poststart of a task is no longer available. The check_restart stanza for checks defined on group services will now properly restart the allocation upon check failures if configured.	2019-11-18 13:04:01 -05:00
Michael Schurter	fb487358fb	connect: add group.service stanza support	2019-07-31 01:04:05 -04:00
Michael Schurter	159042a1a3	client: fix setting alloc unhealthy at deadline During the 0.9 client refactor the code to fail a deployment when the deadline was reached was broken. This restores and tests that behavior.	2019-02-19 07:44:14 -08:00
Michael Schurter	4e7ea460e8	test: port some pre-0.9 DeploymentHealth tests Skipping a failing one as I need to move to some other work and don't want to leave this work orphaned on my machine.	2019-01-14 09:56:53 -08:00
Michael Schurter	944ea6d38b	client: emit last sent alloc to new listeners Fixes a deadlock where the allocwatcher would block forever waiting for an update from a terminal alloc. Made the broadcaster easier to debug as well.	2018-11-27 14:06:08 -08:00
Michael Schurter	4136e59f79	arv2: implement alloc health watching Also remove initial alloc from broadcaster as it just caused useless extra processing.	2018-10-16 16:53:30 -07:00

16 commits