open-nomad

Commit Graph

Author	SHA1	Message	Date
Tim Gross	a3a86a849a	test: deflake node drain intergration test (#18171 ) The `TestDrainer_AllTypes_NoDeadline` test has been flaky. It looks like this might be because the final update of batch allocations to complete is improperly updating the state store directly rather than by RPC. If the service jobs have restarted in the meantime, the `allocClientStateSimulator` will have updated the index on the allocations table and that will prevent the drainer from unblocking (and being marked complete) when the batch jobs are written with an earlier index. This changeset attempts to fix that by making the update via RPC (as it normally would be in real code).	2023-08-14 16:19:00 -04:00
Luiz Aoqui	7b5a8f1fb0	Revert "hashicorp/go-msgpack v2 (#16810 )" (#17047 ) This reverts commit 8a98520d56eed3848096734487d8bd3eb9162a65.	2023-05-01 17:18:34 -04:00
Ian Fijolek	619f49afcf	hashicorp/go-msgpack v2 (#16810 ) * Upgrade from hashicorp/go-msgpack v1.1.5 to v2.1.0 Fixes #16808 * Update hashicorp/net-rpc-msgpackrpc to v2 to match go-msgpack * deps: use go-msgpack v2.0.0 go-msgpack v2.1.0 includes some code changes that we will need to investigate furthere to assess its impact on Nomad, so keeping this dependency on v2.0.0 for now since it's no-op. --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-04-17 17:02:05 -04:00
Tim Gross	5a9abdc469	drain: use client status to determine drain is complete (#14348 ) If an allocation is slow to stop because of `kill_timeout` or `shutdown_delay`, the node drain is marked as complete prematurely, even though drain monitoring will continue to report allocation migrations. This impacts the UI or API clients that monitor node draining to shut down nodes. This changeset updates the behavior to wait until the client status of all drained allocs are terminal before marking the node as done draining.	2023-04-13 08:55:28 -04:00
Tim Gross	a9a350cfdb	drainer: fix codec race condition in integration test (#16845 ) msgpackrpc codec handles are specific to a connection and cannot be shared between goroutines; this can cause corrupted decoding. Fix the drainer integration test so that we create separate codecs for the goroutines that the test helper spins up to simulate client updates. This changeset also refactors the drainer integration test to bring it up to current idioms and library usages, make assertions more clear, and reduce duplication.	2023-04-11 14:31:13 -04:00
hashicorp-copywrite[bot]	005636afa0	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Piotr Kazmierczak	b63944b5c1	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
Seth Hoenig	2631659551	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
Luiz Aoqui	ab8ce87bba	Add pagination, filtering and sort to more API endpoints (#12186 )	2022-03-08 20:54:17 -05:00
Mahmood Ali	d2fcce21f8	Migrate all allocs when draining a node (#10411 ) This fixes a bug affecting drain nodes, where allocs may fail to be migrated if they belong to different namespaces but share the same job name. The reason is that the helper function that creates the migration evals indexed the allocs by job ID without accounting for the namespaces. When job ids clash, only an eval is created for one and the rest of the allocs remain intact. Fixes #10172	2021-04-21 12:11:14 -04:00
Drew Bailey	6c788fdccd	Events/msgtype cleanup (#9117 ) * use msgtype in upsert node adds message type to signature for upsert node, update tests, remove placeholder method * UpsertAllocs msg type test setup * use upsertallocs with msg type in signature update test usage of delete node delete placeholder msgtype method * add msgtype to upsert evals signature, update test call sites with test setup msg type handle snapshot upsert eval outside of FSM and ignore eval event remove placeholder upsertevalsmsgtype handle job plan rpc and prevent event creation for plan msgtype cleanup upsertnodeevents updatenodedrain msgtype msg type 0 is a node registration event, so set the default to the ignore type * fix named import * fix signature ordering on upsertnode to match	2020-10-19 09:30:15 -04:00
Drew Bailey	9d48818eb8	writetxn can return error, add alloc and job generic events. Add events table for durability	2020-10-14 12:44:39 -04:00
Drew Bailey	400455d302	Events/eval alloc events (#9012 ) * generic eval update event first pass at alloc client update events * api/event client	2020-10-14 12:44:37 -04:00
Mahmood Ali	61e4f5aaf9	tests: use GreaterOrEqual and apply change to other tests	2020-05-27 11:22:48 -04:00
Mahmood Ali	ec1fcedb93	tests: node drain events may be duplicated	2020-05-27 08:59:06 -04:00
Seth Hoenig	f0c3dca49c	tests: swap lib/freeport for tweaked helper/freeport Copy the updated version of freeport (sdk/freeport), and tweak it for use in Nomad tests. This means staying below port 10000 to avoid conflicts with the lib/freeport that is still transitively used by the old version of consul that we vendor. Also provide implementations to find ephemeral ports of macOS and Windows environments. Ports acquired through freeport are supposed to be returned to freeport, which this change now also introduces. Many tests are modified to include calls to a cleanup function for Server objects. This should help quite a bit with some flakey tests, but not all of them. Our port problems will not go away completely until we upgrade our vendor version of consul. With Go modules, we'll probably do a 'replace' to swap out other copies of freeport with the one now in 'nomad/helper/freeport'.	2019-12-09 08:37:32 -06:00
Michael Schurter	5d49832de4	tests: fix usages of TestClient cleanup and mock driver	2018-10-29 14:21:05 -07:00
Alex Dadgar	3c19d01d7a	server	2018-09-15 16:23:13 -07:00
Alex Dadgar	9a2237bdab	Drain complete	2018-05-10 17:22:06 -07:00
Alex Dadgar	c91ce5cc38	Fix not enqueuing eval	2018-05-07 14:50:01 -05:00
Alex Dadgar	8d50955054	Fix typos	2018-05-07 14:50:01 -05:00
Alex Dadgar	d45f39f24e	Fix detecting drain strategy on GC'd node	2018-04-25 16:00:56 -07:00
Michael Schurter	edc4891283	drain: improve tests and fix spelling * transistion -> transition * don't t.Fatal in goroutines * don't mutate global state	2018-04-02 16:40:47 -07:00
Alex Dadgar	e8809f40dc	Test transistion from both infinite and a future deadline to force	2018-03-30 11:24:39 -07:00
Alex Dadgar	32a673a7e1	Fix force deadline notification	2018-03-30 09:58:29 -07:00
Alex Dadgar	1aa415b0d8	Integration test	2018-03-30 09:33:23 -07:00
Alex Dadgar	e458ab9031	Merge branch 'master' into b-drain-batch	2018-03-29 17:10:34 -07:00
Alex Dadgar	049a9213d2	Watch batch jobs	2018-03-29 16:07:51 -07:00
Alex Dadgar	f12194328c	Integration test for batch complete case	2018-03-29 13:51:04 -07:00
Michael Schurter	91b5bb58d9	add HasHealth helper for nil checks We performed the DeploymentStatus nil checks a couple different ways, so hopefully this helper will consoldiate them and make it more clear what the code is doing.	2018-03-29 09:29:19 -07:00
Alex Dadgar	b18f789020	Unmark drain when nodes hit their deadline and only batch/system left and add all job type integration test	2018-03-28 17:25:58 -07:00
Alex Dadgar	640ebdaef6	fix race in drain integration tests	2018-03-21 16:51:45 -07:00
Alex Dadgar	92b636dd32	Fix deadline handling	2018-03-21 16:51:44 -07:00
Alex Dadgar	e87c677a42	handle empty node case	2018-03-21 16:51:44 -07:00
Alex Dadgar	405dab2253	integration test and basic fixes	2018-03-21 16:51:44 -07:00

35 Commits