open-nomad

Author	SHA1	Message	Date
James Rasell	8a23afdb56	events: add state objects and logic for service registrations.	2022-02-28 10:44:58 +01:00
James Rasell	20249bb761	state: add service registration restore functionality.	2022-02-28 10:15:27 +01:00
James Rasell	74b367553e	state: add service registration state interaction functions.	2022-02-28 10:15:03 +01:00
James Rasell	cf0b63d561	state: add the table schema for the service_registrations table.	2022-02-28 10:14:10 +01:00
Tim Gross	5b7b9fdafb	csi: tolerate missing plugins on job delete (#12114 ) If a plugin job fails before successfully fingerprinting the plugins, the plugin will not exist when we try to delete the job. Tolerate missing plugins.	2022-02-24 08:53:15 -05:00
Tim Gross	17dc0adee3	csi: fix broken test (#12110 )	2022-02-23 13:48:39 -05:00
Luiz Aoqui	de91954582	initial base work for implementing sorting and filter across API endpoints (#12076 )	2022-02-16 14:34:36 -05:00
Luiz Aoqui	110dbeeb9d	Add `go-bexpr` filters to evals and deployment list endpoints (#12034 )	2022-02-16 11:40:30 -05:00
Seth Hoenig	40c714a681	api: return sorted results in certain list endpoints These API endpoints now return results in chronological order. They can return results in reverse chronological order by setting the query parameter ascending=true. - Eval.List - Deployment.List	2022-02-15 13:48:28 -06:00
Tim Gross	6bd33d3fb9	CSI: use job status not alloc status for plugin updates from summary (#12027 ) When an allocation is updated, the job summary for the associated job is also updated. CSI uses the job summary to set the expected count for controller and node plugins. We incorrectly used the allocation's server status instead of the job status when deciding whether to update or remove the job from the plugins. This caused a node drain or other terminal state for an allocation to clear the expected count for the entire plugin. Use the job status to guide whether to update or remove the expected count. The existing CSI tests for the state store incorrectly modeled the updates we received from servers vs those we received from clients, leading to test assertions that passed when they should not. Rework the tests to clarify each step in the lifecycle and rename CSI state store functions for clarity	2022-02-09 11:51:49 -05:00
Tim Gross	b20a6c9ffb	CSI: move terminal alloc handling into denormalization (#11931 ) * The volume claim GC method and volumewatcher both have logic collecting terminal allocations that duplicates most of the logic that's now in the state store's `CSIVolumeDenormalize` method. Copy this logic into the state store so that all code paths have the same view of the past claims. * Remove logic in the volume claim GC that now lives in the state store's `CSIVolumeDenormalize` method. * Remove logic in the volumewatcher that now lives in the state store's `CSIVolumeDenormalize` method. * Remove logic in the node unpublish RPC that now lives in the state store's `CSIVolumeDenormalize` method.	2022-01-27 10:39:08 -05:00
Tim Gross	a40a20cff8	csi: ensure that PastClaims are populated with correct mode (#11932 ) In the client's `(csiHook) Postrun()` method, we make an unpublish RPC that includes a claim in the `CSIVolumeClaimStateUnpublishing` state and using the mode from the client. But then in the `(CSIVolume) Unpublish` RPC handler, we query the volume from the state store (because we only get an ID from the client). And when we make the client RPC for the node unpublish step, we use the _current volume's_ view of the mode. If the volume's mode has been changed before the old allocations can have their claims released, then we end up making a CSI RPC that will never succeed. Why does this code path get the mode from the volume and not the claim? Because the claim written by the GC job in `(*CoreScheduler) csiVolumeClaimGC` doesn't have a mode. Instead it just writes a claim in the unpublishing state to ensure the volumewatcher detects a "past claim" change and reaps all the claims on the volumes. Fix this by ensuring that the `CSIVolumeDenormalize` creates past claims for all nil allocations with a correct access mode set.	2022-01-27 10:05:41 -05:00
Tim Gross	a2433e35fb	CSI: resolve invalid claim states (#11890 ) * csi: resolve invalid claim states on read It's currently possible for CSI volumes to be claimed by allocations that no longer exist. This changeset asserts a reasonable state at the state store level by registering these nil allocations as "past claims" on any read. This will cause any pass through the periodic GC or volumewatcher to trigger the unpublishing workflow for those claims. * csi: make feasibility check errors more understandable When the feasibility checker finds we have no free write claims, it checks to see if any of those claims are for the job we're currently scheduling (so that earlier versions of a job can't block claims for new versions) and reports a conflict if the volume can't be scheduled so that the user can fix their claims. But when the checker hits a claim that has a GCd allocation, the state is recoverable by the server once claim reaping completes and no user intervention is required; the blocked eval should complete. Differentiate the scheduler error produced by these two conditions.	2022-01-27 09:30:03 -05:00
James Rasell	03cdd1a624	state: move restore functionality into its own file.	2022-01-24 13:03:03 +01:00
Tim Gross	d7756f8cdb	csi: volume deregistration should require exact ID (#11852 ) The command line client sends a specific volume ID, but this isn't enforced at the API level and we were incorrectly using a prefix match for volume deregistration, resulting in cases where a volume with a shorter ID that's a prefix of another volume would be deregistered instead of the intended volume.	2022-01-14 12:26:03 -05:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
Tim Gross	a0cf5db797	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
Tim Gross	624ecab901	evaluations list pagination and filtering (#11648 ) API queries can request pagination using the `NextToken` and `PerPage` fields of `QueryOptions`, when supported by the underlying API. Add a `NextToken` field to the `structs.QueryMeta` so that we have a common field across RPCs to tell the caller where to resume paging from on their next API call. Include this field on the `api.QueryMeta` as well so that it's available for future versions of List HTTP APIs that wrap the response with `QueryMeta` rather than returning a simple list of structs. In the meantime callers can get the `X-Nomad-NextToken`. Add pagination to the `Eval.List` RPC by checking for pagination token and page size in `QueryOptions`. This will allow resuming from the last ID seen so long as the query parameters and the state store itself are unchanged between requests. Add filtering by job ID or evaluation status over the results we get out of the state store. Parse the query parameters of the `Eval.List` API into the arguments expected for filtering in the RPC call.	2021-12-10 13:43:03 -05:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
Mahmood Ali	4d90afb425	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
Isabel Suchanek	ab51050ce8	events: fix wildcard namespace handling (#10935 ) This fixes a bug in the event stream API where it currently interprets namespace=* as an actual namespace, not a wildcard. When Nomad parses incoming requests, it sets namespace to default if not specified, which means the request namespace will never be an empty string, which is what the event subscription was checking for. This changes the conditional logic to check for a wildcard namespace instead of an empty one. It also updates some event tests to include the default namespace in the subscription to match current behavior. Fixes #10903	2021-09-02 09:36:55 -07:00
James Rasell	b6813f1221	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
Seth Hoenig	3371214431	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Mahmood Ali	aa77c2731b	tests: use standard library testing.TB Glint pulled in an updated version of mitchellh/go-testing-interface which broke some existing tests because the update added a Parallel() method to testing.T. This switches to the standard library testing.TB which doesn't have a Parallel() method.	2021-06-09 16:18:45 -07:00
Chris Baker	263ddd567c	Node Drain Metadata (#10250 )	2021-05-07 13:58:40 -04:00
Tim Gross	276633673d	CSI: use AccessMode/AttachmentMode from CSIVolumeClaim Registration of Nomad volumes previously allowed for a single volume capability (access mode + attachment mode pair). The recent `volume create` command requires that we pass a list of requested capabilities, but the existing workflow for claiming volumes and attaching them on the client assumed that the volume's single capability was correct and unchanging. Add `AccessMode` and `AttachmentMode` to `CSIVolumeClaim`, use these fields to set the initial claim value, and add backwards compatibility logic to handle the existing volumes that already have claims without these fields.	2021-04-07 11:24:09 -04:00
Chris Baker	436d46bd19	Merge branch 'main' into f-node-drain-api	2021-04-01 15:22:57 -05:00
Tim Gross	71b9daffb9	CSI: fix misleading HTTP test The HTTP test to create CSI volumes depends on having a controller plugin to talk to, but the test was using a node-only plugin, which allows it to silently ignore the missing controller.	2021-03-31 16:37:09 -04:00
Chris Baker	770c9cecb5	restored Node.Sanitize() for RPC endpoints multiple other updates from code review	2021-03-26 17:03:15 +00:00
Chris Baker	cb540ed691	added tests that the API doesn't leak Node.SecretID added more documentation on JSON encoding to the contributing guide	2021-03-23 18:09:20 +00:00
Chris Baker	dd291e69f4	removed deprecated fields from Drain structs and API node drain: use msgtype on txn so that events are emitted wip: encoding extension to add Node.Drain field back to API responses new approach for hiding Node.SecretID in the API, using `json` tag documented this approach in the contributing guide refactored the JSON handlers with extensions modified event stream encoding to use the go-msgpack encoders with the extensions	2021-03-21 15:30:11 +00:00
Tim Gross	fa25e048b2	CSI: unique volume per allocation Add a `PerAlloc` field to volume requests that directs the scheduler to test feasibility for volumes with a source ID that includes the allocation index suffix (ex. `[0]`), rather than the exact source ID. Read the `PerAlloc` field when making the volume claim at the client to determine if the allocation index suffix (ex. `[0]`) should be added to the volume source ID.	2021-03-18 15:35:11 -04:00
Tim Gross	9b2b580d1a	CSI: remove prefix matching from CSIVolumeByID and fix CLI prefix matching (#10158 ) Callers of `CSIVolumeByID` are generally assuming they should receive a single volume. This potentially results in feasibility checking being performed against the wrong volume if a volume's ID is a prefix substring of other volume (for example: "test" and "testing"). Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix matching in the command line client. Add the required elements for prefix matching to the commands and API.	2021-03-18 14:32:40 -04:00
Charlie Voiselle	0473f35003	Fixup uses of `sanity` (#10187 ) * Fixup uses of `sanity` * Remove unnecessary comments. These checks are better explained by earlier comments about the context of the test. Per @tgross, moved the tests together to better reinforce the overall shared context. * Update nomad/fsm_test.go	2021-03-16 18:05:08 -04:00
Tim Gross	97b0e26d1f	RPC endpoints to support 'nomad ui -login' RPC endpoints for the user-driven APIs (`UpsertOneTimeToken` and `ExchangeOneTimeToken`) and token expiration (`ExpireOneTimeTokens`). Includes adding expiration to the periodic core GC job.	2021-03-10 08:17:56 -05:00
Tim Gross	6b2c4b56d0	state store updates for one-time tokens The `OneTimeToken` struct is to support the `nomad ui -login` command. This changeset adds the struct to the Nomad state store.	2021-03-10 08:17:56 -05:00
Tim Gross	b764f52ab9	deploymentwatcher: reset progress deadline on promotion (#10042 ) In a deployment with two groups (ex. A and B), if group A's canary becomes healthy before group B's, the deadline for the overall deployment will be set to that of group A. When the deployment is promoted, if group A is done it will not contribute to the next deadline cutoff. Group B's old deadline will be used instead, which will be in the past and immediately trigger a deployment progress failure. Reset the progress deadline when the job is promotion to avoid this bug, and to better conform with implicit user expectations around how the progress deadline should interact with promotions.	2021-02-22 16:44:03 -05:00
Drew Bailey	007158ee75	ignore setting job summary when oldstatus == newstatus (#9884 )	2021-01-25 10:34:27 -05:00
Drew Bailey	630babb886	prevent double job status update (#9768 ) * Prevent Job Statuses from being calculated twice https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval insertion iwth job (de-)registration. This change removes a now obsolete guard which checked if the index was equal to the job.CreateIndex, which would empty the status. Now that the job regisration eval insetion is atomic with the registration this check is no longer necessary to set the job statuses correctly. * test to ensure only single job event for job register * periodic e2e * separate job update summary step * fix updatejobstability to use copy instead of modified reference of job * update envoygatewaybindaddresses copy to prevent job diff on null vs empty * set ConsulGatewayBindAddress to empty map instead of nil fix nil assertions for empty map rm unnecessary guard	2021-01-22 09:18:17 -05:00
Drew Bailey	54becaab7d	Events/acl events (#9595 ) * fix acl event creation * allow way to access secretID without exposing it to stream test that values are omitted test event creation test acl events payloads are pointers fix failing tests, do all security steps inside constructor * increase time * ignore empty tokens * uncomment line * changelog	2020-12-11 10:40:50 -05:00
Kris Hicks	0a3a748053	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Kris Hicks	93155ba3da	Add gocritic to golangci-lint config (#9556 )	2020-12-08 12:47:04 -08:00
Drew Bailey	ce85288f2f	ensure node secret ID is not included in event stream (#9510 )	2020-12-03 12:27:14 -05:00
Drew Bailey	17de8ebcb1	API: Event stream use full name instead of Eval/Alloc (#9509 ) * use full name for events use evaluation and allocation instead of short name * update api event stream package and shortnames * update docs * make sync; fix typo * backwards compat not from 1.0.0-beta event stream api changes * use api types instead of string * rm backwards compat note that only changed between prereleases * remove backwards incompat that only existed in prereleases	2020-12-03 11:48:18 -05:00
Drew Bailey	f9f5fe8236	Events switch on memdb change table instead of type to prevent duplicates (#9486 ) * prevent duplicate job events when a job is updated, the job_version table is updated with a structs.Job, this caused there to be multiple job events since we are switching off the change type and not the table * test length * add table value to tests	2020-12-01 15:14:05 -05:00
Drew Bailey	1f8e1aa631	pass in msgType for UpsertJob (#9475 )	2020-12-01 14:00:52 -05:00
Drew Bailey	9adca240f8	Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447 ) * upsertaclpolicies * delete acl policies msgtype * upsert acl policies msgtype * delete acl tokens msgtype * acl bootstrap msgtype wip unsubscribe on token delete test that subscriptions are closed after an ACL token has been deleted Start writing policyupdated test * update test to use before/after policy * add SubscribeWithACLCheck to run acl checks on subscribe * update rpc endpoint to use broker acl check * Add and use subscriptions.closeSubscriptionFunc This fixes the issue of not being able to defer unlocking the mutex on the event broker in the for loop. handle acl policy updates * rpc endpoint test for terminating acl change * add comments Co-authored-by: Kris Hicks <khicks@hashicorp.com>	2020-12-01 11:11:34 -05:00
Drew Bailey	70ae7ec621	return potential errors from txn.Commit (#9483 )	2020-12-01 10:05:37 -05:00
Drew Bailey	a0b7f05a7b	Remove Managed Sinks from Nomad (#9470 ) * Remove Managed Sinks from Nomad Managed Sinks were a beta feature in Nomad 1.0-beta2. During the beta period it was determined that this was not a scalable approach to support community and third party sinks. * update comment * changelog	2020-11-30 14:00:31 -05:00
Tim Gross	b2cd0da0a2	CSI: fix transaction handling in state store (#9438 ) When making updates to CSI plugins, the state store methods that have open write transactions were querying the state store using the same methods used by the CSI RPC endpoint, but these method creates their own top-level read transactions. During concurrent plugin updates (as happens when a plugin job is stopped), this can cause write skew in the plugin counts. * Refactor the CSIPlugin query methods to have an implementation method that accepts a transaction, which can be called with either a read txn or a write txn. * Refactor the CSIVolume query methods to have an implementation method that accepts a transaction, which can be called with either a read txn or a write txn. * CSI volumes need to be "denormalized" with their plugins and (optionally) allocations. Read-only RPC endpoints should take a snapshot so that we can make multiple state store method calls with a consistent view.	2020-11-25 11:15:57 -05:00

1 2 3 4 5 ...

518 commits