open-nomad

Commit Graph

Author	SHA1	Message	Date
Drew Bailey	207068ca28	Events/event source node (#8918 ) * Node Register/Deregister event sourcing example upsert node with context fill in writetxnwithctx ctx passing to handle event type creation, wip test node deregistration event drop Node from registration event * node batch deregistration	2020-10-14 12:44:35 -04:00
Drew Bailey	4753904b90	Events/cfg enable publisher (#8916 ) * only enable publisher based on config * add default prune tick * back out state abandon changes on fsm close	2020-10-14 12:44:35 -04:00
Drew Bailey	f820744746	abandon current state on server shutdown	2020-10-14 12:44:34 -04:00
Drew Bailey	fddac3af00	Event Buffer Implemenation adds an event buffer to hold events from raft changes. update events to use event buffer fix append call provide way to prune buffer items after TTL event publisher tests basic publish test wire up max item ttl rename package to stream, cleanup exploratory work subscription filtering subscription plumbing allow subscribers to consume events, handle closing subscriptions back out old exploratory ctx work fix lint remove unused ctx bits add a few comments fix test stop publisher on abandon	2020-10-14 12:44:34 -04:00
Chris Baker	1d35578bed	removed backwards-compatible/untagged metrics deprecated in 0.7	2020-10-13 20:18:39 +00:00
Seth Hoenig	ed13e5723f	consul/connect: dynamically select envoy sidecar at runtime As newer versions of Consul are released, the minimum version of Envoy it supports as a sidecar proxy also gets bumped. Starting with the upcoming Consul v1.9.X series, Envoy v1.11.X will no longer be supported. Current versions of Nomad hardcode a version of Envoy v1.11.2 to be used as the default implementation of Connect sidecar proxy. This PR introduces a change such that each Nomad Client will query its local Consul for a list of Envoy proxies that it supports (https://github.com/hashicorp/consul/pull/8545) and then launch the Connect sidecar proxy task using the latest supported version of Envoy. If the `SupportedProxies` API component is not available from Consul, Nomad will fallback to the old version of Envoy supported by old versions of Consul. Setting the meta configuration option `meta.connect.sidecar_image` or setting the `connect.sidecar_task` stanza will take precedence as is the current behavior for sidecar proxies. Setting the meta configuration option `meta.connect.gateway_image` will take precedence as is the current behavior for connect gateways. `meta.connect.sidecar_image` and `meta.connect.gateway_image` may make use of the special `${NOMAD_envoy_version}` variable interpolation, which resolves to the newest version of Envoy supported by the Consul agent. Addresses #8585 #7665	2020-10-13 09:14:12 -05:00
Tim Gross	4335d847a4	Allow job Version to start at non-zero value (#9071 ) Stop coercing version of new job to 0 in the state_store, so that we can add regions to a multi-region deployment. Send new version, rather than existing version, to MRD to accomodate version-choosing logic changes in ENT. Co-authored-by: Chris Baker <1675087+cgbaker@users.noreply.github.com>	2020-10-12 13:59:48 -04:00
Nick Ethier	d45be0b5a6	client: add NetworkStatus to Allocation (#8657 )	2020-10-12 13:43:04 -04:00
Yoan Blanc	891accb89a	use allow/deny instead of the colored alternatives (#9019 ) Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-10-12 08:47:05 -04:00
Tim Gross	9b4917ae5f	csi: volumewatcher only needs one pass to collect past claims If a volume GC and a `nomad volume detach` command land concurrently, we can end up with multiple claims without an allocation, which results in extra no-op work when finding claims to collect as past claims.	2020-10-09 11:03:51 -04:00
Tim Gross	ec1e75d9f4	csi: remove stray TODO comment This item was completed in #8626	2020-10-09 11:03:51 -04:00
Tim Gross	e8c13a2307	csi: validate mount options during volume registration (#9044 ) Volumes using attachment mode `file-system` use the CSI filesystem API when they're mounted, and can be passed mount options. But `block-device` mode volumes don't have this option. When RPCs are made to plugins, we are silently dropping the mount options we don't expect to see, but this results in a poor operator experience when the mount options aren't honored. This changeset makes passing mount options to a `block-device` volume a validation error.	2020-10-08 09:23:21 -04:00
Tim Gross	3ceb5b36b1	csi: allow more than 1 writer claim for multi-writer mode (#9040 ) Fixes a bug where CSI volumes with the `MULTI_NODE_MULTI_WRITER` access mode were using the same logic as `MULTI_NODE_SINGLE_WRITER` to determine whether the volume had writer claims available for scheduling. Extends CSI claim endpoint test to exercise multi-reader and make sure `WriteFreeClaims` is exercised for multi-writer in feasibility test.	2020-10-07 10:43:23 -04:00
Seth Hoenig	0c5ae5769f	Merge pull request #9029 from hashicorp/b-tgs-updates consul/connect: trigger update as necessary on connect changes	2020-10-05 16:48:04 -05:00
Seth Hoenig	f44a4f68ee	consul/connect: trigger update as necessary on connect changes This PR fixes a long standing bug where submitting jobs with changes to connect services would not trigger updates as expected. Previously, service blocks were not considered as sources of destructive updates since they could be synced with consul non-destructively. With Connect, task group services that have changes to their connect block or to the service port should be destructive, since the network plumbing of the alloc is going to need updating. Fixes #8596 #7991 Non-destructive half in #7192	2020-10-05 14:53:00 -05:00
Chris Baker	7f701fddd0	updated docs and validation to further prohibit null chars in region, datacenter, and job name	2020-10-05 18:01:50 +00:00
Chris Baker	23ea7cd27c	updated job validate to refute job/group/task IDs containing null characters updated CHANGELOG and upgrade guide	2020-10-05 18:01:49 +00:00
Chris Baker	c8fd9428d4	documenting tests around null characters in job id, task group name, and task name	2020-10-05 18:01:49 +00:00
Fredrik Hoem Grelland	a015c52846	configure nomad cluster to use a Consul Namespace [Consul Enterprise] (#8849 )	2020-10-02 14:46:36 -04:00
Michael Schurter	765473e8b0	jobspec: lower min cpu resources from 10->1 Since CPU resources are usually a soft limit it is desirable to allow setting it as low as possible to allow tasks to run only in "idle" time. Setting it to 0 is still not allowed to avoid potential unintentional side effects with allowing a zero value. While there may not be any side effects this commit attempts to minimize risk by avoiding the issue. This does not change the defaults.	2020-09-30 12:15:13 -07:00
Luiz Aoqui	88d4eecfd0	add scaling policy type	2020-09-29 17:57:46 -04:00
Seth Hoenig	af9543c997	consul: fix validation of task in group-level script-checks When defining a script-check in a group-level service, Nomad needs to know which task is associated with the check so that it can use the correct task driver to execute the check. This PR fixes two bugs: 1) validate service.task or service.check.task is configured 2) make service.check.task inherit service.task if it is itself unset Fixes #8952	2020-09-28 15:02:59 -05:00
Michael Schurter	9dd59ceaa7	core: improve job deregister error logging Noticed this error in some production logs, and they were far from helpful. Changes: 1. Include job ID in logs 2. Wrap errors and log once instead of double log lines 3. Test fsm error handling behavior	2020-09-21 08:59:03 -07:00
Pierre Cauchois	e4b739cafd	RPC Timeout/Retries account for blocking requests (#8921 ) The current implementation measures RPC request timeout only against config.RPCHoldTimeout, which is fine for non-blocking requests but will almost surely be exceeded by long-poll requests that block for minutes at a time. This adds an HasTimedOut method on the RPCInfo interface that takes into account whether the request is blocking, its maximum wait time, and the RPCHoldTimeout.	2020-09-18 08:58:41 -04:00
Seth Hoenig	57fc593363	consul/connect: validate group network on expose port injection In #7800, Nomad would automatically generate a port label for service checks making use of the expose feature, if the port was not already set. This change assumed the group network would be correctly defined (as is checked in a validation hook later). If the group network was not definied, a panic would occur on job submisssion. This change re-uses the group network validation helper to make sure the network is correctly definied before adding ports to it. Fixes #8875	2020-09-14 10:25:03 -05:00
Chris Baker	d0cc0a768b	Update nomad/job_endpoint.go	2020-09-10 17:18:23 -05:00
Chris Baker	eff726609d	move variable out of oss-only build into shared file, fixes ent compile error introduced by #8834	2020-09-10 22:08:25 +00:00
Yoan Blanc	48d07c4d12	fix: panic in test introduced by #8453 (#8834 ) Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-09-09 09:38:15 -04:00
Chris Baker	bfa366ea72	Update nomad/deployment_endpoint.go	2020-09-08 16:39:51 -05:00
Chris Baker	0e509bc11e	check ACLs against deployment namespace on Deployment.GetDeployment, filtering the deployment if the ACL isn't appropriate	2020-09-08 19:57:28 +00:00
Drew Bailey	28aa0387e9	remove node events for state track changing pr remove Txn and update calls with ReadTxn() constructor for changetrackerdb	2020-09-04 10:23:35 -04:00
Drew Bailey	d5f6d3b3c5	fix a few missed txn changes	2020-09-01 10:27:21 -04:00
Drew Bailey	9253146bf4	fix bad merge from scalingpoliciesbynamespace	2020-09-01 10:27:20 -04:00
Drew Bailey	45762d8df8	noop changetracker for snapshots	2020-09-01 10:27:20 -04:00
Drew Bailey	0af749c92e	Transaction change tracking This commit wraps memdb.DB with a changeTrackerDB, which is a thin wrapper around memdb.DB which enables go-memdb's TrackChanges on all write transactions. When the transaction is comitted the changes are sent to an eventPublisher which will be used to create and emit change events. debugging TestFSM_ReconcileSummaries wip revert back rebase revert back rebase fix snapshot to actually use a snapshot	2020-09-01 10:27:20 -04:00
Jasmine Dahilig	71a694f39c	Merge pull request #8390 from hashicorp/lifecycle-poststart-hook task lifecycle poststart hook	2020-08-31 13:53:24 -07:00
Jasmine Dahilig	fbe0c89ab1	task lifecycle poststart: code review fixes	2020-08-31 13:22:41 -07:00
Mahmood Ali	117aec0036	Fix accidental broken clones Fix CSIMountOptions.Copy() and VolumeRequest.Copy() where they accidentally returned a reference to self rather than a deep copy. `&(*ref)` in Golang apparently equivalent to plain `&ref`.	2020-08-28 15:29:22 -04:00
Tim Gross	b77fe023b5	MRD: move 'job stop -global' handling into RPC (#8776 ) The initial implementation of global job stop for MRD looped over all the regions in the CLI for expedience. This changeset includes the OSS parts of moving this into the RPC layer so that API consumers don't have to implement this logic themselves.	2020-08-28 14:28:13 -04:00
Tim Gross	35b1b3bed7	structs: filter NomadTokenID from job diff (#8773 ) Multiregion deployments use the `NomadTokenID` to allow the deploymentwatcher to send RPCs between regions with the original submitter's ACL token. This ID should be filtered from diffs so that it doesn't cause a difference for purposes of job plans.	2020-08-28 13:40:51 -04:00
Lang Martin	7d483f93c0	csi: plugins track jobs in addition to allocations, and use job information to set expected counts (#8699 ) * nomad/structs/csi: add explicit job support * nomad/state/state_store: capture job updates directly * api/nodes: CSIInfo needs the AllocID * command/agent/csi_endpoint: AllocID was missing Co-authored-by: Tim Gross <tgross@hashicorp.com>	2020-08-27 17:20:00 -04:00
Seth Hoenig	c4fd1c97aa	Merge pull request #8761 from hashicorp/b-consul-op-token-check consul/connect: make use of task kind to determine service name in consul token checks	2020-08-27 14:08:33 -05:00
Tim Gross	606df14e78	MRD: deregister regions that are dropped on update (#8763 ) This changeset is the OSS hooks for what will be implemented in ENT.	2020-08-27 14:54:45 -04:00
Seth Hoenig	84176c9a41	consul/connect: make use of task kind to determine service name in consul token checks When consul.allow_unauthenticated is set to false, the job_endpoint hook validates that a `-consul-token` is provided and validates the token against the privileges inherent to a Consul Service Identity policy for all the Connect enabled services defined in the job. Before, the check was assuming the service was of type sidecar-proxy. This fixes the check to use the type of the task so we can distinguish between the different connect types.	2020-08-27 12:14:40 -05:00
Chris Baker	8b9145fabd	state_store/fix the prefix bugs for scaling policies documented in 1a9318	2020-08-27 04:25:37 +00:00
Chris Baker	655cbb4d3c	documenting tests for prefix bugs around job scaling policies	2020-08-27 03:22:13 +00:00
Seth Hoenig	9f1f2a5673	Merge branch 'master' into f-cc-ingress	2020-08-26 15:31:05 -05:00
Seth Hoenig	5d670c6d01	consul/connect: use context cancel more safely	2020-08-26 14:23:31 -05:00
Seth Hoenig	dfe179abc5	consul/connect: fixup some comments and context timeout	2020-08-26 13:17:16 -05:00
Mahmood Ali	45f549e29e	Merge pull request #8691 from hashicorp/b-reschedule-job-versions Respect alloc job version for lost/failed allocs	2020-08-25 18:02:45 -04:00
Mahmood Ali	def768728e	Have Plan.AppendAlloc accept the job	2020-08-25 17:22:09 -04:00
Mahmood Ali	18632955f2	clarify PathEscapesAllocDir specification Clarify how to handle prefix value and path traversal within the alloc dir but outside the prefix directory.	2020-08-24 20:44:26 -04:00
Mahmood Ali	9794760933	validate parameterized job request meta Fixes a bug where `keys` metadata wasn't populated, as we iterated over the empty newly-created `keys` map rather than the request Meta field.	2020-08-24 20:39:01 -04:00
Seth Hoenig	26e77623e5	consul/connect: fixup tests to use new consul sdk	2020-08-24 12:02:41 -05:00
Seth Hoenig	c4fa644315	consul/connect: remove envoy dns option from gateway proxy config	2020-08-24 09:11:55 -05:00
Yoan Blanc	327d17e0dc	fixup! vendor: consul/api, consul/sdk v1.6.0 Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-08-24 08:59:03 +02:00
Seth Hoenig	5b072029f2	consul/connect: add initial support for ingress gateways This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza. ```hcl service { connect { gateway { proxy { // envoy proxy configuration } ingress { // ingress-gateway configuration entry } } } } ``` A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value. Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver. Aims to address #8294 and tangentially #8647	2020-08-21 16:21:54 -05:00
Nick Ethier	3cd5f46613	Update UI to use new allocated ports fields (#8631 ) * nomad: canonicalize alloc shared resources to populate ports * ui: network ports * ui: remove unused task network references and update tests with new shared ports model * ui: lint * ui: revert auto formatting * ui: remove unused page objects * structs: remove unrelated test from bad conflict resolution * ui: formatting	2020-08-20 11:07:13 -04:00
Mahmood Ali	8a342926b7	Respect alloc job version for lost/failed allocs This change fixes a bug where lost/failed allocations are replaced by allocations with the latest versions, even if the version hasn't been promoted yet. Now, when generating a plan for lost/failed allocations, the scheduler first checks if the current deployment is in Canary stage, and if so, it ensures that any lost/failed allocations is replaced one with the latest promoted version instead.	2020-08-19 09:52:48 -04:00
Tim Gross	1aa242c15a	failed core jobs should not have follow-ups (#8682 ) If a core job fails more than the delivery limit, the leader will create a new eval with the TriggeredBy field set to `failed-follow-up`. Evaluations for core jobs have the leader's ACL, which is not valid on another leader after an election. The `failed-follow-up` evals do not have ACLs, so core job evals that fail more than the delivery limit or core job evals that span leader elections will never succeed and will be re-enqueued forever. So we should not retry with a `failed-follow-up`.	2020-08-18 16:48:43 -04:00
Tim Gross	38ec70eb8d	multiregion: validation should always return error for OSS (#8687 )	2020-08-18 15:35:38 -04:00
Lang Martin	e8a5565c1a	nomad/state/state_store: handle type conversion failure explicitly (#8660 )	2020-08-12 17:53:12 -04:00
Michael Schurter	de08ae8083	test: add allocrunner test for poststart hooks	2020-08-12 09:54:14 -07:00
Mahmood Ali	c462f8d0d5	Merge pull request #8524 from hashicorp/b-vault-health-checks Skip checking Vault health	2020-08-11 16:01:07 -04:00
Lang Martin	a27913e699	CSI RPC Token (#8626 ) * client/allocrunner/csi_hook: use the Node SecretID * client/allocrunner/csi_hook: include the namespace for Claim	2020-08-11 13:08:39 -04:00
Lang Martin	c82b2a2454	CSI: volume and plugin allocations in the API (#8590 ) * command/agent/csi_endpoint: explicitly convert to API structs, and convert allocs for single object get endpoints	2020-08-11 12:24:41 -04:00
Tim Gross	def7084be7	msgpack-rpc errors cannot be wrapped (#8633 ) Our RPC calls mangle the errors we get, which prevents us from using wrapped errors and `errors.Is`. Also fixes log message fields.	2020-08-11 10:25:43 -04:00
Tim Gross	443fdaa86b	csi: nomad volume detach command (#8584 ) The soundness guarantees of the CSI specification leave a little to be desired in our ability to provide a 100% reliable automated solution for managing volumes. This changeset provides a new command to bridge this gap by providing the operator the ability to intervene. The command doesn't take an allocation ID so that the operator doesn't have to keep track of alloc IDs that may have been GC'd. Handle this case in the unpublish RPC by sending the client RPC for all the terminal/nil allocs on the selected node.	2020-08-11 10:18:54 -04:00
Tim Gross	fb27082e5c	RPC errors must be wrapped in order to wrap internal errors (#8632 ) The CSI client RPC uses error wrapping to detect the type of error bubbling up from plugins, but if the errors we get aren't wrapped at each layer, we can't unwrap the inner error. Also eliminates some unused args.	2020-08-11 09:13:52 -04:00
Lang Martin	f245ba91c4	nomad/state/state_store: two cases of incorrect CSIPlugin in-place (#8630 )	2020-08-10 18:15:29 -04:00
Mahmood Ali	dce1dc44eb	distinguish between transient and persistent errors	2020-08-10 16:46:06 -04:00
Seth Hoenig	6ab3d21d2c	consul: validate script type when ussing check thresholds	2020-08-10 14:08:09 -05:00
Seth Hoenig	fd4804bf26	consul: able to set pass/fail thresholds on consul service checks This change adds the ability to set the fields `success_before_passing` and `failures_before_critical` on Consul service check definitions. This is a feature added to Consul v1.7.0 and later. https://www.consul.io/docs/agent/checks#success-failures-before-passing-critical Nomad doesn't do much besides pass the fields through to Consul. Fixes #6913	2020-08-10 14:08:09 -05:00
Tim Gross	e5496c7994	csi: missing plugins during node delete are not an error (#8619 ) When deregistering a client, CSI plugins running on that client may not get a chance to fingerprint before being stopped. Account for the case where a plugin allocation is the last instance of the plugin and has been deleted from the state store to avoid errors during node deregistration.	2020-08-10 11:02:01 -04:00
Mahmood Ali	628985c51e	Merge pull request #8613 from alrs/state-test-errs nomad/state: fix dropped scaling_policy test errors	2020-08-10 08:14:19 -04:00
Lars Lehtonen	f8a42f587f	nomad/state: fix dropped scaling_policy test errors	2020-08-07 23:05:33 -07:00
Tim Gross	69f4f171e5	CSI: fix missing ACL tokens for leader-driven RPCs (#8607 ) The volumewatcher and GC job in the leader can't make CSI RPCs when ACLs are enabled without the leader ACL token being passed thru.	2020-08-07 15:37:27 -04:00
Tim Gross	7d53ed88d6	csi: client RPCs should return wrapped errors for checking (#8605 ) When the client-side actions of a CSI client RPC succeed but we get disconnected during the RPC or we fail to checkpoint the claim state, we want to be able to retry the client RPC without getting blocked by the client-side state (ex. mount points) already having been cleaned up in previous calls.	2020-08-07 11:01:36 -04:00
Tim Gross	81b604fa13	csi: controller unpublish should check current alloc count (#8604 ) Using the count of node claims from earlier in the `CSIVolume.Unpublish RPC doesn't correctly account for cases where the RPC was interrupted but checkpointed. Instead, we'll check the current allocation count and status to determine whether we need to send a controller unpublish.	2020-08-07 10:43:45 -04:00
Tim Gross	2854298089	csi: release claims via csi_hook postrun unpublish RPC (#8580 ) Add a Postrun hook to send the `CSIVolume.Unpublish` RPC to the server. This may forward client RPCs to the node plugins or to the controller plugins, depending on whether other allocations on this node have claims on this volume. By making clients responsible for running the `CSIVolume.Unpublish` RPC (and making the RPC available to a `nomad volume detach` command), the volumewatcher becomes only used by the core GC job and we no longer need async volume GC from job deregister and node update.	2020-08-06 14:51:46 -04:00
Michael Schurter	057e1c021f	Merge pull request #8597 from hashicorp/b-vault-revoke-log-line vault: log once per interval if batching revocation	2020-08-06 11:32:47 -07:00
Tim Gross	314458ebdb	csi: update volumewatcher to use unpublish RPC (#8579 ) This changeset updates `nomad/volumewatcher` to take advantage of the `CSIVolume.Unpublish` RPC. This lets us eliminate a bunch of code and associated tests. The raft batching code can be safely dropped, as the characteristic times of the CSI RPCs are on the order of seconds or even minutes, so batching up raft RPCs added complexity without any real world performance wins. Includes refactor w/ test cleanup and dead code elimination in volumewatcher	2020-08-06 14:31:18 -04:00
Tim Gross	eaa14ab64c	csi: add unpublish RPC (#8572 ) This changeset is plumbing for a `nomad volume detach` command that will be reused by the volumewatcher claim GC as well.	2020-08-06 13:51:29 -04:00
Tim Gross	4bbf18703f	csi: retry controller client RPCs on next controller (#8561 ) The documentation encourages operators to run multiple controller plugin instances for HA, but the client RPCs don't take advantage of this by retrying when the RPC fails in cases when the plugin is unavailable (because the node has drained or the alloc has failed but we haven't received an updated fingerprint yet). This changeset tries all known controllers on ready nodes before giving up, and adds tests that exercise the client RPC routing and retries.	2020-08-06 13:24:24 -04:00
Michael Schurter	2385fee0d2	vault: log once per interval if batching revocation This log line should be rare since: 1. Most tokens should be logged synchronously, not via this async batched method. Async revocation only takes place when Vault connectivity is lost and after leader election so no revocations are missed. 2. There should rarely be >1 batch (1,000) tokens to revoke since the above conditions should be brief and infrequent. 3. Interval is 5 minutes, so this log line will be emitted at most once every 5 minutes. What makes this log line rare is also what makes it interesting: due to a bug prior to Nomad 0.11.2 some tokens may never get revoked. Therefore Nomad tries to re-revoke them on every leader election. This caused a massive buildup of old tokens that would never be properly revoked and purged. Nomad 0.11.3 mostly fixed this but still had a bug in purging revoked tokens via Raft (fixed in #8553). The nomad.vault.distributed_tokens_revoked metric is only ticked upon successful revocation and purging, making any bugs or slowness in the process difficult to detect. Logging before a potentially slow revocation+purge operation is performed will give users much better indications of what activity is going on should the process fail to make it to the metric.	2020-08-05 15:39:21 -07:00
Mahmood Ali	490b9ce3a0	Handle Scaling Policies in Job Plan endpoint (#8567 ) Fixes https://github.com/hashicorp/nomad/issues/8544 This PR fixes a bug where using `nomad job plan ...` always report no change if the submitted job contain scaling. The issue has three contributing factors: 1. The plan endpoint doesn't populate the required scaling policy ID; unlike the job register endpoint 2. The plan endpoint suppresses errors on job insertion - the job insertion fails here, because the scaling policy is missing the required ID 3. The scheduler reports no update necessary when the relevant job isn't in store (because the insertion failed) This PR fixes the first two factors. Changing the scheduler to be more strict might make sense, but may violate some idempotency invariant or make the scheduler more brittle.	2020-07-30 12:27:36 -04:00
Seth Hoenig	2511f48351	consul/connect: add support for bridge networks with connect native tasks Before, Connect Native Tasks needed one of these to work: - To be run in host networking mode - To have the Consul agent configured to listen to a unix socket - To have the Consul agent configured to listen to a public interface None of these are a great experience, though running in host networking is still the best solution for non-Linux hosts. This PR establishes a connection proxy between the Consul HTTP listener and a unix socket inside the alloc fs, bypassing the network namespace for any Connect Native task. Similar to and re-uses a bunch of code from the gRPC listener version for envoy sidecar proxies. Proxy is established only if the alloc is configured for bridge networking and there is at least one Connect Native task in the Task Group. Fixes #8290	2020-07-29 09:26:01 -05:00
Michael Schurter	80f521cce5	vault: expired tokens count toward batch limit As of 0.11.3 Vault token revocation and purging was done in batches. However the batch size was only limited by the number of non-expired tokens being revoked. Due to bugs prior to 0.11.3, expired tokens were not properly purged. Long-lived clusters could have thousands to millions of very old expired tokens that never got purged from the state store. Since these expired tokens did not count against the batch limit, very large batches could be created and overwhelm servers. This commit ensures expired tokens count toward the batch limit with this one line change: ``` - if len(revoking) >= toRevoke { + if len(revoking)+len(ttlExpired) >= toRevoke { ``` However, this code was difficult to test due to being in a periodically executing loop. Most of the changes are to make this one line change testable and test it.	2020-07-28 15:42:47 -07:00
Drew Bailey	bd421b6197	Merge pull request #8453 from hashicorp/oss-multi-vault-ns oss compoments for multi-vault namespaces	2020-07-27 08:45:22 -04:00
Drew Bailey	b296558b8e	oss compoments for multi-vault namespaces adds in oss components to support enterprise multi-vault namespace feature upgrade specific doc on vault multi-namespaces vault docs update test to reflect new error	2020-07-24 10:14:59 -04:00
James Rasell	da91e1d0fc	api: add namespace to scaling status GET response object.	2020-07-24 11:19:25 +02:00
Mahmood Ali	5d86f84c5a	test tweaks	2020-07-23 13:25:25 -04:00
Mahmood Ali	5f6162ba46	run revoke daemon if connection is successful	2020-07-23 13:08:16 -04:00
Mahmood Ali	48ebedb738	vault: simply make the API call Avoid checking if API is accessible, just make the API call and handle when it fails.	2020-07-23 11:33:08 -04:00
Tim Gross	d3341a2019	refactor: make it clear where we're accessing dstate The field name `Deployment.TaskGroups` contains a map of `DeploymentState`, which makes it a little harder to follow state updates when combined with inconsistent naming conventions, particularly when we also have the state store or actual `TaskGroup`s in scope. This changeset changes all uses to `dstate` so as not to be confused with actual TaskGroups.	2020-07-20 11:25:53 -04:00
Lang Martin	a3bfd8c209	structs: Job.Validate only allows stop_after_client_disconnected on batch and service jobs (#8444 ) * nomad/structs/structs: add to Job.Validate * Update nomad/structs/structs.go Co-authored-by: Mahmood Ali <mahmood@hashicorp.com> * nomad/structs/structs: match error strings to the config file * nomad/structs/structs_test: clarify the test a bit * nomad/structs/structs_test: typo in the test error comparison Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2020-07-20 10:27:25 -04:00
Mahmood Ali	78568b8e63	Remove unused state.TestInitState	2020-07-20 09:55:55 -04:00
Mahmood Ali	a483dde8b9	minor tweaks from Ent	2020-07-20 09:25:09 -04:00
Mahmood Ali	5adbd9f666	enterprise specific state store objects	2020-07-20 09:22:26 -04:00
Mahmood Ali	ad2d484974	Set AgentShutdown	2020-07-17 11:04:57 -04:00
Mahmood Ali	647c5e4c03	Merge pull request #8435 from hashicorp/b-atomic-job-register Atomic eval insertion with job (de-)registration	2020-07-15 13:48:07 -04:00
Mahmood Ali	aa500f7ba3	comment compat concern in fsm.go	2020-07-15 11:23:49 -04:00
Mahmood Ali	f4a921f2be	no need to handle duplicate evals anymore	2020-07-15 11:14:49 -04:00
Mahmood Ali	a314744210	only set args.Eval after all servers upgrade We set the Eval field on job (de-)registration only after all servers get upgraded, to avoid dealing with duplicate evals.	2020-07-15 11:10:57 -04:00
Mahmood Ali	910776caf0	time.Now().UTC().UnixNano() -> time.Now().UnixNano()	2020-07-15 08:49:17 -04:00
Kurt Neufeld	62851f6ccb	fixed typo in output (#1 )	2020-07-14 10:33:17 -06:00
Mahmood Ali	fbfe4ab1bd	Atomic eval insertion with job (de-)registration This fixes a bug where jobs may get "stuck" unprocessed that dispropotionately affect periodic jobs around leadership transitions. When registering a job, the job registration and the eval to process it get applied to raft as two separate transactions; if the job registration succeeds but eval application fails, the job may remain unprocessed. Operators may detect such failure, when submitting a job update and get a 500 error code, and they could retry; periodic jobs failures are more likely to go unnoticed, and no further periodic invocations will be processed until an operator force evaluation. This fixes the issue by ensuring that the job registration and eval application get persisted and processed atomically in the same raft log entry. Also, applies the same change to ensure atomicity in job deregistration. Backward Compatibility We must maintain compatibility in two scenarios: mixed clusters where a leader can handle atomic updates but followers cannot, and a recent cluster processes old log entries from legacy or mixed cluster mode. To handle this constraints: ensure that the leader continue to emit the Evaluation log entry until all servers have upgraded; also, when processing raft logs, the servers honor evaluations found in both spots, the Eval in job (de-)registration and the eval update entries. When an updated server sees mix-mode behavior where an eval is inserted into the raft log twice, it ignores the second instance. I made one compromise in consistency in the mixed-mode scenario: servers may disagree on the eval.CreateIndex value: the leader and updated servers will report the job registration index while old servers will report the index of the eval update log entry. This discripency doesn't seem to be material - it's the eval.JobModifyIndex that matters.	2020-07-14 11:59:29 -04:00
Tim Gross	bd457343de	MRD: all regions should start pending (#8433 ) Deployments should wait until kicked off by `Job.Register` so that we can assert that all regions have a scheduled deployment before starting any region. This changeset includes the OSS fixes to support the ENT work. `IsMultiregionStarter` has no more callers in OSS, so remove it here.	2020-07-14 10:57:37 -04:00
Tim Gross	0ce3c1e942	multiregion: allow empty region DCs (#8426 ) It's supposed to be possible for a region not to have `datacenters` set so that it can use the job's `datacenters` field. This requires that operators use the same DC name across multiple regions, but that's the default client configuration.	2020-07-13 13:34:19 -04:00
Nick Ethier	d171189afc	nomad: recanonicalize network after connect hook (#8407 ) * nomad: recanonicalize network after connect hook	2020-07-10 10:59:51 -04:00
Seth Hoenig	6fc63ede76	Merge pull request #7733 from jorgemarey/b-vault-policies Fix get all vault token policies	2020-07-09 10:05:59 -05:00
Seth Hoenig	f023df7b68	Merge pull request #8392 from hashicorp/f-infer-cn-taskname consul/connect: infer task name for native service if possible	2020-07-08 14:17:25 -05:00
Seth Hoenig	5be1679b86	Merge pull request #8338 from jorgemarey/b-fix-sidecar-task Change connectDriverConfig to be a func	2020-07-08 14:00:27 -05:00
Seth Hoenig	1a75da0ce0	consul/connect: infer task name in service if possible Before, the service definition for a Connect Native service would always require setting the `service.task` parameter. Now, that parameter is automatically inferred when there is only one task in the task group. Fixes #8274	2020-07-08 13:31:44 -05:00
Jasmine Dahilig	9e27231953	add poststart hook to task hook coordinator & structs	2020-07-08 11:01:35 -07:00
Tim Gross	ec96ddf648	fix swapped old/new multiregion plan diffs (#8378 ) The multiregion plan diffs swap the old and new versions for each region when they're edited (rather than added/removed). The `multiregionRegionDiff` function call incorrectly reversed its arguments for existing regions.	2020-07-08 10:10:50 -04:00
Jorge Marey	a3740cba9b	Change connectDriverConfig to be a func	2020-07-07 08:59:59 +02:00
Nick Ethier	e0fb634309	ar: support opting into binding host ports to default network IP (#8321 ) * ar: support opting into binding host ports to default network IP * fix config plumbing * plumb node address into network resource * struct: only handle network resource upgrade path once	2020-07-06 18:51:46 -04:00
Chris Baker	5b96c3d50e	Merge pull request #8360 from hashicorp/b-8355-better-scaling-validation better error handling around Scaling->Max	2020-07-06 11:32:02 -05:00
Chris Baker	5aa46e9a8f	modified state store to allow version skipping, to support multiregion version syncing also, passing existing version into multiregionRegister to support this	2020-07-06 14:16:55 +00:00
Lars Lehtonen	f32e80175d	nomad: fix dropped test error (#8356 )	2020-07-06 08:46:54 -04:00
Chris Baker	a77e012220	better testing of scaling parsing, fixed some broken tests by api changes	2020-07-04 19:32:37 +00:00
Chris Baker	9100b6b7c0	changes to make sure that Max is present and valid, to improve error messages * made api.Scaling.Max a pointer, so we can detect (and complain) when it is neglected * added checks to HCL parsing that it is present * when Scaling.Max is absent/invalid, don't return extraneous error messages during validation * tweak to multiregion handling to ensure that the count is valid on the interpolated regional jobs resolves #8355	2020-07-04 19:05:50 +00:00
Lang Martin	6c22cd587d	api: `nomad debug` new /agent/host (#8325 ) * command/agent/host: collect host data, multi platform * nomad/structs/structs: new HostDataRequest/Response * client/agent_endpoint: add RPC endpoint * command/agent/agent_endpoint: add Host * api/agent: add the Host endpoint * nomad/client_agent_endpoint: add Agent Host with forwarding * nomad/client_agent_endpoint: use findClientConn This changes forwardMonitorClient and forwardProfileClient to use findClientConn, which was cribbed from the common parts of those funcs. * command/debug: call agent hosts * command/agent/host: eliminate calling external programs	2020-07-02 09:51:25 -04:00
Tim Gross	23be116da0	csi: add -force flag to volume deregister (#8295 ) The `nomad volume deregister` command currently returns an error if the volume has any claims, but in cases where the claims can't be dropped because of plugin errors, providing a `-force` flag gives the operator an escape hatch. If the volume has no allocations or if they are all terminal, this flag deletes the volume from the state store, immediately and implicitly dropping all claims without further CSI RPCs. Note that this will not also unmount/detach the volume, which we'll make the responsibility of a separate `nomad volume detach` command.	2020-07-01 12:17:51 -04:00
Mahmood Ali	7f460d2706	allocrunner: terminate sidecars in the end This fixes a bug where a batch allocation fails to complete if it has sidecars. If the only remaining running tasks in an allocations are sidecars - we must kill them and mark the allocation as complete.	2020-06-29 15:12:15 -04:00
Drew Bailey	01e2cc5054	allow ClusterMetadata to accept a watchset (#8299 ) * allow ClusterMetadata to accept a watchset * use nil instead of empty watchset	2020-06-26 13:23:32 -04:00
Mahmood Ali	49a177ce28	Merge pull request #8017 from hashicorp/f-change-sched-updated Set Updated to true for all non-CAS requests on v1/operator/scheduler/configuration	2020-06-26 08:39:37 -04:00
Mahmood Ali	6605ebd314	Merge pull request #8223 from hashicorp/f-multi-network-validate-ports core: validate port numbers are < 65535	2020-06-26 08:31:01 -04:00
Nick Ethier	89118016fc	command: correctly show host IP in ports output /w multi-host networks (#8289 )	2020-06-25 15:16:01 -04:00
Tim Gross	67ffcb35e9	multiregion: add support for 'job plan' (#8266 ) Add a scatter-gather for multiregion job plans. Each region's servers interpolate the plan locally in `Job.Plan` but don't distribute the plan as done in `Job.Run`. Note that it's not possible to return a usable modify index from a multiregion plan for use with `-check-index`. Even if we were to force the modify index to be the same at the start of `Job.Run` the index immediately drifts during each region's deployments, depending on events local to each region. So we omit this section of a multiregion plan.	2020-06-24 13:24:55 -04:00
Tim Gross	a449009e9f	multiregion validation fixes (#8265 ) Multi-region jobs need to bypass validating counts otherwise we get spurious warnings in Job.Plan.	2020-06-24 12:18:51 -04:00
Seth Hoenig	3872b493e5	Merge pull request #8011 from hashicorp/f-cnative-host consul/connect: implement initial support for connect native	2020-06-24 10:33:12 -05:00
Seth Hoenig	011c6b027f	connect/native: doc and comment tweaks from PR	2020-06-24 10:13:22 -05:00
Michael Schurter	7869ebc587	docs: add comments to structs.Port struct	2020-06-23 11:38:01 -07:00
Michael Schurter	13ed710a04	core: validate port numbers are <= 65535 The scheduler returns a very strange error if it detects a port number out of range. If these would somehow make it to the client they would overflow when converted to an int32 and could cause conflicts.	2020-06-23 11:31:49 -07:00
Seth Hoenig	6c5ab7f45e	consul/connect: split connect native flag and task in service	2020-06-23 10:22:22 -05:00
Seth Hoenig	4d71f22a11	consul/connect: add support for running connect native tasks This PR adds the capability of running Connect Native Tasks on Nomad, particularly when TLS and ACLs are enabled on Consul. The `connect` stanza now includes a `native` parameter, which can be set to the name of task that backs the Connect Native Consul service. There is a new Client configuration parameter for the `consul` stanza called `share_ssl`. Like `allow_unauthenticated` the default value is true, but recommended to be disabled in production environments. When enabled, the Nomad Client's Consul TLS information is shared with Connect Native tasks through the normal Consul environment variables. This does NOT include auth or token information. If Consul ACLs are enabled, Service Identity Tokens are automatically and injected into the Connect Native task through the CONSUL_HTTP_TOKEN environment variable. Any of the automatically set environment variables can be overridden by the Connect Native task using the `env` stanza. Fixes #6083	2020-06-22 14:07:44 -05:00
Mahmood Ali	862834a792	testS: add all namespaces test for allocations	2020-06-22 10:26:08 -04:00
Michael Schurter	562704124d	Merge pull request #8208 from hashicorp/f-multi-network multi-interface network support	2020-06-19 15:46:48 -07:00
Nick Ethier	fb9c458df1	nomad/mock: add NodeNetworkResources to mock Node	2020-06-19 14:22:24 -04:00
Nick Ethier	a87e91e971	test: fix up testing around host networks	2020-06-19 13:53:31 -04:00
Nick Ethier	f0ac1f027a	lint: spelling	2020-06-19 11:29:41 -04:00
Tim Gross	b654e1b8a4	multiregion: all regions start in running if no max_parallel (#8209 ) If `max_parallel` is not set, all regions should begin in a `running` state rather than a `pending` state. Otherwise the first region is set to `running` and then all the remaining regions once it enters `blocked. That behavior is technically correct in that we have at most `max_parallel` regions running, but definitely not what a user expects.	2020-06-19 11:17:09 -04:00
Nick Ethier	f0559a8162	multi-interface network support	2020-06-19 09:42:10 -04:00
Tim Gross	8a354f828f	store ACL Accessor ID from Job.Register with Job (#8204 ) In multiregion deployments when ACLs are enabled, the deploymentwatcher needs an appropriately scoped ACL token with the same `submit-job` rights as the user who submitted it. The token will already be replicated, so store the accessor ID so that it can be retrieved by the leader.	2020-06-19 07:53:29 -04:00
Mahmood Ali	38a01c050e	Merge pull request #8192 from hashicorp/f-status-allnamespaces-2 CLI Allow querying all namespaces for jobs and allocations - Try 2	2020-06-18 20:16:52 -04:00
Nick Ethier	4a44deaa5c	CNI Implementation (#7518 )	2020-06-18 11:05:29 -07:00
Nick Ethier	0bc0403cc3	Task DNS Options (#7661 ) Co-Authored-By: Tim Gross <tgross@hashicorp.com> Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>	2020-06-18 11:01:31 -07:00
Mahmood Ali	c0aa06d9c7	rpc: allow querying allocs across namespaces This implements the backend handling for querying across namespaces for allocation list endpoints.	2020-06-17 16:31:06 -04:00
Mahmood Ali	e784fe331a	use '*' to indicate all namespaces This reverts the introduction of AllNamespaces parameter that was merged earlier but never got released.	2020-06-17 16:27:43 -04:00
Tim Gross	81ae581da6	test: remove flaky test from volumewatcher (#8189 ) The volumewatcher restores itself on notification, but detecting this is racy because it may reap any claim (or find there are no claims to reap) and shutdown before we can test whether it's running. This appears to have become flaky with a new version of golang. The other cases in this test case sufficiently exercise the start/stop behavior of the volumewatcher, so remove the flaky section.	2020-06-17 15:41:51 -04:00
Chris Baker	fe9d654640	Merge pull request #8187 from hashicorp/f-8143-block-scaling-during-deployment modify Job.Scale RPC to return an error if there is an active deployment	2020-06-17 14:38:55 -05:00
Chris Baker	cd903218f7	added changelog entry and satisfied `make check`	2020-06-17 17:43:45 +00:00
Chris Baker	ab2b15d8cb	modify Job.Scale RPC to return an error if there is an active deployment resolves #8143	2020-06-17 17:03:35 +00:00
Tim Gross	6b1cb61888	remove test for ent-only behavior	2020-06-17 11:27:29 -04:00
Tim Gross	c14a75bfab	multiregion: use pending instead of paused The `paused` state is used as an operator safety mechanism, so that they can debug a deployment or halt one that's causing a wider failure. By using the `paused` state as the first state of a multiregion deployment, we risked resuming an intentionally operator-paused deployment because of activity in a peer region. This changeset replaces the use of the `paused` state with a `pending` state, and provides a `Deployment.Run` internal RPC to replace the use of the `Deployment.Pause` (resume) RPC we were using in `deploymentwatcher`.	2020-06-17 11:06:14 -04:00
Tim Gross	fd50b12ee2	multiregion: integrate with deploymentwatcher * `nextRegion` should take status parameter * thread Deployment/Job RPCs thru `nextRegion` * add `nextRegion` calls to `deploymentwatcher` * use a better description for paused for peer	2020-06-17 11:06:00 -04:00
Tim Gross	7b12445f29	multiregion: change AutoRevert to OnFailure	2020-06-17 11:05:45 -04:00
Tim Gross	5c4d0a73f4	start all but first region deployment in paused state	2020-06-17 11:05:34 -04:00
Tim Gross	48e9f75c1e	multiregion: deploymentwatcher hooks This changeset establishes hooks in deploymentwatcher for multiregion deployments (for the enterprise version of Nomad).	2020-06-17 11:05:18 -04:00
Tim Gross	b09b7a2475	Multiregion job registration Integration points for multiregion jobs to be registered in the enterprise version of Nomad: * hook in `Job.Register` for enterprise to send job to peer regions * remove monitoring from `nomad job run` and `nomad job stop` for multiregion jobs	2020-06-17 11:04:58 -04:00
Drew Bailey	9263fcb0d3	Multiregion deploy status and job status CLI	2020-06-17 11:03:34 -04:00
Tim Gross	473a0f1d44	multiregion: unblock and cancel RPCs	2020-06-17 11:02:26 -04:00
Tim Gross	ede3a4f1c4	multiregion: request structs	2020-06-17 11:00:34 -04:00
Tim Gross	6851024925	Multiregion structs Initial struct definitions, jobspec parsing, validation, and conversion between Nomad structs and API structs for multi-region deployments.	2020-06-17 11:00:14 -04:00
Chris Baker	9fc66bc1aa	support in API client and Job.Register RPC for PreserveCounts	2020-06-16 18:45:28 +00:00
Chris Baker	1e3563e08c	wip: added PreserveCounts to struct.JobRegisterRequest, development test for Job.Register	2020-06-16 18:45:17 +00:00
Chris Baker	7ed06cced0	core: update Job.Scale to save the previous job count in the ScalingEvent	2020-06-15 19:49:22 +00:00
Chris Baker	aeb3ed449e	wip: added .PreviousCount to api.ScalingEvent and structs.ScalingEvent, with developmental tests	2020-06-15 19:40:21 +00:00
Mahmood Ali	c17ffb2d35	Merge pull request #8131 from hashicorp/f-snapshot-restore Implement snapshot restore	2020-06-15 08:32:34 -04:00
Mahmood Ali	9bfc3e28d9	Apply suggestions from code review Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2020-06-15 08:32:16 -04:00
Lang Martin	069840bef8	scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect (#8105 ) (#8138 ) * scheduler/reconcile: set FollowupEvalID on lost stop_after_client_disconnect * scheduler/reconcile: thread follupEvalIDs through to results.stop * scheduler/reconcile: comment typo * nomad/_test: correct arguments for plan.AppendStoppedAlloc * scheduler/reconcile: avoid nil, cleanup handleDelayed(Lost\|Reschedules)	2020-06-09 17:13:53 -04:00
Mahmood Ali	63e048e972	clarify ccomments, esp related to leadership code	2020-06-09 12:01:31 -04:00
Mahmood Ali	b543460e0a	loosen raft timeout	2020-06-07 16:38:11 -04:00
Mahmood Ali	69bb42acf8	tests: prefix agent logs to identify agent sources	2020-06-07 16:38:11 -04:00
Mahmood Ali	47a163b63f	reassert leadership	2020-06-07 15:47:06 -04:00
Mahmood Ali	9eb13ae144	basic snapshot restore	2020-06-07 15:46:23 -04:00
Mahmood Ali	bf7a3583e5	Merge pull request #8089 from hashicorp/b-leader-worker-count leadership: pause and unpause workers consistently	2020-06-04 12:01:01 -04:00
Mahmood Ali	cd8e1b4d62	stop periodic dispatch at end of tests (#8111 )	2020-06-04 09:15:00 -04:00
Lang Martin	ac7c39d3d3	Delayed evaluations for `stop_after_client_disconnect` can cause unwanted extra followup evaluations around job garbage collection (#8099 ) * client/heartbeatstop: reversed time condition for startup grace * scheduler/generic_sched: use `delayInstead` to avoid a loop Without protecting the loop that creates followUpEvals, a delayed eval is allowed to create an immediate subsequent delayed eval. For both `stop_after_client_disconnect` and the `reschedule` block, a delayed eval should always produce some immediate result (running or blocked) and then only after the outcome of that eval produce a second delayed eval. * scheduler/reconcile: lostLater are different than delayedReschedules Just slightly. `lostLater` allocs should be used to create batched evaluations, but `handleDelayedReschedules` assumes that the allocations are in the untainted set. When it creates the in-place updates to those allocations at the end, it causes the allocation to be treated as running over in the planner, which causes the initial `stop_after_client_disconnect` evaluation to be retried by the worker.	2020-06-03 09:48:38 -04:00
Mahmood Ali	70fbcb99c2	leadership: pause and unpause workers consistently This fixes a bug where leadership establishment pauses 3/4 of workers but stepping down unpause only 1/2!	2020-06-01 10:57:53 -04:00
Mahmood Ali	891fb3f8a9	test for paused workers upon leadership revocation	2020-06-01 10:48:42 -04:00
Mahmood Ali	de44d9641b	Merge pull request #8047 from hashicorp/f-snapshot-save API for atomic snapshot backups	2020-06-01 07:55:16 -04:00
Mahmood Ali	e37a3312d5	If leadership fails, consider it handled The callers for `forward` and old implementation expect failures to be accompanied with a true value! This fixes the issue and have tests passing!	2020-05-31 22:06:17 -04:00
Mahmood Ali	30ab9c84e5	more review feedback	2020-05-31 21:39:09 -04:00
Mahmood Ali	a73cd01a00	Merge pull request #8001 from hashicorp/f-jobs-list-across-nses endpoint to expose all jobs across all namespaces	2020-05-31 21:28:03 -04:00
Mahmood Ali	082c085068	Merge pull request #8036 from hashicorp/f-background-vault-revoke-on-restore Speed up leadership establishment	2020-05-31 21:27:16 -04:00
Mahmood Ali	1af32e65bc	clarify rpc consistency readiness comment	2020-05-31 21:26:41 -04:00
Mahmood Ali	0819ea60ea	Apply suggestions from code review Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2020-05-31 21:04:39 -04:00
Mahmood Ali	37c6160b96	Handle nil/empty cluster metadata Handle case where a snapshot is made before cluster metadata is created. This fixes a bug where a server may have empty cluster metadata if it created and installed a Raft snapshot before a new cluster metadata ID is generated. This case is very unlikely to arise. Most likely reason is when upgrading from an old version slowly where servers may use snapshots before all servers upgrade. This happened for a user with a log line like: ``` 2020-05-21T15:21:56.996Z [ERROR] nomad.fsm: ClusterSetMetadata failed: error=""set cluster metadata failed: refusing to set new cluster id, previous: , new: <<redacted> ```	2020-05-29 13:34:21 -04:00
Drew Bailey	23d24c7a7f	removes pro tags (#8014 )	2020-05-28 15:40:17 -04:00
Mahmood Ali	475b3b77ad	Merge pull request #8060 from hashicorp/tests-deflake-20200526 Deflake some tests - 2020-05-27 edition	2020-05-27 15:24:31 -04:00
Drew Bailey	34871f89be	Oss license support for ent builds (#8054 ) * changes necessary to support oss licesning shims revert nomad fmt changes update test to work with enterprise changes update tests to work with new ent enforcements make check update cas test to use scheduler algorithm back out preemption changes add comments * remove unused method	2020-05-27 13:46:52 -04:00
Mahmood Ali	61e4f5aaf9	tests: use GreaterOrEqual and apply change to other tests	2020-05-27 11:22:48 -04:00
Mahmood Ali	6dfe0f5d3b	tests: use t.Fatalf when it's clearer	2020-05-27 10:09:56 -04:00
Mahmood Ali	ec1fcedb93	tests: node drain events may be duplicated	2020-05-27 08:59:06 -04:00
Mahmood Ali	c3c2a85314	tests: wait until clients are in the state store	2020-05-26 18:53:24 -04:00
Mahmood Ali	5d80d2a511	tests: eval may be processed quickly	2020-05-26 18:53:24 -04:00
Mahmood Ali	19141f8103	{volume\|deployment}watcher: check for nil batcher	2020-05-26 14:54:27 -04:00
Mahmood Ali	81ac098a22	deploymentwatcher: no batcher when disabling When disabling deploymentwatcher (at the end of a test), avoid starting a new update batcher with its new goroutine.	2020-05-26 14:44:47 -04:00
Mahmood Ali	ccc89f940a	terminate leader goroutines on shutdown Ensure that nomad steps down (and terminate leader goroutines) on shutdown, when the server is the leader. Without this change, `monitorLeadership` may handle `shutdownCh` event and exit early before handling the raft `leaderCh` event and end up leaking leadership goroutines.	2020-05-26 10:18:10 -04:00
Mahmood Ali	e671913e56	fix a trace logline	2020-05-26 10:18:09 -04:00
Mahmood Ali	1c79c3b93d	refactor: context is first parameter By convention, go functions take `context.Context` as the first argument.	2020-05-26 10:18:09 -04:00
Mahmood Ali	1eff8b0ed8	volumewatcher: no batcher when disabling When disabling volumewatcher (at the end of a test), avoid starting a new update batcher with its new goroutine.	2020-05-26 10:18:09 -04:00
Mahmood Ali	b895cef622	always set purgeFunc purgeFunc cannot be nil, so ensure it's set to a no-op function in tests.	2020-05-21 21:05:53 -04:00
Mahmood Ali	2108681c1d	Endpoint for snapshotting server state	2020-05-21 20:04:38 -04:00
Mahmood Ali	fbe140b26c	vault: ensure ttl expired tokens are purge If a token is scheduled for revocation expires before we revoke it, ensure that it is marked as purged in raft and is only removed from local vault state if the purge operation succeeds. Prior to this change, we may remove the accessor from local state but not purge it from Raft. This causes unnecessary and churn in the next leadership elections (and until 0.11.2 result in indefinite retries).	2020-05-21 19:54:50 -04:00
Mahmood Ali	aa8e79e55b	Reorder leadership handling Start serving RPC immediately after leader components are enabled, and move clean up to the bottom as they don't block leadership responsibilities.	2020-05-21 08:30:31 -04:00
Mahmood Ali	1cf1114627	apply the same change to consul revocation	2020-05-21 08:30:31 -04:00
Mahmood Ali	1399d02f45	rate limit revokeDaemon	2020-05-21 08:30:31 -04:00
Mahmood Ali	6e749d12a0	on leadership establishment, revoke Vault tokens in background Establishing leadership should be very fast and never make external API calls. This fixes a situation where there is a long backlog of Vault tokens to be revoked on when leadership is gained. In such case, revoking the tokens will significantly slow down leadership establishment and slow down processing. Worse, the revocation call does not honor leadership `stopCh` signals, so it will not stop when the leader loses leadership.	2020-05-21 07:38:27 -04:00
Tim Gross	72430a4e62	csi: don't pass volume claim releases thru GC eval (#8021 ) Following the new volumewatcher in #7794 and performance improvements to it that landed afterwards, there's no particular reason we should be threading claim releases through the GC eval rather than writing an empty `CSIVolumeClaimRequest` with the mode set to `CSIVolumeClaimRelease`, just as the GC evaluation would do. Also, by batching up these raft messages, we can reduce the amount of raft writes by 1 and cross-server RPCs by 1 per volume we release claims on.	2020-05-20 15:22:51 -04:00
Tim Gross	3902709c0a	csi: check for empty arguments on CSI endpoint (#8027 ) Some of the CSI RPC endpoints were missing validation that the ID or the Volume definition was present. This could result in nonsense `CSIVolume` structs being written to raft during registration. This changeset corrects that bug and adds validation checks to present nicer error messages to operators in some other cases.	2020-05-20 10:22:24 -04:00
Charlie Voiselle	70303c906c	Simplify comments Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2020-05-19 15:05:24 -04:00
Charlie Voiselle	6976a7699e	Set Updated to true for all non-CAS requests	2020-05-19 12:59:39 -04:00
Mahmood Ali	406fce90c3	list all jobs on namespaces the token can access	2020-05-19 09:51:41 -04:00
Seth Hoenig	f6c8db8a8a	consul/connect: use task kind to get service name Fixes #8000 When requesting a Service Identity token from Consul, use the TaskKind of the Task to get at the service name associated with the task. In the past using the TaskName worked because it was generated as a sidecar task with a name that included the service. In the Native context, we need to get at the service name in a more correct way, i.e. using the TaskKind which is defined to include the service name.	2020-05-18 13:46:00 -06:00
Mahmood Ali	5ab2d52e27	endpoint to expose all jobs across all namespaces Allow a `/v1/jobs?all_namespaces=true` to list all jobs across all namespaces. The returned list is to contain a `Namespace` field indicating the job namespace. If ACL is enabled, the request token needs to be a management token or have `namespace:list-jobs` capability on all existing namespaces.	2020-05-18 13:50:46 -04:00
Tim Gross	2082cf738a	csi: support for VolumeContext and VolumeParameters (#7957 ) The MVP for CSI in the 0.11.0 release of Nomad did not include support for opaque volume parameters or volume context. This changeset adds support for both. This also moves args for ControllerValidateCapabilities into a struct. The CSI plugin `ControllerValidateCapabilities` struct that we turn into a CSI RPC is accumulating arguments, so moving it into a request struct will reduce the churn of this internal API, make the plugin code more readable, and make this method consistent with the other plugin methods in that package.	2020-05-15 08:16:01 -04:00
Mahmood Ali	b385a5d063	Merge pull request #7959 from hashicorp/b-deleted-vault-accessors vault: ensure that token revocation is idempotent	2020-05-14 12:39:06 -04:00
Mahmood Ali	077342c528	vault: ensure that token revocation is idempotent This ensures that token revocation is idempotent and can handle when tokens are revoked out of band. Idempotency is important to handle some transient failures and retries. Consider when a single token of a batch fails to be revoked, nomad would retry revoking the entire batch; tokens already revoked should be gracefully handled, otherwise, nomad may retry revoking the same tokens forever.	2020-05-14 11:30:32 -04:00
Mahmood Ali	6ac166e1aa	vault: failing test for repeated revocation	2020-05-14 11:30:29 -04:00
Lang Martin	d3c4700cd3	server: stop after client disconnect (#7939 ) * jobspec, api: add stop_after_client_disconnect * nomad/state/state_store: error message typo * structs: alloc methods to support stop_after_client_disconnect 1. a global AllocStates to track status changes with timestamps. We need this to track the time at which the alloc became lost originally. 2. ShouldClientStop() and WaitClientStop() to actually do the math * scheduler/reconcile_util: delayByStopAfterClientDisconnect * scheduler/reconcile: use delayByStopAfterClientDisconnect * scheduler/util: updateNonTerminalAllocsToLost comments This was setup to only update allocs to lost if the DesiredStatus had already been set by the scheduler. It seems like the intention was to update the status from any non-terminal state, and not all lost allocs have been marked stop or evict by now * scheduler/testing: AssertEvalStatus just use require * scheduler/generic_sched: don't create a blocked eval if delayed * scheduler/generic_sched_test: several scheduling cases	2020-05-13 16:39:04 -04:00
Mahmood Ali	3b4116e0db	Merge pull request #7894 from hashicorp/b-cronexpr-dst-fix Fix Daylight saving transition handling	2020-05-12 16:36:11 -04:00
Tim Gross	4374c1a837	csi: support Secrets parameter in CSI RPCs (#7923 ) CSI plugins can require credentials for some publishing and unpublishing workflow RPCs. Secrets are configured at the time of volume registration, stored in the volume struct, and then passed around as an opaque map by Nomad to the plugins.	2020-05-11 17:12:51 -04:00
Mahmood Ali	938e916d9c	When serializing msgpack, only consider codec tag When serializing structs with msgpack, only consider type tags of `codec`. Hashicorp/go-msgpack (based on ugorji/go) defaults to interpretting `codec` tag if it's available, but falls to using `json` if `codec` isn't present. This behavior is surprising in cases where we want to serialize json differently from msgpack, e.g. serializing `ConsulExposeConfig`.	2020-05-11 14:14:10 -04:00
Mahmood Ali	b4fa8e9588	codec: we use hashicorp/go-msgpack exclusively No need to maintain two msgpack handles!	2020-05-11 14:05:29 -04:00
Tim Gross	6554e9ee37	csi: log fallthrough on invalid node IDs for client RPC (#7918 ) When a CSI client RPC is given a specific node for a controller but the lookup fails (because the node is gone or is an older version), we fallthrough to select a node from all those available. This adds logging to this case to aid in diagnostics.	2020-05-11 12:26:10 -04:00
Tim Gross	1ec41b6770	volumewatcher: stop watcher goroutines when there's no work (#7909 ) The watcher goroutines will be automatically started if a volume has updates, but when idle we shouldn't keep a goroutine running and taking up memory.	2020-05-11 09:32:05 -04:00
Mahmood Ali	061a439f2c	Merge pull request #7912 from hashicorp/f-scheduler-algorithm-followup Scheduler Algorithm Defaults handling and docs	2020-05-11 09:30:58 -04:00
Mahmood Ali	0384543d05	Merge pull request #7913 from hashicorp/deflake-TestTaskTemplateManager_BlockedEvents Deflake TestTaskTemplateManager_BlockedEvents test	2020-05-11 09:30:44 -04:00
Mahmood Ali	dff0fcf2f3	Merge pull request #7914 from hashicorp/b-csi-fix-slice-initialization Fix slice initialization	2020-05-11 09:27:01 -04:00
Tim Gross	3aa761b151	Periodic GC for volume claims (#7881 ) This changeset implements a periodic garbage collection of CSI volumes with missing allocations. This can happen in a scenario where a node update fails partially and the allocation updates are written to raft but the evaluations to GC the volumes are dropped. This feature will cover this edge case and ensure that upgrades from 0.11.0 and 0.11.1 get any stray claims cleaned up.	2020-05-11 08:20:50 -04:00
James Rasell	aaf2fe033e	Merge pull request #7903 from hashicorp/b-gh-7902 api: validate scale count value is not negative.	2020-05-11 09:17:01 +02:00
Mahmood Ali	9fac6ea5d9	Fix slice initialization	2020-05-09 21:35:42 -04:00
Mahmood Ali	64de395df0	tests: ease debugging TestClientEndpoint_CreateNodeEvals TestClientEndpoint_CreateNodeEvals flakes a bit but its output is very confusing, as `structs.Evaluations` overrides GoString. Here, we emit the entire struct of the evaluation, and hopefully we'll figure out the problem the next time it happens	2020-05-09 16:04:32 -04:00
Mahmood Ali	ff5c3e81b0	avoid logging after a test completes	2020-05-09 14:40:00 -04:00
Mahmood Ali	2c963885b0	handle upgrade path and defaults Ensure that `""` Scheduler Algorithm gets explicitly set to binpack on upgrades or on API handling when user misses the value. The scheduler already treats `""` value as binpack. This PR merely ensures that the operator API returns the effective value.	2020-05-09 12:34:08 -04:00
Tim Gross	8373e917fc	volumewatcher: set maximum batch size for raft update (#7907 ) The `volumewatcher` has a 250ms batch window so claim updates will not typically be large enough to risk exceeding the maximum raft message size. But large jobs might have enough volume claims that this could be a danger. Set a maximum batch size of 100 messages per batch (roughly 33K), as a very conservative safety/robustness guard. Co-authored-by: Chris Baker <1675087+cgbaker@users.noreply.github.com>	2020-05-08 16:53:57 -04:00
James Rasell	55a2ad3854	api: validate scale count value is not negative. An operator could submit a scale request including a negative count value. This negative value caused the Nomad server to panic. The fix adds validation to the submitted count, returning an error to the caller if it is negative.	2020-05-08 16:51:40 +02:00
Mahmood Ali	57435950d7	Update current DST and some code style issues	2020-05-07 19:27:05 -04:00
Mahmood Ali	c8fb132956	Update cronexpr to point to hashicorp/cronexpr	2020-05-07 17:50:45 -04:00
Mahmood Ali	507c0b8f64	tests for periodic job scheduling and DST	2020-05-07 17:36:59 -04:00
Tim Gross	42f9d517d8	CSI volumewatcher testability improvments (#7889 ) * volumewatcher: remove redundant log fields The constructor for `volumeWatcher` already sets a `logger.With` that includes the volume ID and namespace fields. Remove them from the various trace logs. * volumewatcher: advance state for controller already released One way of bypassing client RPCs in testing is to set a claim status to controller-detached, but this results in an incorrect claim state when we checkpoint.	2020-05-07 15:57:24 -04:00
Tim Gross	801ebcfe8d	periodic GC for CSI plugins (#7878 ) This changeset implements a periodic garbage collection of unused CSI plugins. Plugins are self-cleaning when the last allocation for a plugin is stopped, but this feature will cover any missing edge cases and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins cleaned up.	2020-05-06 16:49:12 -04:00
Tim Gross	00c9bd7ff0	reorder volume claim batch request raft message (#7871 ) For backwards compatibility during upgrades, new raft message types need to come at the end of the enum.	2020-05-06 08:57:51 -04:00
Tim Gross	ce86a594a6	csi: fix plugin counts on node update (#7844 ) In this changeset: * If a Nomad client node is running both a controller and a node plugin (which is a common case), then if only the controller or the node is removed, the plugin was not being updated with the correct counts. * The existing test for plugin cleanup didn't go back to the state store, which normally is ok but is complicated in this case by denormalization which changes the behavior. This commit makes the test more comprehensive. * Set "controller required" when plugin has `PUBLISH_READONLY`. All known controllers that support `PUBLISH_READONLY` also support `PUBLISH_UNPUBLISH_VOLUME` but we shouldn't assume this. * Only create plugins when the allocs for those plugins are healthy. If we allow a plugin to be created for the first time when the alloc is not healthy, then we'll recreate deleted plugins when the job's allocs all get marked terminal. * Terminal plugin alloc updates should cleanup the plugin. The client fingerprint can't tell if the plugin is unhealthy intentionally (for the case of updates or job stop). Allocations that are server-terminal should delete themselves from the plugin and trigger a plugin self-GC, the same as an unused node.	2020-05-05 15:39:57 -04:00
Tim Gross	22e3815e8c	docstring improvements and typo fixes (#7862 )	2020-05-05 10:30:50 -04:00
Tim Gross	1c6dcab56b	volumewatcher: remove spurious nil-check (#7858 ) The nil-check here is left-over from an earlier approach that didn't get merged. It doesn't do anything for us now as we can't ever pass it `nil` and if we leave it in the `getVolume` call it guards will panic anyways.	2020-05-04 12:28:32 -04:00

... 3 4 5 6 7 ...

3702 Commits