open-nomad

Commit Graph

Author	SHA1	Message	Date
Tim Gross	22cf24a6bd	CSI: retry claims from client when max claims are reached (#12113 ) When the alloc runner claims a volume, an allocation for a previous version of the job may still have the volume claimed because it's still shutting down. In this case we'll receive an error from the server. Retry this error until we succeed or until a very long timeout expires, to give operators a chance to recover broken plugins. Make the alloc runner hook tolerant of temporary RPC failures.	2022-02-24 10:39:07 -05:00
Tim Gross	246db87a74	CSI: allow for concurrent plugin allocations (#12078 ) The dynamic plugin registry assumes that plugins are singletons, which matches the behavior of other Nomad plugins. But because dynamic plugins like CSI are implemented by allocations, we need to handle the possibility of multiple allocations for a given plugin type + ID, as well as behaviors around interleaved allocation starts and stops. Update the data structure for the dynamic registry so that more recent allocations take over as the instance manager singleton, but we still preserve the previous running allocations so that restores work without racing. Multiple allocations can run on a client for the same plugin, even if only during updates. Provide each plugin task a unique path for the control socket so that the tasks don't interfere with each other.	2022-02-23 15:23:07 -05:00
Tim Gross	309ac6c3d8	csi: don't wait to fire initial unmount RPC (#12102 ) In PR #11892 we updated the `csi_hook` to unmount the volume locally via the CSI node RPCs before releasing the claim from the server. The timer for this hook was initialized with the retry time, forcing us to wait 1s before making the first unmount RPC calls. Use the new helper for timers to ensure we clean up the timer nicely.	2022-02-22 13:43:06 -05:00
Michael Schurter	48aaa2c7d9	Merge pull request #11975 from hashicorp/f-connect-debugging connect: write envoy bootstrap debugging info	2022-02-18 13:56:22 -08:00
Seth Hoenig	6550c90198	connect: bootstrap envoy using -proxy-id This PR modifies the Consul CLI arguments used to bootstrap envoy for Connect sidecars to make use of '-proxy-id' instead of '-sidecar-for'. Nomad registers the sidecar service, so we know what ID it has. The '-sidecar-for' was intended for use when you only know the name of the service for which the sidecar is being created. The improvement here is that using '-proxy-id' does not require an underlying request for listing Consul services. This will make make the interaction between Nomad and Consul more efficient. Closes #10452	2022-02-18 14:58:23 -06:00
Michael Schurter	27b8112123	connect: write envoy bootstrap debugging info When Consul Connect just works, it's wonderful. When it doesn't work it can be exceeding difficult to debug: operators have to check task events, Nomad logs, Consul logs, Consul APIs, and even then critical information is missing. Using Consul to generate a bootstrap config for Envoy is notoriously difficult. Nomad doesn't even log stderr, so operators are left trying to piece together what went wrong. This patch attempts to provide maximal context which unfortunately includes secrets. Secrets are always restricted to the secrets/ directory. This makes debugging a little harder, but allows operators to know exactly what operation Nomad was trying to perform. What's added: - stderr is sent to alloc/logs/envoy_bootstrap.stderr.0 - the CLI is written to secrets/.envoy_bootstrap.cmd - the environment is written to secrets/.envoy_bootstrap.env as JSON Accessing this information is unfortunately awkward: ``` nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.env nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.cmd nomad alloc fs b36a alloc/logs/envoy_bootstrap.stderr.0 ``` The above assumes an alloc id that starts with `b36a` and a Connect sidecar proxy for a service named `count-countdash`. If the alloc is unable to start successfully, the debugging files are only accessible from the host filesystem.	2022-02-18 12:02:36 -08:00
Tim Gross	27bb2da5ee	CSI: make gRPC client creation more robust (#12057 ) Nomad communicates with CSI plugin tasks via gRPC. The plugin supervisor hook uses this to ping the plugin for health checks which it emits as task events. After the first successful health check the plugin supervisor registers the plugin in the client's dynamic plugin registry, which in turn creates a CSI plugin manager instance that has its own gRPC client for fingerprinting the plugin and sending mount requests. If the plugin manager instance fails to connect to the plugin on its first attempt, it exits. The plugin supervisor hook is unaware that connection failed so long as its own pings continue to work. A transient failure during plugin startup may mislead the plugin supervisor hook into thinking the plugin is up (so there's no need to restart the allocation) but no fingerprinter is started. * Refactors the gRPC client to connect on first use. This provides the plugin manager instance the ability to retry the gRPC client connection until success. * Add a 30s timeout to the plugin supervisor so that we don't poll forever waiting for a plugin that will never come back up. Minor improvements: * The plugin supervisor hook creates a new gRPC client for every probe and then throws it away. Instead, reuse the client as we do for the plugin manager. * The gRPC client constructor has a 1 second timeout. Clarify that this timeout applies to the connection and not the rest of the client lifetime.	2022-02-15 16:57:29 -05:00
James Rasell	15205b5408	Merge pull request #12052 from hashicorp/b-taskrunner-track-deregistered-call client: track service deregister call so it's only called once.	2022-02-14 09:01:26 +01:00
Tim Gross	8ffe7aa76f	csi: provide `CSI_ENDPOINT` env var to plugins (#12050 ) The CSI specification says: > The CO SHALL provide the listen-address for the Plugin by way of the `CSI_ENDPOINT` environment variable. Note that plugins without filesystem isolation won't have the plugin dir bind-mounted to their alloc dir, but we can provide a path to the socket anyways. Refactor to use opts struct for plugin supervisor hook config. The parameter list for configuring the plugin supervisor hook has grown enough where is makes sense to use an options struct similiar to many of the other task runner hooks (ex. template).	2022-02-11 08:46:21 -05:00
James Rasell	222592a07e	client: track service deregister call so it's only called once. In certain task lifecycles the taskrunner service deregister call could be called three times for a task that is exiting. Whilst each hook caller of deregister has its own purpose, we should try and ensure it is only called once during the shutdown lifecycle of a task. This change therefore tracks when deregister has been called, so that subsequent calls are noop. In the event the task is restarting, the deregister value is reset to ensure proper operation.	2022-02-11 09:29:38 +01:00
Luiz Aoqui	3bf6036487	Version 1.2.6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJiBIXqAAoJELC0QQl2hbZ2M8cP/A7LENJbFSph25M1aGItra5j BphSX//Sq/v9ZzO44rOGNYQGfTpFT8STJgj2GC50qR/ilF4KX4D0oZlDyu/6D0NG ouN9RUjnFd6IEDQrjqqqhr3F69Z95SWVfi1rfgn/pIgOYkVEXfi6DXaulVVyd2ZT J0G5w5ryl5d8PhuL7TWw4zbhZRQn0hVspZv/1s3/I9aG6Sew8SMweeOxbN9lBr7E H19Amdjh6ugRuPgU7YMpKDVrZQRv9Wt7BUP/uc0u3LiW9z3Ko8ZKnCRKErtL5Kc3 HDZsWe+t3va4Uekzd0HULNcYU4kwjogdRYRzX5kRsOyXelrZkQIqYFiKrk1wVbq/ cYM5DUak6eUQBGhgi3UY0fklBFq4GDGpiwEzn7rvQb0PRSuVyykgbZ12fzyIu8dp tWbR/WOEg9F+jva6HkR2kDIcr5mDmny3Pxi5aUT6lMk1111nCzOjDzhLkQVtfsex FDMByXxM4oWAK3ouq2OIdxDL2c742A2933C4/30KWE7Xy7twsvkGw52irw66VO3V 4PHP880cDvEDaEh15mY/8FlaAE7t/gsCUuYLxGwl33TaXSRBLc9vVNrrp89q53TD ZcvXTBpHUOWa6ZlHF/4f8LW44rowM6bU0Wili7NaWOKx86dnUJMG4sqJifNgcpS/ 7lXogv98CYLbMy4X4if0 =NY1Z -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmIFbbkACgkQKfRZwNnL tXOr/g/+N2ZBMK8ohEvtdXLl7WXrVhgJfUSVbdD5Kfshul9CPn3yWRxJzqtEN2Pf 55ozeWLpoziP9y9LviJ7rDidXcTmDFutbFdGJ3L+ZLdLILsNOq1A+lbuwO3fJngZ 5aiPoJLsw4sqj6uHaM6Cls2f145O92nT7GXEHCxuvGHeSf3NkcR+zRY5nPrLTIrA uxYefCOzP6C2I+W7dL4Oj5R5EZd4UDi1WiL8pGzwm24LcagZN2ctctolAeF9OlJX M58UUv9b4GObe617u8MeH0LIlyZiNwn9JqrV33dKVTyrkBIYfYxkzdzMKf1csVYk kQb13KPdPTASBAGTl+sxeXXnw/bg09JXGcvREX5lLyQqY8xGwTv2FpTmybKWLiss Bg6BbejrgtCPBik0EAHWV0+kVzhi9bPfUYwTXLDCzMtrbyCyPoWchruel2sm41U1 ezRDzlSvf6nrXf7sAv6umJICck4Bc5Gol+8W7fxvWqnY9rQ3ds2v7E5lXZMBbOmE JSi+EDWBJjBAXehE6pLxeVsvlHMRWN007Z2UeD4neGIgG7xFJLq6nKeUKoiNIpgk hKBL8iwHyuJfrBB/dcPzI9NV+jL6OZ/oI1RWxSj0MX/B4VXZp8HrqZA5JxzQolUg KIxqe4iX3WIkQv+UU4WiELvs4O7fujB4KWz3iQokhwDxqGUpffk= =5EG2 -----END PGP SIGNATURE----- Merge tag 'v1.2.6' into merge-release-1.2.6-branch Version 1.2.6	2022-02-10 14:55:34 -05:00
Seth Hoenig	de078e7ac6	client: fix race condition in use of go-getter go-getter creates a circular dependency between a Client and Getter, which means each is inherently thread-unsafe if you try to re-use on or the other. This PR fixes Nomad to no longer make use of the default Getter objects provided by the go-getter package. Nomad must create a new Client object on every artifact download, as the Client object controls the Src and Dst among other things. When Caling Client.Get, the Getter modifies its own Client reference, creating the circular reference and race condition. We can still achieve most of the desired connection caching behavior by re-using a shared HTTP client with transport pooling enabled.	2022-02-09 19:48:28 -05:00
Karthick Ramachandran	0600bc32e2	improve error message on service length (#12012 )	2022-02-04 19:39:34 -05:00
Seth Hoenig	db2347a86c	cleanup: prevent leaks from time.After This PR replaces use of time.After with a safe helper function that creates a time.Timer to use instead. The new function returns both a time.Timer and a Stop function that the caller must handle. Unlike time.NewTimer, the helper function does not panic if the duration set is <= 0.	2022-02-02 14:32:26 -06:00
Seth Hoenig	04f84bcdfe	deps: import libtime the normal way Previously we copied this library by hand to avoid vendor-ing a bunch of files related to minimock. Now that we no longer vendor, just import the library normally. Also we might use more of the library for handling `time.After` uses, for which this library provides a Context-based solution.	2022-01-31 14:49:05 -06:00
Tim Gross	622ed093ae	CSI: node unmount from the client before unpublish RPC (#11892 ) When an allocation stops, the `csi_hook` makes an unpublish RPC to the servers to unpublish via the CSI RPCs: first to the node plugins and then the controller plugins. The controller RPCs must happen after the node RPCs so that the node has had a chance to unmount the volume before the controller tries to detach the associated device. But the client has local access to the node plugins and can independently determine if it's safe to send unpublish RPC to those plugins. This will allow the server to treat the node plugin as abandoned if a client is disconnected and `stop_on_client_disconnect` is set. This will let the server try to send unpublish RPCs to the controller plugins, under the assumption that the client will be trying to unmount the volume on its end first. Note that the CSI `NodeUnpublishVolume`/`NodeUnstageVolume` RPCs can return ignorable errors in the case where the volume has already been unmounted from the node. Handle all other errors by retrying until we get success so as to give operators the opportunity to reschedule a failed node plugin (ex. in the case where they accidentally drained a node without `-ignore-system`). Fan-out the work for each volume into its own goroutine so that we can release a subset of volumes if only one is stuck.	2022-01-28 14:43:58 -05:00
Tim Gross	66b4b28b1a	CSI: node unmount from the client before unpublish RPC (#11892 ) When an allocation stops, the `csi_hook` makes an unpublish RPC to the servers to unpublish via the CSI RPCs: first to the node plugins and then the controller plugins. The controller RPCs must happen after the node RPCs so that the node has had a chance to unmount the volume before the controller tries to detach the associated device. But the client has local access to the node plugins and can independently determine if it's safe to send unpublish RPC to those plugins. This will allow the server to treat the node plugin as abandoned if a client is disconnected and `stop_on_client_disconnect` is set. This will let the server try to send unpublish RPCs to the controller plugins, under the assumption that the client will be trying to unmount the volume on its end first. Note that the CSI `NodeUnpublishVolume`/`NodeUnstageVolume` RPCs can return ignorable errors in the case where the volume has already been unmounted from the node. Handle all other errors by retrying until we get success so as to give operators the opportunity to reschedule a failed node plugin (ex. in the case where they accidentally drained a node without `-ignore-system`). Fan-out the work for each volume into its own goroutine so that we can release a subset of volumes if only one is stuck.	2022-01-28 08:30:31 -05:00
James Rasell	7205b3f08e	Merge pull request #11402 from hashicorp/document-client-initial-vault-renew taskrunner: add clarifying initial vault token renew comment.	2022-01-13 16:21:58 +01:00
Alessandro De Blasis	e647549ecf	metrics: added `mapped_file` metric (#11500 ) Signed-off-by: Alessandro De Blasis <alex@deblasis.net> Co-authored-by: Nate <37554478+servusdei2018@users.noreply.github.com>	2022-01-10 15:35:19 -05:00
grembo	edd3b8a20c	Un-break templates when using vault stanza change_mode noop (#11783 ) Templates in nomad jobs make use of the vault token defined in the vault stanza when issuing credentials like client certificates. When using change_mode "noop" in the vault stanza, consul-template is not informed in case a vault token is re-issued (which can happen from time to time for various reasons, as described in https://www.nomadproject.io/docs/job-specification/vault). As a result, consul-template will keep using the old vault token to renew credentials and - once the token expired - stop renewing credentials. The symptom of this problem is a vault_token file that is newer than the issued credential (e.g., TLS certificate) in a job's /secrets directory. This change corrects this, so that h.updater.updatedVaultToken(token) is called, which will inform stakeholders about the new token and make sure, the new token is used by consul-template. Example job template fragment: vault { policies = ["nomad-job-policy"] change_mode = "noop" } template { data = <<-EOH {{ with secret "pki_int/issue/nomad-job" "common_name=myjob.service.consul" "ttl=90m" "alt_names=localhost" "ip_sans=127.0.0.1"}} {{ .Data.certificate }} {{ .Data.private_key }} {{ .Data.issuing_ca }} {{ end }} EOH destination = "${NOMAD_SECRETS_DIR}/myjob.crt" change_mode = "noop" } This fix does not alter the meaning of the three change modes of vault - "noop" - Take no action - "restart" - Restart the job - "signal" - send a signal to the task as the switch statement following line 232 contains the necessary logic. It is assumed that "take no action" was never meant to mean "don't tell consul-template about the new vault token". Successfully tested in a staging cluster consisting of multiple nomad client nodes.	2022-01-10 14:41:38 -05:00
Derek Strickland	0a8e03f0f7	Expose Consul template configuration parameters (#11606 ) This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza. - `wait` It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza. - `max_stale` - `block_query_wait` - `consul_retry` - `vault_retry` - `wait` Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure. - `wait_bounds` Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-01-10 10:19:07 -05:00
Tim Gross	5eda9be7b0	CSI: tests to exercise csi_hook (#11788 ) Small refactoring of the allocrunner hook for CSI to make it more testable, and a unit test that covers most of its logic.	2022-01-07 15:23:47 -05:00
Tim Gross	265e488ab4	task runner: fix goroutine leak in prestart hook (#11741 ) The task runner prestart hooks take a `joincontext` so they have the option to exit early if either of two contexts are canceled: from killing the task or client shutdown. Some tasks exit without being shutdown from the server, so neither of the joined contexts ever gets canceled and we leak the `joincontext` (48 bytes) and its internal goroutine. This primarily impacts batch jobs and any task that fails or completes early such as non-sidecar prestart lifecycle tasks. Cancel the `joincontext` after the prestart call exits to fix the leak.	2021-12-23 11:50:51 -05:00
James Rasell	45f4689f9c	chore: fixup inconsistent method receiver names. (#11704 )	2021-12-20 11:44:21 +01:00
Tim Gross	a0cf5db797	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
James Rasell	e3537a06bb	taskrunner: add clarifying initial vault token renew comment.	2021-10-28 17:09:22 +02:00
Michael Schurter	fd68bbc342	test: update tests to properly use AllocDir Also use t.TempDir when possible.	2021-10-19 10:49:07 -07:00
Michael Schurter	10c3bad652	client: never embed alloc_dir in chroot Fixes #2522 Skip embedding client.alloc_dir when building chroot. If a user configures a Nomad client agent so that the chroot_env will embed the client.alloc_dir, Nomad will happily infinitely recurse while building the chroot until something horrible happens. The best case scenario is the filesystem's path length limit is hit. The worst case scenario is disk space is exhausted. A bad agent configuration will look something like this: ```hcl data_dir = "/tmp/nomad-badagent" client { enabled = true chroot_env { # Note that the source matches the data_dir "/tmp/nomad-badagent" = "/ohno" # ... } } ``` Note that `/ohno/client` (the state_dir) will still be created but not `/ohno/alloc` (the alloc_dir). While I cannot think of a good reason why someone would want to embed Nomad's client (and possibly server) directories in chroots, there should be no cause for harm. chroots are only built when Nomad runs as root, and Nomad disables running exec jobs as root by default. Therefore even if client state is copied into chroots, it will be inaccessible to tasks. Skipping the `data_dir` and `{client,server}.state_dir` is possible, but this PR attempts to implement the minimum viable solution to reduce risk of unintended side effects or bugs. When running tests as root in a vm without the fix, the following error occurs: ``` === RUN TestAllocDir_SkipAllocDir alloc_dir_test.go:520: Error Trace: alloc_dir_test.go:520 Error: Received unexpected error: Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long Test: TestAllocDir_SkipAllocDir --- FAIL: TestAllocDir_SkipAllocDir (22.76s) ``` Also removed unused Copy methods on AllocDir and TaskDir structs. Thanks to @eveld for not letting me forget about this!	2021-10-18 09:22:01 -07:00
Mahmood Ali	4d90afb425	gofmt all the files mostly to handle build directives in 1.17.	2021-10-01 10:14:28 -04:00
James Rasell	0e926ef3fd	allow configuration of Docker hostnames in bridge mode (#11173 ) Add a new hostname string parameter to the network block which allows operators to specify the hostname of the network namespace. Changing this causes a destructive update to the allocation and it is omitted if empty from API responses. This parameter also supports interpolation. In order to have a hostname passed as a configuration param when creating an allocation network, the CreateNetwork func of the DriverNetworkManager interface needs to be updated. In order to minimize the disruption of future changes, rather than add another string func arg, the function now accepts a request struct along with the allocID param. The struct has the hostname as a field. The in-tree implementations of DriverNetworkManager.CreateNetwork have been modified to account for the function signature change. In updating for the change, the enhancement of adding hostnames to network namespaces has also been added to the Docker driver, whilst the default Linux manager does not current implement it.	2021-09-16 08:13:09 +02:00
James Rasell	b6813f1221	chore: fix incorrect docstring formatting.	2021-08-30 11:08:12 +02:00
Mahmood Ali	c37339a8c8	Merge pull request #9160 from hashicorp/f-sysbatch core: implement system batch scheduler	2021-08-16 09:30:24 -04:00
James Rasell	a9a04141a3	consul/connect: avoid warn messages on connect proxy errors When creating a TCP proxy bridge for Connect tasks, we are at the mercy of either end for managing the connection state. For long lived gRPC connections the proxy could reasonably expect to stay open until the context was cancelled. For the HTTP connections used by connect native tasks, we experience connection disconnects. The proxy gets recreated as needed on follow up requests, however we also emit a WARN log when the connection is broken. This PR lowers the WARN to a TRACE, because these disconnects are to be expected. Ideally we would be able to proxy at the HTTP layer, however Consul or the connect native task could be configured to expect mTLS, preventing Nomad from MiTM the requests. We also can't mange the proxy lifecycle more intelligently, because we have no control over the HTTP client or server and how they wish to manage connection state. What we have now works, it's just noisy. Fixes #10933	2021-08-05 11:27:35 +02:00
Seth Hoenig	3371214431	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
Michael Schurter	efe8ea2c2c	Merge pull request #10849 from benbuzbee/benbuz/fix-destroy Don't treat a failed recover + successful destroy as a successful recover	2021-07-19 10:49:31 -07:00
Seth Hoenig	f9d3fedca2	consul/connect: add missing import statements	2021-07-12 09:28:16 -05:00
Seth Hoenig	5540dfc17f	consul/connect: use join host port Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2021-07-12 09:04:54 -05:00
Seth Hoenig	f80ae067a8	consul/connect: fix bug causing high cpu with multiple connect sidecars in group This PR fixes a bug where the underlying Envoy process of a Connect gateway would consume a full core of CPU if there is more than one sidecar or gateway in a group. The utilization was being caused by Consul injecting an envoy_ready_listener on 127.0.0.1:8443, of which only one of the Envoys would be able to bind to. The others would spin in a hot loop trying to bind the listener. As a workaround, we now specify -address during the Envoy bootstrap config step, which is how Consul maps this ready listener. Because there is already the envoy_admin_listener, and we need to continue supporting running gateways in host networking mode, and in those case we want to use the same port value coming from the service.port field, we now bind the admin listener to the 127.0.0.2 loop-back interface, and the ready listener takes 127.0.0.1. This shouldn't make a difference in the 99.999% use case where envoy is being run in its official docker container. Advanced users can reference ${NOMAD_ENVOY_ADMIN_ADDR_<service>} (as they 'ought to) if needed, as well as the new variable ${NOMAD_ENVOY_READY_ADDR_<service>} for the envoy_ready_listener.	2021-07-09 14:34:44 -05:00
Seth Hoenig	e47ea462fb	client: fix logline in group shutdown hook Fixes #10844	2021-07-08 11:14:37 -05:00
Seth Hoenig	c8260c3940	consul: avoid triggering unnecessary sync when removing workload There are bits of logic in callers of RemoveWorkload on group/task cleanup hooks which call RemoveWorkload with the "Canary" version of the workload, in case the alloc is marked as a Canary. This logic triggers an extra sync with Consul, and also doesn't do the intended behavior - for which no special casing is necessary anyway. When the workload is marked for removal, all associated services and checks will be removed regardless of the Canary status, because the service and check IDs do not incorporate the canary-ness in the first place. The only place where canary-ness matters is when updating a workload, where we need to compute the hash of the services and checks to determine whether they have been modified, the Canary flag of which is a part of that. Fixes #10842	2021-07-06 14:08:42 -05:00
Ben Buzbee	e247f8806b	Don't treat a failed recover + successful destroy as a successful recover This code just seems incorrect. As it stands today it reports a successful restore if RecoverTask fails and then DestroyTask succeeds. This can result in a really annoying bug where it then calls RecoverTask again, whereby it will probably get ErrTaskNotFound and call DestroyTask once more. I think the only reason this has not been noticed so far is because most drivers like Docker will return Success, then nomad will call RecoverTask, get an error (not found) and call DestroyTask again, and get a ErrTasksNotFound err.	2021-07-03 01:46:36 +00:00
Seth Hoenig	5aa657c6bd	consul/connect: automatically set consul tls sni name for connect native tasks This PR makes it so that Nomad will automatically set the CONSUL_TLS_SERVER_NAME environment variable for Connect native tasks running in bridge networking mode where Consul has TLS enabled. Because of the use of a unix domain socket for communicating with Consul when in bridge networking mode, the server name is a file name instead of something compatible with the mTLS certificate Consul will authenticate against. "localhost" is by default a compatible name, so Nomad will set the environment variable to that. Fixes #10804	2021-06-28 08:36:53 -05:00
Tim Gross	59c1237fc9	tests: allocrunner CNI tests are Linux-only (#10783 ) Running the `client/allocrunner` tests fail to compile on macOS because the CNI test file depends on the CNI network configurator, which is in a Linux-only file.	2021-06-18 11:34:31 -04:00
Tim Gross	7bd61bbf43	docker: generate /etc/hosts file for bridge network mode (#10766 ) When `network.mode = "bridge"`, we create a pause container in Docker with no networking so that we have a process to hold the network namespace we create in Nomad. The default `/etc/hosts` file of that pause container is then used for all the Docker tasks that share that network namespace. Some applications rely on this file being populated. This changeset generates a `/etc/hosts` file and bind-mounts it to the container when Nomad owns the network, so that the container's hostname has an IP in the file as expected. The hosts file will include the entries added by the Docker driver's `extra_hosts` field. In this changeset, only the Docker task driver will take advantage of this option, as the `exec`/`java` drivers currently copy the host's `/etc/hosts` file and this can't be changed without breaking backwards compatibility. But the fields are available in the task driver protobuf for community task drivers to use if they'd like.	2021-06-16 14:55:22 -04:00
James Rasell	492e308846	tests: remove duplicate import statements.	2021-06-11 09:39:22 +02:00
Seth Hoenig	519429a2de	consul: probe consul namespace feature before using namespace api This PR changes Nomad's wrapper around the Consul NamespaceAPI so that it will detect if the Consul Namespaces feature is enabled before making a request to the Namespaces API. Namespaces are not enabled in Consul OSS, and require a suitable license to be used with Consul ENT. Previously Nomad would check for a 404 status code when makeing a request to the Namespaces API to "detect" if Consul OSS was being used. This does not work for Consul ENT with Namespaces disabled, which returns a 500. Now we avoid requesting the namespace API altogether if Consul is detected to be the OSS sku, or if the Namespaces feature is not licensed. Since Consul can be upgraded from OSS to ENT, or a new license applied, we cache the value for 1 minute, refreshing on demand if expired. Fixes https://github.com/hashicorp/nomad-enterprise/issues/575 Note that the ticket originally describes using attributes from https://github.com/hashicorp/nomad/issues/10688. This turns out not to be possible due to a chicken-egg situation between bootstrapping the agent and setting up the consul client. Also fun: the Consul fingerprinter creates its own Consul client, because there is no [currently] no way to pass the agent's client through the fingerprint factory.	2021-06-07 12:19:25 -05:00
Seth Hoenig	d026ff1f66	consul/connect: add support for connect mesh gateways This PR implements first-class support for Nomad running Consul Connect Mesh Gateways. Mesh gateways enable services in the Connect mesh to make cross-DC connections via gateways, where each datacenter may not have full node interconnectivity. Consul docs with more information: https://www.consul.io/docs/connect/gateways/mesh-gateway The following group level service block can be used to establish a Connect mesh gateway. service { connect { gateway { mesh { // no configuration } } } } Services can make use of a mesh gateway by configuring so in their upstream blocks, e.g. service { connect { sidecar_service { proxy { upstreams { destination_name = "<service>" local_bind_port = <port> datacenter = "<datacenter>" mesh_gateway { mode = "<mode>" } } } } } } Typical use of a mesh gateway is to create a bridge between datacenters. A mesh gateway should then be configured with a service port that is mapped from a host_network configured on a WAN interface in Nomad agent config, e.g. client { host_network "public" { interface = "eth1" } } Create a port mapping in the group.network block for use by the mesh gateway service from the public host_network, e.g. network { mode = "bridge" port "mesh_wan" { host_network = "public" } } Use this port label for the service.port of the mesh gateway, e.g. service { name = "mesh-gateway" port = "mesh_wan" connect { gateway { mesh {} } } } Currently Envoy is the only supported gateway implementation in Consul. By default Nomad client will run the latest official Envoy docker image supported by the local Consul agent. The Envoy task can be customized by setting `meta.connect.gateway_image` in agent config or by setting the `connect.sidecar_task` block. Gateways require Consul 1.8.0+, enforced by the Nomad scheduler. Closes #9446	2021-06-04 08:24:49 -05:00
Mahmood Ali	067fd86a8c	drivers: Capture exit code when task is killed (#10494 ) This commit ensures Nomad captures the task code more reliably even when the task is killed. This issue affect to `raw_exec` driver, as noted in https://github.com/hashicorp/nomad/issues/10430 . We fix this issue by ensuring that the TaskRunner only calls `driver.WaitTask` once. The TaskRunner monitors the completion of the task by calling `driver.WaitTask` which should return the task exit code on completion. However, it also could return a "context canceled" error if the agent/executor is shutdown. Previously, when a task is to be stopped, the killTask path makes two WaitTask calls, and the second returns "context canceled" occasionally because of a "race" in task shutting down and depending on driver, and how fast it shuts down after task completes. By having a single WaitTask call and consistently waiting for the task, we ensure we capture the exit code reliably before the executor is shutdown or the contexts expired. I opted to change the TaskRunner implementation to avoid changing the driver interface or requiring 3rd party drivers to update. Additionally, the PR ensures that attempts to kill the task terminate when the task "naturally" dies. Without this change, if the task dies at the right moment, the `killTask` call may retry to kill an already-dead task for up to 5 minutes before giving up.	2021-05-04 10:54:00 -04:00
Michael Schurter	547a718ef6	Merge pull request #10248 from hashicorp/f-remotetask-2021 core: propagate remote task handles	2021-04-30 08:57:26 -07:00
Michael Schurter	e62795798d	core: propagate remote task handles Add a new driver capability: RemoteTasks. When a task is run by a driver with RemoteTasks set, its TaskHandle will be propagated to the server in its allocation's TaskState. If the task is replaced due to a down node or draining, its TaskHandle will be propagated to its replacement allocation. This allows tasks to be scheduled in remote systems whose lifecycles are disconnected from the Nomad node's lifecycle. See https://github.com/hashicorp/nomad-driver-ecs for an example ECS remote task driver.	2021-04-27 15:07:03 -07:00

1 2 3 4 5 ...

596 Commits