open-nomad

Author	SHA1	Message	Date
Daniel Bennett	a7ed6f5c53	full task cleanup when alloc prerun hook fails (#17104 ) to avoid leaking task resources (e.g. containers, iptables) if allocRunner prerun fails during restore on client restart. now if prerun fails, TaskRunner.MarkFailedKill() will only emit an event, mark the task as failed, and cancel the tr's killCtx, so then ar.runTasks() -> tr.Run() can take care of the actual cleanup. removed from (formerly) tr.MarkFailedDead(), now handled by tr.Run(): * set task state as dead * save task runner local state * task stop hooks also done in tr.Run() now that it's not skipped: * handleKill() to kill tasks while respecting their shutdown delay, and retrying as needed * also includes task preKill hooks * clearDriverHandle() to destroy the task and associated resources * task exited hooks	2023-05-08 13:17:10 -05:00
Tim Gross	17bd930ca9	logs: fix missing allocation logs after update to Nomad 1.5.4 (#17087 ) When the server restarts for the upgrade, it loads the `structs.Job` from the Raft snapshot/logs. The jobspec has long since been parsed, so none of the guards around the default value are in play. The empty field value for `Enabled` is the zero value, which is false. This doesn't impact any running allocation because we don't replace running allocations when either the client or server restart. But as soon as any allocation gets rescheduled (ex. you drain all your clients during upgrades), it'll be using the `structs.Job` that the server has, which has `Enabled = false`, and logs will not be collected. This changeset fixes the bug by adding a new field `Disabled` which defaults to false (so that the zero value works), and deprecates the old field. Fixes #17076	2023-05-04 16:01:18 -04:00
Seth Hoenig	e9fec4ebc8	connect: remove unusable path for fallback envoy image names (#17044 ) This PR does some cleanup of an old code path for versions of Consul that did not support reporting the supported versions of Envoy in its API. Those versions are no longer supported for years at this point, and the fallback version of envoy hasn't been supported by any version of Consul for almost as long. Remove this code path that is no longer useful.	2023-05-02 09:48:44 -05:00
Seth Hoenig	e8d53ea30b	connect: use explicit docker.io prefix in default envoy image names (#17045 ) This PR modifies references to the envoyproxy/envoy docker image to explicitly include the docker.io prefix. This does not affect existing users, but makes things easier for Podman users, who otherwise need to specify the full name because Podman does not default to docker.io	2023-05-02 09:27:48 -05:00
Luiz Aoqui	7b5a8f1fb0	Revert "hashicorp/go-msgpack v2 (#16810 )" (#17047 ) This reverts commit 8a98520d56eed3848096734487d8bd3eb9162a65.	2023-05-01 17:18:34 -04:00
Seth Hoenig	86f6a38867	connect: do not restrict auto envoy version to docker task driver (#17041 ) This PR updates the envoy_bootstrap_hook to no longer disable itself if the task driver in use is not docker. In other words, make it work for podman and other image based task drivers. The hook now only checks that 1. the task is a connect sidecar 2. the task.config block contains an "image" field	2023-05-01 15:07:35 -05:00
Seth Hoenig	753c17c9de	services: un-mark group services as deregistered if restart hook runs (#16905 ) * services: un-mark group services as deregistered if restart hook runs This PR may fix a bug where group services will never be deregistered if the group undergoes a task restart. * e2e: add test case for restart and deregister group service * cl: add cl * e2e: add wait for service list call	2023-04-24 14:24:51 -05:00
Tim Gross	72cbe53f19	logs: allow disabling log collection in jobspec (#16962 ) Some Nomad users ship application logs out-of-band via syslog. For these users having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow disabling the logmon and pointing the task's stdout/stderr to /dev/null. This changeset is the first of several incremental improvements to log collection short of full-on logging plugins. The next step will likely be to extend the internal-only task driver configuration so that cluster administrators can turn off log collection for the entire driver. --- Fixes: #11175 Co-authored-by: Thomas Weber <towe75@googlemail.com>	2023-04-24 10:00:27 -04:00
valodzka	379497a484	fix host port handling for ipv6 (#16723 )	2023-04-20 19:53:20 -07:00
Etienne Bruines	1e3531b978	cni: fix plugin fingerprinting versions (#16776 ) CNI plugins v1.2.0 and above output a second line, containing supported protocol versions.	2023-04-20 18:44:39 -07:00
astudentofblake	42c4c8d5ea	fix: added landlock access to /usr/libexec for getter (#16900 )	2023-04-20 11:16:04 -05:00
Luiz Aoqui	fb588fcbb8	allocrunner: prevent panic on network manager (#16921 )	2023-04-18 13:39:13 -07:00
Ian Fijolek	619f49afcf	hashicorp/go-msgpack v2 (#16810 ) * Upgrade from hashicorp/go-msgpack v1.1.5 to v2.1.0 Fixes #16808 * Update hashicorp/net-rpc-msgpackrpc to v2 to match go-msgpack * deps: use go-msgpack v2.0.0 go-msgpack v2.1.0 includes some code changes that we will need to investigate furthere to assess its impact on Nomad, so keeping this dependency on v2.0.0 for now since it's no-op. --------- Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2023-04-17 17:02:05 -04:00
Seth Hoenig	92d4a05534	users: eliminate nobody user memoization (#16904 ) This PR eliminates code specific to looking up and caching the uid/gid/user.User object associated with the nobody user in an init block. This code existed before adding the generic users cache and was meant to optimize the one search path we knew would happen often. Now that we have the cache, seems reasonable to eliminate this init block and use the cache instead like for any other user. Also fixes a constraint on the podman (and other) drivers, where building without CGO became problematic on some OS like Fedora IoT where the nobody user cannot be found with the pure-Go standard library. Fixes github.com/hashicorp/nomad-driver-podman/issues/228	2023-04-17 12:30:30 -05:00
Tim Gross	62548616d4	client: allow `drain_on_shutdown` configuration (#16827 ) Adds a new configuration to clients to optionally allow them to drain their workloads on shutdown. The client sends the `Node.UpdateDrain` RPC targeting itself and then monitors the drain state as seen by the server until the drain is complete or the deadline expires. If it loses connection with the server, it will monitor local client status instead to ensure allocations are stopped before exiting.	2023-04-14 15:35:32 -04:00
James Rasell	bc01d47071	consul/connect: fixed a bug where restarting proxy tasks failed. (#16815 ) The first start of a Consul Connect proxy sidecar triggers a run of the envoy_version hook which modifies the task config image entry. The modification takes into account a number of factors to correctly populate this. Importantly, once the hook has run, it marks itself as done so the taskrunner will not execute it again. When the client receives a non-destructive update for the allocation which the proxy sidecar is a member of, it will update and overwrite the task definition within the taskerunner. In doing so it overwrite the modification performed by the hook. If the allocation is restarted, the envoy_version hook will be skipped as it previously marked itself as done, and therefore the sidecar config image is incorrect and causes a driver error. The fix removes the hook in marking itself as done to the view of the taskrunner.	2023-04-11 15:56:03 +01:00
Seth Hoenig	ba728f8f97	api: enable support for setting original job source (#16763 ) * api: enable support for setting original source alongside job This PR adds support for setting job source material along with the registration of a job. This includes a new HTTP endpoint and a new RPC endpoint for making queries for the original source of a job. The HTTP endpoint is /v1/job/<id>/submission?version=<version> and the RPC method is Job.GetJobSubmission. The job source (if submitted, and doing so is always optional), is stored in the job_submission memdb table, separately from the actual job. This way we do not incur overhead of reading the large string field throughout normal job operations. The server config now includes job_max_source_size for configuring the maximum size the job source may be, before the server simply drops the source material. This should help prevent Bad Things from happening when huge jobs are submitted. If the value is set to 0, all job source material will be dropped. * api: avoid writing var content to disk for parsing * api: move submission validation into RPC layer * api: return an error if updating a job submission without namespace or job id * api: be exact about the job index we associate a submission with (modify) * api: reword api docs scheduling * api: prune all but the last 6 job submissions * api: protect against nil job submission in job validation * api: set max job source size in test server * api: fixups from pr	2023-04-11 08:45:08 -05:00
Michael Schurter	a407cbb626	Merge pull request #16836 from hashicorp/compliance/add-headers [COMPLIANCE] Add Copyright and License Headers	2023-04-10 16:32:03 -07:00
Daniel Bennett	fa33ee567a	gracefully recover tasks that use csi node plugins (#16809 ) new WaitForPlugin() called during csiHook.Prerun, so that on startup, clients can recover running tasks that use CSI volumes, instead of them being terminated and rescheduled because they need a node plugin that is "not found" yet, only because the plugin task has not yet been recovered.	2023-04-10 17:15:33 -05:00
hashicorp-copywrite[bot]	005636afa0	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Tim Gross	1335543731	ephemeral disk: `migrate` should imply `sticky` (#16826 ) The `ephemeral_disk` block's `migrate` field allows for best-effort migration of the ephemeral disk data to new nodes. The documentation says the `migrate` field is only respected if `sticky=true`, but in fact if client ACLs are not set the data is migrated even if `sticky=false`. The existing behavior when client ACLs are disabled has existed since the early implementation, so "fixing" that case now would silently break backwards compatibility. Additionally, having `migrate` not imply `sticky` seems nonsensical: it suggests that if we place on a new node we migrate the data but if we place on the same node, we throw the data away! Update so that `migrate=true` implies `sticky=true` as follows: * The failure mode when client ACLs are enabled comes from the server not passing along a migration token. Update the server so that the server provides a migration token whenever `migrate=true` and not just when `sticky=true` too. * Update the scheduler so that `migrate` implies `sticky`. * Update the client so that we check for `migrate \|\| sticky` where appropriate. * Refactor the E2E tests to move them off the old framework and make the intention of the test more clear.	2023-04-07 16:33:45 -04:00
James Rasell	15eee99db4	client: ensure envoy version hook uses all pointer receiver funcs. (#16813 )	2023-04-06 14:47:00 +01:00
the-nando	f541f2e59b	Do not set attributes when spawning the getter child (#16791 ) * Do not set attributes when spawning the getter child * Cleanup * Cleanup --------- Co-authored-by: the-nando <the-nando@invalid.local>	2023-04-05 11:47:51 -05:00
Tim Gross	118b703164	CSI: set mounts in alloc hook resources atomically (#16722 ) The allocrunner has a facility for passing data written by allocrunner hooks to taskrunner hooks. Currently the only consumers of this facility are the allocrunner CSI hook (which writes data) and the taskrunner volume hook (which reads that same data). The allocrunner hook for CSI volumes doesn't set the alloc hook resources atomically. Instead, it gets the current resources and then writes a new version back. Because the CSI hook is currently the only writer and all readers happen long afterwards, this should be safe but #16623 shows there's some sequence of events during restore where this breaks down. Refactor hook resources so that hook data is accessed via setters and getters that hold the mutex.	2023-04-03 11:03:36 -04:00
Seth Hoenig	87f4b71df0	client/fingerprint: correctly fingerprint E/P cores of Apple Silicon chips (#16672 ) * client/fingerprint: correctly fingerprint E/P cores of Apple Silicon chips This PR adds detection of asymetric core types (Power & Efficiency) (P/E) when running on M1/M2 Apple Silicon CPUs. This functionality is provided by shoenig/go-m1cpu which makes use of the Apple IOKit framework to read undocumented registers containing CPU performance data. Currently working on getting that functionality merged upstream into gopsutil, but gopsutil would still not support detecting P vs E cores like this PR does. Also refactors the CPUFingerprinter code to handle the mixed core types, now setting power vs efficiency cpu attributes. For now the scheduler is still unaware of mixed core types - on Apple platforms tasks cannot reserve cores anyway so it doesn't matter, but at least now the total CPU shares available will be correct. Future work should include adding support for detecting P/E cores on the latest and upcoming Intel chips, where computation of total cpu shares is currently incorrect. For that, we should also include updating the scheduler to be core-type aware, so that tasks of resources.cores on Linux platforms can be assigned the correct number of CPU shares for the core type(s) they have been assigned. node attributes before cpu.arch = arm64 cpu.modelname = Apple M2 Pro cpu.numcores = 12 cpu.reservablecores = 0 cpu.totalcompute = 1000 node attributes after cpu.arch = arm64 cpu.frequency.efficiency = 2424 cpu.frequency.power = 3504 cpu.modelname = Apple M2 Pro cpu.numcores.efficiency = 4 cpu.numcores.power = 8 cpu.reservablecores = 0 cpu.totalcompute = 37728 * fingerprint/cpu: follow up cr items	2023-03-28 08:27:58 -05:00
Seth Hoenig	07543f8bdf	nsd: always set deregister flag after deregistration of group (#16289 ) * services: always set deregister flag after deregistration of group This PR fixes a bug where the group service hook's deregister flag was not set in some cases, causing the hook to attempt deregistrations twice during job updates (alloc replacement). In the tests ... we used to assert on the wrong behvior (remove twice) which has now been corrected to assert we remove only once. This bug was "silent" in the Consul provider world because the error logs for double deregistration only show up in Consul logs; with the Nomad provider the error logs are in the Nomad agent logs. * services: cleanup group service hook tests	2023-03-17 09:44:21 -05:00
Tim Gross	ec47b245d0	client: don't use `Status` RPC for Consul discovery (#16490 ) In #16217 we switched clients using Consul discovery to the `Status.Members` endpoint for getting the list of servers so that we're using the correct address. This endpoint has an authorization gate, so this fails if the anonymous policy doesn't have `node:read`. We also can't check the `AuthToken` for the request for the client secret, because the client hasn't yet registered so the server doesn't have anything to compare against. Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the Consul response directly. Update the `registerNode` method to handle the list of servers we get back in the response; if we get a "no servers" or "no path to region" response we'll kick off discovery again and retry immediately rather than waiting 15s.	2023-03-16 15:38:33 -04:00
Seth Hoenig	5b1970468e	artifact: git needs more files for private repositories (#16508 ) * landlock: git needs more files for private repositories This PR fixes artifact downloading so that git may work when cloning from private repositories. It needs - file read on /etc/passwd - dir read on /root/.ssh - file write on /root/.ssh/known_hosts Add these rules to the landlock rules for the artifact sandbox. * cr: use nonexistent instead of devnull Co-authored-by: Michael Schurter <mschurter@hashicorp.com> * cr: use go-homdir for looking up home directory * pr: pull go-homedir into explicit require * cr: fixup homedir tests in homeless root cases * cl: fix root test for real --------- Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2023-03-16 12:22:25 -05:00
Seth Hoenig	d2e8fb626a	artifact: do not set process attributes on darwin (#16511 ) This PR fixes the non-root macOS use case where artifact downloads stopped working. It seems setting a Credential on a SysProcAttr used by the exec package will always cause fork/exec to fail - even if the credential contains our own UID/GID or nil UID/GID. Technically we do not need to set this as the child process will inherit the parent UID/GID anyway... and not setting it makes things work again ... /shrug	2023-03-16 11:31:18 -05:00
Seth Hoenig	a25d3ea792	cgv1: do not disable cpuset manager if reserved interface already exists (#16467 ) * cgv1: do not disable cpuset manager if reserved interface already exists This PR fixes a bug where restarting a Nomad Client on a machine using cgroups v1 (e.g. Ubuntu 20.04) would cause the cpuset cgroups manager to disable itself. This is being caused by incorrectly interpreting a "file exists" error as problematic when ensuring the reserved cpuset exists. If we get a "file exists" error, that just means the Client was likely restarted. Note that a machine reboot would fix the issue - the groups interfaces are ephemoral. * cl: add cl	2023-03-13 17:00:17 -05:00
Luiz Aoqui	7305a374e3	allocrunner: fix health check monitoring for Consul services (#16402 ) Services must be interpolated to replace runtime variables before they can be compared against the values returned by Consul.	2023-03-10 14:43:31 -05:00
Michael Schurter	0021b282ef	env/aws: update ec2 cpu info data (#16417 ) Update AWS EC2 CPU tables using `make ec2info`	2023-03-09 14:33:21 -08:00
Seth Hoenig	ff4503aac6	client: disable running artifact downloader as nobody (#16375 ) * client: disable running artifact downloader as nobody This PR reverts a change from Nomad 1.5 where artifact downloads were executed as the nobody user on Linux systems. This was done as an attempt to improve the security model of artifact downloading where third party tools such as git or mercurial would be run as the root user with all the security implications thereof. However, doing so conflicts with Nomad's own advice for securing the Client data directory - which when setup with the recommended directory permissions structure prevents artifact downloads from working as intended. Artifact downloads are at least still now executed as a child process of the Nomad agent, and on modern Linux systems make use of the kernel Landlock feature for limiting filesystem access of the child process. * docs: update upgrade guide for 1.5.1 sandboxing * docs: add cl * docs: add title to upgrade guide fix	2023-03-08 15:58:43 -06:00
Lance Haig	35c17b2e56	deps: Update ioutil deprecated library references to os and io respectively in the client package (#16318 ) * Update ioutil deprecated library references to os and io respectively * Deal with the errors produced. Add error handling to filEntry info Add error handling to info	2023-03-08 13:25:10 -06:00
Luiz Aoqui	2a1a790820	client: don't emit task shutdown delay event if not waiting (#16281 )	2023-03-03 18:22:06 -05:00
Farbod Ahmadian	629ac58763	tests: add functionality to skip a test if it's not running in CI and not with root user (#16222 )	2023-03-02 13:38:27 -05:00
Tim Gross	bb4880ec13	client: use RPC address and not serf after initial Consul discovery (#16217 ) Nomad servers can advertise independent IP addresses for `serf` and `rpc`. Somewhat unexpectedly, the `serf` address is also used for both Serf and server-to-server RPC communication (including Raft RPC). The address advertised for `rpc` is only used for client-to-server RPC. This split was introduced intentionally in Nomad 0.8. When clients are using Consul discovery for connecting to servers, they get an initial discovery set from Consul and use the correct `rpc` tag in Consul to get a list of adddresses for servers. The client then makes a `Status.Peers` RPC to get the list of those servers that are raft peers. But this endpoint is shared between servers and clients, and provides the address used for Raft. Most of the time this is harmless because servers will bind on 0.0.0.0 anyways., But in topologies where servers are on a private network and clients are on separate subnets (or even public subnets), clients will make initial contact with the server to get the list of peers but then populate their local server set with unreachable addresses. Cluster administrators can work around this problem by using `server_join` with specific IP addresses (or DNS names), because the `Node.UpdateStatus` endpoint returns the correct set of RPC addresses when updating the node. So once a client has registered, it will get the correct set of RPC addresses. This changeset updates the client logic to query `Status.Members` instead of `Status.Peers`, and then extract the correctly advertised address and port from the response body.	2023-03-02 13:36:45 -05:00
Michael Schurter	bd7b60712e	Accept Workload Identities for Client RPCs (#16254 ) This change resolves policies for workload identities when calling Client RPCs. Previously only ACL tokens could be used for Client RPCs. Since the same cache is used for both bearer tokens (ACL and Workload ID), the token cache size was doubled. --------- Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2023-02-27 10:17:47 -08:00
Tim Gross	79844048e6	populate Nomad token for task runner update hooks (#16266 ) The `TaskUpdateRequest` struct we send to task runner update hooks was not populating the Nomad token that we get from the task runner (which we do for the Vault token). This results in task runner hooks like the template hook overwriting the Nomad token with the zero value for the token. This causes in-place updates of a task to break templates (but not other uses that rely on identity but don't currently bother to update it, like the identity hook).	2023-02-27 10:48:13 -05:00
Seth Hoenig	61404b2551	services: Set Nomad's User-Agent by default on HTTP checks for nomad services (#16248 )	2023-02-23 08:10:42 -06:00
Seth Hoenig	804f9fdb93	services: ensure task group is set on service hook (#16240 ) This PR fixes a bug where the task group information was not being set on the serviceHook.AllocInfo struct, which is needed later on for calculating the CheckID of a nomad service check. The CheckID is calculated independently from multiple callsites, and the information being passed in must be consistent, including the group name. The workload.AllocInfo.Group was not set at this callsite, due to the bug fixed in this PR. https://github.com/hashicorp/nomad/blob/main/client/serviceregistration/nsd/nsd.go#L114	2023-02-22 10:22:48 -06:00
Seth Hoenig	5d325decca	cgutil: handle panic from runc helper method (#16180 ) This PR wraps the cgroups.IsCgroup2UnifiedMode() helper method from runc in a defer/recover block because it might panic in some cases. Upstream fix in: https://github.com/opencontainers/runc/pull/3745 Closes #16179	2023-02-14 15:09:43 -06:00
Seth Hoenig	165791dd89	artifact: protect against unbounded artifact decompression (1.5.0) (#16151 ) * artifact: protect against unbounded artifact decompression Starting with 1.5.0, set defaut values for artifact decompression limits. artifact.decompression_size_limit (default "100GB") - the maximum amount of data that will be decompressed before triggering an error and cancelling the operation artifact.decompression_file_count_limit (default 4096) - the maximum number of files that will be decompressed before triggering an error and cancelling the operation. * artifact: assert limits cannot be nil in validation	2023-02-14 09:28:39 -06:00
Seth Hoenig	490c902c62	cni: handle multi-path cni_path when fingerprinting plugins (#16163 ) This PR fixes the CNI plugin fingerprinter to take into account the fact that the cni_path config can be a multi-path (e.g. `/foo:/bar:/baz`). Accumulate plugins from each of the possible path elements. If scanning any of the named directory fails, the fingerprinter fails. Fixes #16083 No CL/BP - has not shipped yet.	2023-02-13 14:55:56 -06:00
Charlie Voiselle	7cfb938723	[chore] Move TestUtil_loadVersionControlGlobalConfigs into build flagged file (#16114 )	2023-02-09 14:25:26 -05:00
Seth Hoenig	0e7bf87ee1	deps: upgrade to hashicorp/golang-lru/v2 (#16085 )	2023-02-08 15:20:33 -06:00
Seth Hoenig	a4519c739d	users: eliminate LookupGroupId and its one use case (#16093 ) This PR deletes the user.LookupGroupId function as it was only being used in a single test case, and its value was not important to the test.	2023-02-08 14:57:09 -06:00
Michael Schurter	35d65c7c7e	Dynamic Node Metadata (#15844 ) Fixes #14617 Dynamic Node Metadata allows Nomad users, and their jobs, to update Node metadata through an API. Currently Node metadata is only reloaded when a Client agent is restarted. Includes new UI for editing metadata as well. --------- Co-authored-by: Phil Renaud <phil.renaud@hashicorp.com>	2023-02-07 14:42:25 -08:00
Seth Hoenig	590ae08752	main: remove deprecated uses of rand.Seed (#16074 ) * main: remove deprecated uses of rand.Seed go1.20 deprecates rand.Seed, and seeds the rand package automatically. Remove cases where we seed the random package, and cleanup the one case where we intentionally create a known random source. * cl: update cl * mod: update go mod	2023-02-07 09:19:38 -06:00
Luiz Aoqui	d6bb417795	docs: update default Nomad bridge config (#16072 )	2023-02-07 09:47:41 -05:00

1 2 3 4 5 ...

4728 commits