open-nomad

Author	SHA1	Message	Date
Seth Hoenig	ef9b84ad82	deps: update api go version and dependencies This PR sets the minimum Go version for the `api` submodule to Go 1.17. It also upgrades - gorilla/websocket 1.4.1 -> 1.4.2 - mitchelh/mapstructure 1.4.2 -> 1.4.3 - stretchr/testify 1.5.1 -> 1.7.0 Closes #11518 #11602 #11528	2022-01-24 12:23:26 -06:00
Tim Gross	04977525dd	csi: update leader's ACL in volumewatcher (#11891 ) The volumewatcher that runs on the leader needs to make RPC calls rather than writing to raft (as we do in the deploymentwatcher) because the unpublish workflow needs to make RPC calls to the clients. This requires that the volumewatcher has access to the leader's ACL token. But when leadership transitions, the new leader creates a new leader ACL token. This ACL token needs to be passed into the volumewatcher when we enable it, otherwise the volumewatcher can find itself with a stale token.	2022-01-24 11:49:50 -05:00
Seth Hoenig	0030424384	Merge pull request #11889 from hashicorp/build-update-circle build: upgrade circleci configuration	2022-01-24 10:18:21 -06:00
Seth Hoenig	c220d3009b	Merge pull request #11910 from hashicorp/deps-update-containernetworking deps: upgrade containernetworking/plugins	2022-01-24 10:14:50 -06:00
Seth Hoenig	8a96e5d567	deps: add missing cl note	2022-01-24 10:13:13 -06:00
Tim Gross	b9456f2f72	changelog: fix entry markdown (#11911 )	2022-01-24 11:04:14 -05:00
Seth Hoenig	2f0cfb5740	build: upgrade and speedup circleci configuration This PR upgrades our CI images and fixes some affected tests. - upgrade go-machine-image to premade latest ubuntu LTS (ubuntu-2004:202111-02) - eliminate go-machine-recent-image (no longer necessary) - manage GOPATH in GNUMakefile (see https://discuss.circleci.com/t/gopath-is-set-to-multiple-directories/7174) - fix tcp dial error check (message seems to be OS specific) - spot check values measured instead of specifically 'RSS' (rss no longer reported in cgroups v2) - use safe MkdirTemp for generating tmpfiles NOT applied: (too flakey) - eliminate setting GOMAXPROCS=1 (build tools were also affected by this setting) - upgrade resource type for all imanges to large (2C -> 4C)	2022-01-24 08:28:14 -06:00
Seth Hoenig	f2a71fd0d9	deps: pty has new home github.com/kr/pty was moved to github.com/creack/pty Swap this dependency so we can upgrade to the latest version and no longer need a replace directive.	2022-01-19 12:33:05 -06:00
Seth Hoenig	9a6988f55b	deps: adjust to gzip handler zero length response body After swapping gzip handler to use the gorilla library, we must account for a quirk in how zero/minimal length response bodies are delivered. The previous gzip handler was configured to compress all responses regardless of size - even if the data was zero length or below the network MTU. This behavior changed in [v1.1.0](`c551b6c3b4 (diff-de723e6602cc2f16f7a9d85fd89d69954edc12a49134dab8901b10ee06d1879d)`) which is why we could not upgrade. The Nomad HTTP Client mutates the http.Response.Body object, making a strong assumption that if the Content-Encoding header is set to "gzip", the response will be readable via gzip decoder. This is no longer true for the nytimes gzip handler, and is also not true for the gorilla gzip handler. It seems in practice this only makes a difference on the /v1/operator/license endpoint which returns an empty response in OSS Nomad. The fix here is to simply not wrap the response body reader if we encounter an io.EOF while creating the gzip reader - indicating there is no data to decode.	2022-01-19 11:52:19 -06:00
Seth Hoenig	4650e97d29	deps: upgrade docker and runc This PR upgrades - docker dependency to the latest tagged release (v20.10.12) - runc dependency to the latest tagged release (v1.0.3) Docker does not abide by [semver](https://github.com/moby/moby/issues/39302), so it is marked +incompatible, and transitive dependencies are upgrade manually. Runc made three relevant breaking changes * cgroup manager .Set changed to accept Resources instead of Cgroup `3f65946756` * config.Device moved to devices.Device https://github.com/opencontainers/runc/pull/2679 * mountinfo.Mounted now returns an error if the specified path does not exist https://github.com/moby/sys/blob/mountinfo/v0.5.0/mountinfo/mountinfo.go#L16	2022-01-18 08:35:26 -06:00
Dave May	330d24a873	cli: Add event stream capture to nomad operator debug (#11865 )	2022-01-17 21:35:51 -05:00
Michael Schurter	99c863f909	cli: improve debug error messages (#11507 ) Improves `nomad debug` error messages when contacting agents that do not have /v1/agent/host endpoints (the endpoint was added in v0.12.0) Part of #9568 and manually tested against Nomad v0.8.7. Hopefully isRedirectError can be reused for more cases listed in #9568	2022-01-17 11:15:17 -05:00
Luiz Aoqui	57a61b50f4	changelog: add entry for #11793 (#11862 )	2022-01-17 11:08:29 -05:00
James Rasell	29fb0ddb4e	Merge pull request #11849 from hashicorp/b-changelog-11848 changelog: add entry for #11848	2022-01-17 09:35:10 +01:00
Jai	7ad9ec1143	Merge pull request #11820 from hashicorp/f-ui/alloc-legend feat: add links to legend items in `allocation-summary`	2022-01-14 14:02:54 -05:00
Tim Gross	d7756f8cdb	csi: volume deregistration should require exact ID (#11852 ) The command line client sends a specific volume ID, but this isn't enforced at the API level and we were incorrectly using a prefix match for volume deregistration, resulting in cases where a volume with a shorter ID that's a prefix of another volume would be deregistered instead of the intended volume.	2022-01-14 12:26:03 -05:00
Tim Gross	33f7c6cba4	csi: when warning for multiple prefix matches, use full ID (#11853 ) When the `volume deregister` or `volume detach` commands get an ID prefix that matches multiple volumes, show the full length of the volume IDs in the list of volumes shown so so that the user can select the correct one.	2022-01-14 12:25:48 -05:00
Tim Gross	9c4864badd	freebsd: build fix for ARM7 32-bit (#11854 ) The size of `stat_t` fields is architecture dependent, which was reportedly causing a build failure on FreeBSD ARM7 32-bit systems. This changeset matches the behavior we have on Linux.	2022-01-14 12:25:32 -05:00
Tim Gross	73d0779858	drivers: set world-readable permissions on copied resolv.conf (#11856 ) When we copy the system DNS to a task's `resolv.conf`, we should set the permissions as world-readable so that unprivileged users within the task can read it.	2022-01-14 12:25:23 -05:00
Jai Bhagat	764936a768	chore: add changelog	2022-01-14 10:23:09 -05:00
James Rasell	10405e9c6b	changelog: add entry for #11848	2022-01-14 13:40:50 +01:00
Luiz Aoqui	d48e50da9a	Fix log level parsing from lines that include a timestamp (#11838 )	2022-01-13 09:56:35 -05:00
Luiz Aoqui	c7ae13a1f3	Fix ACL requirements for job details UI (#11672 )	2022-01-12 21:26:02 -05:00
Michael Schurter	ebadaabc71	doc: add changelog for #11830	2022-01-12 14:21:47 -08:00
Tim Gross	b62da8fc9a	docs: improve changelog for PR #11783 (#11818 )	2022-01-11 11:54:12 -05:00
Tim Gross	1a5973184e	docs: changelog for PR #11783 (#11812 )	2022-01-10 16:39:21 -05:00
Alessandro De Blasis	e647549ecf	metrics: added `mapped_file` metric (#11500 ) Signed-off-by: Alessandro De Blasis <alex@deblasis.net> Co-authored-by: Nate <37554478+servusdei2018@users.noreply.github.com>	2022-01-10 15:35:19 -05:00
Derek Strickland	0a8e03f0f7	Expose Consul template configuration parameters (#11606 ) This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza. - `wait` It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza. - `max_stale` - `block_query_wait` - `consul_retry` - `vault_retry` - `wait` Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure. - `wait_bounds` Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2022-01-10 10:19:07 -05:00
Tim Gross	32f150d469	docs: new scheduler metrics (#11790 ) * Fixed name of `nomad.scheduler.allocs.reschedule` metric * Added new metrics to metrics reference documentation * Expanded definitions of "waiting" metrics * Changelog entry for #10236 and #10237	2022-01-07 09:51:15 -05:00
Charlie Voiselle	98a240cd99	Make number of scheduler workers reloadable (#11593 ) ## Development Environment Changes * Added stringer to build deps ## New HTTP APIs * Added scheduler worker config API * Added scheduler worker info API ## New Internals * (Scheduler)Worker API refactor—Start(), Stop(), Pause(), Resume() * Update shutdown to use context * Add mutex for contended server data - `workerLock` for the `workers` slice - `workerConfigLock` for the `Server.Config.NumSchedulers` and `Server.Config.EnabledSchedulers` values ## Other * Adding docs for scheduler worker api * Add changelog message Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>	2022-01-06 11:56:13 -05:00
Michael Schurter	1af8d47de2	Merge pull request #11744 from hashicorp/b-node-copy Fix Node.Copy()	2022-01-05 17:01:53 -08:00
Jai	c7e581d879	Merge pull request #11590 from hashicorp/e-ui/breadcrumbs-service Refactor: Breadcrumbs Service	2022-01-05 17:46:48 -05:00
Tim Gross	51f512a3e6	csi: reap unused volume claims at leadership transitions (#11776 ) When `volumewatcher.Watcher` starts on the leader, it starts a watch on every volume and triggers a reap of unused claims on any change to that volume. But if a reaping is in-flight during leadership transitions, it will fail and the event that triggered the reap will be dropped. Perform one reap of unused claims at the start of the watcher so that leadership transitions don't drop this event.	2022-01-05 11:40:20 -05:00
Arkadiusz	ffb174b596	Fix log streaming missing frames (#11721 ) Perform one more read after receiving cancel when streaming file from the allocation API	2022-01-04 14:07:16 -05:00
Tim Gross	2806dc2bd7	docs/tests for multiple HTTP address config (#11760 )	2022-01-03 10:17:13 -05:00
Tim Gross	395628efe1	api: paginate deployment list and accept wildcard namespace (#11743 ) Add `per_page` and `next_token` handling to `Deployment.List` RPC, and allow the use of a wildcard namespace for namespace filtering.	2022-01-03 08:36:02 -05:00
Michael Schurter	c4d03815e1	add changelog for Node.Copy fix	2021-12-23 12:34:05 -08:00
Tim Gross	265e488ab4	task runner: fix goroutine leak in prestart hook (#11741 ) The task runner prestart hooks take a `joincontext` so they have the option to exit early if either of two contexts are canceled: from killing the task or client shutdown. Some tasks exit without being shutdown from the server, so neither of the joined contexts ever gets canceled and we leak the `joincontext` (48 bytes) and its internal goroutine. This primarily impacts batch jobs and any task that fails or completes early such as non-sidecar prestart lifecycle tasks. Cancel the `joincontext` after the prestart call exits to fix the leak.	2021-12-23 11:50:51 -05:00
Luiz Aoqui	4bdd2c84e3	fix host network reserved port fingerprint (#11728 )	2021-12-22 15:29:54 -05:00
Luiz Aoqui	40093f97cd	api: support namespace wildcard in CSI volume list (#11724 )	2021-12-21 17:19:45 -05:00
Shishir	65eab35412	Add support for setting pids_limit in docker plugin config. (#11526 )	2021-12-21 13:31:34 -05:00
Tim Gross	b0c3b99b03	scheduler: fix quadratic performance with spread blocks (#11712 ) When the scheduler picks a node for each evaluation, the `LimitIterator` provides at most 2 eligible nodes for the `MaxScoreIterator` to choose from. This keeps scheduling fast while producing acceptable results because the results are binpacked. Jobs with a `spread` block (or node affinity) remove this limit in order to produce correct spread scoring. This means that every allocation within a job with a `spread` block is evaluated against _all_ eligible nodes. Operators of large clusters have reported that jobs with `spread` blocks that are eligible on a large number of nodes can take longer than the nack timeout to evaluate (60s). Typical evaluations are processed in milliseconds. In practice, it's not necessary to evaluate every eligible node for every allocation on large clusters, because the `RandomIterator` at the base of the scheduler stack produces enough variation in each pass that the likelihood of an uneven spread is negligible. Note that feasibility is checked before the limit, so this only impacts the number of _eligible_ nodes available for scoring, not the total number of nodes. This changeset sets the iterator limit for "large" `spread` block and node affinity jobs to be equal to the number of desired allocations. This brings an example problematic job evaluation down from ~3min to ~10s. The included tests ensure that we have acceptable spread results across a variety of large cluster topologies.	2021-12-21 10:10:01 -05:00
Jai Bhagat	1887e34b75	chore: edit mirage scenario to populate csi	2021-12-21 07:42:23 -05:00
Luiz Aoqui	3d3b5a2c8e	changelog: add entries for #11555 , #11557 , and #11687 (#11706 )	2021-12-20 13:45:20 -05:00
Tim Gross	e046bb31e9	api: respect wildcard in evaluations list API (#11710 )	2021-12-20 12:23:50 -05:00
Jai	93d5ef596f	Merge pull request #11545 from hashicorp/f-ui/add-alloc-filters-on-table Add Allocation Filters in Client View	2021-12-18 09:39:53 -05:00
Jai	2b7fb2c5bd	Merge pull request #11544 from hashicorp/f-ui/add-filters-to-allocs Add filters to Allocations	2021-12-18 09:38:28 -05:00
Luiz Aoqui	e067b3d75f	changelog: fix entry for #11544	2021-12-17 18:57:54 -05:00
Luiz Aoqui	a1c4536523	changelog: add entry for #11545	2021-12-17 18:49:56 -05:00
Tim Gross	f2615992a4	cli: unhide advanced operator raft debugging commands (#11682 ) The `nomad operator raft` and `nomad operator snapshot state` subcommands for inspecting on-disk raft state were hidden and undocumented. Expose and document these so that advanced operators have support for these tools.	2021-12-16 10:32:11 -05:00
Tim Gross	536e3c5282	`nomad eval list` command (#11675 ) Use the new filtering and pagination capabilities of the `Eval.List` RPC to provide filtering and pagination at the command line. Also includes note that `nomad eval status -json` is deprecated and will be replaced with a single evaluation view in a future version of Nomad.	2021-12-15 11:58:38 -05:00
Tim Gross	f8a133a810	cli: ensure `-stale` flag is respected by `nomad operator debug` (#11678 ) When a cluster doesn't have a leader, the `nomad operator debug` command can safely use stale queries to gracefully degrade the consistency of almost all its queries. The query parameter for these API calls was not being set by the command. Some `api` package queries do not include `QueryOptions` because they target a specific agent, but they can potentially be forwarded to other agents. If there is no leader, these forwarded queries will fail. Provide methods to call these APIs with `QueryOptions`.	2021-12-15 10:44:03 -05:00
Luiz Aoqui	05bb65779c	api: return error when `LicenseGet` status is not `200` (#11644 )	2021-12-14 19:47:09 -05:00
Tim Gross	a0cf5db797	provide `-no-shutdown-delay` flag for job/alloc stop (#11596 ) Some operators use very long group/task `shutdown_delay` settings to safely drain network connections to their workloads after service deregistration. But during incident response, they may want to cause that drain to be skipped so they can quickly shed load. Provide a `-no-shutdown-delay` flag on the `nomad alloc stop` and `nomad job stop` commands that bypasses the delay. This sets a new desired transition state on the affected allocations that the allocation/task runner will identify during pre-kill on the client. Note (as documented here) that using this flag will almost always result in failed inbound network connections for workloads as the tasks will exit before clients receive updated service discovery information and won't be gracefully drained.	2021-12-13 14:54:53 -05:00
Tim Gross	5a68373e7f	Version 1.2.3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJhs7QgAAoJELC0QQl2hbZ2IQQP/3aKKgsptB0IPGx4vAAlIfMY IUyj9KdQ0SRN4B0C4h/T3CxqIhFPGmrV2RkOtEpDyBJuTUbH4FBjCscsKePFON+g Kfk/SoP05AQSksXFiKVK99UxUjg43SdqvatwnmLH4hafapbq5mMouTkBho+i05xK n6853DOwoq5qsPs6ihwRddRtpduozBKWLBMoBUm3syf8erWX0dafU5WszvLvG16R YJxTNr0nwQFhKDfY1CFUHJglj1s521poA9Zj6Xa1fNnIQ2JdKW1kElPUXmra1w7X 0Wussv4fgJAetTO2bz0+IeuQf+EzxQ7vKDklt4ORypXkwiC9h7x2ZNCKRL+GReyU wUnzccXBfOsgpvW5EAoNXCGOQa6c2+uvHAAd62AAqljLh+B+yDJysvPobihfbSsu E2kXJEd3N6GndDjFfzUaYPhhGkvBaPUTNxybSaaREShJ7a7c8tedxfMpNYt1RwGz llJEoeZZketwjEFLEHp9xjNeqXdAXyrqCkluMvy+foU72HaRPFc0tlDnRsqirZ0p hBxLxPp5oM4V/RegTa3z8P4J0kMSvCdCE4bPNgyiEJDmvxRYDVk5YorLTCDTGrWU 4WO7fue0bOwhGBYWRfAWzfpoHrCvRLto2vVdtBaFwlzmGP8j/QjM8ANrGyiJeiuY IPZSM93pAAcWQEV9id/E =G3In -----END PGP SIGNATURE----- Merge tag 'v1.2.3' into merge-release-1.2.3-branch Version 1.2.3	2021-12-13 10:12:07 -05:00
Tim Gross	46e1d29298	golang security update 1.17.5	2021-12-10 13:50:22 -05:00
Tim Gross	624ecab901	evaluations list pagination and filtering (#11648 ) API queries can request pagination using the `NextToken` and `PerPage` fields of `QueryOptions`, when supported by the underlying API. Add a `NextToken` field to the `structs.QueryMeta` so that we have a common field across RPCs to tell the caller where to resume paging from on their next API call. Include this field on the `api.QueryMeta` as well so that it's available for future versions of List HTTP APIs that wrap the response with `QueryMeta` rather than returning a simple list of structs. In the meantime callers can get the `X-Nomad-NextToken`. Add pagination to the `Eval.List` RPC by checking for pagination token and page size in `QueryOptions`. This will allow resuming from the last ID seen so long as the query parameters and the state store itself are unchanged between requests. Add filtering by job ID or evaluation status over the results we get out of the state store. Parse the query parameters of the `Eval.List` API into the arguments expected for filtering in the RPC call.	2021-12-10 13:43:03 -05:00
Lukas W	0e5958d671	CLI: Return non-zero exit code when deployment fails in `nomad run` (#11550 ) * Exit non-zero from run command if deployment fails * Fix typo in deployment monitor introduced in 0edda11	2021-12-09 09:09:28 -05:00
Vyacheslav Morov	6a244f18ad	cli: Add var args to plan output. (#11631 )	2021-12-07 10:43:52 -05:00
Tim Gross	03e697a69d	scheduler: config option to reject job registration (#11610 ) During incident response, operators may find that automated processes elsewhere in the organization can be generating new workloads on Nomad clusters that are unable to handle the workload. This changeset adds a field to the `SchedulerConfiguration` API that causes all job registration calls to be rejected unless the request has a management ACL token.	2021-12-06 15:20:34 -05:00
Derek Strickland	8595e3ed6a	Add change log entry for PR 11592 (#11609 )	2021-12-02 16:18:56 -05:00
Tim Gross	5097546153	changelog: new metrics in Nomad Enterprise (#11591 ) This changelog is for a PR that landed in Nomad Enterprise only.	2021-12-01 09:15:12 -05:00
Michael Schurter	3d248153f4	Merge pull request #11579 from hashicorp/b-getscalingpolicy-rpc-index-response rpc: fix scaling policy get index response when policy is found.	2021-11-30 10:43:20 -08:00
Tim Gross	6e1311a265	client: respect `client_auto_join` after connection loss (#11585 ) The `consul.client_auto_join` configuration block tells the Nomad client whether to use Consul service discovery to find Nomad servers. By default it is set to `true`, but contrary to the documentation it was only respected during the initial client registration. If a client missed a heartbeat, failed a `Node.UpdateStatus` RPC, or if there was no Nomad leader, the client would fallback to Consul even if `client_auto_join` was set to `false`. This changeset returns early from the client's trigger for Consul discovery if the `client_auto_join` field is set to `false`.	2021-11-30 13:20:42 -05:00
James Rasell	a9a624574f	changelog: add entry for #11579	2021-11-26 11:16:17 +01:00
Tim Gross	74768eb7d3	scheduler: fix panic in system jobs when nodes filtered by class (#11565 ) In the system scheduler, if a subset of clients are filtered by class, we hit a code path where the `AllocMetric` has been copied, but the `Copy` method does not instantiate the various maps. This leads to an assignment to a nil map. This changeset ensures that the maps are non-nil before continuing. The `Copy` method relies on functions in the `helper` package that all return nil slices or maps when passed zero-length inputs. This changeset to fix the panic bug intentionally defers updating those functions because it'll have potential impact on memory usage. See https://github.com/hashicorp/nomad/issues/11564 for more details.	2021-11-24 12:59:15 -05:00
Tim Gross	ba38008596	scheduler: fix panic in system jobs when nodes filtered by class (#11565 ) In the system scheduler, if a subset of clients are filtered by class, we hit a code path where the `AllocMetric` has been copied, but the `Copy` method does not instantiate the various maps. This leads to an assignment to a nil map. This changeset ensures that the maps are non-nil before continuing. The `Copy` method relies on functions in the `helper` package that all return nil slices or maps when passed zero-length inputs. This changeset to fix the panic bug intentionally defers updating those functions because it'll have potential impact on memory usage. See https://github.com/hashicorp/nomad/issues/11564 for more details.	2021-11-24 12:28:47 -05:00
Jai Bhagat	9dc6ad7b7d	chore: changelog entry	2021-11-23 18:28:33 -05:00
Luiz Aoqui	9d6842dd4d	Don't emit scaling event error when a deployment is underway (#11556 )	2021-11-23 10:20:18 -05:00
James Rasell	751c8217d1	core: allow setting and propagation of eval priority on job de/registration (#11532 ) This change modifies the Nomad job register and deregister RPCs to accept an updated option set which includes eval priority. This param is optional and override the use of the job priority to set the eval priority. In order to ensure all evaluations as a result of the request use the same eval priority, the priority is shared to the allocReconciler and deploymentWatcher. This creates a new distinction between eval priority and job priority. The Nomad agent HTTP API has been modified to allow setting the eval priority on job update and delete. To keep consistency with the current v1 API, job update accepts this as a payload param; job delete accepts this as a query param. Any user supplied value is validated within the agent HTTP handler removing the need to pass invalid requests to the server. The register and deregister opts functions now all for setting the eval priority on requests. The change includes a small change to the DeregisterOpts function which handles nil opts. This brings the function inline with the RegisterOpts.	2021-11-23 09:23:31 +01:00
Luiz Aoqui	d3c1a03edd	Version 1.2.1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJhl94SAAoJELC0QQl2hbZ2pqoP/R7HyOxvealo5MBJcG4mGiWT Hsu9VXpYKDWn0GSXd3JmqYWH7tIwFMXispZ7pMlDLieypW3UpMYIbIquaePxOaRL yhlc0CLT7JDsFPx8Puv1fgKXaS3EfFyJlYx437bhCQ+K0k2+1n3EOhrzU/DQ4j8V D5qxlkZh6IK6brIJ54NivGzTxtzGGvIGXCrDPolX3cwoBtyO/pbecfEkRlN2xwxl P68l52+Jit3lK2Cljh4Kr1qFj8voHPjYUTXGas8ZkIVrx9l4fb6CHib2y3hy4bRR qwXT4keWc8bxtLQ7vtetGBAXp4UKJigziE4imhHAttBN9th2/Oy0qSQCNX3xELJC Jwgc+N+ON63QI2sP/8FWvmeUrJpASRITYl/Gr8uOR6n1PacrBhFT9OV4VMkte1ua jS/WF/7k21NZYqZca+thvN12wmw/gSEAEeCHH5kR3vPLeV6FdanhKLjufMNuMShc UKJCEZw1/Lyux1XkLqMPoZ4DCak8/HskupQoLNsekF1Uki8ObU4as7GERedxqkj6 i2+1QIQMqvviskOwT0QOWm4RFXjRQsIK8uUfXzHHWDMzDhvnGjB0eWVMLAj4/rTe 46yUP4kdarFkxwkDmLEyoogdD35wC4Xc8Y8IynzUTN77pOWID5QEyFZVaaBB4NR3 wNowUJGrNkxEYXwGSkjh =Zuw2 -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElFaq1Z5DKdB91i+lKfRZwNnLtXMFAmGbu3sACgkQKfRZwNnL tXMx4BAAksQ07tSoOku8zDwx2JpoiNApoYhMLlfJ4S3Mw+RYtbayAMRyA08GG56I U85XJB/Z2CzliYL/Nya1e3z6Gyn92V0iD9u7N1xEAPt8PdyiXqIBZn1rWoiCcnMO C3f2aRGhLZMVOZG0v7fgbh1PkhJt4MLcRQE9nn5ojPvFzW9bL0Iz7lc9IxHQtaU0 rANDcXdj3IhiOdEgjtO++Qhdeu3t2SBhT2xFnlJ3gXC2q/aY1a2C7BYdlSxtw0JU nKpxvBTsB7rINGcYxhXZlckui5YLL4BX11XqsYhUTMC+33vxE5HNty1ANc1+SNyO 0iHp0yc5J6MCLuiZ/2sBek2tC+KHCufb+qEIqPmBpcWPJRT8HjginLxj/HyL2TQc pLF9XxhYKvv0sm3Zr3Ima5kqWgayph3XhQ73hKs9f7SLfErr6qr4XaI8egZA4OTG 0QGmY/61UlAdsz5tUvIGRWYD5rqXyXIYnUprldPSQdeZ0o2GjX7T0GZ934O5uHfE Ne73GafGn8JaGxH9+AEHMJAVpkrzWR1wrExL3kGJ8NF40HlsYofIuhTkZqMKX3EH 7KfefSJW1NQAGeAEwjtvzhmUiM0cVoCWGd4COxX1G3oJ0o8gZ3RklDEA4Pa9C0rO pBW/KIckPpGieGvPaA3mqmXDjx6oOaxPi9wd5TniBHh43pgrASo= =KVce -----END PGP SIGNATURE----- Merge tag 'v1.2.1' into merge-release-1.2.1-branch Version 1.2.1	2021-11-22 10:47:04 -05:00
Tim Gross	fc1d4814d9	qemu: add `args_allowlist` to sandbox VM command line inputs The QEMU driver allows arbitrary command line options, but many of these options give access to host resources that operators may not want to expose such as devices. Add an optional allowlist to the plugin configuration so that operators can limit the resources for QEMU.	2021-11-19 11:11:52 -05:00
Tim Gross	7f6fca6db9	changelog batch (#11517 )	2021-11-17 11:24:32 -05:00
Tim Gross	e729133134	api: return 404 for alloc FS list/stat endpoints (#11482 ) * api: return 404 for alloc FS list/stat endpoints If the alloc filesystem doesn't have a file requested by the List Files or Stat File API, we currently return a HTTP 500 error with the expected "file not found" error message. Return a HTTP 404 error instead. * update FS Handler Previously the FS handler would interpret a 500 status as a 404 in the adapter layer by checking if the response body contained the text or is the response status was 500 and then throw an error code for 404. Co-authored-by: Jai Bhagat <jaybhagat841@gmail.com>	2021-11-17 11:15:07 -05:00
Tim Gross	863486ffb0	deps: update go-getter to 1.5.9 (#11481 ) go-getter 1.5.9 includes a patch in 1.5.6 that automatically unpacks uncompressed tar archives. Previously Nomad only unpacked compressed archives, but documented that it unpacked all archives.	2021-11-17 11:14:44 -05:00
James Rasell	519851cf1a	changelog: add entry for #11504	2021-11-15 12:01:52 +01:00
Dave May	3c04d7927b	cli: refactor operator debug capture (#11466 ) * debug: refactor Consul API collection * debug: refactor Vault API collection * debug: cleanup test timing * debug: extend test to multiregion * debug: save cmdline flags in bundle * debug: add cli version to output * Add changelog entry	2021-11-05 19:43:10 -04:00
Tim Gross	73e3b15305	build: bump go version to 1.17.3 (#11461 )	2021-11-05 15:34:24 -04:00
James Rasell	99955eb80f	Merge pull request #11426 from hashicorp/b-set-dereg-eval-priority-correctly rpc: set the deregistration eval priority to the job priority.	2021-11-05 15:53:10 +01:00
James Rasell	2cc661c523	Merge pull request #11429 from hashicorp/b-set-scale-eval-priority-correctly rpc: set the job scale eval priority to the job priority.	2021-11-05 15:52:31 +01:00
Alessandro De Blasis	07c670fdc0	cli: show `host_network` in `nomad status` (#11432 ) Enhance the CLI in order to return the host network in two flavors (default, verbose) of the `node status` command. Fixes: #11223. Signed-off-by: Alessandro De Blasis <alex@deblasis.net>	2021-11-05 09:02:46 -04:00
Florian Apolloner	ef88795af3	Added a `-hcl2-strict` flag to allow for lenient hcl variable parsing. (#11284 ) Co-authored-by: James Rasell <jrasell@hashicorp.com>	2021-11-04 16:33:09 +01:00
James Rasell	674761436e	Merge pull request #11165 from hashicorp/b-gh-11149 jobspec2: ensure consistent error handling between var-file & var.	2021-11-04 16:24:00 +01:00
James Rasell	4125e13698	changelog: add entry for #11165	2021-11-04 15:35:02 +01:00
James Rasell	2b866b1d34	changelog: fixup entry extension for #11167	2021-11-04 15:28:34 +01:00
Michael Schurter	3718557041	Merge pull request #11416 from hashicorp/f-rejected-info core: bump rejected plans from debug -> info	2021-11-03 16:49:28 -07:00
Michael Schurter	ef3fc79225	Merge pull request #11334 from hashicorp/f-chroot-skip-allocdir client: never embed alloc_dir in chroot	2021-11-03 16:48:09 -07:00
Charlie Voiselle	71643263a6	Parse `job > group > consul` block in HCL1 (#11423 )	2021-11-03 13:49:32 -04:00
Luiz Aoqui	5d204c8ced	Revert "Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799 )" (#11433 )	2021-11-02 17:42:52 -04:00
James Rasell	a2176474a5	changelog: add entry for #11429	2021-11-02 12:58:10 +01:00
James Rasell	4803eb9d88	changelog: add entry for #11426	2021-11-02 11:43:13 +01:00
James Rasell	c071efbd6b	Merge pull request #11411 from hashicorp/f-gh-11406 cli: add json and template flag opts to acl bootstrap command.	2021-11-02 09:48:25 +01:00
Charlie Voiselle	29e7d46dd9	Making RPC Upgrade mode reloadable. (#11144 ) - Making RPC Upgrade mode reloadable. - Add suggestions from code review - remove spurious comment - switch to require(t,...) form for test. - Add to changelog	2021-11-01 16:30:53 -04:00
Luiz Aoqui	655ac2719f	Allow using specific object ID on diff (#11400 )	2021-11-01 15:16:31 -04:00
Michael Schurter	efe5714840	core: bump rejected plans from debug -> info As we have continued to see reports of #9506 we need to elevate this log line as it is the only way to detect when plans are being erroneously rejected. Users who see this log line repeatedly should drain and restart the node in the log line. This seems to workaorund the issue. Please post any details on #9506!	2021-10-31 12:51:42 -07:00
James Rasell	30ad7985b2	changelog: add entry for #11411 .	2021-10-29 09:08:10 +02:00
Dave May	509c74ce19	debug: update default node-id and docs (#11398 ) * debug: default node-id to all * debug: align cli help and website documentation	2021-10-27 13:43:56 -04:00
Mahmood Ali	cdddd64a42	logging: Log the cause behind agent startup failure (#11353 ) Log the failure error when the agent fails to start. Previously, the agent startup failure error would be emitted to the command UI but not logged. So it doesn't get emitted to syslog or `log_file` if they are set, and it makes debugging much harder. Also, logging the error again before exit makes the error more visible: previously, the operator needed to scroll to the top to find the error. On a sample failure, the output will look like: ``` ==> WARNING: Bootstrap mode enabled! Potentially unsafe operation. ==> Loaded configuration from sample-configs/config-bad ==> Starting Nomad agent... ==> Error starting agent: setting up server node ID failed: mkdir /path-without-permission: read-only file system 2021-10-20T14:38:51.179-0400 [WARN] agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/path-without-permission/plugins 2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins 2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=java type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=docker type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=mock_driver type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=exec type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [INFO] agent: detected plugin: name=qemu type=driver plugin_version=0.1.0 2021-10-20T14:38:51.181-0400 [ERROR] agent: error starting agent: error="setting up server node ID failed: mkdir /path-without-permission: read-only file system" ``` This change adds the final `ERROR` message. It's easy to miss the `==> Error starting agent` above.	2021-10-27 10:41:17 -07:00
Mahmood Ali	daf20f9788	vault: set JobID in Vault metadata (#11397 ) Closes: #11395 .	2021-10-27 07:20:29 -07:00
Mahmood Ali	e06ff1d613	scheduler: stop allocs in unrelated nodes (#11391 ) The system scheduler should leave allocs on draining nodes as-is, but stop node stop allocs on nodes that are no longer part of the job datacenters. Previously, the scheduler did not make the distinction and left system job allocs intact if they are already running. I've added a failing test first, which you can see in https://app.circleci.com/jobs/github/hashicorp/nomad/179661 . Fixes https://github.com/hashicorp/nomad/issues/11373	2021-10-27 07:04:13 -07:00

1 2 3 4 5 ...

269 commits