open-nomad

Author	SHA1	Message	Date
Luiz Aoqui	5995ea9981	docs: improve job parse API documentation (#15387 )	2022-11-25 12:46:53 -05:00
Luiz Aoqui	8f91be26ab	scheduler: create placements for non-register MRD (#15325 ) * scheduler: create placements for non-register MRD For multiregion jobs, the scheduler does not create placements on registration because the deployment must wait for the other regions. Once of these regions will then trigger the deployment to run. Currently, this is done in the scheduler by considering any eval for a multiregion job as "paused" since it's expected that another region will eventually unpause it. This becomes a problem where evals not triggered by a job registration happen, such as on a node update. These types of regional changes do not have other regions waiting to progress the deployment, and so they were never resulting in placements. The fix is to create a deployment at job registration time. This additional piece of state allows the scheduler to differentiate between a multiregion change, where there are other regions engaged in the deployment so no placements are required, from a regional change, where the scheduler does need to create placements. This deployment starts in the new "initializing" status to signal to the scheduler that it needs to compute the initial deployment state. The multiregion deployment will wait until this deployment state is persisted and its starts is set to "pending". Without this state transition it's possible to hit a race condition where the plan applier and the deployment watcher may step of each other and overwrite their changes. * changelog: add entry for #15325	2022-11-25 12:45:34 -05:00
Piotr Kazmierczak	9c85315bd2	bugfix: typos in acl role commands (#15382 ) Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2022-11-25 10:28:33 +01:00
Tim Gross	8657695322	scheduler: set job on system stack for CSI feasibility check (#15372 ) When the scheduler checks feasibility of each node, it creates a "stack" which carries attributes of the job and task group it needs to check feasibility for. The `system` and `sysbatch` scheduler use a different stack than `service` and `batch` jobs. This stack was missing the call to set the job ID and namespace for the CSI check. This prevents CSI volumes from being scheduled for system jobs whenever the volume is in a non-default namespace. Set the job ID and namespace to match the generic scheduler.	2022-11-23 16:47:35 -05:00
Luiz Aoqui	4208cfcfbd	cli: improve errors for multiregion deployments (#15326 ) Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>	2022-11-23 16:40:13 -05:00
Jack	62f7de7ed5	cli: `wait` flag for use with `deployment status -monitor` (#15262 )	2022-11-23 16:36:13 -05:00
Tim Gross	918a91f8bf	fix linter for go.mod tidy (#15373 ) This passed lint in #15335 but that PR was not rebased on incoming changes before merging, so this was missed and has shown up on `main`	2022-11-23 16:26:00 -05:00
Jai	49b7e78275	refact: add namespace to nested detail pages (#15261 ) * refact: add namespace to job breadcrumb * refact: edit markup for correctness * styling: add child span style * chore: undo prev commits after design exploration * ui: add namespace to alloc details * ui: add namespace to task details * ui: add namespace to taskgroup	2022-11-23 11:36:47 -05:00
dependabot[bot]	077d4fac84	build(deps): bump github.com/aws/aws-sdk-go from 1.44.126 to 1.44.142 (#15335 ) Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.44.126 to 1.44.142. - [Release notes](https://github.com/aws/aws-sdk-go/releases) - [Commits](https://github.com/aws/aws-sdk-go/compare/v1.44.126...v1.44.142) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-23 10:24:29 -05:00
dependabot[bot]	178a0f1bc7	build(deps): bump github.com/hashicorp/vault/sdk from 0.6.0 to 0.6.1 (#15233 ) Bumps [github.com/hashicorp/vault/sdk](https://github.com/hashicorp/vault) from 0.6.0 to 0.6.1. - [Release notes](https://github.com/hashicorp/vault/releases) - [Changelog](https://github.com/hashicorp/vault/blob/main/CHANGELOG.md) - [Commits](https://github.com/hashicorp/vault/compare/v0.6.0...v0.6.1) --- updated-dependencies: - dependency-name: github.com/hashicorp/vault/sdk dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-23 10:07:36 -05:00
dependabot[bot]	788e0cb6aa	build(deps): bump golang.org/x/sys from 0.1.0 to 0.2.0 (#15231 ) Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.1.0 to 0.2.0. - [Release notes](https://github.com/golang/sys/releases) - [Commits](https://github.com/golang/sys/compare/v0.1.0...v0.2.0) --- updated-dependencies: - dependency-name: golang.org/x/sys dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-23 10:07:29 -05:00
dependabot[bot]	9b90820b9c	build(deps): bump github.com/docker/docker (#15229 ) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 20.10.19+incompatible to 20.10.21+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Changelog](https://github.com/moby/moby/blob/master/CHANGELOG.md) - [Commits](https://github.com/docker/docker/compare/v20.10.19...v20.10.21) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-23 10:07:22 -05:00
dependabot[bot]	cc479e0a49	build(deps): bump github.com/hashicorp/go-plugin from 1.4.5 to 1.4.6 (#15230 ) Bumps [github.com/hashicorp/go-plugin](https://github.com/hashicorp/go-plugin) from 1.4.5 to 1.4.6. - [Release notes](https://github.com/hashicorp/go-plugin/releases) - [Changelog](https://github.com/hashicorp/go-plugin/blob/master/CHANGELOG.md) - [Commits](https://github.com/hashicorp/go-plugin/compare/v1.4.5...v1.4.6) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-plugin dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-23 10:07:07 -05:00
Sam	4689822628	Fix missing host header in http check (#15337 )	2022-11-23 08:58:13 -05:00
James Rasell	32dfa431f3	sso: add ACL auth-method HTTP API CRUD endpoints (#15338 ) * core: remove custom auth-method TTLS and use ACL token TTLS. * agent: add ACL auth-method HTTP endpoints for CRUD actions. * api: add ACL auth-method client.	2022-11-23 09:38:02 +01:00
Phil Renaud	3189826a5b	Task sub row alignment changes (#15363 )	2022-11-22 15:49:50 -05:00
Lance Haig	0263e7af34	Add command "nomad tls" (#14296 )	2022-11-22 14:12:07 -05:00
Tim Gross	c66ab53755	Merge pull request #15361 from hashicorp/post-1.4.3-release Post 1.4.3 release	2022-11-22 13:20:43 -05:00
Tim Gross	b781bbe7ea	changelogs for 1.3.8 and 1.2.15	2022-11-22 12:57:55 -05:00
hc-github-team-nomad-core	f63351f9e2	Prepare for next release	2022-11-22 12:56:29 -05:00
hc-github-team-nomad-core	031d75e158	Generate files for 1.4.3 release	2022-11-22 12:56:29 -05:00
Seth Hoenig	dfc3b067ea	e2e: fix 1 of 4 client disconnect tests (#15357 ) This PR modifies the disconnect helper job to run as root, which is necesary for manipulating iptables as it does. Also re-organizes the final test logic to wait for client re-connect before looking for the replacement (3rd) allocation in case that client was needed to run the alloc (also giving the sheduler more time to do its thing). Skips the other 3 tests, which fail and I cannot yet figure out what is going on.	2022-11-22 08:51:53 -06:00
Jai	9e6025f25b	refact: add conditional table logic (#15330 )	2022-11-22 09:19:16 -05:00
Phil Renaud	bebbf340e2	Conditional CSS and an awaiter to help screenshot consistency (#15355 )	2022-11-21 14:55:13 -05:00
Tim Gross	9ba30b9cbd	ensure engineering has merge authority on build pipeline (#15350 ) Adds @hashicorp/nomad-eng to the codeowners list for the build and release workflow files, so that we can fix problems that arise without being bottlenecked on another team.	2022-11-21 14:30:02 -05:00
Tim Gross	c7d3b3a543	pin build/release pipeline to ubuntu 20.04 (#15348 ) The `ubuntu-latest` runner has been migrated to Ubuntu 22.04, which doesn't have all the same multilib packages as 20.04. Although we'll probably want to migrate eventually, we should ship Nomad 1.4.3 with the same toolchain as we did previously so that we're not introducing new issues.	2022-11-21 14:08:45 -05:00
Seth Hoenig	2c7c6334c0	e2e: fixup oversubscription test case for jammy (#15347 ) * e2e: fixup oversubscription test case for jammy jammy uses cgroups v2, need to lookup the max memory limit from the unified heirarchy format * e2e: set constraint to require cgroups v2 on oversub docker test	2022-11-21 12:41:55 -06:00
James Rasell	e2a2ea68fc	client: accommodate Consul 1.14.0 gRPC and agent self changes. (#15309 ) * client: accommodate Consul 1.14.0 gRPC and agent self changes. Consul 1.14.0 changed the way in which gRPC listeners are configured, particularly when using TLS. Prior to the change, a single listener was responsible for handling plain-text and encrypted gRPC requests. In 1.14.0 and beyond, separate listeners will be used for each, defaulting to 8502 and 8503 for plain-text and TLS respectively. The change means that Nomad’s Consul Connect integration would not work when integrated with Consul clusters using TLS and running 1.14.0 or greater. The Nomad Consul fingerprinter identifies the gRPC port Consul has exposed using the "DebugConfig.GRPCPort" value from Consul’s “/v1/agent/self” endpoint. In Consul 1.14.0 and greater, this only represents the plain-text gRPC port which is likely to be disbaled in clusters running TLS. In order to fix this issue, Nomad now takes into account the Consul version and configured scheme to optionally use “DebugConfig.GRPCTLSPort” value from Consul’s agent self return. The “consul_grcp_socket” allocrunner hook has also been updated so that the fingerprinted gRPC port attribute is passed in. This provides a better fallback method, when the operator does not configure the “consul.grpc_address” option. * docs: modify Consul Connect entries to detail 1.14.0 changes. * changelog: add entry for #15309 * fixup: tidy tests and clean version match from review feedback. * fixup: use strings tolower func.	2022-11-21 09:19:09 -06:00
Jai	51fff25954	respect casing on service tags (#15329 ) * styles: add service tag style * refact: update service tag on alloc * refact: update service tag in component	2022-11-21 10:18:15 -05:00
Jai	7aa3fd76db	style: wrap secret value in tag (#15331 )	2022-11-21 10:18:02 -05:00
Seth Hoenig	bf4b5f9a8d	consul: add trace logging around service registrations (#15311 ) This PR adds trace logging around the differential done between a Nomad service registration and its corresponding Consul service registration, in an effort to shed light on why a service registration request is being made.	2022-11-21 08:03:56 -06:00
Piotr Kazmierczak	bb66b5e770	acl: sso auth method RPC endpoints (#15221 ) This PR implements RPC endpoints for SSO auth methods. This PR is part of the SSO work captured under ☂️ ticket #13120.	2022-11-21 10:15:39 +01:00
Piotr Kazmierczak	d02241cad5	acl: sso auth method event stream (#15280 ) This PR implements SSO auth method support in the event stream. This PR is part of the SSO work captured under ☂️ ticket #13120.	2022-11-21 10:06:05 +01:00
Phil Renaud	11dc19b307	[ui] Show Consul Connect upstreams / on update info in sidebar (#15324 ) * Added consul connect icon and sidebar info * Show icon to the right of name	2022-11-18 22:49:10 -05:00
Seth Hoenig	eaf842b226	e2e: jammy image needs latest java lts (#15323 )	2022-11-18 14:36:36 -06:00
James Rasell	a7350853ae	api: ensure ACL role upsert decode error returns a 400 status code. (#15253 )	2022-11-18 17:47:43 +01:00
James Rasell	3225cf77b6	api: ensure all request body decode error return a 400 status code. (#15252 )	2022-11-18 17:04:33 +01:00
Luiz Aoqui	b28494ec9a	docs: add cpu-allocated and memory-allocated (#15299 ) Document the Autoscaler Nomad APM paramemeters `cpu-allocated` and `memory-allocated` that were implemented in https://github.com/hashicorp/nomad-autoscaler/pull/324 and https://github.com/hashicorp/nomad-autoscaler/pull/334	2022-11-18 10:55:17 -05:00
Tim Gross	05a46e6648	make eval cancelation really async with `Eval.Ack` (#15298 ) Ensure we never block in the `Eval.Ack`	2022-11-18 08:38:17 -05:00
Luiz Aoqui	909435a0e7	scheduler: log stack in case of panic (#15303 )	2022-11-17 18:59:33 -05:00
stswidwinski	7b6e856a29	Add mount propagation to protobuf definition of mounts (#15096 ) * Add mount propagation to protobuf definition of mounts * Fix formatting * Add mount propagation to the simple roundtrip test. * changelog: add entry for #15096 Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>	2022-11-17 18:14:59 -05:00
Tim Gross	b74a868aae	make eval cancelation async with `Eval.Ack` (#15294 ) In #14621 we added an eval canelation reaper goroutine with a channel that allowed us to wake it up. But we forgot to actually send on this channel from `Eval.Ack` and are still committing the cancelations synchronously. Fix this by sending on the buffered channel to wake up the reaper instead.	2022-11-17 16:40:41 -05:00
Tim Gross	d0f9e887f7	autopilot: include only servers from the same region (#15290 ) When we migrated to the updated autopilot library in Nomad 1.4.0, the interface for finding servers changed. Previously autopilot would get the serf members and call `IsServer` on each of them, leaving it up to the implementor to filter out clients (and in Nomad's case, other regions). But in the "new" autopilot library, the equivalent interface is `KnownServers` for which we did not filter by region. This causes spurious attempts for the cross-region stats fetching, which results in TLS errors and a lot of log noise. Filter the member set by region to fix the regression.	2022-11-17 12:09:36 -05:00
Tim Gross	510eb435dc	remove deprecated `AllocUpdateRequestType` raft entry (#15285 ) After Deployments were added in Nomad 0.6.0, the `AllocUpdateRequestType` raft log entry was no longer in use. Mark this as deprecated, remove the associated dead code, and remove references to the metrics it emits from the docs. We'll leave the entry itself just in case we encounter old raft logs that we need to be able to safely load.	2022-11-17 12:08:04 -05:00
Seth Hoenig	845ff10281	e2e: disable systemd stub dns in jammy image (#15286 )	2022-11-17 09:50:44 -06:00
stswidwinski	75f80e2fdd	Fix goroutine leakage (#15180 ) * Fix goroutine leakage * cl: add cl entry Co-authored-by: Seth Hoenig <shoenig@duck.com>	2022-11-17 09:47:11 -06:00
Seth Hoenig	74f1fe437e	ci: use hashicorp/setup-golang for setting up go compiler, cache (#15271 ) This PR changes test-core to make use of https://github.com/hashicorp/setup-golang to consolidate the setting up of the Go compiler and the Go modules cache used for the CI job. Fixes: #14905	2022-11-17 07:50:45 -06:00
Tim Gross	dd3a07302e	keyring: update handle to state inside replication loop (#15227 ) * keyring: update handle to state inside replication loop When keyring replication starts, we take a handle to the state store. But whenever a snapshot is restored, this handle is invalidated and no longer points to a state store that is receiving new keys. This leaks a bunch of memory too! In addition to operator-initiated restores, when fresh servers are added to existing clusters with large-enough state, the keyring replication can get started quickly enough that it's running before the snapshot from the existing clusters have been restored. Fix this by updating the handle to the state store on each pass.	2022-11-17 08:40:12 -05:00
Ayrat Badykov	c94c231c08	fix create snapshot request docs (#15242 )	2022-11-17 08:43:40 +01:00
Tim Gross	6415fb4284	eval broker: shed all but one blocked eval per job after ack (#14621 ) When an evaluation is acknowledged by a scheduler, the resulting plan is guaranteed to cover up to the `waitIndex` set by the worker based on the most recent evaluation for that job in the state store. At that point, we no longer need to retain blocked evaluations in the broker that are older than that index. Move all but the highest priority / highest `ModifyIndex` blocked eval into a canceled set. When the `Eval.Ack` RPC returns from the eval broker it will signal a reap of a batch of cancelable evals to write to raft. This paces the cancelations limited by how frequently the schedulers are acknowledging evals; this should reduce the risk of cancelations from overwhelming raft relative to scheduler progress. In order to avoid straggling batches when the cluster is quiet, we also include a periodic sweep through the cancelable list.	2022-11-16 16:10:11 -05:00

1 2 3 4 5 ...

24115 commits