open-nomad

Author	SHA1	Message	Date
James Rasell	939b23936a	Merge pull request #10744 from hashicorp/b-remove-duplicate-imports chore: remove duplicate import statements	2021-06-11 16:42:34 +02:00
Mahmood Ali	74efd3626e	Merge pull request #10742 from hashicorp/deflake-tests-20210608 Deflaking Test 2021 June edition	2021-06-11 09:14:40 -04:00
James Rasell	c168108bb7	Merge pull request #10739 from hashicorp/f-remove-unused-types-pkg core: remove unused types pkg and PeriodicCallback type.	2021-06-11 13:27:22 +02:00
James Rasell	492e308846	tests: remove duplicate import statements.	2021-06-11 09:39:22 +02:00
James Rasell	3b16e4d0b3	jobspec2: remove duplicate imports statements.	2021-06-11 09:38:47 +02:00
James Rasell	050b5408c7	drivers: remove duplicate import statements.	2021-06-11 09:38:09 +02:00
James Rasell	2898e5d379	e2e: remove duplicate import statements.	2021-06-11 09:37:23 +02:00
Mahmood Ali	9b35bf1858	deflake TestNomad_BootstrapExpect and other leader tests The test fails reliably locally on my machine. The test uses non-dev mode where Raft actions get committed to disk, causing operations to exceed the 50ms tight Raft deadlines. So, here we ensure that non-dev servers use default Raft config files with longer timeouts. Also, noticed that the test queries a server, that may a follower with a stale state. I've updated the test to ensure we query the leader for its state. The Barrier call ensures that the leader is a "stable" leader with committed entries. Protects against a window where a new leader reports the previous term before it commits a raft log entry.	2021-06-10 22:04:10 -04:00
Mahmood Ali	ff73cc279e	tests: deflake TestAgentProfile_RemoteClient TestAgentProfile_RemoteClient test must wait for the client node to be registered in raft state store, and not merely that the server has a network connection from the client. In https://app.circleci.com/pipelines/github/hashicorp/nomad/15539/workflows/8dcbc3f3-946b-4da0-b089-9093788bc0c9/jobs/147919, notice how `node registration complete` log line occured after the test already have failed. This is another case of flakiness due to not waiting for client registration.	2021-06-10 22:00:15 -04:00
Mahmood Ali	8009d9837c	tests: deflake TestMonitor_Monitor_RemoteServer and cross-region tests Ensure that all servers are joined to each other before test proceed, instead of just joining them to the first server and relying on background serf propagation. Relying on backgorund serf propagation is a cause of flakiness, specially for tests with multiple regions. The server receiving the RPC may not be aware of the region and fail to forward RPC accordingly. For example, consider `TestMonitor_Monitor_RemoteServer` failure in https://app.circleci.com/pipelines/github/hashicorp/nomad/16402/workflows/7f327235-7d0c-40ba-9757-600522afca51/jobs/158045 you can observe: * `nomad-117` is joined to `nomad-118` and `nomad-119` * `nomad-119` is the foreign region * `nomad-117` gains leadership in the default region, `nomad-118` is the non-leader * search logs for `nomad: adding server` and notice that `nomad-118` only added `nomad-118` and `nomad-118`, but not `nomad-119`! * so the query to the non-leader in the test fails to be forwarded to the appopriate region.	2021-06-10 21:27:55 -04:00
Mahmood Ali	071c556b3d	tests: deflake CSI forwarding tests This updates `client.Ready()` so it returns once the client node got registered at the servers. Previously, it returns when the fingerprinters first batch completes, wtihout ensuring that the node is stored in the Raft data. The tests may fail later when it with unknown node errors later. `client.Reedy()` seem to be only called in CSI and some client stats now. This class of bug, assuming client is registered without checking, is a source of flakiness elsewhere. Other tests use other mechanisms for checking node readiness, though not consistently.	2021-06-10 21:26:34 -04:00
Isabel Suchanek	785eb40985	Merge pull request #10740 from hashicorp/docs-deploy-monitor docs: add deployment monitor to docs, changelog	2021-06-10 13:53:03 -07:00
Isabel Suchanek	c6c52bc53e	docs: add deployment monitor to docs, changelog Updates the deployment status and job run docs	2021-06-10 10:51:33 -07:00
James Rasell	25883eca43	core: remove unused types pkg and PeriodicCallback type.	2021-06-10 15:57:13 +02:00
Mahmood Ali	b372a1d2b4	update release to 1.1.1 (#10735 )	2021-06-10 08:57:30 -04:00
Mahmood Ali	448282ff84	prepare for 1.1.2 dev cycle	2021-06-10 08:04:25 -04:00
Nomad Release Bot	4fe52bc753	remove generated files	2021-06-10 08:04:25 -04:00
Nomad Release bot	7cc7389afd	Generate files for 1.1.1 release	2021-06-10 08:04:25 -04:00
Mahmood Ali	fd15ac1821	prepare changelog for 1.1.1/1.0.7 release	2021-06-10 08:04:25 -04:00
Shishir Mahajan	f50f10504f	Update containerd task driver options. - hostname - auth Signed-off-by: Shishir Mahajan <smahajan@roblox.com>	2021-06-10 08:03:49 -04:00
Mahmood Ali	aa77c2731b	tests: use standard library testing.TB Glint pulled in an updated version of mitchellh/go-testing-interface which broke some existing tests because the update added a Parallel() method to testing.T. This switches to the standard library testing.TB which doesn't have a Parallel() method.	2021-06-09 16:18:45 -07:00
Isabel Suchanek	dfaef2468c	cli: add monitor flag to deployment status Adding '-verbose' will print out the allocation information for the deployment. This also changes the job run command so that it now blocks until deployment is complete and adds timestamps to the output so that it's more in line with the output of node drain. This uses glint to print in place in running in a tty. Because glint doesn't yet support cmd/powershell, Windows workflows use a different library to print in place, which results in slightly different formatting: 1) different margins, and 2) no spinner indicating deployment in progress.	2021-06-09 16:18:45 -07:00
Mahmood Ali	baacc3c19b	golang: update to 1.16.5 (#10733 ) 1.16.5 contains some security fixes for zip that are notable to get: https://github.com/golang/go/issues?q=milestone%3AGo1.16.5+label%3ACherryPickApproved	2021-06-09 11:51:41 -04:00
Mike Wickett	409075d51f	website: update alert banner (#10728 )	2021-06-09 11:02:10 -04:00
Mahmood Ali	0976af471c	driver/docker: ignore cpuset errors for short-lived tasks follow up (#10730 ) minor refactor and changelog	2021-06-09 11:00:39 -04:00
Seth Hoenig	2eaf7c8ef8	Merge pull request #10732 from hashicorp/docs-update-cl docs: update cl	2021-06-09 09:57:20 -05:00
Seth Hoenig	f4b4727a64	docs: update cl	2021-06-09 09:50:29 -05:00
Seth Hoenig	7ce74d80eb	Merge pull request #10729 from hashicorp/f-cns-acl-check_cp-ent consul: move consul acl tests into ent files	2021-06-09 09:45:13 -05:00
Mahmood Ali	c2026dfa28	Merge pull request #10416 from hashicorp/b-cores-docker driver/docker: ignore error if container exists before cgroup can be written	2021-06-09 10:34:02 -04:00
Seth Hoenig	dbdc479970	consul: move consul acl tests into ent files (cherry-pick ent back to oss) This PR moves a lot of Consul ACL token validation tests into ent files, so that we can verify correct behavior difference between OSS and ENT Nomad versions.	2021-06-09 08:38:42 -05:00
Michael Schurter	fff95b0697	docs: improve wait_for_index metrics description (#10717 ) Old description of `{plan,worker}.wait_for_index` described the metric in terms of waiting for a snapshot which has two problems: 1. "Snapshot" is an overloaded term in Nomad and operators can't be expected to know which use we're referring to here. 2. The most important thing about the metric is what we're waiting on before taking a snapshot: the raft index of the object to be processed (plan or eval). The new description tries to cram all of that context into the tiny space provided. See #5791 for details about the `wait_for_index` mechanism in general.	2021-06-09 08:53:06 -04:00
Seth Hoenig	d656777dd7	Merge pull request #10720 from hashicorp/f-cns-acl-check consul: correctly check consul acl token namespace when using consul oss	2021-06-08 15:43:42 -05:00
Seth Hoenig	87be8c4c4b	consul: correctly check consul acl token namespace when using consul oss This PR fixes the Nomad Object Namespace <-> Consul ACL Token relationship check when using Consul OSS (or Consul ENT without namespace support). Nomad v1.1.0 introduced a regression where Nomad would fail the validation when submitting Connect jobs and allow_unauthenticated set to true, with Consul OSS - because it would do the namespace check against the Consul ACL token assuming the "default" namespace, which does not work because Consul OSS does not have namespaces. Instead of making the bad assumption, expand the namespace check to handle each special case explicitly. Fixes #10718	2021-06-08 13:55:57 -05:00
Michael Schurter	3196326a73	Merge pull request #10721 from hashicorp/b-icanthazip e2e: use api.ipify.org	2021-06-08 09:45:04 -07:00
Tim Gross	e44b039ea0	docs: warn not to set `network_mode` for Connect-enabled Docker task	2021-06-08 10:14:15 -04:00
James Rasell	bf2a5baf73	Merge pull request #10723 from hashicorp/f-changelog-10712 chanagelog: add entry for #10712	2021-06-08 15:05:00 +02:00
James Rasell	1061e3e5eb	chanagelog: add entry for #10712	2021-06-08 14:28:56 +02:00
Brandon Romano	ee946c7ef6	Merge pull request #10679 from hashicorp/ks.website/updates-alert-banner chore: updates alert-banner — hashiconf	2021-06-08 04:17:27 -07:00
Michael Schurter	319650d481	e2e: use api.ipify.org ipv4.icanhazip.com returns ipv6 addresses	2021-06-07 15:12:42 -07:00
Seth Hoenig	c13bf8b917	Merge pull request #10715 from hashicorp/f-cns-attrs consul: probe consul namespace feature before using namespace api	2021-06-07 16:11:17 -05:00
Seth Hoenig	209e2d6d81	consul: pr cleanup namespace probe function signatures	2021-06-07 15:41:01 -05:00
Tim Gross	3d6f8a3ecc	changelog for backporting #10714	2021-06-07 15:46:15 -04:00
dependabot[bot]	c056a9b922	build(deps): bump lodash from 4.17.20 to 4.17.21 in /website (#10716 ) Bumps [lodash](https://github.com/lodash/lodash) from 4.17.20 to 4.17.21. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.20...4.17.21) --- updated-dependencies: - dependency-name: lodash dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-06-07 15:01:01 -04:00
Kendall Strautman	9f0d4cecba	Update website/data/alert-banner.js	2021-06-07 10:35:31 -07:00
Florian Apolloner	ad472e8079	Fixed global-search keyboard shortcut for non-english keyboard layouts. Closes #10646	2021-06-07 13:32:38 -04:00
Seth Hoenig	519429a2de	consul: probe consul namespace feature before using namespace api This PR changes Nomad's wrapper around the Consul NamespaceAPI so that it will detect if the Consul Namespaces feature is enabled before making a request to the Namespaces API. Namespaces are not enabled in Consul OSS, and require a suitable license to be used with Consul ENT. Previously Nomad would check for a 404 status code when makeing a request to the Namespaces API to "detect" if Consul OSS was being used. This does not work for Consul ENT with Namespaces disabled, which returns a 500. Now we avoid requesting the namespace API altogether if Consul is detected to be the OSS sku, or if the Namespaces feature is not licensed. Since Consul can be upgraded from OSS to ENT, or a new license applied, we cache the value for 1 minute, refreshing on demand if expired. Fixes https://github.com/hashicorp/nomad-enterprise/issues/575 Note that the ticket originally describes using attributes from https://github.com/hashicorp/nomad/issues/10688. This turns out not to be possible due to a chicken-egg situation between bootstrapping the agent and setting up the consul client. Also fun: the Consul fingerprinter creates its own Consul client, because there is no [currently] no way to pass the agent's client through the fingerprint factory.	2021-06-07 12:19:25 -05:00
James Rasell	15bb40db77	Merge pull request #10712 from hashicorp/b-gh-10711 cmd: validate the type flag when querying plugin status.	2021-06-07 18:14:20 +02:00
Mahmood Ali	9de37cf1d8	update changelog for GH-10710 (#10713 ) Co-authored-by: Tim Gross <tgross@hashicorp.com>	2021-06-07 10:27:53 -04:00
Mahmood Ali	2c73552b4d	pool: track usage of incoming streams (#10710 ) Track usage of incoming streams on a connection. Connections without reference counts get marked as unused and reaped in a periodic job. This fixes a bug where `alloc exec` and `alloc fs` sessions get terminated unexpectedly. Previously, when a client heartbeats switches between servers, the pool connection reaper eventually identifies the connection as unused and closes it even if it has an active exec/fs sessions. Fixes #10579	2021-06-07 10:22:37 -04:00
James Rasell	888371a012	cmd: validate the type flag when querying plugin status.	2021-06-07 13:53:28 +02:00

1 2 3 4 5 ...

21447 commits