open-nomad

Author	SHA1	Message	Date
Tim Gross	3169839653	docs: always use -ignore-system on node drain with CSI (#8606 ) Postrun hooks for allocation runners don't currently block the registration of terminal health with the servers, which is what allows system jobs to be drained. So draining nodes with jobs that claim CSI volumes requires the `-ignore-system` job to ensure that the postrun hook for service jobs gets a chance to execute.	2020-08-07 11:22:28 -04:00
Tim Gross	7d53ed88d6	csi: client RPCs should return wrapped errors for checking (#8605 ) When the client-side actions of a CSI client RPC succeed but we get disconnected during the RPC or we fail to checkpoint the claim state, we want to be able to retry the client RPC without getting blocked by the client-side state (ex. mount points) already having been cleaned up in previous calls.	2020-08-07 11:01:36 -04:00
Tim Gross	81b604fa13	csi: controller unpublish should check current alloc count (#8604 ) Using the count of node claims from earlier in the `CSIVolume.Unpublish RPC doesn't correctly account for cases where the RPC was interrupted but checkpointed. Instead, we'll check the current allocation count and status to determine whether we need to send a controller unpublish.	2020-08-07 10:43:45 -04:00
Seth Hoenig	d6a60ff4b1	Merge pull request #8603 from hashicorp/f-upgrade-consul-api deps: upgrade import of consul/api	2020-08-07 08:46:19 -05:00
Seth Hoenig	fb1c85a956	deps: upgrade import of consul/api Upgrade our consul/api import to the equivelent of consul@v1.8.1 which includes a bug fix necessary for #6913. If consul would publish a proper api/ submodule tag we could reference that.	2020-08-06 21:02:33 -05:00
Michael Lange	286e56ed82	Make eq-by helper resilient to a lack of prop since handlebars doesn't short-circuit evaluation	2020-08-06 17:59:26 -07:00
Michael Lange	476002d727	Key the annotations each loop by annotationKey for stable dom nodes	2020-08-06 17:58:43 -07:00
Michael Lange	a04d4f2d76	Add integration test for line-chart annotation staggering	2020-08-06 17:37:09 -07:00
Michael Lange	59d98b80ca	Add missing word "two" to test name Co-authored-by: Buck Doyle <buck@hashicorp.com>	2020-08-06 15:43:29 -07:00
Tim Gross	2854298089	csi: release claims via csi_hook postrun unpublish RPC (#8580 ) Add a Postrun hook to send the `CSIVolume.Unpublish` RPC to the server. This may forward client RPCs to the node plugins or to the controller plugins, depending on whether other allocations on this node have claims on this volume. By making clients responsible for running the `CSIVolume.Unpublish` RPC (and making the RPC available to a `nomad volume detach` command), the volumewatcher becomes only used by the core GC job and we no longer need async volume GC from job deregister and node update.	2020-08-06 14:51:46 -04:00
Michael Schurter	c7c603eda5	build: update from Go 1.14.6 to Go 1.14.7 Go 1.14.7 fixes CVE-2020-16845 which is not believed to impact Nomad.	2020-08-06 11:50:29 -07:00
Michael Schurter	057e1c021f	Merge pull request #8597 from hashicorp/b-vault-revoke-log-line vault: log once per interval if batching revocation	2020-08-06 11:32:47 -07:00
Tim Gross	314458ebdb	csi: update volumewatcher to use unpublish RPC (#8579 ) This changeset updates `nomad/volumewatcher` to take advantage of the `CSIVolume.Unpublish` RPC. This lets us eliminate a bunch of code and associated tests. The raft batching code can be safely dropped, as the characteristic times of the CSI RPCs are on the order of seconds or even minutes, so batching up raft RPCs added complexity without any real world performance wins. Includes refactor w/ test cleanup and dead code elimination in volumewatcher	2020-08-06 14:31:18 -04:00
Tim Gross	eaa14ab64c	csi: add unpublish RPC (#8572 ) This changeset is plumbing for a `nomad volume detach` command that will be reused by the volumewatcher claim GC as well.	2020-08-06 13:51:29 -04:00
Tim Gross	4bbf18703f	csi: retry controller client RPCs on next controller (#8561 ) The documentation encourages operators to run multiple controller plugin instances for HA, but the client RPCs don't take advantage of this by retrying when the RPC fails in cases when the plugin is unavailable (because the node has drained or the alloc has failed but we haven't received an updated fingerprint yet). This changeset tries all known controllers on ready nodes before giving up, and adds tests that exercise the client RPC routing and retries.	2020-08-06 13:24:24 -04:00
Luiz Aoqui	0fadad46d3	Merge pull request #8595 from hashicorp/docs/fix-connect-log-level docs: fix Consul Connect log_level meta key	2020-08-06 11:00:09 -04:00
Buck Doyle	67f8d95917	Add override for null events collection This removes some errors in the console if there are no autoscaling events.	2020-08-06 07:51:56 -05:00
James Rasell	38f23b79df	Merge pull request #8574 from shishir-a412ed/f-ui-containerd-driver Add nomad-driver-containerd to nomad UI docs.	2020-08-06 09:36:11 +02:00
Michael Schurter	2385fee0d2	vault: log once per interval if batching revocation This log line should be rare since: 1. Most tokens should be logged synchronously, not via this async batched method. Async revocation only takes place when Vault connectivity is lost and after leader election so no revocations are missed. 2. There should rarely be >1 batch (1,000) tokens to revoke since the above conditions should be brief and infrequent. 3. Interval is 5 minutes, so this log line will be emitted at most once every 5 minutes. What makes this log line rare is also what makes it interesting: due to a bug prior to Nomad 0.11.2 some tokens may never get revoked. Therefore Nomad tries to re-revoke them on every leader election. This caused a massive buildup of old tokens that would never be properly revoked and purged. Nomad 0.11.3 mostly fixed this but still had a bug in purging revoked tokens via Raft (fixed in #8553). The nomad.vault.distributed_tokens_revoked metric is only ticked upon successful revocation and purging, making any bugs or slowness in the process difficult to detect. Logging before a potentially slow revocation+purge operation is performed will give users much better indications of what activity is going on should the process fail to make it to the metric.	2020-08-05 15:39:21 -07:00
Luiz Aoqui	602d3373ed	docs: fix Consul Connect log_level meta key	2020-08-05 17:01:03 -04:00
Buck Doyle	9074d33f28	UI: Add truncation of rendered search results (#8571 ) This closes #8549. Thanks to @optiz0r for the bug report. Having the global search attempt to render every returned result is obviously a mistake!	2020-08-05 15:58:44 -05:00
Michael Lange	3b59b52bca	Compare scale events by their UID instead of reference equality	2020-08-05 12:02:23 -07:00
Michael Lange	ecaee67ff1	Use the correct gray for the info details	2020-08-05 12:02:22 -07:00
Michael Lange	033618c46e	Safestr the annotation style property	2020-08-05 12:02:22 -07:00
Michael Lange	ebdb0c4101	Conditionally show the scaling timeline or accordion	2020-08-05 12:02:22 -07:00
Michael Lange	4c4e2e505f	Unit test coverage for the ScaleEventsChart data domain logic	2020-08-05 12:02:22 -07:00
Michael Lange	792fa64101	Integration tests for the ScaleEventsChart component	2020-08-05 12:02:22 -07:00
Michael Lange	39583e0ce1	Force mock error scale events to be annotations	2020-08-05 12:02:22 -07:00
Michael Lange	09f6bca470	New ScaleEventsChart component Displays all scale events in the form of an annotated line chart. When annotations are clicked, the timestamp, message, and meta propeties for the event are displayed below the chart.	2020-08-05 12:02:22 -07:00
Michael Lange	21f3b7dfcc	Add activeAnnotation property to line-chart	2020-08-05 12:02:22 -07:00
Michael Lange	2903d1f504	Stagger line chart annotations when they are too close	2020-08-05 12:02:22 -07:00
Michael Lange	b332e186b2	Add curve options to line chart	2020-08-05 12:02:22 -07:00
Michael Lange	a891e907f5	Test coverage for line chart annotations	2020-08-05 12:02:22 -07:00
Michael Lange	24b6aeb746	Story for line chart annotations	2020-08-05 12:02:22 -07:00
Michael Lange	8445e22faf	Add annotations to the line chart component	2020-08-05 12:02:22 -07:00
Michael Lange	299f2b6453	Make the default time series date format for line chart more useful	2020-08-05 12:02:21 -07:00
Shishir Mahajan	874f948520	Fix review comments.	2020-08-05 11:51:00 -07:00
Shishir Mahajan	088b0694b4	Add nomad-driver-containerd to nomad UI docs.	2020-08-04 11:29:05 -07:00
Chris Baker	a5dc6df0ff	Merge pull request #8583 from hashicorp/cgbaker-patch-1 Update CHANGELOG.md	2020-07-31 11:16:48 -05:00
Chris Baker	07e8b405d2	Update CHANGELOG.md Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2020-07-31 11:14:13 -05:00
Chris Baker	e920bd22bb	Update CHANGELOG.md Co-authored-by: Mahmood Ali <mahmood@hashicorp.com>	2020-07-31 11:13:40 -05:00
Chris Baker	8ba61e60d6	Update CHANGELOG.md label in changelog listed wrong issue number	2020-07-31 11:05:53 -05:00
Drew Bailey	c06a84e4a2	ignore VAULT_NAMESPACE (#8581 ) VAULT_NAMESPACE in 0.12.1 and previous versions is already ignored. \n revert change that used it as a default since it will break oss users	2020-07-31 10:33:21 -04:00
Tim Gross	5dba653b43	csi/e2e: add 2nd controller for node drain testing (#8573 )	2020-07-31 08:03:49 -04:00
Buck Doyle	c2ce0a1dec	Add linting for acceptance accessibility audits (#8570 ) This makes use of the PR I recently had merged to eslint-plugin-ember-a11y-testing to add linting that ensures an accessibility audit is called at least once per acceptance test file. When I have added linting for component tests, it can apply there too. I added exclusions for the filesystem browser tests, which are covered by behaviors/fs and for the search test which will involve significant overrides to Ember Power Select default templates.	2020-07-30 12:40:05 -05:00
James Rasell	90903bb625	Merge pull request #8555 from hashicorp/remove-size-detail-from-docs-homepage docs: remove Nomad binary size from README and website.	2020-07-30 19:20:23 +02:00
Mahmood Ali	490b9ce3a0	Handle Scaling Policies in Job Plan endpoint (#8567 ) Fixes https://github.com/hashicorp/nomad/issues/8544 This PR fixes a bug where using `nomad job plan ...` always report no change if the submitted job contain scaling. The issue has three contributing factors: 1. The plan endpoint doesn't populate the required scaling policy ID; unlike the job register endpoint 2. The plan endpoint suppresses errors on job insertion - the job insertion fails here, because the scaling policy is missing the required ID 3. The scheduler reports no update necessary when the relevant job isn't in store (because the insertion failed) This PR fixes the first two factors. Changing the scheduler to be more strict might make sense, but may violate some idempotency invariant or make the scheduler more brittle.	2020-07-30 12:27:36 -04:00
Michael Lange	acecdbf4a2	Merge pull request #8569 from hashicorp/d/update-scale-request-params Docs: Update Reason to Message in the job scale docs	2020-07-30 09:18:18 -07:00
Michael Lange	42517d87b4	Merge pull request #8568 from hashicorp/b-ui/scale-post-message Use the correct Message property instead of Reason in scale POST request	2020-07-30 09:13:15 -07:00
Michael Lange	af446cec10	Update Reason to Message in the job scale docs	2020-07-30 09:06:08 -07:00

... 3 4 5 6 7 ...

19331 commits