Commit graph

21795 commits

Author SHA1 Message Date
Mahmood Ali 97966c7a71
e2e: Run system jobs on all datacenters (#11060)
Target all e2e datacenters for system and sysbatch e2e tests.  They
require that the system jobs run on all linux clients.

However, the jobs currenly only target `dc1` datacenter, but the nightly
e2e cluster has 4 clients spread in `dc1` and `dc2` datacenters, causing
the tests to fail.

I missed this problem in e2e dev cluster because it only used a single
dc1 datacenter.
2021-08-17 11:01:47 -04:00
Mahmood Ali c37339a8c8
Merge pull request #9160 from hashicorp/f-sysbatch
core: implement system batch scheduler
2021-08-16 09:30:24 -04:00
James Rasell 534368780b
Merge pull request #11051 from hashicorp/b-gh-11047
tlsutil: update testing certificates close to expiry.
2021-08-16 09:42:01 +02:00
James Rasell 54d6785bcc
tlsutil: update testing certificates close to expiry. 2021-08-13 11:09:40 +02:00
Blake Covarrubias 0778ffab8c docs: Remove note on ingress gateway hosts field needing a port number
Update the ingress gateway documentation to remove the note stating
that a port must be specified for values in the `hosts` field when
the ingress gateway is listening on a non-standard HTTP port.

Specifying a port was required in Consul 1.8.0, but that requirement
was removed in 1.8.1 with hashicorp/consul#8190 which made Consul
include the port number when constructing the Envoy configuration.

Related Consul docs PR: hashicorp/consul#10827
2021-08-11 14:55:05 -07:00
Mahmood Ali 5ae9df80bf
docs: Consul Connect tweaks (#11040)
Tweaks to the commands in Consul Connect page.

For multi-command scripts, having the leading `$` is a bit annoying, as it makes copying the text harder. Also, the `copy` button would only copy the first command and ignore the rest.

Also, the `echo 1 > ...` commands are required to run as root, unlike the rest! I made them use `| sudo tee` pattern to ease copy & paste as well.

Lastly, update the CNI plugin links to 1.0.0. It's fresh off the oven - just got released less than an hour ago: https://github.com/containernetworking/plugins/releases/tag/v1.0.0 .
2021-08-11 17:14:26 -04:00
Grant Griffiths 7af1e0b270 Add -k consul dependency back in
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-08-11 09:03:25 -07:00
Mahmood Ali 47d2dcd60a
Merge pull request #11034 from tgross/docs-cni-install
docs: note CNI requirement for bridge networking
2021-08-11 11:07:25 -04:00
Tim Gross de957b48ff docs: note CNI requirement for bridge networking
Using `bridge` networking requires that you have CNI plugins installed
on the client, but this isn't in the jobspec `network` docs which are
the first place someone will look when trying to configure task
networking.
2021-08-11 10:18:35 -04:00
Michael Schurter a7aae6fa0c
Merge pull request #10848 from ggriffiths/listsnapshot_secrets
CSI Listsnapshot secrets support
2021-08-10 15:59:33 -07:00
Mahmood Ali ea003188fa
system: re-evaluate node on feasibility changes (#11007)
Fix a bug where system jobs may fail to be placed on a node that
initially was not eligible for system job placement.

This changes causes the reschedule to re-evaluate the node if any
attribute used in feasibility checks changes.

Fixes https://github.com/hashicorp/nomad/issues/8448
2021-08-10 17:17:44 -04:00
Mahmood Ali bfc766357e
deployments: canary=0 is implicitly autopromote (#11013)
In a multi-task-group job, treat 0 canary groups as auto-promote.

This change fixes an edge case where Nomad requires a manual promotion,
if the job had any group with canary=0 and rest of groups having
auto_promote set.

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2021-08-10 17:06:40 -04:00
Mahmood Ali efcc8bf082
Speed up client startup and registration (#11005)
Speed up client startup, by retrying more until the servers are known.

Currently, if client fingerprinting is fast and finishes before the
client connect to a server, node registration may be delayed by 15
seconds or so!

Ideally, we'd wait until the client discovers the servers and then retry
immediately, but that requires significant code changes.

Here, we simply retry the node registration request every second. That's
basically the equivalent of check if the client discovered servers every
second. Should be a cheap operation.

When testing this change on my local computer and where both servers and
clients are co-located, the time from startup till node registration
dropped from 34 seconds to 8 seconds!
2021-08-10 17:06:18 -04:00
Michael Schurter 71372b9b7e docs: add brief description to portworx readme
Also hclfmt hcl snippet.
2021-08-10 10:58:26 -07:00
Luiz Aoqui c1d1906628
ui: add missing pipe separator in parameterized and periodic jobs (#11020) 2021-08-10 13:48:20 -04:00
Michael Schurter ec08bd6ac7
Merge pull request #10995 from miao1007/patch-1
docs: Add replication_token link with authoritative_region
2021-08-10 10:48:02 -07:00
Jai 29a7fe6efa
Merge pull request #10666 from hashicorp/b-ui/search-namespaces
ui: Fix fuzzy search namespace-handling
2021-08-10 13:13:20 -04:00
Mike Wickett 7e914d8f46
chore: update alert banner (#11022) 2021-08-10 12:56:13 -04:00
Jai Bhagat a9b9132f35 edit hierarchy to lead with namespace before job 2021-08-10 10:35:36 -04:00
Luiz Aoqui d283e90c35
ui: only dipslay "Dispatch Job" button on parameterized jobs (#11019) 2021-08-09 17:49:08 -04:00
Luiz Aoqui 5b50b385e8
make: embed the Nomad UI data by default (#11018) 2021-08-09 16:53:44 -04:00
Lir (Rookout) f720179ba0
Some Rookout docs tweaks (#10989) 2021-08-09 11:19:36 +02:00
Michael Schurter c39ca0773d
Merge pull request #10951 from hashicorp/b-cn-proxy
consul/connect: avoid warn messages on connect proxy errors
2021-08-06 15:25:40 -07:00
Michael Schurter a87f748d6d
Merge pull request #11010 from hashicorp/docs-10875
docs: add backward incompat note about #10875
2021-08-06 08:28:48 -07:00
Michael Schurter 6d14c181dd docs: add backward incompat note about #10875
Fixes #11002
2021-08-05 15:08:55 -07:00
James Rasell c2b975163f
Merge pull request #11006 from hashicorp/f-gh-10929-changelog
changelog: add entry for #10929
2021-08-05 17:32:19 +02:00
James Rasell a9a04141a3
consul/connect: avoid warn messages on connect proxy errors
When creating a TCP proxy bridge for Connect tasks, we are at the
mercy of either end for managing the connection state. For long
lived gRPC connections the proxy could reasonably expect to stay
open until the context was cancelled. For the HTTP connections used
by connect native tasks, we experience connection disconnects.
The proxy gets recreated as needed on follow up requests, however
we also emit a WARN log when the connection is broken. This PR
lowers the WARN to a TRACE, because these disconnects are to be
expected.

Ideally we would be able to proxy at the HTTP layer, however Consul
or the connect native task could be configured to expect mTLS, preventing
Nomad from MiTM the requests.

We also can't mange the proxy lifecycle more intelligently, because
we have no control over the HTTP client or server and how they wish
to manage connection state.

What we have now works, it's just noisy.

Fixes #10933
2021-08-05 11:27:35 +02:00
James Rasell c7449b4810
changelog: add entry for #10929 2021-08-05 10:48:36 +02:00
James Rasell 61436b437a
Merge pull request #10929 from AchilleAsh/fix-token-docker-auth-config
fix: load token in docker auth config
2021-08-05 10:44:39 +02:00
Grant Griffiths 0f6cc14e36 Remove install dependency on consul for simplicity
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-08-05 01:09:53 -07:00
Luiz Aoqui 7341615fac
changelog: add entry for #10934 (#11001) 2021-08-04 11:33:18 -04:00
Grant Griffiths 3f01836643 Add Portworx CSI Driver example
Signed-off-by: Grant Griffiths <ggriffiths@purestorage.com>
2021-08-04 06:19:08 -07:00
James Rasell 13d2b26e27
Merge pull request #10996 from hashicorp/b-fix-doublespace-general-cli-opts
cli: fix minor format error within `-ca-cert` help text.
2021-08-04 09:21:19 +02:00
Luiz Aoqui a81e6a427d
ui: fix job dispatch page when job doesn't have any meta fields (#10934) 2021-08-03 13:50:43 -04:00
Mahmood Ali 28bc234e84 e2e: fix tests
Use basic sleeps in busybox images. busybox are very light, and ping has
permissions complications, and it may fail for network related
issues.
2021-08-03 11:38:35 -04:00
Seth Hoenig 3371214431 core: implement system batch scheduler
This PR implements a new "System Batch" scheduler type. Jobs can
make use of this new scheduler by setting their type to 'sysbatch'.

Like the name implies, sysbatch can be thought of as a hybrid between
system and batch jobs - it is for running short lived jobs intended to
run on every compatible node in the cluster.

As with batch jobs, sysbatch jobs can also be periodic and/or parameterized
dispatch jobs. A sysbatch job is considered complete when it has been run
on all compatible nodes until reaching a terminal state (success or failed
on retries).

Feasibility and preemption are governed the same as with system jobs. In
this PR, the update stanza is not yet supported. The update stanza is sill
limited in functionality for the underlying system scheduler, and is
not useful yet for sysbatch jobs. Further work in #4740 will improve
support for the update stanza and deployments.

Closes #2527
2021-08-03 10:30:47 -04:00
James Rasell 78a489418d
cli: fix minor format error within -ca-cert help text. 2021-08-03 16:05:06 +02:00
みゃお 8d970d97d3
[doc]Add replication_token link with authoritative_region
replication_token always works together with authoritative_region, add a link for better doc.
2021-08-03 18:56:00 +08:00
Mahmood Ali 0bc12fba7c
Only initialize task.VolumeMounts when not-nil (#10990)
1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning.

The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally.

Fixes #10981
2021-08-02 13:08:10 -04:00
Mike Wickett 85278ab3f7
website: update consent manager (#10977) 2021-08-02 12:56:20 -04:00
Derek Strickland 7210f855e8
Merge pull request #10976 from itorres/api-docs-allocation-restart-sample
API docs: Fix allocation restart example
2021-08-02 08:48:45 -04:00
James Rasell 15dc41d17f
Merge pull request #10987 from hashicorp/f-docs-order-external-drivers-alphabetically
docs: order external driver overview alphabetically.
2021-08-02 12:50:16 +02:00
James Rasell 167b6c50ff
docs: order external driver overview alphabetically. 2021-08-02 10:51:37 +02:00
Lir (Rookout) 216d0392a8
Rookout driver docs (#10950)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2021-08-02 10:09:45 +02:00
Mahmood Ali 8585aa8991
Merge pull request #10969 from hashicorp/merge-release-1.1.3
Merge release 1.1.3
2021-07-30 11:23:25 -04:00
Ignacio Torres Masdeu 3f784f17f7
Fix allocation restart API docs example 2021-07-30 16:45:21 +02:00
Mike Nomitch 6a8158fd5a
Adds documentation for file mode to sink docs (#10972) 2021-07-29 16:09:18 -04:00
Mahmood Ali b590d43b0d
website: publish 1.1.3 release (#10970) 2021-07-29 12:35:18 -04:00
Mahmood Ali 327ad78ea5 prepare for next dev cycle 2021-07-29 12:32:09 -04:00
Kent 'picat' Gruber b70ad3c190 Add configuration for /website using NPM 2021-07-29 11:03:26 -04:00