open-nomad

Author	SHA1	Message	Date
Mahmood Ali	84a3522133	Consider all system jobs for a new node (#11054 ) When a node becomes ready, create an eval for all system jobs across namespaces. The previous code uses `job.ID` to deduplicate evals, but that ignores the job namespace. Thus if there are multiple jobs in different namespaces sharing the same ID/Name, only one will be considered for running in the new node. Thus, Nomad may skip running some system jobs in that node.	2021-08-18 09:50:37 -04:00
Mahmood Ali	97966c7a71	e2e: Run system jobs on all datacenters (#11060 ) Target all e2e datacenters for system and sysbatch e2e tests. They require that the system jobs run on all linux clients. However, the jobs currenly only target `dc1` datacenter, but the nightly e2e cluster has 4 clients spread in `dc1` and `dc2` datacenters, causing the tests to fail. I missed this problem in e2e dev cluster because it only used a single dc1 datacenter.	2021-08-17 11:01:47 -04:00
Mahmood Ali	c37339a8c8	Merge pull request #9160 from hashicorp/f-sysbatch core: implement system batch scheduler	2021-08-16 09:30:24 -04:00
James Rasell	534368780b	Merge pull request #11051 from hashicorp/b-gh-11047 tlsutil: update testing certificates close to expiry.	2021-08-16 09:42:01 +02:00
James Rasell	54d6785bcc	tlsutil: update testing certificates close to expiry.	2021-08-13 11:09:40 +02:00
Mahmood Ali	5ae9df80bf	docs: Consul Connect tweaks (#11040 ) Tweaks to the commands in Consul Connect page. For multi-command scripts, having the leading `$` is a bit annoying, as it makes copying the text harder. Also, the `copy` button would only copy the first command and ignore the rest. Also, the `echo 1 > ...` commands are required to run as root, unlike the rest! I made them use `\| sudo tee` pattern to ease copy & paste as well. Lastly, update the CNI plugin links to 1.0.0. It's fresh off the oven - just got released less than an hour ago: https://github.com/containernetworking/plugins/releases/tag/v1.0.0 .	2021-08-11 17:14:26 -04:00
Mahmood Ali	47d2dcd60a	Merge pull request #11034 from tgross/docs-cni-install docs: note CNI requirement for bridge networking	2021-08-11 11:07:25 -04:00
Tim Gross	de957b48ff	docs: note CNI requirement for bridge networking Using `bridge` networking requires that you have CNI plugins installed on the client, but this isn't in the jobspec `network` docs which are the first place someone will look when trying to configure task networking.	2021-08-11 10:18:35 -04:00
Michael Schurter	a7aae6fa0c	Merge pull request #10848 from ggriffiths/listsnapshot_secrets CSI Listsnapshot secrets support	2021-08-10 15:59:33 -07:00
Mahmood Ali	ea003188fa	system: re-evaluate node on feasibility changes (#11007 ) Fix a bug where system jobs may fail to be placed on a node that initially was not eligible for system job placement. This changes causes the reschedule to re-evaluate the node if any attribute used in feasibility checks changes. Fixes https://github.com/hashicorp/nomad/issues/8448	2021-08-10 17:17:44 -04:00
Mahmood Ali	bfc766357e	deployments: canary=0 is implicitly autopromote (#11013 ) In a multi-task-group job, treat 0 canary groups as auto-promote. This change fixes an edge case where Nomad requires a manual promotion, if the job had any group with canary=0 and rest of groups having auto_promote set. Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2021-08-10 17:06:40 -04:00
Mahmood Ali	efcc8bf082	Speed up client startup and registration (#11005 ) Speed up client startup, by retrying more until the servers are known. Currently, if client fingerprinting is fast and finishes before the client connect to a server, node registration may be delayed by 15 seconds or so! Ideally, we'd wait until the client discovers the servers and then retry immediately, but that requires significant code changes. Here, we simply retry the node registration request every second. That's basically the equivalent of check if the client discovered servers every second. Should be a cheap operation. When testing this change on my local computer and where both servers and clients are co-located, the time from startup till node registration dropped from 34 seconds to 8 seconds!	2021-08-10 17:06:18 -04:00
Luiz Aoqui	c1d1906628	ui: add missing pipe separator in parameterized and periodic jobs (#11020 )	2021-08-10 13:48:20 -04:00
Michael Schurter	ec08bd6ac7	Merge pull request #10995 from miao1007/patch-1 docs: Add replication_token link with authoritative_region	2021-08-10 10:48:02 -07:00
Jai	29a7fe6efa	Merge pull request #10666 from hashicorp/b-ui/search-namespaces ui: Fix fuzzy search namespace-handling	2021-08-10 13:13:20 -04:00
Mike Wickett	7e914d8f46	chore: update alert banner (#11022 )	2021-08-10 12:56:13 -04:00
Jai Bhagat	a9b9132f35	edit hierarchy to lead with namespace before job	2021-08-10 10:35:36 -04:00
Luiz Aoqui	d283e90c35	ui: only dipslay "Dispatch Job" button on parameterized jobs (#11019 )	2021-08-09 17:49:08 -04:00
Luiz Aoqui	5b50b385e8	make: embed the Nomad UI data by default (#11018 )	2021-08-09 16:53:44 -04:00
Lir (Rookout)	f720179ba0	Some Rookout docs tweaks (#10989 )	2021-08-09 11:19:36 +02:00
Michael Schurter	c39ca0773d	Merge pull request #10951 from hashicorp/b-cn-proxy consul/connect: avoid warn messages on connect proxy errors	2021-08-06 15:25:40 -07:00
Michael Schurter	a87f748d6d	Merge pull request #11010 from hashicorp/docs-10875 docs: add backward incompat note about #10875	2021-08-06 08:28:48 -07:00
Michael Schurter	6d14c181dd	docs: add backward incompat note about #10875 Fixes #11002	2021-08-05 15:08:55 -07:00
James Rasell	c2b975163f	Merge pull request #11006 from hashicorp/f-gh-10929-changelog changelog: add entry for #10929	2021-08-05 17:32:19 +02:00
James Rasell	a9a04141a3	consul/connect: avoid warn messages on connect proxy errors When creating a TCP proxy bridge for Connect tasks, we are at the mercy of either end for managing the connection state. For long lived gRPC connections the proxy could reasonably expect to stay open until the context was cancelled. For the HTTP connections used by connect native tasks, we experience connection disconnects. The proxy gets recreated as needed on follow up requests, however we also emit a WARN log when the connection is broken. This PR lowers the WARN to a TRACE, because these disconnects are to be expected. Ideally we would be able to proxy at the HTTP layer, however Consul or the connect native task could be configured to expect mTLS, preventing Nomad from MiTM the requests. We also can't mange the proxy lifecycle more intelligently, because we have no control over the HTTP client or server and how they wish to manage connection state. What we have now works, it's just noisy. Fixes #10933	2021-08-05 11:27:35 +02:00
James Rasell	c7449b4810	changelog: add entry for #10929	2021-08-05 10:48:36 +02:00
James Rasell	61436b437a	Merge pull request #10929 from AchilleAsh/fix-token-docker-auth-config fix: load token in docker auth config	2021-08-05 10:44:39 +02:00
Luiz Aoqui	7341615fac	changelog: add entry for #10934 (#11001 )	2021-08-04 11:33:18 -04:00
James Rasell	13d2b26e27	Merge pull request #10996 from hashicorp/b-fix-doublespace-general-cli-opts cli: fix minor format error within `-ca-cert` help text.	2021-08-04 09:21:19 +02:00
Luiz Aoqui	a81e6a427d	ui: fix job dispatch page when job doesn't have any meta fields (#10934 )	2021-08-03 13:50:43 -04:00
Mahmood Ali	28bc234e84	e2e: fix tests Use basic sleeps in busybox images. busybox are very light, and ping has permissions complications, and it may fail for network related issues.	2021-08-03 11:38:35 -04:00
Seth Hoenig	3371214431	core: implement system batch scheduler This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527	2021-08-03 10:30:47 -04:00
James Rasell	78a489418d	cli: fix minor format error within `-ca-cert` help text.	2021-08-03 16:05:06 +02:00
みゃお	8d970d97d3	[doc]Add replication_token link with authoritative_region replication_token always works together with authoritative_region, add a link for better doc.	2021-08-03 18:56:00 +08:00
Mahmood Ali	0bc12fba7c	Only initialize task.VolumeMounts when not-nil (#10990 ) 1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning. The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally. Fixes #10981	2021-08-02 13:08:10 -04:00
Mike Wickett	85278ab3f7	website: update consent manager (#10977 )	2021-08-02 12:56:20 -04:00
Derek Strickland	7210f855e8	Merge pull request #10976 from itorres/api-docs-allocation-restart-sample API docs: Fix allocation restart example	2021-08-02 08:48:45 -04:00
James Rasell	15dc41d17f	Merge pull request #10987 from hashicorp/f-docs-order-external-drivers-alphabetically docs: order external driver overview alphabetically.	2021-08-02 12:50:16 +02:00
James Rasell	167b6c50ff	docs: order external driver overview alphabetically.	2021-08-02 10:51:37 +02:00
Lir (Rookout)	216d0392a8	Rookout driver docs (#10950 ) Co-authored-by: James Rasell <jrasell@users.noreply.github.com>	2021-08-02 10:09:45 +02:00
Mahmood Ali	8585aa8991	Merge pull request #10969 from hashicorp/merge-release-1.1.3 Merge release 1.1.3	2021-07-30 11:23:25 -04:00
Ignacio Torres Masdeu	3f784f17f7	Fix allocation restart API docs example	2021-07-30 16:45:21 +02:00
Mike Nomitch	6a8158fd5a	Adds documentation for file mode to sink docs (#10972 )	2021-07-29 16:09:18 -04:00
Mahmood Ali	b590d43b0d	website: publish 1.1.3 release (#10970 )	2021-07-29 12:35:18 -04:00
Mahmood Ali	327ad78ea5	prepare for next dev cycle	2021-07-29 12:32:09 -04:00
Mahmood Ali	70f541287b	e2e: wait for allocs and deployments (#10967 ) As we moved to using `-detach` for registering jobs, we should wait until allocs and deployments are created before asserting their properties. Fixing `TestNodeDrainIgnoreSystem` and `TestRescheduleProgressDeadlineFail` tests as they seem particularly flaky, failing 9 and 7 times (respectively) in the last two weeks.	2021-07-29 10:52:04 -04:00
Nomad Release Bot	423799b2dd	remove generated files	2021-07-29 04:17:44 +00:00
Nomad Release Bot	6b52b26517	Release v1.1.3	2021-07-29 04:17:00 +00:00
Nomad Release bot	b5dff8be42	Generate files for 1.1.3 release	2021-07-29 03:43:03 +00:00
Mahmood Ali	12d4a0b759	prepare release docker image for fetching	2021-07-28 23:33:59 -04:00

1 2 3 4 5 ...

21689 commits