open-nomad

Author	SHA1	Message	Date
Mahmood Ali	5b42796f1e	Merge pull request #7704 from hashicorp/b-agent-shutdown-order agent: shutdown agent http server last	2020-04-20 10:37:26 -04:00
Mahmood Ali	3e741a0caa	Merge pull request #7748 from hashicorp/b-noisy-http-logs agent: route http logs through hclog	2020-04-20 10:37:15 -04:00
Mahmood Ali	1c0e1cabc9	update changelog [ci skip]	2020-04-20 10:36:39 -04:00
Mahmood Ali	4e1366f285	agent: route http logs through hclog Pipe http server log to hclog, so that it uses the same logging format as rest of nomad logs. Also, supports emitting them as json logs, when json formatting is set. The http server logs are emitted as Trace level, as they are typically repsent HTTP client errors (e.g. failed tls handshakes, invalid headers, etc). Though, Panic logs represent server errors and are relayed as Error level.	2020-04-20 10:33:40 -04:00
Mahmood Ali	86aa8105b2	Merge pull request #7749 from hashicorp/b-docker-panic driver/docker: protect against nil container	2020-04-20 10:31:46 -04:00
Mahmood Ali	6bfef2c945	add changelog [ci skip]	2020-04-20 10:31:09 -04:00
Jeffrey 'jf' Lim	35418efb60	demo/vagrant/Vagrantfile: Update Nomad version (0.11.0) (#7579 )	2020-04-20 09:29:12 -04:00
Anthony Scalisi	9664c6b270	fix spelling errors (#6985 )	2020-04-20 09:28:19 -04:00
Charles Z	e4a669598e	label csi as beta from 0.11 release notes (#7745 )	2020-04-20 08:48:04 -04:00
Mahmood Ali	dff071c3b9	driver/docker: protect against nil container Protect against a panic when we attempt to start a container with a name that conflicts with an existing one. If the existing one is being deleted while nomad first attempts to create the container, the createContainer will fail with `container already exists`, but we get nil container reference from the `containerByName` lookup, and cause a crash. I'm not certain how we get into the state, except for being very unlucky. I suspect that this case may be the result of a concurrent restart or the docker engine API not being fully consistent (e.g. an earlier call purged the container, but docker didn't free up resources yet to create a new container with the same name immediately yet). If that's the case, then re-attempting creation will hopefully succeed, or we'd at least fail enough times for the alloc to be rescheduled to another node.	2020-04-19 15:34:45 -04:00
Jeffrey 'jf' Lim	eab600d3e1	Fix/improve "job plan" messaging (#7580 )	2020-04-17 15:53:16 -04:00
Yishan Lin	164314f7fa	Merge pull request #7741 from hashicorp/yishan/docs-rebased-preemption-update docs: update preemption page	2020-04-17 11:03:27 -07:00
Yishan Lin	b95309dc4b	docs: update preemption page This page has not been updated (yet) to reflect that support for all 3 job types (service, batch, system) which shipped in 0.9.2. The current page implies that preemption is only available for system jobs. This is early preparation for Nomad 0.12, where we plan to move Preemption from Enterprise feature suite to OSS for all.	2020-04-17 09:34:07 -07:00
Brandon Romano	3b22f5aa72	Merge pull request #7717 from hashicorp/website-alert website: Adjust the website alert to point to the blog post	2020-04-14 11:36:43 -07:00
Brandon Romano	f520757617	Adjust the website alert to point to the blog post	2020-04-14 11:17:06 -07:00
Michael Schurter	165ddda744	Merge pull request #7682 from hashicorp/b-comment-fix core: fix comment on system stack	2020-04-13 15:13:23 -07:00
Chris Baker	a37446acaa	documents the scaling block in the JSON Job docs (#7706 ) * documents the scaling block in the JSON Job docs resolves #7656 * add task-specific restart to JSON Job docs companion to #7603 * [docs] improved and corrected scaling docs * Update website/pages/api-docs/json-jobs.mdx Co-Authored-By: Michael Schurter <mschurter@hashicorp.com> Co-authored-by: Michael Schurter <mschurter@hashicorp.com>	2020-04-13 16:33:49 -05:00
Chris Baker	a6e0d17433	update `restart` documentation (#7603 ) * update `restart` documentation #7288 added support for task-specific `restart` policy. this PR updates the docs to reflect that. * added an explicit example of task-specific restart policy * Update website/pages/docs/job-specification/restart.mdx	2020-04-13 16:29:43 -05:00
Drew Bailey	f3b168e369	Merge pull request #7663 from hashicorp/b-taskrunner-shutdown_delay Run task shutdown_delay regardless of service registration	2020-04-13 13:27:24 -04:00
Drew Bailey	da11c31e4c	Update CHANGELOG.md	2020-04-13 12:41:13 -04:00
Mahmood Ali	b78680eee7	agent: shutdown agent http server last Shutdown http server last, after nomad client/server components terminate. Before this change, if the agent is taking an unexpectedly long time to shutdown, the operator cannot query the http server directly: they cannot access agent specific http endpoints and need to query another agent about the troublesome agent. Unexpectedly long shutdown can happen in normal cases, e.g. a client might hung is if one of the allocs it is running has a long shutdown_delay. Here, we switch to ensuring that the http server is shutdown last. I believe this doesn't require extra care in agent shutting down logic while operators may be able to submit write http requests. We already need to cope with operators submiting these http requests to another agent or by servers updating the client allocations.	2020-04-13 10:50:07 -04:00
Tim Gross	4e9bd1e1d1	refactor: consolidate private methods for CSI RPC (#7702 ) Follow-up for a method missed in the refactor for #7688. The `volAndPluginLookup` method is only ever called from the server's `CSI` RPC and never the `ClientCSI` RPC, so move it into that scope.	2020-04-13 10:46:43 -04:00
Tim Gross	ab3086a1f4	e2e: testing reliability (#7701 ) * pin CSI plugin versions * ensure failing CSI tests clean up * allow NOMAD_SHA env var to override makefile	2020-04-13 10:25:24 -04:00
Mahmood Ali	e6551455b9	Merge pull request #7693 from greut/bump-testify api: testify v1.5.1	2020-04-11 09:09:44 -04:00
Yoan Blanc	790df29996	api: testify v1.5.1 Signed-off-by: Yoan Blanc <yoan@dosimple.ch>	2020-04-11 13:55:10 +02:00
Tim Gross	f37e986b1b	refactor: make nodeForControllerPlugin private to ClientCSI (#7688 ) The current design of `ClientCSI` RPC requires that callers in the server know about the free-standing `nodeForControllerPlugin` function. This makes it difficult to send `ClientCSI` RPC messages from subpackages of `nomad` and adds a bunch of boilerplate to every server-side caller of a controller RPC. This changeset makes it so that the `ClientCSI` RPCs will populate and validate the controller's client node ID if it hasn't been passed by the caller, centralizing the logic of picking and validating controller targets into the `nomad.ClientCSI` struct.	2020-04-10 16:47:21 -04:00
Seth Hoenig	43804d8ca9	Merge pull request #7684 from hashicorp/b-connect-sidecar-name connect: enable configuring sidecar_task.name	2020-04-10 10:04:25 -06:00
Seth Hoenig	7ff7d2a288	Merge pull request #7683 from hashicorp/b-no-sidecar-panic connect: correctly handle missing sidecar_service task stanza	2020-04-10 09:49:59 -06:00
Seth Hoenig	0407eaaf88	connect: extract common task keys	2020-04-10 09:49:19 -06:00
Drew Bailey	591aea1edd	changelog	2020-04-10 11:14:39 -04:00
Drew Bailey	8bfee62b70	Run task shutdown_delay regardless of service registration task shutdown_delay will currently only run if there are registered services for the task. This implementation detail isn't explicity stated anywhere and is defined outside of the service stanza. This change moves shutdown_delay to be evaluated after prekill hooks are run, outside of any task runner hooks. just use time.sleep	2020-04-10 11:06:26 -04:00
Seth Hoenig	db865e05d8	connect: enable configuring sidecar_task.name Before, the submitted jobspec for sidecar_task would pass through 2 key validation steps - once for the subset specific to connect sidecar task definitions, and once again for the set of normal task definition where the task would actually get unmarshalled. The valid keys for the normal task definition did not include "name", which is supposed to be configurable for the sidecar task. To fix this, just eliminate the double validation step, and instead pass-in the correct set of keys to validate against to the one generic task parser. Fixes #7680	2020-04-09 21:01:16 -06:00
Seth Hoenig	20802da8fd	connect: correctly deal with nil sidecar_service task stanza Before, if the sidecar_service stanza of a connect enabled service was missing, the job submission would cause a panic in the nomad agent. Since the panic was happening in the API handler the agent itself continued running, but this change will the condition more gracefully. By fixing the `Copy` method, the API handler now returns the proper error. $ nomad job run foo.nomad Error submitting job: Unexpected response code: 500 (1 error occurred: * Task group api validation failed: 2 errors occurred: * Missing tasks for task group * Task group service validation failed: 1 error occurred: * Service[0] count-api validation failed: 1 error occurred: * Consul Connect must be native or use a sidecar service	2020-04-09 20:28:17 -06:00
Michael Schurter	4b475db408	core: fix comment on system stack This makes me do a double take every time I run into it, so what if we just changed it?	2020-04-09 15:19:11 -07:00
Mahmood Ali	1640f58776	Merge pull request #7676 from hashicorp/vendor-golang-org-x-20200409 Upgrade all golang.org/x packages	2020-04-09 17:18:57 -04:00
Seth Hoenig	58d844f591	Merge pull request #7678 from hashicorp/docs-connect-config-link-404 docs: fix link to envoy proxy documentation on consul site	2020-04-09 13:56:58 -06:00
Seth Hoenig	6cfecc6d03	docs: fix link to envoy proxy documentation on consul site	2020-04-09 13:46:59 -06:00
Mahmood Ali	735a478cc2	Upgrade all golang.org/x packages Upgrade all golang.org/x packages to pick up fixes and improvements. Some packages date back to 2018 and so much improvement happened since then!	2020-04-09 15:23:25 -04:00
Michael Schurter	4bd6372284	Merge pull request #7675 from hashicorp/release-post-0110 Prepare for 0.11.1 release	2020-04-09 12:20:39 -07:00
Michael Schurter	084c6bb94b	docs: add #7673 to changelog	2020-04-09 12:18:34 -07:00
Mahmood Ali	c64a79f6d1	Merge pull request #7673 from hashicorp/b-http2-cached-connections vendor: upgrade golang.org/x/net packages	2020-04-09 15:14:40 -04:00
Michael Schurter	c763cb3b37	docs: prep changelog for 0.11.1	2020-04-09 12:11:54 -07:00
Michael Schurter	f93ee566cf	release: bump version to 0.11.1 for development	2020-04-09 12:11:13 -07:00
James Rasell	20fd17c166	Merge pull request #7666 from hashicorp/add-0.11.0-changelog-release-date changelog: add 0.11.0 release date.	2020-04-09 21:10:15 +02:00
Mahmood Ali	63d15d7e5c	vendor: upgrade golang.org/net/... golang.org/net packages are ancient - upgrading them to pick up important fixes, e.g. https://go-review.googlesource.com/c/go/+/87298/	2020-04-09 14:57:39 -04:00
Mahmood Ali	1271c4ce96	Merge pull request #7672 from hashicorp/dev-e2e-tweaks-20200409 e2e: add a convenient creation script	2020-04-09 11:30:48 -04:00
Mahmood Ali	c8eddb9f6b	fixup! e2e: add a convenient creation script	2020-04-09 11:04:26 -04:00
Mahmood Ali	8a4937d9ce	e2e: add a convenient creation script Add a convenience Makefile for creating e2e environment for manual debugging.	2020-04-09 10:54:30 -04:00
James Rasell	cb2719b7e6	changelog: add 0.11.0 release date.	2020-04-09 10:35:17 +02:00
Mahmood Ali	030e40ac5c	Merge pull request #7652 from hashicorp/v-gomod-msgpaack dev: Use go mod for managing hashicorp/go-msgpack	2020-04-08 14:42:39 -04:00

1 2 3 4 5 ...

17908 commits