open-nomad

Author	SHA1	Message	Date
Danielle Lancashire	5a87b3ab4b	env_aws: Disable Retries and set Session cfg Previously, Nomad used hand rolled HTTP requests to interact with the EC2 metadata API. Recently however, we switched to using the AWS SDK for this fingerprinting. The default behaviour of the AWS SDK is to perform retries with exponential backoff when a request fails. This is problematic for Nomad, because interacting with the EC2 API is in our client start path. Here we revert to our pre-existing behaviour of not performing retries in the fast path, as if the metadata service is unavailable, it's likely that nomad is not running in AWS.	2019-12-16 10:56:32 +01:00
Michael Schurter	9b08968c2e	Merge pull request #6855 from hashicorp/b-interp-connect-task connect: canonicalize before adding sidecar	2019-12-13 09:26:44 -08:00
Mahmood Ali	4a1cc67f58	Merge pull request #6820 from hashicorp/f-skip-docker-logging-knob driver: allow disabling log collection	2019-12-13 11:41:20 -05:00
Mahmood Ali	a7361612b6	Merge pull request #6556 from hashicorp/c-vendor-multierror-20191025 Update go-multierror library	2019-12-13 11:32:42 -05:00
Mahmood Ali	46bc3b57e6	address review comments	2019-12-13 11:21:00 -05:00
Mahmood Ali	b3a1e571e5	tests: fix error format assertion multierror library changed formatting slightly.	2019-12-13 11:01:20 -05:00
Buck Doyle	d1b689c0f4	Update changelog with #6817	2019-12-13 09:16:31 -06:00
Mahmood Ali	ea30ab9c56	Update go-multierror to 72917a1 To pick up https://github.com/hashicorp/go-multierror/pull/28	2019-12-13 10:13:31 -05:00
Buck Doyle	6ff6a41df8	Fix flapping status light test (#6852 ) I unintentionally introduced a flapping test in #6817. The draining status of the node will be randomly chosen and that flag takes precedence over eligibility. This forces the draining flag to be false rather than random so the test should no longer flap. See here for an example failure: https://circleci.com/gh/hashicorp/nomad/26368	2019-12-13 09:02:02 -06:00
Preetha Appan	5f7cfa5050	update changelog	2019-12-13 08:22:14 -06:00
Mahmood Ali	5a81cd3309	Merge pull request #6839 from hashicorp/b-cgroup-cleanup executor: stop joining executor to container cgroup	2019-12-13 09:05:09 -05:00
Michael Schurter	5a25766714	docs: add #6855 to changelog Also make Connect related fixes more consistent in the changelog. I suspect users won't care if a Connect related fix is in the server's admission controller or in the client's groupservice hook or somewhere else, so I think grouping them by `consul/connect:` makes the most sense.	2019-12-12 20:58:49 -08:00
Michael Schurter	95fd2643d7	connect: canonicalize before adding sidecar Fixes #6853 Canonicalize jobs first before adding any sidecars. This fixes a bug where sidecar tasks were added without interpolated names and broke validation. Sidecar tasks must be canonicalized independently. Also adds a group network to the mock connect job because it wasn't a valid connect job before!	2019-12-12 20:55:56 -08:00
Mahmood Ali	570d81966f	Merge pull request #6854 from hashicorp/update-changelog Add notarization details to changelog	2019-12-12 20:56:53 -05:00
Michele	161780ba00	Add clarifying update	2019-12-12 15:28:47 -08:00
Michele	49d0ef63aa	Add apple notarization note	2019-12-12 15:24:18 -08:00
Preetha	2692949556	Merge pull request #6849 from hashicorp/b-debug-preemption Use debug logging for scheduler internals	2019-12-12 16:15:46 -06:00
Preetha Appan	afff27b69b	More error->debug for logging in the bin packing iterator	2019-12-12 15:50:16 -06:00
ebarriosjr	b953239227	driver/pot: Added extra_hosts and args commands (#6577 )	2019-12-12 16:29:45 -05:00
Buck Doyle	09067b4eb7	UI: Fix client sorting (#6817 ) There are two changes here, and some caveats/commentary: 1. The “State“ table column was actually sorting only by status. The state was not an actual property, just something calculated in each client row, as a product of status, isEligible, and isDraining. This PR adds isDraining as a component of compositeState so it can be used for sorting. 2. The Sortable mixin declares dependent keys that cause the sort to be live-updating, but only if the members of the array change, such as if a new client is added, but not if any of the sortable properties change. This PR adds a SortableFactory function that generates a mixin whose listSorted computed property includes dependent keys for the sortable properties, so the table will live-update if any of the sortable properties change, not just the array members. There’s a warning if you use SortableFactory without dependent keys and via the original Sortable interface, so we can eventually migrate away from it.	2019-12-12 13:06:54 -06:00
Michael Lange	ed8fd28a10	Merge pull request #6808 from hashicorp/b-ui/unclosed-log-streams UI: Unclosed log streams	2019-12-12 10:55:49 -08:00
Preetha Appan	3458b41290	Use debug logging for scheduler internals We currently log an error if preemption is unable to find a suitable set of allocations to preempt. This commit changes that to debug level since not finding preemptable allocations is not an error condition.	2019-12-12 12:05:29 -06:00
Tim Gross	e439e927ed	e2e: run client/allocs metrics tests nightly (#6842 ) Refactor the metrics end-to-end tests so they can be run with our e2e test framework. Runs fabio/prometheus and a collection of jobs that will cause metrics to be measured. We then query Prometheus to ensure we're publishing those allocation metrics and some metrics from the clients as well. Includes adding a placeholder for running the same tests on Windows.	2019-12-12 12:45:16 -05:00
Mahmood Ali	d80ae6765b	simplify cgroup path lookup	2019-12-11 12:43:25 -05:00
Seth Hoenig	2600c95af7	Merge pull request #6838 from hashicorp/f-parallelize-state-store-tests tests: parallelize state store tests	2019-12-11 11:05:52 -06:00
Mahmood Ali	94ab62dfb4	executor: stop joining executor to container cgroup Stop joining libcontainer executor process into the newly created task container cgroup, to ensure that the cgroups are fully destroyed on shutdown, and to make it consistent with other plugin processes. Previously, executor process is added to the container cgroup so the executor process resources get aggregated along with user processes in our metric aggregation. However, adding executor process to container cgroup adds some complications with much benefits: First, it complicates cleanup. We must ensure that the executor is removed from container cgroup on shutdown. Though, we had a bug where we missed removing it from the systemd cgroup. Because executor uses `containerState.CgroupPaths` on launch, which includes systemd, but `cgroups.GetAllSubsystems` which doesn't. Second, it may have advese side-effects. When a user process is cpu bound or uses too much memory, executor should remain functioning without risk of being killed (by OOM killer) or throttled. Third, it is inconsistent with other drivers and plugins. Logmon and DockerLogger processes aren't in the task cgroups. Neither are containerd processes, though it is equivalent to executor in responsibility. Fourth, in my experience when executor process moves cgroup while it's running, the cgroup aggregation is odd. The cgroup `memory.usage_in_bytes` doesn't seem to capture the full memory usage of the executor process and becomes a red-harring when investigating memory issues. For all the reasons above, I opted to have executor remain in nomad agent cgroup and we can revisit this when we have a better story for plugin process cgroup management.	2019-12-11 11:28:09 -05:00
Mahmood Ali	739e5e8811	drivers/exec: test all cgroups are destroyed	2019-12-11 11:12:29 -05:00
Seth Hoenig	d45dec1ca8	tests: parallelize state store tests It has been decided we're going to live in a many core world. Let's take advantage of that and parallelize these state store tests which all run in memory and are largely CPU bound. An unscientific benchmark demonstrating the improvement: [mp state (master)] $ go test PASS ok github.com/hashicorp/nomad/nomad/state 5.162s [mp state (f-parallelize-state-store-tests)] $ go test PASS ok github.com/hashicorp/nomad/nomad/state 1.527s	2019-12-11 09:36:37 -06:00
Tim Gross	b25713a837	doc: spread is inherited from job to group (#6837 )	2019-12-11 09:59:26 -05:00
Drew Bailey	a8bb422500	Merge pull request #6834 from hashicorp/monitor-changelog add 6828 to changelog	2019-12-11 08:17:12 -05:00
Michael Schurter	cdfaa3ca8a	Merge pull request #6833 from hashicorp/sentinel-imports-note Make note of Sentinel standard imports	2019-12-10 13:56:01 -08:00
Chris Arcand	deb84a41f6	Make note of Sentinel standard imports > Sentinel-embedded applications can choose to whitelist or blacklist certain standard imports. Please reference the documentation for the Sentinel-enabled application you're using to determine if all standard imports are available.	2019-12-10 14:44:51 -06:00
Drew Bailey	fc6f49dba9	add 6828 to changelog	2019-12-10 15:02:34 -05:00
Tim Gross	5289c1e0aa	doc: explain ALLOC_INDEX uniqueness guarantees (#6830 ) The `ALLOC_INDEX` isn't guaranteed to be unique, and this has caused some user confusion. The servers make a best-effort attempt to make this value unique from 0 to count-1 but when you have canaries on the task group, there are reused indexes because you have multiple job versions running at the same time. If a user needs a unique number for interpolating a value in your application, they can get this by combining the job version and the alloc index. Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>	2019-12-10 10:30:26 -05:00
Danielle	246a4e898b	Merge pull request #6828 from hashicorp/b/nomad-monitor-panic command: error when no node is found for `monitor`	2019-12-10 14:29:32 +01:00
Danielle Lancashire	cd764ab0e9	command: error when no node is found for `monitor` Currently `nomad monitor -node-id` will panic when a node-id does not match any nodes, as there is no empty result bounds checking. Here we return an error to the user when no nodes are found.	2019-12-10 13:10:47 +01:00
Chris Dickson	4d8ba272d1	client: expose allocated CPU per task (#6784 )	2019-12-09 15:40:22 -05:00
Seth Hoenig	2508892973	Merge pull request #6800 from hashicorp/b-update-freeport tests: swap lib/freeport for tweaked helper/freeport	2019-12-09 09:50:26 -06:00
Tim Gross	74a01477fd	Merge pull request #6631 from hashicorp/dependabot/npm_and_yarn/ui/lodash.mergewith-4.6.2 Bump lodash.mergewith from 4.6.1 to 4.6.2 in /ui	2019-12-09 09:47:14 -05:00
Tim Gross	06e30473c0	Merge pull request #6629 from hashicorp/dependabot/npm_and_yarn/ui/lodash.defaultsdeep-4.6.1 Bump lodash.defaultsdeep from 4.6.0 to 4.6.1 in /ui	2019-12-09 09:47:05 -05:00
Seth Hoenig	f0c3dca49c	tests: swap lib/freeport for tweaked helper/freeport Copy the updated version of freeport (sdk/freeport), and tweak it for use in Nomad tests. This means staying below port 10000 to avoid conflicts with the lib/freeport that is still transitively used by the old version of consul that we vendor. Also provide implementations to find ephemeral ports of macOS and Windows environments. Ports acquired through freeport are supposed to be returned to freeport, which this change now also introduces. Many tests are modified to include calls to a cleanup function for Server objects. This should help quite a bit with some flakey tests, but not all of them. Our port problems will not go away completely until we upgrade our vendor version of consul. With Go modules, we'll probably do a 'replace' to swap out other copies of freeport with the one now in 'nomad/helper/freeport'.	2019-12-09 08:37:32 -06:00
dependabot[bot]	28ff457228	Bump lodash.mergewith from 4.6.1 to 4.6.2 in /ui Bumps [lodash.mergewith](https://github.com/lodash/lodash) from 4.6.1 to 4.6.2. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/commits) Signed-off-by: dependabot[bot] <support@github.com>	2019-12-09 13:37:45 +00:00
Tim Gross	7ab879f347	Merge pull request #6628 from hashicorp/dependabot/npm_and_yarn/ui/handlebars-4.1.2 Bump handlebars from 4.1.1 to 4.1.2 in /ui	2019-12-09 08:37:10 -05:00
Tim Gross	48608acd37	Bump fstream from 1.0.11 to 1.0.12 in /ui (#6630 ) Bumps [fstream](https://github.com/npm/fstream) from 1.0.11 to 1.0.12. - [Release notes](https://github.com/npm/fstream/releases) - [Commits](https://github.com/npm/fstream/compare/v1.0.11...v1.0.12) Signed-off-by: dependabot[bot] <support@github.com>	2019-12-09 08:36:57 -05:00
dependabot[bot]	0d89b2f5ea	Bump lodash.merge from 4.6.1 to 4.6.2 in /ui (#6632 ) Bumps [lodash.merge](https://github.com/lodash/lodash) from 4.6.1 to 4.6.2. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/commits) Signed-off-by: dependabot[bot] <support@github.com>	2019-12-09 08:36:44 -05:00
Mahmood Ali	0b7085ba3a	driver: allow disabling log collection Operators commonly have docker logs aggregated using various tools and don't need nomad to manage their docker logs. Worse, Nomad uses a somewhat heavy docker api call to collect them and it seems to cause problems when a client runs hundreds of log collections. Here we add a knob to disable log aggregation completely for nomad. When log collection is disabled, we avoid running logmon and docker_logger for the docker tasks in this implementation. The downside here is once disabled, `nomad logs ...` commands and API no longer return logs and operators must corrolate alloc-ids with their aggregated log info. This is meant as a stop gap measure. Ideally, we'd follow up with at least two changes: First, we should optimize behavior when we can such that operators don't need to disable docker log collection. Potentially by reverting to using pre-0.9 syslog aggregation in linux environments, though with different trade-offs. Second, when/if logs are disabled, nomad logs endpoints should lookup docker logs api on demand. This ensures that the cost of log collection is paid sparingly.	2019-12-08 14:15:03 -05:00
Mahmood Ali	ded2a725db	Merge pull request #6788 from hashicorp/b-timeout-logmon-stop logmon: add timeout to RPC operations	2019-12-06 19:12:06 -05:00
abhip	c81b3f5cbd	Update consensus.html.md (#6813 ) The url for raft algorithm pdf is no longer valid. Here is correct url - https://raft.github.io/raft.pdf and website is https://raft.github.io/	2019-12-06 06:17:30 -08:00
Seth Hoenig	c83d479bf8	Merge pull request #6814 from hashicorp/f-use-golangci-lint swap gometalint for golangci-lint	2019-12-06 08:16:17 -06:00
dependabot[bot]	12e34aef1d	Bump lodash.defaultsdeep from 4.6.0 to 4.6.1 in /ui Bumps [lodash.defaultsdeep](https://github.com/lodash/lodash) from 4.6.0 to 4.6.1. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.6.0...4.6.1) Signed-off-by: dependabot[bot] <support@github.com>	2019-12-06 14:13:32 +00:00

1 2 3 4 5 ...

16684 commits