Commit graph

17068 commits

Author SHA1 Message Date
Preetha Appan afff27b69b More error->debug for logging in the bin packing iterator 2019-12-12 15:50:16 -06:00
ebarriosjr b953239227 driver/pot: Added extra_hosts and args commands (#6577) 2019-12-12 16:29:45 -05:00
Buck Doyle 09067b4eb7
UI: Fix client sorting (#6817)
There are two changes here, and some caveats/commentary:

1. The “State“ table column was actually sorting only by status. The state was not an actual property, just something calculated in each client row, as a product of status, isEligible, and isDraining. This PR adds isDraining as a component of compositeState so it can be used for sorting.

2. The Sortable mixin declares dependent keys that cause the sort to be live-updating, but only if the members of the array change, such as if a new client is added, but not if any of the sortable properties change. This PR adds a SortableFactory function that generates a mixin whose listSorted computed property includes dependent keys for the sortable properties, so the table will live-update if any of the sortable properties change, not just the array members. There’s a warning if you use SortableFactory without dependent keys and via the original Sortable interface, so we can eventually migrate away from it.
2019-12-12 13:06:54 -06:00
Michael Lange ed8fd28a10
Merge pull request #6808 from hashicorp/b-ui/unclosed-log-streams
UI: Unclosed log streams
2019-12-12 10:55:49 -08:00
Preetha Appan 3458b41290 Use debug logging for scheduler internals
We currently log an error if preemption is unable to find a suitable set of
allocations to preempt. This commit changes that to debug level since not finding
preemptable allocations is not an error condition.
2019-12-12 12:05:29 -06:00
Tim Gross e439e927ed
e2e: run client/allocs metrics tests nightly (#6842)
Refactor the metrics end-to-end tests so they can be run with our e2e
test framework. Runs fabio/prometheus and a collection of jobs that
will cause metrics to be measured. We then query Prometheus to ensure
we're publishing those allocation metrics and some metrics from the
clients as well.

Includes adding a placeholder for running the same tests on Windows.
2019-12-12 12:45:16 -05:00
Mahmood Ali d80ae6765b simplify cgroup path lookup 2019-12-11 12:43:25 -05:00
Seth Hoenig 2600c95af7
Merge pull request #6838 from hashicorp/f-parallelize-state-store-tests
tests: parallelize state store tests
2019-12-11 11:05:52 -06:00
Mahmood Ali 94ab62dfb4 executor: stop joining executor to container cgroup
Stop joining libcontainer executor process into the newly created task
container cgroup, to ensure that the cgroups are fully destroyed on
shutdown, and to make it consistent with other plugin processes.

Previously, executor process is added to the container cgroup so the
executor process resources get aggregated along with user processes in
our metric aggregation.

However, adding executor process to container cgroup adds some
complications with much benefits:

First, it complicates cleanup.  We must ensure that the executor is
removed from container cgroup on shutdown.  Though, we had a bug where
we missed removing it from the systemd cgroup.  Because executor uses
`containerState.CgroupPaths` on launch, which includes systemd, but
`cgroups.GetAllSubsystems` which doesn't.

Second, it may have advese side-effects.  When a user process is cpu
bound or uses too much memory, executor should remain functioning
without risk of being killed (by OOM killer) or throttled.

Third, it is inconsistent with other drivers and plugins.  Logmon and
DockerLogger processes aren't in the task cgroups.  Neither are
containerd processes, though it is equivalent to executor in
responsibility.

Fourth, in my experience when executor process moves cgroup while it's
running, the cgroup aggregation is odd.  The cgroup
`memory.usage_in_bytes` doesn't seem to capture the full memory usage of
the executor process and becomes a red-harring when investigating memory
issues.

For all the reasons above, I opted to have executor remain in nomad
agent cgroup and we can revisit this when we have a better story for
plugin process cgroup management.
2019-12-11 11:28:09 -05:00
Mahmood Ali 739e5e8811 drivers/exec: test all cgroups are destroyed 2019-12-11 11:12:29 -05:00
Seth Hoenig d45dec1ca8 tests: parallelize state store tests
It has been decided we're going to live in a many core world.
Let's take advantage of that and parallelize these state store
tests which all run in memory and are largely CPU bound.

An unscientific benchmark demonstrating the improvement:

[mp state (master)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	5.162s

[mp state (f-parallelize-state-store-tests)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	1.527s
2019-12-11 09:36:37 -06:00
Tim Gross b25713a837
doc: spread is inherited from job to group (#6837) 2019-12-11 09:59:26 -05:00
Drew Bailey a8bb422500
Merge pull request #6834 from hashicorp/monitor-changelog
add 6828 to changelog
2019-12-11 08:17:12 -05:00
Michael Schurter cdfaa3ca8a
Merge pull request #6833 from hashicorp/sentinel-imports-note
Make note of Sentinel standard imports
2019-12-10 13:56:01 -08:00
Chris Arcand deb84a41f6 Make note of Sentinel standard imports
> Sentinel-embedded applications can choose to whitelist or blacklist
certain standard imports. Please reference the documentation for the
Sentinel-enabled application you're using to determine if all standard
imports are available.
2019-12-10 14:44:51 -06:00
Drew Bailey fc6f49dba9
add 6828 to changelog 2019-12-10 15:02:34 -05:00
Tim Gross 5289c1e0aa
doc: explain ALLOC_INDEX uniqueness guarantees (#6830)
The `ALLOC_INDEX` isn't guaranteed to be unique, and this has caused
some user confusion. The servers make a best-effort attempt to make
this value unique from 0 to count-1 but when you have canaries on the
task group, there are reused indexes because you have multiple job
versions running at the same time. If a user needs a unique number for
interpolating a value in your application, they can get this by
combining the job version and the alloc index.

Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2019-12-10 10:30:26 -05:00
Danielle 246a4e898b
Merge pull request #6828 from hashicorp/b/nomad-monitor-panic
command: error when no node is found for `monitor`
2019-12-10 14:29:32 +01:00
Danielle Lancashire cd764ab0e9
command: error when no node is found for monitor
Currently `nomad monitor -node-id` will panic when a node-id does not
match any nodes, as there is no empty result bounds checking. Here we
return an error to the user when no nodes are found.
2019-12-10 13:10:47 +01:00
Chris Dickson 4d8ba272d1 client: expose allocated CPU per task (#6784) 2019-12-09 15:40:22 -05:00
Seth Hoenig 2508892973
Merge pull request #6800 from hashicorp/b-update-freeport
tests: swap lib/freeport for tweaked helper/freeport
2019-12-09 09:50:26 -06:00
Tim Gross 74a01477fd
Merge pull request #6631 from hashicorp/dependabot/npm_and_yarn/ui/lodash.mergewith-4.6.2
Bump lodash.mergewith from 4.6.1 to 4.6.2 in /ui
2019-12-09 09:47:14 -05:00
Tim Gross 06e30473c0
Merge pull request #6629 from hashicorp/dependabot/npm_and_yarn/ui/lodash.defaultsdeep-4.6.1
Bump lodash.defaultsdeep from 4.6.0 to 4.6.1 in /ui
2019-12-09 09:47:05 -05:00
Seth Hoenig f0c3dca49c tests: swap lib/freeport for tweaked helper/freeport
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.

Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.

This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
2019-12-09 08:37:32 -06:00
dependabot[bot] 28ff457228
Bump lodash.mergewith from 4.6.1 to 4.6.2 in /ui
Bumps [lodash.mergewith](https://github.com/lodash/lodash) from 4.6.1 to 4.6.2.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/commits)

Signed-off-by: dependabot[bot] <support@github.com>
2019-12-09 13:37:45 +00:00
Tim Gross 7ab879f347
Merge pull request #6628 from hashicorp/dependabot/npm_and_yarn/ui/handlebars-4.1.2
Bump handlebars from 4.1.1 to 4.1.2 in /ui
2019-12-09 08:37:10 -05:00
Tim Gross 48608acd37
Bump fstream from 1.0.11 to 1.0.12 in /ui (#6630)
Bumps [fstream](https://github.com/npm/fstream) from 1.0.11 to 1.0.12.
- [Release notes](https://github.com/npm/fstream/releases)
- [Commits](https://github.com/npm/fstream/compare/v1.0.11...v1.0.12)

Signed-off-by: dependabot[bot] <support@github.com>
2019-12-09 08:36:57 -05:00
dependabot[bot] 0d89b2f5ea Bump lodash.merge from 4.6.1 to 4.6.2 in /ui (#6632)
Bumps [lodash.merge](https://github.com/lodash/lodash) from 4.6.1 to 4.6.2.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/commits)

Signed-off-by: dependabot[bot] <support@github.com>
2019-12-09 08:36:44 -05:00
Mahmood Ali 0b7085ba3a driver: allow disabling log collection
Operators commonly have docker logs aggregated using various tools and
don't need nomad to manage their docker logs.  Worse, Nomad uses a
somewhat heavy docker api call to collect them and it seems to cause
problems when a client runs hundreds of log collections.

Here we add a knob to disable log aggregation completely for nomad.
When log collection is disabled, we avoid running logmon and
docker_logger for the docker tasks in this implementation.

The downside here is once disabled, `nomad logs ...` commands and API
no longer return logs and operators must corrolate alloc-ids with their
aggregated log info.

This is meant as a stop gap measure.  Ideally, we'd follow up with at
least two changes:

First, we should optimize behavior when we can such that operators don't
need to disable docker log collection.  Potentially by reverting to
using pre-0.9 syslog aggregation in linux environments, though with
different trade-offs.

Second, when/if logs are disabled, nomad logs endpoints should lookup
docker logs api on demand.  This ensures that the cost of log collection
is paid sparingly.
2019-12-08 14:15:03 -05:00
Mahmood Ali ded2a725db
Merge pull request #6788 from hashicorp/b-timeout-logmon-stop
logmon: add timeout to RPC operations
2019-12-06 19:12:06 -05:00
abhip c81b3f5cbd Update consensus.html.md (#6813)
The url for raft algorithm pdf is no longer valid. Here is correct url - https://raft.github.io/raft.pdf  and website is https://raft.github.io/
2019-12-06 06:17:30 -08:00
Seth Hoenig c83d479bf8
Merge pull request #6814 from hashicorp/f-use-golangci-lint
swap gometalint for golangci-lint
2019-12-06 08:16:17 -06:00
dependabot[bot] 12e34aef1d
Bump lodash.defaultsdeep from 4.6.0 to 4.6.1 in /ui
Bumps [lodash.defaultsdeep](https://github.com/lodash/lodash) from 4.6.0 to 4.6.1.
- [Release notes](https://github.com/lodash/lodash/releases)
- [Commits](https://github.com/lodash/lodash/compare/4.6.0...4.6.1)

Signed-off-by: dependabot[bot] <support@github.com>
2019-12-06 14:13:32 +00:00
dependabot[bot] 66ff8a191f
Bump handlebars from 4.1.1 to 4.1.2 in /ui
Bumps [handlebars](https://github.com/wycats/handlebars.js) from 4.1.1 to 4.1.2.
- [Release notes](https://github.com/wycats/handlebars.js/releases)
- [Changelog](https://github.com/wycats/handlebars.js/blob/master/release-notes.md)
- [Commits](https://github.com/wycats/handlebars.js/compare/v4.1.1...v4.1.2)

Signed-off-by: dependabot[bot] <support@github.com>
2019-12-06 14:13:27 +00:00
dependabot[bot] 2020fbc7a4
Bump fstream from 1.0.11 to 1.0.12 in /ui
Bumps [fstream](https://github.com/npm/fstream) from 1.0.11 to 1.0.12.
- [Release notes](https://github.com/npm/fstream/releases)
- [Commits](https://github.com/npm/fstream/compare/v1.0.11...v1.0.12)

Signed-off-by: dependabot[bot] <support@github.com>
2019-12-06 14:13:19 +00:00
Charlie Voiselle ad09167b08 Added Discuss badge to README (#6695) 2019-12-06 06:09:23 -08:00
Fredrik Hoem Grelland 3e164b42fe Update network.html.md (#6782)
There is an undocumented way of mapping a dynamically allocated port to the container. This is applicable in bridge networking ( necessary for consul connect enabled services ) to expose the service *directly*. This is needed when using upstream connect services, but you need to expose the service by normal means. By referencing the current documentation you need to use static ports in order to do so. Introduced in #6189 but undocumented
2019-12-06 06:08:56 -08:00
James Rasell 6fcb19890b docs: add jrasell/chemtrail to community resources. (#6798) 2019-12-06 06:07:00 -08:00
Seth Hoenig 37c7e60cd5 spellcheck: add misspell to linter dependencies
You'd think since golangci-lint embeds misspell we could use that,
but it fails to run if it finds no Go source files, which is the
case in our website/ directory that we want to check.
2019-12-05 19:27:14 -06:00
Danielle Lancashire d2075ebae9 spellcheck: Fix spelling of retrieve 2019-12-05 18:59:47 -06:00
Danielle Lancashire 0071a06c8b chore: Switch from gometalinter to golangci-lint
gometalinter has been deprecated, with golangci-lint as its spiritual
and recommended successor. Here we switch to using it with an equivalent
configuration, albeit with newer versions of some linters.

To maintain compatibility with existing settings, we have a couple of
things disabled here, specifically:

- tests
        We have a lot of unused code in our tests that choke deadcode.
        We should attempt to clean these up soon so that we can lint our
        testcode.
- govet.check-shadowing = false
        This breaks on redefining `err` which we do all over the nomad
        codebase.
2019-12-05 18:58:13 -06:00
Michael Lange 13d9e5ead7 Don't initialize variables before you need them 2019-12-05 10:51:51 -08:00
Mahmood Ali 574dac202c
Merge pull request #6809 from hoh/patch-1
docs: port name requires quotes in hcl
2019-12-05 10:09:28 -05:00
Hugo Herter a98e59e0fc docs: port name requires quotes in hcl
When trying to run this example, Nomad v0.10.2 raises the following error:
`Error getting job struct: Error parsing job file from example-ipv6.hcl: error parsing: At 33:22: Unknown token: 27:16 IDENT db`

Adding quotes around the port map `db` fixes the problem and the job works as expected.
2019-12-05 12:37:24 +01:00
Michael Lange 2c6392ff1e Light test coverage for the stream-logger class 2019-12-05 00:19:36 -08:00
Michael Lange 9504d10ce7 Fix a race condition where a response sent after the logger stops never gets canceled 2019-12-05 00:19:35 -08:00
Michael Schurter c26ec9a59c
Merge pull request #6807 from hashicorp/docs-097
docs: add #6792 backport to 0.9.7 changelog
2019-12-04 14:49:16 -08:00
Michael Schurter ba88ff874b docs: add #6792 backport to 0.9.7 changelog 2019-12-04 14:48:45 -08:00
Michael Schurter 1abc36c42c
Merge pull request #6805 from hashicorp/release-0102
Release 0102
2019-12-04 14:38:44 -08:00
Michael Schurter 8daeeb84c8
Merge pull request #6806 from hashicorp/docs-0102
docs: release 0.10.2 and 0.9.7
2019-12-04 14:27:53 -08:00