Commit graph

2184 commits

Author SHA1 Message Date
Preetha Appan f443c7d321
Fix whitespace 2019-02-14 12:49:26 -06:00
Chris Baker ab02f4588e
Update job_init.go
minor
2019-02-14 12:49:26 -06:00
Preetha Appan d405881e34
expand job init example with spread and affinity 2019-02-14 12:49:26 -06:00
Michael Schurter 3b84e08fa4
Merge pull request #5297 from hashicorp/b-docker-logging
Docker: Fix logging config parsing
2019-02-11 06:57:52 -08:00
Michael Schurter e3e1797850 consul: squelch noisy useless logs
Only log when syncing actually did something.
2019-02-04 11:07:57 -08:00
Iskander (Alex) Sharipov 7b1a4eaef9
nomad/command: fix strings.Contains args order
Swapped call args order to meet the expected behavior.

Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>
2019-02-02 09:43:24 +03:00
Michael Schurter cad3f1022a cli: do not duplicate reschedule headers per group
Fixes #5291
2019-02-01 09:28:36 -08:00
Alex Dadgar 84d0afccae Generate files for 0.9.0-beta2 2019-01-30 13:31:50 -08:00
Preetha Appan 8e621a167b
fix tests 2019-01-30 14:46:24 -06:00
Alex Dadgar 41265d4d61 Change types of weights on spread/affinity 2019-01-30 12:20:38 -08:00
Alex Dadgar bc804dda2e Nomad 0.9.0-beta1 generated code 2019-01-30 10:49:44 -08:00
Nick Ethier c21ce7b523
add circbufwriter package 2019-01-28 11:35:21 -05:00
Nick Ethier 3ef163b03b
executor: prevent logger from blocking when stderr pipe is detached 2019-01-25 23:08:01 -05:00
Michael Schurter 13f061a83f
Merge pull request #5196 from hashicorp/f-plugin-utils
Make plugins/shared external and make pluginutls/
2019-01-23 06:59:32 -08:00
Michael Schurter 32daa7b47b goimports until make check is happy 2019-01-23 06:27:14 -08:00
Preetha Appan 5f1d467ed2
nil check node resources to prevent panic 2019-01-22 19:34:02 -06:00
Michael Schurter be0bab7c3f move pluginutils -> helper/pluginutils
I wanted a different color bikeshed, so I get to paint it
2019-01-22 15:50:08 -08:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Alex Dadgar cdcd3c929c loader and singleton 2019-01-22 15:11:57 -08:00
Alex Dadgar 6c2782f037 move catalog + grpcutils 2019-01-22 15:11:57 -08:00
Mahmood Ali 05e32fb525
Merge pull request #5213 from hashicorp/b-api-separate
Slimmer /api package
2019-01-18 20:52:53 -05:00
Mahmood Ali 5df63fda7c
Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Mahmood Ali 6bdb9864de api: remove MockJob from exported functions
`api.MockJob` is a test utility, that's only used by `command/agent`
package.  This moves it to the package and removes it from the public
API.
2019-01-18 14:51:31 -05:00
Michael Schurter 48afda786b
Merge pull request #5187 from hashicorp/test-consul
Port a bunch of pre-0.9 Consul tests to 0.9
2019-01-15 07:41:50 -08:00
Alex Dadgar 471fdb3ccf
Merge pull request #5173 from hashicorp/b-log-levels
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Mahmood Ali 9909d98bee Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Michael Schurter fc1bb95ef8 Remove old comment; it's been fixed! 2019-01-14 09:56:53 -08:00
Preetha Appan 7bd1440710
REfactor statedb factory config to set it directly in client config 2019-01-12 10:38:20 -06:00
Preetha Appan f059ef8a47
Modified destroy failure handling to rely on allocrunner's destroy method
Added a unit test with custom statedb implementation that errors, to
use to verify destroy errors
2019-01-12 10:37:12 -06:00
Alex Dadgar 5621086f50 Enable json logs 2019-01-11 11:36:37 -08:00
Alex Dadgar 14ed757a56 Plugins use parent loggers
This PR fixes various instances of plugins being launched without using
the parent loggers. This meant that logs would not all go to the same
output, break formatting etc.
2019-01-11 11:36:37 -08:00
Preetha Appan b46728a88b
Make spread weight a pointer with default value if unset 2019-01-11 10:31:21 -06:00
Chris Baker e9db2ae822 Merge branch 'master' of github.com:hashicorp/nomad into f-1157-validate-node-meta-variables 2019-01-09 18:56:49 +00:00
Chris Baker d5b1a56f3b increased config validation coverage for dev mode 2019-01-09 18:56:40 +00:00
Michael Schurter ac169008f0
Merge pull request #5045 from hashicorp/b-drivermanager-tests-drain
drain: fix node drain monitoring
2019-01-09 10:23:28 -08:00
Mahmood Ali 90f3cea187
Merge pull request #5157 from hashicorp/r-drivers-no-cstructs
drivers: avoid referencing client/structs package
2019-01-09 13:06:46 -05:00
Mahmood Ali 03a9e812c8 cli: support hitting pre-0.9 nomad agents
node.NodeResources is nil when operating against pre-0.9.
2019-01-08 19:32:26 -05:00
Chris Baker d8a3a74c43 move if dev check into config validation, to support dev-mod
validation in the future
2019-01-08 22:21:48 +00:00
Michael Schurter 8a6b1acaa6 drain: fix node drain monitoring
The whole approach to monitoring drains has ordering issues and lacks
state to output useful error messages.

AFAICT to get the tests passing reliably I needed to change the behavior
of monitoring.

Parts of these tests are skipped in CI, and they should be rewritten as
e2e tests.
2019-01-08 09:35:16 -08:00
Chris Baker 220e9e838f refactored config validation into a new method, modified Meta.Client
tests appropriately
2019-01-08 15:07:36 +00:00
Mahmood Ali 916a40bb9e move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Chris Baker 91449d6809 Merge branch 'master' of github.com:hashicorp/nomad into f-1157-validate-node-meta-variables 2019-01-08 02:17:35 +00:00
Chris Baker bf00f93d87 moved interp key regex out to a helper function 2019-01-08 00:11:47 +00:00
Alex Dadgar 8a35d7b1dd Test recovery 2019-01-07 14:49:41 -08:00
Chris Baker f99e18aaf4 gofmt to make check happy 2019-01-07 18:01:59 +00:00
Chris Baker a61afad5bb added validation on client metadata keys 2019-01-07 17:16:38 +00:00
Nick Ethier ab3c5c0a8b
fix test 2018-12-20 13:54:29 -05:00
Nick Ethier fad553ab6a
command: wait for drivers to be ready before test 2018-12-20 13:52:33 -05:00
Nick Ethier 5b9bba08c6
fix tests 2018-12-20 01:05:17 -05:00
Nick Ethier 060ceb3635
fix test 2018-12-20 01:01:53 -05:00
Nick Ethier a96afb6c91
fix tests that fail as a result of async client startup 2018-12-20 00:53:44 -05:00
Nick Ethier 82175d1328
client/drivermananger: add driver manager
The driver manager is modeled after the device manager and is started by the client.
It's responsible for handling driver lifecycle and reattachment state, as well as
processing the incomming fingerprint and task events from each driver. The mananger
exposes a method for registering event handlers for task events that is used by the
task runner to update the server when a task has been updated with an event.

Since driver fingerprinting has been implemented by the driver manager, it is no
longer needed in the fingerprint mananger and has been removed.
2018-12-18 22:55:18 -05:00
Alex Dadgar 4c57d2ec4d Add plugin API versioning to plugin loader and plugins 2018-12-18 16:48:00 -08:00
Nick Ethier 09dadf0a23
Merge branch 'master' into f-grpc-executor
* master: (71 commits)
  Fix output of 'nomad deployment fail' with no arg
  Always create a running allocation when testing task state
  tests: ensure exec tests pass valid task resources (#4992)
  some changes for more idiomatic code
  fix iops related tests
  fixed bug in loop delay
  gofmt
  improved code for readability
  client: updateAlloc release lock after read
  fixup! device attributes in `nomad node status -verbose`
  drivers/exec: support device binds and mounts
  fix iops bug and increase test matrix coverage
  tests: tag image explicitly
  changelog
  ci: install lxc-templates explicitly
  tests: skip checking rdma cgroup
  ci: use Ubuntu 16.04 (Xenial) in TravisCI
  client: update driver info on new fingerprint
  drivers/docker: enforce volumes.enabled (#4983)
  client: Style: use fluent style for building loggers
  ...
2018-12-13 14:41:09 -05:00
Brian Lalor 31ef34838e
Fix output of 'nomad deployment fail' with no arg 2018-12-13 13:22:17 -05:00
Mahmood Ali d497729826
Merge pull request #4978 from hashicorp/f-device-tweaks
Display device attributes in `nomad node status -verbose`
2018-12-12 19:45:07 -05:00
Mahmood Ali 00c9385a2b fixup! device attributes in nomad node status -verbose 2018-12-12 09:17:31 -05:00
Alex Dadgar 86d9ad4397 fix iops bug and increase test matrix coverage 2018-12-11 15:28:21 -08:00
Mahmood Ali 69b2355274
Merge pull request #4975 from hashicorp/fix-master-20181209
Some test fixes and remedies
2018-12-11 18:00:21 -05:00
Alex Dadgar 1531b6d534
Merge pull request #4970 from hashicorp/f-no-iops
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Mahmood Ali 5a487ac884 tests: prevent indefinite blocking in some tests
Noticed few places where tests seem to block indefinitely and panic
after the test run reaches the test package timeout.

I intend to follow up with the proper fix later, but timing out is much
better than indefinitely blocking.
2018-12-11 09:35:26 -05:00
Michael Schurter 8808ab9cea
Merge pull request #4953 from hashicorp/b-script-context-wrapper
consul: add ScriptExecutor context wrapper
2018-12-10 17:22:53 -08:00
Michael Schurter 4c5f3ae82c
Merge pull request #4952 from hashicorp/b-script-context
consul: fix script checks exiting after 1 run
2018-12-10 17:22:15 -08:00
Mahmood Ali 14668f48d1 device attributes in nomad node status -verbose
This reports device attributes like the following:

```
$ nomad node status -self -verbose
ID          = f7adb958-29e1-2a5a-2303-9d61ffaab33a
Name        = mars.local
Class       = <none>
DC          = dc1
Drain       = false
Eligibility = eligible
Status      = ready
Uptime      = 12h40m13s

Drivers
Driver       Detected  Healthy  Message                               Time
docker       true      true     healthy                               2018-12-10T11:47:19-05:00
...

Attributes
cpu.arch                      = amd64
cpu.frequency                 = 2200
cpu.modelname                 = Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
cpu.numcores                  = 12
...

Device Group Attributes
Device Group = nomad/file/mock
block_device = sda1
filesystem   = ext4
size         = 63.2 GB

Meta
```
2018-12-10 12:18:24 -05:00
Mahmood Ali 9f69b8bfec Rename helper_stats -> helper_devices 2018-12-10 12:18:24 -05:00
Nick Ethier 47df1dde10
Merge branch 'master' into f-grpc-executor 2018-12-06 21:42:38 -05:00
Nick Ethier 29ef54c0ee
executor: merge plugin shim with executor package 2018-12-06 21:13:45 -05:00
Alex Dadgar c918a96490 Warn if IOPS is being used 2018-12-06 16:17:09 -08:00
Alex Dadgar 1e3c3cb287 Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Nick Ethier 8b20de4801
executor: use grpc instead of netrpc as plugin protocol
* Added protobuf spec for executor
 * Seperated executor structs into their own package
2018-12-05 11:03:56 -05:00
Mahmood Ali f8efc40b8b tests: stop integration tests tasks explicitly
Also update the new recommended `nomad job` subcommands
2018-12-04 11:50:59 -05:00
Michael Schurter 8fa5e90095 consul: add ScriptExecutor context wrapper
Since d335a82859ca2177bc6deda0c2c85b559daf2db3 ScriptExecutors now take
a timeout duration instead of a context. This broke the script check
removal code which used context cancelation propagation to remove
script checks while they were executing.

This commit adds a wrapper around ScriptExecutors that obeys context
cancelation again. The only downside is that it leaks a goroutine until
the underlying Exec call completes or timeouts.

Since check removal is relatively rare, check timeouts usually low, and
scripts usually fast, the risk of leaking a goroutine seems very small.
2018-12-03 20:26:31 -08:00
Michael Schurter 6459c19ffc consul: fix script checks exiting after 1 run
Fixes a regression caused in d335a82859ca2177bc6deda0c2c85b559daf2db3

The removal of the inner context made the remaining cancels cancel the
outer context and cause script checks to exit prematurely.
2018-12-03 18:50:02 -08:00
Danielle Tomlinson 51a9f7369e
Merge pull request #4936 from hashicorp/f-legacy-refactor
Refactor and repackage client/driver
2018-11-30 13:38:06 +01:00
Danielle Tomlinson ffc5e5d56b executors: Unify go-plugin handshake 2018-11-30 10:59:23 +01:00
Danielle Tomlinson fdfe93aa25 fixup: executorplugin: fix rkt build 2018-11-30 10:47:08 +01:00
Danielle Tomlinson d26a310db0 client: Move executor plugins into own package 2018-11-30 10:46:13 +01:00
Danielle Tomlinson 9b3e731f88 command: Remove Extraneous field in nodedrain test 2018-11-30 10:46:13 +01:00
Nick Ethier 80ae7e34f4
Merge pull request #4906 from hashicorp/f-metric-prefix-master
Port metric prefix filtering to master
2018-11-29 22:27:47 -05:00
Nick Ethier b1484aec33
nomad: fix hclog usage 2018-11-29 22:27:39 -05:00
Alex Dadgar 4ee603c382 Device hook and devices affect computed node class
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.

TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
2018-11-27 17:25:33 -08:00
Nick Ethier ed65610ec6
command/agent: additional tests for telemetry config parsing 2018-11-19 23:22:33 -05:00
Nick Ethier b81e4e18f0
agent: suppose filter_default telemetry option 2018-11-19 23:21:48 -05:00
Nick Ethier 85b221a1d6
nomad: add flag to disable publishing of job_summary metrics for dispatched jobs 2018-11-19 23:21:19 -05:00
Nick Ethier 9e64ce7d73
docker: properly launch docker logger process 2018-11-19 22:59:12 -05:00
Mahmood Ali 9479015f51
Merge pull request #4884 from hashicorp/f-alloc-devices-cli
Report alloc device statistics in API and CLI
2018-11-16 18:04:54 -05:00
Mahmood Ali 6f9126f475 show Device Stats header in alloc status 2018-11-16 17:34:37 -05:00
Mahmood Ali 00ffd02ced Show stable order of device attributes 2018-11-16 17:34:37 -05:00
Preetha 5f094633fa
Merge pull request #4889 from hashicorp/f-service-meta
Pass service metadata "external-source" for consul UI integration
2018-11-16 12:24:21 -06:00
Preetha Appan 18708d3f0b
Pass service metadata "external-source" for consul UI integration 2018-11-16 11:28:56 -06:00
Mahmood Ali 33c96a803a tweak whitespace in device stats output 2018-11-16 10:37:39 -05:00
Mahmood Ali 159b8f866a Display device stats in nomad alloc status 2018-11-16 10:26:32 -05:00
Mahmood Ali d88a3f8413 Prepare to reuse device resources printing 2018-11-16 10:26:32 -05:00
Michael Schurter 6f3712ed48 gofmt -s -w command/helper_stats_test.go
Fixes the static checks build
2018-11-15 14:14:05 -08:00
Mahmood Ali 24b37e0aaf Display StatsObject nested objects as well 2018-11-15 08:09:54 -05:00
Mahmood Ali ee9353fbd6 Use disk display format for devices 2018-11-14 22:13:23 -05:00
Mahmood Ali 0712be643f Print verbose device in nomad node status -stats 2018-11-14 22:13:23 -05:00
Mahmood Ali 93e8fc53f9 device stats summary in node status
Sample output with a mock device:

```
Host Resource Utilization
CPU             Memory          Disk
2651/26400 MHz  9.6 GiB/16 GiB  98 GiB/234 GiB

Device Resource Utilization
nomad/file/mock[README.md]    511 bytes
nomad/file/mock[e2e.go]       239 bytes
nomad/file/mock[e2e_test.go]  128 bytes

Allocations
No allocations placed
```
2018-11-14 22:13:23 -05:00
Mahmood Ali c62ec124c0 Set clean config for mock driver
The default job here contains some exec task config (for setting
command and args) that aren't used for mock driver.  Now, the alloc
runner seems stricter about validating fields and errors on unexpected
fields.

Updating configs in tests so we can have an explicit task config
whenever driver is set explicitly.
2018-11-13 10:21:40 -05:00
Mahmood Ali c7610d8c22 mark and skip failing consul failing tests 2018-11-13 10:21:40 -05:00