Commit Graph

242 Commits

Author SHA1 Message Date
Mahmood Ali 5df63fda7c
Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Chris Baker 265f935d10 renamed mispelled function, fixed incorrect comment 2019-01-18 20:00:49 +00:00
Danielle Tomlinson 12eb4631ba
Merge pull request #5204 from hashicorp/dani/loader-typo-2
Fix typo in PluginLoader
2019-01-18 11:22:22 +01:00
Danielle Tomlinson ae5ad68600 pluginloader: typo: s/validePluginConfig/validatePluginConfig 2019-01-17 19:10:32 +01:00
Danielle Tomlinson 707988fee1 plugins: Require an exe extension on windows 2019-01-17 18:43:14 +01:00
Danielle Tomlinson 7fca934509 chore: General Cleanup 2019-01-17 18:43:14 +01:00
Mahmood Ali b7faf76343 chore: Stylistic cleanup
Co-Authored-By: dantoml <dani@tomlinson.io>
2019-01-17 18:43:14 +01:00
Danielle Tomlinson 62e06eda56 chore: Cleanup formatting 2019-01-17 18:43:13 +01:00
Danielle Tomlinson 477c0b1d23 plugins: Load plugins on windows 2019-01-17 18:43:13 +01:00
Mahmood Ali 9909d98bee Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Nick Ethier 3b395d7100
drivers: plumb grpc client logger 2019-01-12 12:18:23 -05:00
Nick Ethier 7e306afde3
executor: fix failing stats related test 2019-01-12 12:18:23 -05:00
Nick Ethier b0d9440474
docker: add test for stats collection 2019-01-12 12:18:22 -05:00
Nick Ethier 9fea54e0dc
executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Mahmood Ali b08f59cdda
Merge pull request #5162 from hashicorp/f-extract-lxc
Extract LXC from nomad
2019-01-09 13:07:05 -05:00
Mahmood Ali 90f3cea187
Merge pull request #5157 from hashicorp/r-drivers-no-cstructs
drivers: avoid referencing client/structs package
2019-01-09 13:06:46 -05:00
Mahmood Ali f679975956 fixup! remove unused field 2019-01-08 12:58:12 -05:00
Mahmood Ali f015b88ea7 remove unused field 2019-01-08 12:19:44 -05:00
Mahmood Ali 62a7f951c0 remove lxc references 2019-01-08 09:28:20 -05:00
Mahmood Ali 426c981c34 Remove some dead code 2019-01-08 09:11:48 -05:00
Mahmood Ali 64f80343fc drivers: re-export ResourceUsage structs
Re-export the ResourceUsage structs in drivers package to avoid drivers
directly depending on the internal client/structs package directly.

I attempted moving the structs to drivers, but that caused some import
cycles that was a bit hard to disentagle.  Alternatively, I added an
alias here that's sufficient for our purposes of avoiding external
drivers depend on internal packages, while allowing us to restructure
packages in future without breaking source compatibility.
2019-01-08 09:11:47 -05:00
Mahmood Ali 916a40bb9e move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Mahmood Ali 9369b123de use drivers.FSIsolation 2019-01-08 09:11:47 -05:00
Danielle Tomlinson 8df20f49f7 drivers: Add internal interface for Shutdown
This allows us to correctly terminate internal state during runs of the
nomad test suite, e.g closing eventer contexts correctly.
2019-01-08 13:48:49 +01:00
Alex Dadgar fb5dc9058e regenerate protos 2019-01-07 14:49:40 -08:00
Alex Dadgar c9825a9c36 recover 2019-01-07 14:49:40 -08:00
Alex Dadgar a6b36df4de remove nil logger 2019-01-07 14:48:01 -08:00
Preetha Appan 2fb2de3cef
Standardize driver health description messages for all drivers 2019-01-06 22:06:38 -06:00
Danielle Tomlinson 43f2dc0c36 chore: Fix environement->environment typo 2019-01-03 13:31:26 +01:00
Danielle Tomlinson 45174ac3e9
Merge pull request #5041 from hashicorp/dani/b-driver-healt
drivers: Cleanup root user fingerprinting
2019-01-03 13:16:28 +01:00
Danielle Tomlinson 63b5e1a9e9 plugins: Add consistent message for requires root 2018-12-20 12:54:01 +01:00
Alex Dadgar 9d34802f7a Store device envs separately and pass to drivers 2018-12-19 14:23:09 -08:00
Alex Dadgar fff09162aa proto 2018-12-19 13:54:19 -08:00
Nick Ethier ce1a5cba0e
drivermanager: use allocID and task name to route task events 2018-12-18 23:01:51 -05:00
Alex Dadgar 730a6f5b9a lint 2018-12-18 16:48:00 -08:00
Alex Dadgar 4c57d2ec4d Add plugin API versioning to plugin loader and plugins 2018-12-18 16:48:00 -08:00
Alex Dadgar 1dabde6e0b base fixes 2018-12-18 16:48:00 -08:00
Alex Dadgar 74e7e0fba7 protos 2018-12-18 16:48:00 -08:00
Alex Dadgar b653ae2af7 utilities 2018-12-18 15:48:52 -08:00
Alex Dadgar b9ee03b2c1 protos 2018-12-18 15:48:52 -08:00
Nick Ethier 0c50a51c19
executor: encode mounts and devices correctly when using grpc 2018-12-15 00:08:23 -05:00
Nick Ethier 09dadf0a23
Merge branch 'master' into f-grpc-executor
* master: (71 commits)
  Fix output of 'nomad deployment fail' with no arg
  Always create a running allocation when testing task state
  tests: ensure exec tests pass valid task resources (#4992)
  some changes for more idiomatic code
  fix iops related tests
  fixed bug in loop delay
  gofmt
  improved code for readability
  client: updateAlloc release lock after read
  fixup! device attributes in `nomad node status -verbose`
  drivers/exec: support device binds and mounts
  fix iops bug and increase test matrix coverage
  tests: tag image explicitly
  changelog
  ci: install lxc-templates explicitly
  tests: skip checking rdma cgroup
  ci: use Ubuntu 16.04 (Xenial) in TravisCI
  client: update driver info on new fingerprint
  drivers/docker: enforce volumes.enabled (#4983)
  client: Style: use fluent style for building loggers
  ...
2018-12-13 14:41:09 -05:00
Alex Dadgar 1531b6d534
Merge pull request #4970 from hashicorp/f-no-iops
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Nick Ethier 86e9c11ec2
executor: don't drop errors when configuring libcontainer cfg, add nil check on resources 2018-12-07 14:03:42 -05:00
Nick Ethier 2283cb2c39
executor: use drivers.Resources as resource model 2018-12-06 21:22:02 -05:00
Nick Ethier 29ef54c0ee
executor: merge plugin shim with executor package 2018-12-06 21:13:45 -05:00
Nick Ethier 71353a88d4
executor: remove structs package 2018-12-06 20:54:14 -05:00
Alex Dadgar 1e3c3cb287 Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Mahmood Ali 9e825f880c Use absolute path in example device plugin
deviceDir is used for specifying mount/device host paths, and those
should be absolute paths.
2018-12-06 15:46:35 -05:00
Nick Ethier 8b20de4801
executor: use grpc instead of netrpc as plugin protocol
* Added protobuf spec for executor
 * Seperated executor structs into their own package
2018-12-05 11:03:56 -05:00
Danielle Tomlinson 8ba0a816f3 plugins: Add support for serving driver plugins 2018-12-01 17:30:54 +01:00
Danielle Tomlinson 393b76ed7f plugins: Move driver testing support to subpackage
this allows us to drop a cyclical import, but is subobptimal as it
requires BaseDriver tests to move. This falls firmly into the realm of
being a hack. Alternatives welcome.
2018-12-01 17:29:39 +01:00
Danielle Tomlinson 2db5ae38d8 client: Rename drivers/shared/env => client/taskenv 2018-11-30 12:18:39 +01:00
Danielle Tomlinson ffc5e5d56b executors: Unify go-plugin handshake 2018-11-30 10:59:23 +01:00
Danielle Tomlinson fdfe93aa25 fixup: executorplugin: fix rkt build 2018-11-30 10:47:08 +01:00
Danielle Tomlinson d26a310db0 client: Move executor plugins into own package 2018-11-30 10:46:13 +01:00
Danielle Tomlinson d582ea1d8b drivers: Create drivers/shared/structs
This creates a drivers/shared/structs package and moves the buffer size
checks into it.
2018-11-30 10:46:13 +01:00
Danielle Tomlinson 0544a57abe drivers: Move client/drivers/executor to drivers/shared/executor 2018-11-30 10:46:13 +01:00
Danielle Tomlinson 1a29811169 drivers: Move client/drivers/env to drivers/shared/env
As part of deprecating legacy drivers, we're moving the env package to a
new drivers/shared tree, as it is used by the modern docker and rkt
driver packages, and is useful for 3rd party plugins.
2018-11-30 10:46:13 +01:00
Chris Baker b43090a267
Merge pull request #4932 from hashicorp/b-1172-rkt-env-vars
change to testing utilities to fix rkt tests
2018-11-29 09:18:10 -05:00
Chris Baker da35fda145 testing: in MkAllocDir, do not update TaskConfig with All() from the task builder, just with Env() (because it pollutes environment variables with node attributes and fails the rkt tests) 2018-11-28 22:19:48 +00:00
Preetha 1f526db414
Merge pull request #4919 from hashicorp/f-fingerprint-attribute-type
Modify fingerprint interface to use typed attribute struct
2018-11-28 14:18:28 -06:00
Michael Schurter 1bd9a9f9dd
Merge pull request #4894 from hashicorp/f-device-hook
Device hook and devices affect computed node class
2018-11-28 12:10:43 -06:00
Preetha Appan f89dbcd9cc
modify fingerprint interface to use typed attribute struct 2018-11-28 10:01:03 -06:00
Mahmood Ali 6d34d2fade Add Driver Plugin for LXC 2018-11-27 21:40:43 -05:00
Alex Dadgar 4ee603c382 Device hook and devices affect computed node class
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.

TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
2018-11-27 17:25:33 -08:00
Chris Baker a1fb1f3830
Merge pull request #4891 from hashicorp/b-1150-rkt-volume-names
drivers/rkt: fix invalid volumes
2018-11-27 18:55:00 -05:00
Chris Baker c0bc9d069d change to docs in the driver proto to reflect standard pattern 2018-11-27 23:52:24 +00:00
Preetha Appan b9a22f8047
Fix panic in test setup when task does not have resources
This affects exec/rawexec drivers
2018-11-26 21:42:45 -06:00
Preetha Appan 125869686b
Fix nil dereference in copy method 2018-11-26 15:53:15 -06:00
Chris Baker 9bd4317139 modified TaskConfig to include AllocID
use this for volume names in drivers/rkt to address #1150
2018-11-26 18:54:26 +00:00
Mahmood Ali 141092e46d Formatting and typo fixes 2018-11-25 11:53:21 -05:00
Nick Ethier 1f3fe02e62
docker: sync access to exit result within a handle 2018-11-20 20:41:32 -05:00
Nick Ethier aa9f45ae47
docker: fix tests 2018-11-19 22:59:18 -05:00
Nick Ethier 4be8a86ef9
plugins/driver: remove NodeResources from task Resources and use PercentTicks field for docker driver 2018-11-19 22:59:17 -05:00
Nick Ethier ced5d5c445
docker: move recoverable error proto to shared structs 2018-11-19 22:59:16 -05:00
Nick Ethier 69049d37f5
drivers: added NodeResources to drivers.TaskConfig 2018-11-19 22:59:16 -05:00
Nick Ethier 3d7cdea19e
drivers/docker: more work porting tests from old driver plugin 2018-11-19 22:59:16 -05:00
Nick Ethier 117b9e6584
drivers: support recoverable errors in the plugin RPC layer 2018-11-19 22:59:15 -05:00
Nick Ethier 8f8698b3e1
docker: started work on porting docker driver to new plugin framework 2018-11-19 22:59:15 -05:00
Mahmood Ali b74ccc742c Expose Device Stats in /client/stats API endpoint 2018-11-14 14:41:19 -05:00
Mahmood Ali c5de71a424 Allow nullable fields in StatValues
In state values, we need to be able to distinguish between zero values
(e.g. `false`) and unset values (e.g. `nil`).

We can alternatively use protobuf `oneOf` and nested map to ensure
consistency of fields that are set together, but the golang
representation does not represent that well and introducing a mismatch
between representations.  Thus, I opted not to use it.
2018-11-14 14:41:19 -05:00
Mahmood Ali 713c9fe683 Move Stat{Object|Value} to plugins/shared/structs
Moving them as they may be useful for other packages/plugins besides
devices.
2018-11-14 09:01:26 -05:00
Mahmood Ali 1f4db08f42 Regenerate proto files with protoc-gen-go@v1.2.0 2018-11-14 09:01:26 -05:00
Mahmood Ali 1e92161f14
Merge pull request #4858 from hashicorp/b-fix-master-20181109
Fix some tests in master
2018-11-13 16:08:26 -05:00
Alex Dadgar 17e8446484
Merge pull request #4868 from hashicorp/b-plugin-ctx
Plugin client's handle plugin dying
2018-11-13 10:26:53 -08:00
Mahmood Ali ac3b4571eb Address review comments 2018-11-13 10:21:40 -05:00
Mahmood Ali fa146d9b85 fix plugin test 2018-11-13 10:21:40 -05:00
Alex Dadgar 693f244cce Plugin client's handle plugin dying
This PR plumbs the plugins done ctx through the base and driver plugin
clients (device already had it). Further, it adds generic handling of
gRPC stream errors.
2018-11-12 17:09:27 -08:00
Mahmood Ali 032f86bc78 Add a helper functions for checking unix root 2018-11-08 10:00:49 -08:00
Alex Dadgar c4f9e22aeb fix race 2018-11-07 12:22:07 -08:00
Alex Dadgar b4661df231 reserve uses donectx 2018-11-07 10:43:15 -08:00
Alex Dadgar f0c7a8159b tests 2018-11-07 10:43:15 -08:00
Alex Dadgar 204ca8230c Device manager
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
2018-11-07 10:43:15 -08:00
Alex Dadgar feb83a2be3 assign devices 2018-11-07 10:32:03 -08:00
Mahmood Ali 53543b3e32 register the java plugin 2018-11-06 12:41:39 -08:00
Michael Schurter 392d548b85
Merge pull request #4828 from hashicorp/b-restore
Implement client agent restarting
2018-11-05 18:50:15 -06:00
Michael Schurter d29d09023e client: do not run terminal allocs 2018-11-05 12:32:05 -08:00
Michael Schurter 2bbd88888c client: first pass at implementing task restoring
Task restoring works but dead tasks may be restarted
2018-11-05 12:32:05 -08:00
Mahmood Ali a17521475d
Merge pull request #4826 from hashicorp/b-driver-exec-tweaks-20181031
Register exec driver plugin among some fixes
2018-11-02 10:11:05 -04:00