open-nomad

Commit Graph

Author	SHA1	Message	Date
Tim Gross	32b94bf1a4	csi: stub fingerprint on instance manager shutdown (#7388 ) Run the plugin fingerprint one last time with a closed client during instance manager shutdown. This will return quickly and will give us a correctly-populated `PluginInfo` marked as unhealthy so the Nomad client can update the server about plugin health.	2020-03-23 13:59:25 -04:00
Tim Gross	de4ad6ca38	csi: add Provider field to CSI CLIs and APIs (#7285 ) Derive a provider name and version for plugins (and the volumes that use them) from the CSI identity API `GetPluginInfo`. Expose the vendor name as `Provider` in the API and CLI commands.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	247e86bb35	csi: VolumeCapabilities for ControllerPublishVolume This commit introduces support for providing VolumeCapabilities during requests to `ControllerPublishVolumes` as this is a required field.	2020-03-23 13:58:30 -04:00
Danielle Lancashire	34acb596e3	plugins/csi: Implement ConvtrollerValidateCapabilities RPC	2020-03-23 13:58:30 -04:00
Danielle Lancashire	6b7ee96a88	csi: Move VolumeCapabilties helper to package	2020-03-23 13:58:30 -04:00
Danielle Lancashire	6762442199	csiclient: Add grpc.CallOption support to NodeUnpublishVolume	2020-03-23 13:58:30 -04:00
Tim Gross	60901fa764	csi: implement CSI controller detach request/response (#7107 ) This changeset implements the minimal structs on the client-side we need to compile the work-in-progress implementation of the server-to-controller RPCs. It doesn't include implementing the `ClientCSI.DettachVolume` RPC on the client.	2020-03-23 13:58:29 -04:00
Danielle Lancashire	a5c96ce2e1	csi: Add grpc.CallOption support to NodePublishVolume	2020-03-23 13:58:29 -04:00
Danielle Lancashire	add55e37b8	csi: Expose gRPC Options on NodeUnstageVolume	2020-03-23 13:58:29 -04:00
Danielle Lancashire	65d9ddc9af	csi: Expose grpc.CallOptions for NodeStageVolume	2020-03-23 13:58:29 -04:00
Danielle Lancashire	51270ae0f4	csi: Support for NodeUnpublishVolume RPCs	2020-03-23 13:58:29 -04:00
Danielle Lancashire	a4b96aff33	csi: Nil check ToCSIRepresentation implementations	2020-03-23 13:58:29 -04:00
Danielle Lancashire	02c4612e65	csi: Add NodePublishVolume RPCs	2020-03-23 13:58:29 -04:00
Danielle Lancashire	98f00a9220	csi: Add NodeUnstageVolume RPCs to CSIPlugin	2020-03-23 13:58:29 -04:00
Danielle Lancashire	5c447396fa	csi: Add NodeUnstageVolume as a CSI Dependency	2020-03-23 13:58:29 -04:00
Danielle Lancashire	f208770e94	csi: Add NodeStageVolume to fake client	2020-03-23 13:58:29 -04:00
Danielle Lancashire	07651a5231	csi: Add NodeStageVolume RPC	2020-03-23 13:58:29 -04:00
Danielle Lancashire	317b680744	csi: Add csi.NodeStageVolume to the NodeClient Implements a fake version of NodeStageVolume as a dependency of implementing the client.NodeStageVolume request	2020-03-23 13:58:29 -04:00
Danielle Lancashire	ab1edd4e24	csi: Add Nomad Model for VolumeCapabilities This commit introduces a nomad model for interacting with CSI VolumeCapabilities as a pre-requisite for implementing NodeStageVolume and NodeMountVolume correctly. These fields have a few special characteristics that I've tried to model here - specificially, we make a basic attempt to avoid printing data that should be redacted during debug logs (additional mount flags), and also attempt to make debuggability of other integer fields easier by implementing the fmt.Stringer and fmt.GoStringer interfaces as necessary. We do not currnetly implement a CSI Protobuf -> Nomad implementation transformation as this is currently not needed by any used RPCs.	2020-03-23 13:58:29 -04:00
Danielle Lancashire	de5d373001	csi: Setup gRPC Clients with a logger	2020-03-23 13:58:29 -04:00
Danielle Lancashire	c16812280c	csi: Add NodeGetCapabilities RPC	2020-03-23 13:58:29 -04:00
Danielle Lancashire	05525c98ae	plugins_csi: Add GetControllerCapabilities RPC	2020-03-23 13:58:28 -04:00
Danielle Lancashire	72ee2d4c1c	csi: Add initial plumbing for controller rpcs	2020-03-23 13:58:28 -04:00
Danielle Lancashire	426c26d7c0	CSI Plugin Registration (#6555 ) This changeset implements the initial registration and fingerprinting of CSI Plugins as part of #5378. At a high level, it introduces the following: * A `csi_plugin` stanza as part of a Nomad task configuration, to allow a task to expose that it is a plugin. * A new task runner hook: `csi_plugin_supervisor`. This hook does two things. When the `csi_plugin` stanza is detected, it will automatically configure the plugin task to receive bidirectional mounts to the CSI intermediary directory. At runtime, it will then perform an initial heartbeat of the plugin and handle submitting it to the new `dynamicplugins.Registry` for further use by the client, and then run a lightweight heartbeat loop that will emit task events when health changes. * The `dynamicplugins.Registry` for handling plugins that run as Nomad tasks, in contrast to the existing catalog that requires `go-plugin` type plugins and to know the plugin configuration in advance. * The `csimanager` which fingerprints CSI plugins, in a similar way to `drivermanager` and `devicemanager`. It currently only fingerprints the NodeID from the plugin, and assumes that all plugins are monolithic. Missing features * We do not use the live updates of the `dynamicplugin` registry in the `csimanager` yet. * We do not deregister the plugins from the client when they shutdown yet, they just become indefinitely marked as unhealthy. This is deliberate until we figure out how we should manage deploying new versions of plugins/transitioning them.	2020-03-23 13:58:28 -04:00
Chris Baker	d6364e13bc	fix typo in comment	2020-03-13 09:09:46 -05:00
Mahmood Ali	88cfe504a0	update grpc Upgrade grpc to v1.27.1 and protobuf plugins to v1.3.4.	2020-03-03 08:39:54 -05:00
Mahmood Ali	503d99e960	update protobuf generated files	2020-03-02 15:19:46 -05:00
Mahmood Ali	0b7085ba3a	driver: allow disabling log collection Operators commonly have docker logs aggregated using various tools and don't need nomad to manage their docker logs. Worse, Nomad uses a somewhat heavy docker api call to collect them and it seems to cause problems when a client runs hundreds of log collections. Here we add a knob to disable log aggregation completely for nomad. When log collection is disabled, we avoid running logmon and docker_logger for the docker tasks in this implementation. The downside here is once disabled, `nomad logs ...` commands and API no longer return logs and operators must corrolate alloc-ids with their aggregated log info. This is meant as a stop gap measure. Ideally, we'd follow up with at least two changes: First, we should optimize behavior when we can such that operators don't need to disable docker log collection. Potentially by reverting to using pre-0.9 syslog aggregation in linux environments, though with different trade-offs. Second, when/if logs are disabled, nomad logs endpoints should lookup docker logs api on demand. This ensures that the cost of log collection is paid sparingly.	2019-12-08 14:15:03 -05:00
Tim Gross	24be1b2b6f	fix plugin launcher SetConfig msgpack params (#6776 ) * fix plugin launcher SetConfig msgpack params The plugin launcher tool was passing the wrong byte array into `SetConfig`, resulting in msgpack decoding errors. This was fixed in a949050 (#6725) but accidentally reverted in 6aff18d (#6590). Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com>	2019-11-26 10:49:22 -05:00
Lang Martin	2d5737419c	plugins device: remove trace level containing config contents	2019-11-25 14:49:40 -05:00
Chris Baker	943c4f083e	the plugin launcher tool was passing the wrong byte array into SetConfig, resulting in msgpack decoding errors	2019-11-19 14:53:34 +00:00
Danielle Lancashire	4fbcc668d0	volumes: Add support for mount propagation This commit introduces support for configuring mount propagation when mounting volumes with the `volume_mount` stanza on Linux targets. Similar to Kubernetes, we expose 3 options for configuring mount propagation: - private, which is equivalent to `rprivate` on Linux, which does not allow the container to see any new nested mounts after the chroot was created. - host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts that have been created _outside of the container_ to be visible inside the container after the chroot is created. - bidirectional, which is equivalent to `rshared` on Linux, which allows both the container to see new mounts created on the host, but importantly _allows the container to create mounts that are visible in other containers an don the host_ private and host-to-task are safe, but bidirectional mounts can be dangerous, as if the code inside a container creates a mount, and does not clean it up before tearing down the container, it can cause bad things to happen inside the kernel. To add a layer of safety here, we require that the user has ReadWrite permissions on the volume before allowing bidirectional mounts, as a defense in depth / validation case, although creating mounts should also require a priviliged execution environment inside the container.	2019-10-14 14:09:58 +02:00
Tim Gross	d965a15490	driver/networking: don't recreate existing network namespaces	2019-09-25 14:58:17 -04:00
Grégoire Delattre	c6ac788258	Fix the ExecTask function in DriverExecTaskNotSupported (#6145 ) This fixes the ExecTask definition to match with the DriverPlugin interface.	2019-08-29 11:36:29 -04:00
Mahmood Ali	3791a70aa9	Merge pull request #5676 from hashicorp/f-b-upgrade-ugorji-dep-20190508 Update ugorji/go to latest	2019-08-23 18:29:49 -04:00
Lucas BEE	35a1b72962	Add NetworkIsolation in TaskConfig (#6135 ) NetworkIsolation was left out of the task config when using an external task driver plugin	2019-08-15 13:05:55 -04:00
Lucas BEE	406642f34a	Fix missing plugin driver capabilities (#6128 ) NetIsolationModes and MustInitiateNetwork were left out of the driver Capabilities when using an external task driver plugin Signed-off-by: Lucas BEE <pouulet@gmail.com>	2019-08-14 09:10:10 -04:00
Danielle Lancashire	6ef8d5233e	client: Add volume_hook for mounting volumes	2019-08-12 15:39:08 +02:00
Nick Ethier	971c8c9c2b	Driver networking support Adds support for passing network isolation config into drivers and implements support in the rawexec driver as a proof of concept	2019-07-31 01:03:20 -04:00
Nick Ethier	63c5504d56	ar: fix lint errors	2019-07-31 01:03:19 -04:00
Nick Ethier	2d60ef64d9	plugins/driver: make DriverNetworkManager interface optional	2019-07-31 01:03:19 -04:00
Nick Ethier	ab84630132	plugin/driver: fix tests and add new dep to vendor	2019-07-31 01:03:17 -04:00
Nick Ethier	548f78ef15	ar: initial driver based network management	2019-07-31 01:03:17 -04:00
Nick Ethier	66c514a388	Add network lifecycle management Adds a new Prerun and Postrun hooks to manage set up of network namespaces on linux. Work still needs to be done to make the code platform agnostic and support Docker style network initalization.	2019-07-31 01:03:17 -04:00
Jasmine Dahilig	2157f6ddf1	add formatting for hcl parsing error messages (#5972 )	2019-07-19 10:04:39 -07:00
Mahmood Ali	72d81da4e0	Signal plugin shutdown for driver.TaskStats The driver plugin stub client must call `grpcutils.HandleGrpcErr` to handle plugin shutdown similar to other functions. This ensures that TaskStats returns `ErrPluginShutdown` when plugin shutdown.	2019-07-11 13:57:35 +08:00
Chris Baker	9442c26cff	docker: DestroyTask was not cleaning up Docker images because it was erroring early due to an attempt to inspect an image that had already been removed	2019-06-03 19:04:27 +00:00
Mahmood Ali	cf1f3625b4	Update ugorji/go to latest Our testing so far indicates that ugorji/go/codec maintains backward compatiblity with the version we are using now, for purposes of Nomad serialization. Using latest ugorji/go allows us to get back to using upstream library, get get the optimizations benefits in RPC paths (including code generation optimizations). ugorji/go introduced two significant changes: * time binary format in `debb8e2d2e`. Setting `h.BasicHandle.TimeNotBuiltin = true` restores old behavior * ugorji/go started honoring `json` tag as well: v1.1.4 is the latest but has a bug in handling RawString that's fixed in `d09a80c1e0` .	2019-05-09 19:35:58 -04:00
Mahmood Ali	57c18fec4e	Add basic drivers conformance tests Add consolidated testing package to serve as conformance tests for all drivers.	2019-05-09 16:49:08 -04:00
Mahmood Ali	13de875750	implemment streaming exec handling in driver grpc handlers Also add a helper that converts the adapts the high level interface to the low-level interface of nomad exec interfaces.	2019-05-09 16:49:08 -04:00
Mahmood Ali	f47d3d5f8a	add nomad streaming exec core data structures and interfaces In this commit, we add two driver interfaces for supporting `nomad exec` invocation: * A high level `ExecTaskStreamingDriver`, that operates on io reader/writers. Drivers should prefer using this interface * A low level `ExecTaskStreamingRawDriver` that operates on the raw stream of input structs; useful when a driver delegates handling to driver backend (e.g. across RPC/grpc). The interfaces are optional for a driver, as `nomad exec` support is opt-in. Existing drivers continue to compile without exec support, until their maintainer add such support. Furthermore, we create protobuf structures to represent exec stream entities: `ExecTaskStreamingRequest` and `ExecTaskStreamingResponse`. We aim to reuse the protobuf generated code as much as possible, without translation to avoid conversion overhead. `ExecTaskStream` abstract fetching and sending stream entities. It's influenced by the grpc bi-directional stream interface, to avoid needing any adapter. I considered using channels, but the asynchronisity and concurrency makes buffer reuse too complicated, which would put more pressure on GC and slows exec operation.	2019-04-30 14:02:29 -04:00
Mahmood Ali	0c8ee8c404	Simplify proto conversion and handle swap Convert all cpu and memory usage fields regardless of stated measured fields, and handle swap fields	2019-03-30 15:18:28 -04:00
Mahmood Ali	f4a68f556f	deserialize total ticks	2019-03-30 07:14:57 -04:00
Mahmood Ali	9656d79eba	Always report TotalTicks when percent is measured Fix a case where TotalTicks doesn't get serialized across executor grpc calls. Here, I opted to implicit add field, rather than explicitly mark it as a measured field, because it's a derived field and to preserve 0.8 behavior where total ticks aren't explicitly marked as a measured field.	2019-03-29 22:34:28 -04:00
Mahmood Ali	b08a2744f8	Merge pull request #5428 from hashicorp/b-dropped-logs-on-task-restart client/logmon: restart log collection correctly when a task is restarted	2019-03-21 14:02:08 -04:00
Nick Ethier	83936bea3c	logmon: fix logmon handling in driver test harness	2019-03-20 21:14:08 -04:00
Mahmood Ali	fb55717b0c	Regenerate Proto files (#5421 ) Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin. The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](`ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)`), and [0.9.0-beta3 preperation](`b849d84f2f`). This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync. Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085	2019-03-14 10:56:27 -04:00
Michael Schurter	d74755900e	Generate files for 0.9.0-beta3 release	2019-02-26 09:44:49 -08:00
Michael Schurter	38821954b7	plugins: squelch context Canceled error logs As far as I can tell this is the most straightforward and resilient way to skip error logging on context cancellation with grpc streams. You cannot compare the error against context.Canceled directly as it is of type `*status.statusError`. The next best solution I found was: ```go resp, err := stream.Recv() if code, ok := err.(interface{ Code() code.Code }); ok { if code.Code == code.Canceled { return } } ``` However I think checking ctx.Err() directly makes the code much easier to read and is resilient against grpc API changes.	2019-02-21 15:32:18 -08:00
Michael Schurter	6c0cc65b2e	simplify hcl2 parsing helper No need to pass in the entire eval context	2019-02-04 11:07:57 -08:00
Alex Dadgar	bc804dda2e	Nomad 0.9.0-beta1 generated code	2019-01-30 10:49:44 -08:00
Nick Ethier	bb9a8afe9b	executor: fix bug and add tests for incorrect stats timestamp reporting	2019-01-28 21:57:45 -05:00
Mahmood Ali	389e043129	drivers: pass logger through driver plugin client This fixes a panic whenever driver plugin attempts to log a message.	2019-01-25 09:38:41 -05:00
Nick Ethier	c41f96943d	plugins/drivers: change stats interval to duration type in proto	2019-01-24 22:19:18 -05:00
Mahmood Ali	698f0a1575	Merge pull request #5229 from hashicorp/r-grabbag-201901019 Grab bag of small changes	2019-01-23 13:06:51 -05:00
Mahmood Ali	651bbe5330	spell check	2019-01-23 10:54:52 -05:00
Michael Schurter	32daa7b47b	goimports until make check is happy	2019-01-23 06:27:14 -08:00
Michael Schurter	be0bab7c3f	move pluginutils -> helper/pluginutils I wanted a different color bikeshed, so I get to paint it	2019-01-22 15:50:08 -08:00
Alex Dadgar	4bdccab550	goimports	2019-01-22 15:44:31 -08:00
Alex Dadgar	b7a65676fe	gofmt	2019-01-22 15:43:34 -08:00
Alex Dadgar	2ca0e97361	Split hclspec	2019-01-22 15:43:34 -08:00
Alex Dadgar	5ca6dd7988	move hclutils	2019-01-22 15:43:34 -08:00
Alex Dadgar	72a5691897	Driver tests do not use hcl2/hcl, hclspec, or hclutils	2019-01-22 15:43:34 -08:00
Alex Dadgar	b2c7268843	move reattach config	2019-01-22 15:11:58 -08:00
Alex Dadgar	cdcd3c929c	loader and singleton	2019-01-22 15:11:57 -08:00
Alex Dadgar	6c2782f037	move catalog + grpcutils	2019-01-22 15:11:57 -08:00
Alex Dadgar	f3a2d05e11	increase log level	2019-01-22 14:20:54 -08:00
Nick Ethier	7da9f5eac7	drivers: regen proto	2019-01-18 18:53:45 -05:00
Nick Ethier	e3c6f89b9a	drivers: use consts for task handle version	2019-01-18 18:31:01 -05:00
Nick Ethier	04772aac7c	drivers: fix test	2019-01-18 18:31:01 -05:00
Nick Ethier	6804450c69	cleanup code comments and small fixes from refactor	2019-01-18 18:31:01 -05:00
Nick Ethier	05bd369d1f	driver: add pre09 migration logic	2019-01-18 18:31:01 -05:00
Nick Ethier	e5a6fc9271	executor: add pre 0.9 client and wrapper	2019-01-18 18:30:58 -05:00
Mahmood Ali	5df63fda7c	Merge pull request #5190 from hashicorp/f-memory-usage Track Basic Memory Usage as reported by cgroups	2019-01-18 16:46:02 -05:00
Chris Baker	265f935d10	renamed mispelled function, fixed incorrect comment	2019-01-18 20:00:49 +00:00
Danielle Tomlinson	12eb4631ba	Merge pull request #5204 from hashicorp/dani/loader-typo-2 Fix typo in PluginLoader	2019-01-18 11:22:22 +01:00
Danielle Tomlinson	ae5ad68600	pluginloader: typo: s/validePluginConfig/validatePluginConfig	2019-01-17 19:10:32 +01:00
Danielle Tomlinson	707988fee1	plugins: Require an exe extension on windows	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	7fca934509	chore: General Cleanup	2019-01-17 18:43:14 +01:00
Mahmood Ali	b7faf76343	chore: Stylistic cleanup Co-Authored-By: dantoml <dani@tomlinson.io>	2019-01-17 18:43:14 +01:00
Danielle Tomlinson	62e06eda56	chore: Cleanup formatting	2019-01-17 18:43:13 +01:00
Danielle Tomlinson	477c0b1d23	plugins: Load plugins on windows	2019-01-17 18:43:13 +01:00
Mahmood Ali	9909d98bee	Track Basic Memory Usage as reported by cgroups Track current memory usage, `memory.usage_in_bytes`, in addition to `memory.max_memory_usage_in_bytes` and friends. This number is closer what Docker reports. Related to https://github.com/hashicorp/nomad/issues/5165 .	2019-01-14 18:47:52 -05:00
Nick Ethier	3b395d7100	drivers: plumb grpc client logger	2019-01-12 12:18:23 -05:00
Nick Ethier	7e306afde3	executor: fix failing stats related test	2019-01-12 12:18:23 -05:00
Nick Ethier	b0d9440474	docker: add test for stats collection	2019-01-12 12:18:22 -05:00
Nick Ethier	9fea54e0dc	executor: implement streaming stats API plugins/driver: update driver interface to support streaming stats client/tr: use streaming stats api TODO: * how to handle errors and closed channel during stats streaming * prevent tight loop if Stats(ctx) returns an error drivers: update drivers TaskStats RPC to handle streaming results executor: better error handling in stats rpc docker: better control and error handling of stats rpc driver: allow stats to return a recoverable error	2019-01-12 12:18:22 -05:00
Mahmood Ali	b08f59cdda	Merge pull request #5162 from hashicorp/f-extract-lxc Extract LXC from nomad	2019-01-09 13:07:05 -05:00
Mahmood Ali	90f3cea187	Merge pull request #5157 from hashicorp/r-drivers-no-cstructs drivers: avoid referencing client/structs package	2019-01-09 13:06:46 -05:00
Mahmood Ali	f679975956	fixup! remove unused field	2019-01-08 12:58:12 -05:00

1 2 3 4 5 ...

325 Commits