Commit graph

26 commits

Author SHA1 Message Date
Mahmood Ali b51f00a7f3 logmon client to handle grpc closing errors 2019-04-25 14:32:24 -04:00
Mahmood Ali 81f4f07ed7 rename fifo methods for clarity 2019-04-01 16:52:58 -04:00
Mahmood Ali e87afe465b clarify closeDone blocking and field name 2019-04-01 16:10:34 -04:00
Mahmood Ali 2b1f858e1b log when fifo fails to open 2019-04-01 13:18:03 -04:00
Mahmood Ali 967452a3f0 fifo: Use plain fifo file in Unix
This PR switches to using plain fifo files instead of golang structs
managed by containerd/fifo library.

The library main benefit is management of opening fifo files.  In Linux,
a reader `open()` request would block until a writer opens the file (and
vice-versa).  The library uses goroutines so that it's the first IO
operation that blocks.

This benefit isn't really useful for us: Given that logmon simply
streams output in a separate process, blocking of opening or first read
is effectively the same.

The library additionally makes further complications for managing state
and tracking read/write permission that seems overhead for our use,
compared to using a file directly.

Looking here, I made the following incidental changes:
* document that we do handle if fifo files are already created, as we
rely on that behavior for logmon restarts
* use type system to lock read vs write: currently, fifo library returns
`io.ReadWriteCloser` even if fifo is opened for writing only!
2019-04-01 13:18:03 -04:00
Mahmood Ali b08a2744f8
Merge pull request #5428 from hashicorp/b-dropped-logs-on-task-restart
client/logmon: restart log collection correctly when a task is restarted
2019-03-21 14:02:08 -04:00
Mahmood Ali 729458f110 fix TestLogmon_Start_restart 2019-03-21 13:36:46 -04:00
Nick Ethier b252d712df
logmon: fix test assertion 2019-03-20 21:37:17 -04:00
Nick Ethier c1f5011181
logmon: remove sleeps from tests 2019-03-20 10:45:09 -04:00
Nick Ethier e14041bdec
logmon: add tests for rotation and open/closing of fifos 2019-03-19 14:41:23 -04:00
Nick Ethier dc18b8928a
logmon: make Start rpc idempotent and simplify hook 2019-03-19 14:02:36 -04:00
Nick Ethier 7dc3d83634
client/logmon: restart log collection correctly when a task is restarted 2019-03-15 23:59:18 -04:00
Mahmood Ali fb55717b0c
Regenerate Proto files (#5421)
Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin.

The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)), and [0.9.0-beta3 preperation](b849d84f2f).

This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync.  Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085
2019-03-14 10:56:27 -04:00
Michael Schurter a2e3ea6dc9 logmon: fix reattach configuration
There were multiple bugs here:

1. Reattach unmarshalling always returned an error because you can't
   unmarshal into a nil pointer.
2. The hook data wasn't being saved because it was put on the request
   struct, not the response struct.
3. The plugin configuration should only have reattach *or* a command
   set. Not both.
4. Setting Done=true meant the hook was never re-run on agent restart so
   reattaching was never attempted.
2019-02-21 15:32:18 -08:00
Alex Dadgar bc804dda2e Nomad 0.9.0-beta1 generated code 2019-01-30 10:49:44 -08:00
Mahmood Ali 5df63fda7c
Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Danielle Tomlinson 99da4c780d logmon: Reattach to existing loggers
This commit prevents us from creating duplicate logmon hooks when
restoring allocations by persisting the logmon reattach config using
HookData.
2019-01-16 14:56:10 +01:00
Mahmood Ali 9909d98bee Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Alex Dadgar 8a35d7b1dd Test recovery 2019-01-07 14:49:41 -08:00
Marcin Matlaszek 39eec70f31
Recover from any possible io error when invoking Write on FileRotator
As of now, FileRotator uses bufio.Write under the hood to write data to
configured output file. Due to the way how bufio handles any occurred io
error - saves it into `err` variable never resetting it automatically -
any operation like `Write`, `Flush` etc will become a no-op, returning the very same,
saved error (eg. Out of disk space) even when the problem is fixed (eg. disk
space is available again).

That automatically means that FileRotator will stop writing any logs,
reporting the same error over and over again, even if it's no longer
valid.

This PR fixes it by resetting the bufio Writer, which resets any errors
and tries to write requested data.
2018-12-07 18:22:29 +01:00
Danielle Tomlinson 6b72e96eba client: Move driver/logging to logmon/logging
The logging package is used by logmon and the legacy mock_driver. Because the
legacy drivers are going away, I'm moving it here to signify its actual
ownership.
2018-11-30 10:46:13 +01:00
Preetha Appan f89dbcd9cc
modify fingerprint interface to use typed attribute struct 2018-11-28 10:01:03 -06:00
Mahmood Ali 1f4db08f42 Regenerate proto files with protoc-gen-go@v1.2.0 2018-11-14 09:01:26 -05:00
Alex Dadgar 204ca8230c Device manager
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
2018-11-07 10:43:15 -08:00
Nick Ethier 352c05cdf4 plugin/drivers: plumb in stdout/stderr paths 2018-10-16 16:53:31 -07:00
Nick Ethier 5dee1141d1 executor v2 (#4656)
* client/executor: refactor client to remove interpolation

* executor: POC libcontainer based executor

* vendor: use hashicorp libcontainer fork

* vendor: add libcontainer/nsenter dep

* executor: updated executor interface to simplify operations

* executor: implement logging pipe

* logmon: new logmon plugin to manage task logs

* driver/executor: use logmon for log management

* executor: fix tests and windows build

* executor: fix logging key names

* executor: fix test failures

* executor: add config field to toggle between using libcontainer and standard executors

* logmon: use discover utility to discover nomad executable

* executor: only call libcontainer-shim on main in linux

* logmon: use seperate path configs for stdout/stderr fifos

* executor: windows fixes

* executor: created reusable pid stats collection utility that can be used in an executor

* executor: update fifo.Open calls

* executor: fix build

* remove executor from docker driver

* executor: Shutdown func to kill and cleanup executor and its children

* executor: move linux specific universal executor funcs to seperate file

* move logmon initialization to a task runner hook

* client: doc fixes and renaming from code review


* taskrunner: use shared config struct for logmon fifo fields

* taskrunner: logmon only needs to be started once per task
2018-10-16 16:53:31 -07:00