Fix#5418
When using a docker logger that doesn't support log streaming through
API, currently docker logger runs a tight loop of Docker API calls
unexpectedly. This change ensures we stop fetching logs early.
Also, this adds some basic backoff strategy when Docker API logging
fails unexpectedly, to avoid accidentally DoSing the docker daemon.
Noticed that the protobuf files are out of sync with ones generated by 1.2.0 protoc go plugin.
The cause for these files seem to be related to release processes, e.g. [0.9.0-beta1 preperation](ecec3d38de (diff-da4da188ee496377d456025c2eab4e87)), and [0.9.0-beta3 preperation](b849d84f2f).
This restores the changes to that of the pinned protoc version and fails build if protobuf files are out of sync. Sample failing Travis job is that of the first commit change: https://travis-ci.org/hashicorp/nomad/jobs/506285085
Sometimes the nomad docker_logger may be killed by a service manager
when restarting the client for upgrades or reliability reasons.
Currently if this happens, we leak the users container and try to
reschedule over it.
This commit adds a new step to the recovery process that will spawn a
new docker logger process that will fetch logs from _the current
timestamp_. This is to avoid restarting users tasks because our logging
sidecar has failed.
This commit adds some extra resiliency to the docker logger in the case
of API failure from the docker daemon, by restarting the stream from the
current point in time if the stream returns and the container is still
running.
Uses the home directory and windows path expansion, as c:\tmp doesn't
necessarily exist, and mktemp would involve unnecessarily complicating
the commands.
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends. This number is closer
what Docker reports.
Related to https://github.com/hashicorp/nomad/issues/5165 .
This PR fixes various instances of plugins being launched without using
the parent loggers. This meant that logs would not all go to the same
output, break formatting etc.
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.