This commit introduces a rotating file logger for Nomad Agent Logs. The
logger implementation itself is a lift and shift from Consul, with tests
updated to fit with the Nomad pattern of using require, and not having a
testutil for creating tempdirs cleanly.
Show full ID on individual alloc or node status views. Shortening
the ID isn't very helpful in these cases, and makes looking up the full
id slightly more complicated when user needs to interact with API.
List views are unmodified and show short id unless `-vebose` flag is passed.
Before
```
$ nomad node status -self | head -n2
ID = 21fc51f9
Name = mars-2.local
$ nomad alloc status 15ae54cd | head -n3
ID = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3
Eval ID = a6b15f86
Name = example.cache[0]
```
After:
```
$ nomad node status -self | head -n2
ID = 21fc51f9-fd39-0fa0-fb41-f34c7aa36101
Name = mars-2.local
$ nomad alloc status 15ae54cd | head -n3
ID = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3
Eval ID = a6b15f86-ca8e-e536-b544-4bfb43137ff3
Name = example.cache[0]
```
This fixes two bugs:
First, FS Logs API endpoint only propagated error back to user if it was
encoded with code, which isn't common. Other errors get suppressed and
callers get an empty response with 200 error code. Now, these endpoints
return a 500 status code along with the error message.
Before
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start®ion=global&task=redis&type=stdout"; echo
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start®ion=global&task=redis&type=stdout HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:47:21 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
```
After
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start®ion=global&task=redis&type=stdout"; echo
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start®ion=global&task=redis&type=stdout HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:48:12 GMT
< Content-Length: 60
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
alloc lookup failed: index error: UUID must be 36 characters
```
Second, we return 400 status code for request validation errors.
Before
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:47:29 GMT
< Content-Length: 22
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
must provide task name
```
After
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 400 Bad Request
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:49:18 GMT
< Content-Length: 22
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
must provide task name
```
This fixes a bug in the CLI handling of node lookup failures when
querying allocation and FS endpoints.
Allocation and FS endpoint are handled by the client; one can query the
relevant client directly, or query a server to have it forwarded
transparently to relevant client. Querying the client directly is
benefecial to avoid loading servers with IO.
As an optimization, the CLI attempts to query the client directly, but
then falls back to using server forwarding path if it encounters network
or connection errors (e.g. clients are locked down or in a separate
inaccessible network).
Here, we fix a bug where if the CLI fails to find to lookup the client
details because it lacks ACL capability or other unexpected reasons, the
CLI will not go through fallback path.
This sets a default-but-query-configurable Faker seed in development,
via faker-seed. It also changes uses of Math.random to use Faker’s
randomness so auto-generated data remains stable in development.
fixes https://github.com/hashicorp/nomad/issues/6382
The prestart hook for templates blocks while it resolves vault secrets.
If the secret is not found it continues to retry. If a task is shutdown
during this time, the prestart hook currently does not receive
shutdownCtxCancel, causing it to hang.
This PR joins the two contexts so either killCtx or shutdownCtx cancel
and stop the task.
I noticed while working on #6166 that some of the factory properties
that used Faker’s randomisation features are using their output
rather than a function that would call the randomiser. This means that
the randomisation happens once and the value is used for every model
generated by the factory. This wraps the randomiser calls in functions
so different models can have different values.
Golang 1.13 is pickier with importpaths and aliasing and fails
compilation currently.
Here, for go-msgpack dependency, we use upstream ugorji/go with a single
change
23165f7bc3
.
For consistency and to ease noticing descripency, I made ugorji/go and
hashicorp/go-msgpack reference the same sha.
This is a dependency management update and has no functional change to
product.
Without passing the network isolation configuration to the executor,
java tasks are not placed in the same network namespace as the other
processes in their task group, which breaks Consul Connect.
In a job registration request, ensure that the request namespace "header" and job
namespace field match. This should be the case already in prod, as http
handlers ensures that the values match [1].
This mitigates bugs that exploit bugs where we may check a value but act
on another, resulting into bypassing ACL system.
[1] https://github.com/hashicorp/nomad/blob/v0.9.5/command/agent/job_endpoint.go#L415-L418