Commit graph

1454 commits

Author SHA1 Message Date
Drew Bailey 24929776a2
shutdown delay for task groups
copy struct values

ensure groupserviceHook implements RunnerPreKillhook

run deregister first

test that shutdown times are delayed

move magic number into variable
2019-12-16 11:38:16 -05:00
Seth Hoenig f0c3dca49c tests: swap lib/freeport for tweaked helper/freeport
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.

Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.

This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
2019-12-09 08:37:32 -06:00
Michael Schurter 3008473f9b
Merge branch 'master' into release-0102 2019-12-04 14:13:34 -08:00
Mahmood Ali 7b8cfee162 tests: deflake TestHTTP_FreshClientAllocMetrics
The test asserts that alloc counts get reported accurately in metrics by
inspecting the metrics endpoint directly.  Sadly, the metrics as
collected by `armon/go-metrics` seem to be stateful and may contain info
from other tests.

This means that the test can fail depending on the order of returned
metrics.

Inspecting the metrics output of one failing run, you can see the
duplicate guage entries but for different node_ids:

```
    {
      "Name": "service-name.default-0a3ba4b6-2109-485e-be74-6864228aed3d.client.allocations.terminal",
      "Value": 10,
      "Labels": {
        "datacenter": "dc1",
        "node_class": "none",
        "node_id": "67402bf4-00f3-bd8d-9fa8-f4d1924a892a"
      }
    },
    {
      "Name": "service-name.default-0a3ba4b6-2109-485e-be74-6864228aed3d.client.allocations.terminal",
      "Value": 0,
      "Labels": {
        "datacenter": "dc1",
        "node_class": "none",
        "node_id": "a2945b48-7e66-68e2-c922-49b20dd4e20c"
      }
    },
```
2019-11-22 18:41:21 -05:00
Nomad Release bot db6420367d Generate files for 0.10.2-rc1 release 2019-11-22 18:42:49 +00:00
Mahmood Ali be6c60455e
Merge pull request #6669 from hashicorp/b-cors-allow-credentials
Allow UI to query client directly for task logs/state
2019-11-20 15:14:01 -05:00
Preetha 42c1c85285
Merge pull request #6421 from hashicorp/b-acl-bootstrap-codes
api: acl bootstrap errors aren't 500
2019-11-20 10:36:08 -06:00
Preetha be4a51d5b8
Merge pull request #6349 from hashicorp/b-host-stats
client: Return empty values when host stats fail
2019-11-20 10:13:02 -06:00
Mahmood Ali 6f8bb5e90b api: acl bootstrap errors aren't 500
Noticed that ACL endpoints return 500 status code for user errors.  This
is confusing and can lead to false monitoring alerts.

Here, I introduce a concept of RPCCoded errors to be returned by RPC
that signal a code in addition to error message.  Codes for now match
HTTP codes to ease reasoning.

```
$ nomad acl bootstrap
Error bootstrapping: Unexpected response code: 500 (ACL bootstrap already done (reset index: 9))

$ nomad acl bootstrap
Error bootstrapping: Unexpected response code: 400 (ACL bootstrap already done (reset index: 9))
```
2019-11-19 15:51:57 -05:00
Nick Ethier bd454a4c6f
client: improve group service stanza interpolation and check_re… (#6586)
* client: improve group service stanza interpolation and check_restart support

Interpolation can now be done on group service stanzas. Note that some task runtime specific information
that was previously available when the service was registered poststart of a task is no longer available.

The check_restart stanza for checks defined on group services will now properly restart the allocation upon
check failures if configured.
2019-11-18 13:04:01 -05:00
Drew Bailey 9b63828658
serverID to target remote leader or server
handle the case where we request a server-id which is this current server

update docs, error on node and server id params

more accurate names for tests

use shared no leader err, formatting

rm bad comment

remove redundant variable
2019-11-14 10:07:35 -05:00
Drew Bailey b644e1f47d
add server-id to monitor specific server 2019-11-14 09:53:41 -05:00
Drew Bailey acd97d0731
Merge pull request #6670 from hashicorp/api/fallthrough-test
test rootfallthrough handler
2019-11-13 10:51:31 -05:00
Lars Lehtonen 1dbf44bc40 command/agent: Prune Dead Code (#6682)
* remove unused MockPeriodicJob() from tests
* remove unused getIndex() from tests
* remove unused checkIndex() from tests
* remove unused assertIndex() from tests
* remove unused Agent.findLoopbackDevice()
2019-11-13 08:20:01 -05:00
Drew Bailey f5310ff63f
fix so assertions are test case driven 2019-11-12 14:28:21 -05:00
Charlie Voiselle 835831a3d8 Added service wrapper code (#6220)
This is the basic code to add the Windows Service Manager hooks to Nomad.

Includes vendoring golang.org/x/sys/windows/svc and added Docs:
* guide for installing as a windows service.
* configuration for logging to file from PR #6429
2019-11-11 15:16:07 -05:00
Drew Bailey f989f38594
test /ui/ path 2019-11-11 12:12:42 -05:00
Drew Bailey a0548824f3
test rootfallthrough handler 2019-11-11 12:08:44 -05:00
Mahmood Ali b2145f2d02 Allow UI to query client directly
Nomad web UI currently fails when querying client nodes for allocation
state end endpoints, due to CORS policy.

The issue is that CORS requests that are marked `withCredentials` need
the http server to include a `Access-Control-Allow-Credentials` [1].

But Nomad Task Logs and filesystem requests include authenticating
information and thus marked with `credentials=true`[2][3].

It's worth noting that the browser currently sends credentials and
authentication token to servers anyway; it's just that the response is
not made available to caller nomad ui javascript.  For task logs
specifically, nomad ui retries again by querying the web ui address
(typically pointing to a nomad server) which will forward the request
to the nomad client agent appropriately.

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Credentials
[2] 101d0373ee/ui/app/components/task-log.js (L50)
[3] 101d0373ee/ui/app/services/token.js (L25-L39)
2019-11-11 15:13:30 +00:00
Lars Lehtonen 08d5342812 command/agent: TestAgent_ServerConfig() fix dropped errors (#6659) 2019-11-11 09:46:46 -05:00
Drew Bailey 04439a5a78
better func name, swap conditional 2019-11-11 08:35:56 -05:00
Drew Bailey c85df2dac7
returns a 404 if not found instead of redirect to ui 2019-11-08 15:34:35 -05:00
Drew Bailey 7b2ad28ef6
unlock before returning, no need for label
comment, trigger build

return length written
2019-11-05 11:44:29 -05:00
Drew Bailey d3b48a3e45
simplify logch goroutine 2019-11-05 11:44:28 -05:00
Drew Bailey df57f70a68
wireup plain=true|false query param 2019-11-05 11:44:28 -05:00
Drew Bailey f4a7e3dc75
coordinate closing of doneCh, use interface to simplify callers
comments
2019-11-05 11:44:26 -05:00
Drew Bailey fe542680dc
log-json -> json
fix typo command/agent/monitor/monitor.go

Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com>

Update command/agent/monitor/monitor.go

Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com>

address feedback, lock to prevent send on closed channel

fix lock/unlock for dropped messages
2019-11-05 09:51:59 -05:00
Drew Bailey 84c8e79f90
simplify assert message 2019-11-05 09:51:56 -05:00
Drew Bailey 8726b685de
address feedback 2019-11-05 09:51:56 -05:00
Drew Bailey e4b3e1d7d4
allow more time for streaming message
remove unused struct
2019-11-05 09:51:55 -05:00
Drew Bailey 318b6c91bf
monitor command takes no args
rm extra new line

fix lint errors

return after close

fix, simplify test
2019-11-05 09:51:55 -05:00
Drew Bailey 0e759c401c
moving endpoints over to frames 2019-11-05 09:51:54 -05:00
Drew Bailey fb23c1325d
fix deadlock issue, switch to frames envelope 2019-11-05 09:51:54 -05:00
Drew Bailey 32f62edbb0
return 400 if invalid log_json param is given
Addresses feedback around monitor implementation

subselect on stopCh to prevent blocking forever.

Set up a separate goroutine to check every 3 seconds for dropped
messages.

rename returned ch to avoid confusion
2019-11-05 09:51:53 -05:00
Drew Bailey 17d876d5ef
rename function, initialize log level better
underscores instead of dashes for query params
2019-11-05 09:51:53 -05:00
Drew Bailey da6229d704
update go-hclog dep
remove duplicate lock
2019-11-05 09:51:52 -05:00
Drew Bailey db65b1f4a5
agent:read acl policy for monitor 2019-11-05 09:51:52 -05:00
Drew Bailey f46fd5b3e1
only look up rpchandler for node if we have nodeid
fix some comments and nomad monitor -h output
2019-11-05 09:51:51 -05:00
Drew Bailey 3b9c33a5f0
new hclog with standardlogger intercept 2019-11-05 09:51:49 -05:00
Drew Bailey a45ae1cd58
enable json formatting, use queryoptions 2019-11-05 09:51:49 -05:00
Drew Bailey 786989dbe3
New monitor pkg for shared monitor functionality
Adds new package that can be used by client and server RPC endpoints to
facilitate monitoring based off of a logger

clean up old code

small comment about write

rm old comment about minsize

rename to Monitor

Removes connection logic from monitor command

Keep connection logic in endpoints, use a channel to send results from
monitoring

use new multisink logger and interfaces

small test for dropped messages

update go-hclogger and update sink/intercept logger interfaces
2019-11-05 09:51:49 -05:00
Drew Bailey e076204820
get local rpc endpoint working 2019-11-05 09:51:48 -05:00
Drew Bailey 976c43157c
remove log_writer
prefix output with proper spacing

update gzip handler, adjust first byte flow to allow gzip handler bypass

wip, first stab at wiring up rpc endpoint
2019-11-05 09:51:48 -05:00
Drew Bailey 0de94466b2
Display error when remote side ended monitor
multisink logger

remove usage of logwriter
2019-11-05 09:51:48 -05:00
Drew Bailey f60e44afc7
Adds nomad monitor command
Adds nomad monitor command. Like consul monitor, this command allows you
to stream logs from a nomad agent in real time with a a specified log
level

add endpoint tests

Upgrade go-hclog to latest version

The current version of go-hclog pads log prefixes to equal lengths
so info becomes [INFO ] and debug becomes [DEBUG]. This breaks
hashicorp/logutils/level.go Check function. Upgrading to the latest
version removes this padding and fixes log filtering that uses logutils
Check
2019-11-05 09:51:47 -05:00
Drew Bailey b386119d15
Add Agent Monitor to receive streaming logs
Queries /v1/agent/monitor and receives streaming logs from client
2019-11-05 09:51:47 -05:00
Drew Bailey b0184e2032
Adds AgentMonitor Endpoint
AgentMonitor is an endpoint to stream logs for a given agent. It allows
callers to pass in a supplied log level, which may be different than the
agents config allowing for temporary debugging with lower log levels.

Pass in logWriter when setting up Agent
2019-11-05 09:51:46 -05:00
Michael Schurter 9fed8d1bed client: fix panic from 0.8 -> 0.10 upgrade
makeAllocTaskServices did not do a nil check on AllocatedResources
which causes a panic when upgrading directly from 0.8 to 0.10. While
skipping 0.9 is not supported we intend to fix serious crashers caused
by such upgrades to prevent cluster outages.

I did a quick audit of the client package and everywhere else that
accesses AllocatedResources appears to be properly guarded by a nil
check.
2019-11-01 07:47:03 -07:00
Mahmood Ali 3f6e50617a
Merge pull request #6047 from hashicorp/b-ignore-server-if-disabled
Only warn against BootstrapExpect set in CLI flag
2019-10-29 10:55:44 -04:00
Michael Schurter 39437a5c5b
Merge branch 'master' into release-0100 2019-10-22 08:17:57 -07:00