Commit Graph

2596 Commits

Author SHA1 Message Date
Drew Bailey 48c451709e
update license command output to reflect api changes 2020-05-05 10:28:58 -04:00
Mahmood Ali 78ae7b885a
Merge pull request #7810 from hashicorp/spread-configuration
spread scheduling algorithm
2020-05-01 13:15:19 -04:00
Mahmood Ali b9e3cde865 tests and some clean up 2020-05-01 13:13:30 -04:00
Charlie Voiselle 663fb677cf Add SchedulerAlgorithm to SchedulerConfig 2020-05-01 13:13:29 -04:00
Drew Bailey 581ad558a8
temporarily test for 404 until endpoint is ready 2020-05-01 11:24:37 -04:00
Drew Bailey 41c7d49eb7
properly format license output 2020-04-30 14:46:26 -04:00
Drew Bailey 42075ef30e
allow test to check if server is enterprise 2020-04-30 14:46:21 -04:00
Drew Bailey acacecc67b
add license reset command to commands
help text formatting

remove reset

no signed option
2020-04-30 14:46:20 -04:00
Drew Bailey a266284f60
test all commands oss err 2020-04-30 14:46:19 -04:00
Drew Bailey 59b76f90e8
hcl fmt from editor
license cli formatting, license endpoints ent only

test oss error

type assertions
2020-04-30 14:46:18 -04:00
Drew Bailey 74abe6ef48
license cli commands
cli changes, formatting
2020-04-30 14:46:17 -04:00
Lang Martin e32b5b12dd
command: deployment status without a prefix lists deployments (#7821) 2020-04-28 15:11:32 -04:00
Mahmood Ali b8fb32f5d2 http: adjust log level for request failure
Failed requests due to API client errors are to be marked as DEBUG.

The Error log level should be reserved to signal problems with the
cluster and are actionable for nomad system operators.  Logs due to
misbehaving API clients don't represent a system level problem and seem
spurius to nomad maintainers at best.  These log messages can also be
attack vectors for deniel of service attacks by filling servers disk
space with spurious log messages.
2020-04-22 16:19:59 -04:00
Mahmood Ali 5b42796f1e
Merge pull request #7704 from hashicorp/b-agent-shutdown-order
agent: shutdown agent http server last
2020-04-20 10:37:26 -04:00
Mahmood Ali 4e1366f285 agent: route http logs through hclog
Pipe http server log to hclog, so that it uses the same logging format
as rest of nomad logs.  Also, supports emitting them as json logs, when
json formatting is set.

The http server logs are emitted as Trace level, as they are typically
repsent HTTP client errors (e.g. failed tls handshakes, invalid headers,
etc).

Though, Panic logs represent server errors and are relayed as Error
level.
2020-04-20 10:33:40 -04:00
Jeffrey 'jf' Lim eab600d3e1
Fix/improve "job plan" messaging (#7580) 2020-04-17 15:53:16 -04:00
Mahmood Ali b78680eee7 agent: shutdown agent http server last
Shutdown http server last, after nomad client/server components
terminate.

Before this change, if the agent is taking an unexpectedly long time to
shutdown, the operator cannot query the http server directly: they
cannot access agent specific http endpoints and need to query another
agent about the troublesome agent.

Unexpectedly long shutdown can happen in normal cases, e.g. a client
might hung is if one of the allocs it is running has a long
shutdown_delay.

Here, we switch to ensuring that the http server is shutdown last.

I believe this doesn't require extra care in agent shutting down logic
while operators may be able to submit write http requests.  We already
need to cope with operators submiting these http requests to another
agent or by servers updating the client allocations.
2020-04-13 10:50:07 -04:00
Mahmood Ali 14d6fec05a tests: deflake some SetServer related tests
Some tests assert on numbers on numbers of servers, e.g.
TestHTTP_AgentSetServers and TestHTTP_AgentListServers_ACL . Though, in dev and
test modes, the agent starts with servers having duplicate entries for
advertised and normalized RPC values, then settles with one unique value after
Raft/Serf re-sets servers with one single unique value.

This leads to flakiness, as the test will fail if assertion runs before Serf
update takes effect.

Here, we update the inital dev handling so it only adds a unique value if the
advertised and normalized values are the same.

Sample log lines illustrating the problem:

```
=== CONT  TestHTTP_AgentSetServers
    TestHTTP_AgentSetServers: testlog.go:34: 2020-04-06T21:47:51.016Z [INFO]  nomad.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:127.0.0.1:9008 Address:127.0.0.1:9008}]"
    TestHTTP_AgentSetServers: testlog.go:34: 2020-04-06T21:47:51.016Z [INFO]  nomad: serf: EventMemberJoin: TestHTTP_AgentSetServers.global 127.0.0.1
    TestHTTP_AgentSetServers: testlog.go:34: 2020-04-06T21:47:51.035Z [DEBUG] client.server_mgr: new server list: new_servers=[127.0.0.1:9008, 127.0.0.1:9008] old_servers=[]
...
    TestHTTP_AgentSetServers: agent_endpoint_test.go:759:
                Error Trace:    agent_endpoint_test.go:759
                                                        http_test.go:1089
                                                        agent_endpoint_test.go:705
                Error:          "[127.0.0.1:9008 127.0.0.1:9008]" should have 1 item(s), but has 2
                Test:           TestHTTP_AgentSetServers
```
2020-04-07 09:27:48 -04:00
Mahmood Ali ed4c4d13a4 fixup! backend: support WS authentication handshake in alloc/exec 2020-04-03 14:20:31 -04:00
Mahmood Ali e63e096136 backend: support WS authentication handshake in alloc/exec
The javascript Websocket API doesn't support setting custom headers
(e.g. `X-Nomad-Token`).  This change adds support for having an
authentication handshake message: clients can set `ws_handshake` URL
query parameter to true and send a single handshake message with auth
token first before any other mssage.

This is a backward compatible change: it does not affect nomad CLI path, as it
doesn't set `ws_handshake` parameter.
2020-04-03 11:18:54 -04:00
Mahmood Ali 990cfb6fef agent config parsing tests for scheduler config 2020-04-03 07:54:32 -04:00
Chris Baker 277d29c6e7
Merge pull request #7572 from hashicorp/f-7422-scaling-events
finalizing scaling API work
2020-04-01 13:49:22 -05:00
Seth Hoenig 9aa9721143 connect: fix bug where absent connect.proxy stanza needs default config
In some refactoring, a bug was introduced where if the connect.proxy
stanza in a submitted job was nil, the default proxy configuration
would not be initialized with default values, effectively breaking
Connect.

      connect {
        sidecar_service {} # should work
      }

In contrast, by setting an empty proxy stanza, the config values would
be inserted correctly.

      connect {
        sidecar_service {
	  proxy {} # workaround
	}
      }

This commit restores the original behavior, where having a proxy
stanza present is not required.

The unit test for this case has also been corrected.
2020-04-01 11:19:32 -06:00
Chris Baker 40d6b3bbd1 adding raft and state_store support to track job scaling events
updated ScalingEvent API to record "message string,error bool" instead
of confusing "reason,error *string"
2020-04-01 16:15:14 +00:00
Seth Hoenig 14c7cebdea connect: enable automatic expose paths for individual group service checks
Part of #6120

Building on the support for enabling connect proxy paths in #7323, this change
adds the ability to configure the 'service.check.expose' flag on group-level
service check definitions for services that are connect-enabled. This is a slight
deviation from the "magic" that Consul provides. With Consul, the 'expose' flag
exists on the connect.proxy stanza, which will then auto-generate expose paths
for every HTTP and gRPC service check associated with that connect-enabled
service.

A first attempt at providing similar magic for Nomad's Consul Connect integration
followed that pattern exactly, as seen in #7396. However, on reviewing the PR
we realized having the `expose` flag on the proxy stanza inseperably ties together
the automatic path generation with every HTTP/gRPC defined on the service. This
makes sense in Consul's context, because a service definition is reasonably
associated with a single "task". With Nomad's group level service definitions
however, there is a reasonable expectation that a service definition is more
abstractly representative of multiple services within the task group. In this
case, one would want to define checks of that service which concretely make HTTP
or gRPC requests to different underlying tasks. Such a model is not possible
with the course `proxy.expose` flag.

Instead, we now have the flag made available within the check definitions themselves.
By making the expose feature resolute to each check, it is possible to have
some HTTP/gRPC checks which make use of the envoy exposed paths, as well as
some HTTP/gRPC checks which make use of some orthongonal port-mapping to do
checks on some other task (or even some other bound port of the same task)
within the task group.

Given this example,

group "server-group" {
  network {
    mode = "bridge"
    port "forchecks" {
      to = -1
    }
  }

  service {
    name = "myserver"
    port = 2000

    connect {
      sidecar_service {
      }
    }

    check {
      name     = "mycheck-myserver"
      type     = "http"
      port     = "forchecks"
      interval = "3s"
      timeout  = "2s"
      method   = "GET"
      path     = "/classic/responder/health"
      expose   = true
    }
  }
}

Nomad will automatically inject (via job endpoint mutator) the
extrapolated expose path configuration, i.e.

expose {
  path {
    path            = "/classic/responder/health"
    protocol        = "http"
    local_path_port = 2000
    listener_port   = "forchecks"
  }
}

Documentation is coming in #7440 (needs updating, doing next)

Modifications to the `countdash` examples in https://github.com/hashicorp/demo-consul-101/pull/6
which will make the examples in the documentation actually runnable.

Will add some e2e tests based on the above when it becomes available.
2020-03-31 17:15:50 -06:00
Seth Hoenig 41244c5857 jobspec: parse multi expose.path instead of explicit slice 2020-03-31 17:15:27 -06:00
Seth Hoenig 0266f056b8 connect: enable proxy.passthrough configuration
Enable configuration of HTTP and gRPC endpoints which should be exposed by
the Connect sidecar proxy. This changeset is the first "non-magical" pass
that lays the groundwork for enabling Consul service checks for tasks
running in a network namespace because they are Connect-enabled. The changes
here provide for full configuration of the

  connect {
    sidecar_service {
      proxy {
        expose {
          paths = [{
		path = <exposed endpoint>
                protocol = <http or grpc>
                local_path_port = <local endpoint port>
                listener_port = <inbound mesh port>
	  }, ... ]
       }
    }
  }

stanza. Everything from `expose` and below is new, and partially implements
the precedent set by Consul:
  https://www.consul.io/docs/connect/registration/service-registration.html#expose-paths-configuration-reference

Combined with a task-group level network port-mapping in the form:

  port "exposeExample" { to = -1 }

it is now possible to "punch a hole" through the network namespace
to a specific HTTP or gRPC path, with the anticipated use case of creating
Consul checks on Connect enabled services.

A future PR may introduce more automagic behavior, where we can do things like

1) auto-fill the 'expose.path.local_path_port' with the default value of the
   'service.port' value for task-group level connect-enabled services.

2) automatically generate a port-mapping

3) enable an 'expose.checks' flag which automatically creates exposed endpoints
   for every compatible consul service check (http/grpc checks on connect
   enabled services).
2020-03-31 17:15:27 -06:00
Seth Hoenig 1ce4eb17fa client: use consistent name for struct receiver parameter
This helps reduce the number of squiggly lines in Goland.
2020-03-31 17:15:27 -06:00
Lang Martin 8d4f39fba1
csi: add node events to report progress mounting and unmounting volumes (#7547)
* nomad/structs/structs: new NodeEventSubsystemCSI

* client/client: pass triggerNodeEvent in the CSIConfig

* client/pluginmanager/csimanager/instance: add eventer to instanceManager

* client/pluginmanager/csimanager/manager: pass triggerNodeEvent

* client/pluginmanager/csimanager/volume: node event on [un]mount

* nomad/structs/structs: use storage, not CSI

* client/pluginmanager/csimanager/volume: use storage, not CSI

* client/pluginmanager/csimanager/volume_test: eventer

* client/pluginmanager/csimanager/volume: event on error

* client/pluginmanager/csimanager/volume_test: check event on error

* command/node_status: remove an extra space in event detail format

* client/pluginmanager/csimanager/volume: use snake_case for details

* client/pluginmanager/csimanager/volume_test: snake_case details
2020-03-31 17:13:52 -04:00
Yoan Blanc 225c9c1215 fixup! vendor: explicit use of hashicorp/go-msgpack
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-31 09:48:07 -04:00
Yoan Blanc 761d014071 vendor: explicit use of hashicorp/go-msgpack
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-31 09:45:21 -04:00
Seth Hoenig b3664c628c
Merge pull request #7524 from hashicorp/docs-consul-acl-minimums
consul: annotate Consul interfaces with ACLs
2020-03-30 13:27:27 -06:00
Mahmood Ali 7df337e4c4
Merge pull request #7534 from hashicorp/b-windows-dev-network
windows: support -dev mode
2020-03-30 14:35:28 -04:00
Seth Hoenig 0a812ab689 consul: annotate Consul interfaces with ACLs 2020-03-30 10:17:28 -06:00
Drew Bailey a98dc8c768
update audit examples to an endpoint that is audited 2020-03-30 10:03:11 -04:00
Mahmood Ali dedf1cd3d7 tests: remove TestHTTP_NodeDrain_Compat
Nomad 0.11 servers no longer support having pre-0.8 clients.
2020-03-30 07:06:52 -04:00
Mahmood Ali 8b2b3f99d3 tests: deflake TestHTTP_NodeDrain
A node may be recognized as not running any allocs and have its drain
flag reset before the test queries it.
2020-03-30 07:06:52 -04:00
Mahmood Ali b0cc23ae63 tests: deflake TestConsul_PeriodicSync 2020-03-30 07:06:47 -04:00
Mahmood Ali ec6afa5795 windows: support -dev mode
Support running `nomad agent -dev` in Windows, by setting proper network
interface.

Prior to this change, `nomad` uses `lo` interface but Windows uses
"Loopback Pseudo-Interface 1" to refer to loopback device interface:
https://github.com/golang/go/blob/go1.14.1/src/net/net_windows_test.go#L304-L318
.
2020-03-28 12:01:51 -04:00
Drew Bailey a66b4be0f3
remove auditing for /ui/ 2020-03-27 10:12:42 -04:00
Drew Bailey de687edb2e
wrap http.Handlers
better comments
2020-03-27 09:35:10 -04:00
Lang Martin 50ff9ccd44
csi: plugin deregistration on plugin job GC (#7502)
* nomad/structs/csi: delete just one plugin type from a node

* nomad/structs/csi: add DeleteAlloc

* nomad/state/state_store: add deleteJobFromPlugin

* nomad/state/state_store: use DeleteAlloc not DeleteNodeType

* move CreateTestCSIPlugin to state to avoid an import cycle

* nomad/state/state_store_test: delete a plugin by deleting its jobs

* nomad/*_test: move CreateTestCSIPlugin to state

* nomad/state/state_store: update one plugin per transaction

* command/plugin_status_test: move CreateTestCSIPlugin

* nomad: csi: handle nils CSIPlugin methods, clarity
2020-03-26 17:07:18 -04:00
Drew Bailey b96a4da6fc
sync changes made to oss files from ent 2020-03-25 10:57:44 -04:00
Drew Bailey 218bfff6dd
add in change missed from ent 2020-03-25 10:53:38 -04:00
Drew Bailey 97cc19276d
add auditor 2020-03-25 10:48:23 -04:00
Drew Bailey 7329a88758
allow all build contexts to use noOpAuditor 2020-03-25 10:38:40 -04:00
Mahmood Ali 1c1186b344
Merge pull request #7487 from hashicorp/b-xss-oss
agent: prevent XSS by controlling Content-Type
2020-03-25 09:56:11 -04:00
Michael Schurter 29622013fa remove double negative from comment
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2020-03-25 09:45:43 -04:00
Michael Schurter 1a27b8a07d test: assert monitor endpoint sets proper headers 2020-03-25 09:45:43 -04:00
Michael Schurter d6d44a8214 test: assert fs endpoints are xss safe 2020-03-25 09:45:43 -04:00