Commit graph

3474 commits

Author SHA1 Message Date
Tim Gross 01c704ab9d csi: add PublishContext to CSIVolumeClaimResponse (#7113)
The `ControllerPublishVolumeResponse` CSI RPC includes the publish
context intended to be passed by the orchestrator as an opaque value
to the node plugins. This changeset adds it to our response to a
volume claim request to proxy the controller's response back to the
client node.
2020-03-23 13:58:29 -04:00
Tim Gross fb1aad66ee csi: implement releasing volume claims for terminal allocs (#7076)
When an alloc is marked terminal, and after node unstage/unpublish
have been called, the client will sync the terminal alloc state with
the server via `Node.UpdateAlloc` RPC.

This changeset implements releasing the volume claim for each volume
associated with the terminal alloc. It doesn't yet implement the RPC
call we need to make to the `ControllerUnpublishVolume` CSI RPC.
2020-03-23 13:58:29 -04:00
Tim Gross d4cd272de3 csi: implement VolumeClaimRPC (#7048)
When the client receives an allocation which includes a CSI volume,
the alloc runner will block its main `Run` loop. The alloc runner will
issue a `VolumeClaim` RPC to the Nomad servers. This changeset
implements the portions of the `VolumeClaim` RPC endpoint that have
not been previously completed.
2020-03-23 13:58:29 -04:00
Lang Martin 421d7ed2e4 nomad: csi_endpoint send register & deregister requests to raft (#7059) 2020-03-23 13:58:29 -04:00
Lang Martin 7b675f89ac csi: fix index maintenance for CSIVolume and CSIPlugin tables (#7049)
* state_store: csi volumes/plugins store the index in the txn

* nomad: csi_endpoint_test require index checks need uint64()

* nomad: other tests using int 0 not uint64(0)

* structs: pass index into New, but not other struct methods

* state_store: csi plugin indexes, use new struct interface

* nomad: csi_endpoint_test check index/query meta (on explicit 0)

* structs: NewCSIVolume takes an index arg now

* scheduler/test: NewCSIVolume takes an index arg now
2020-03-23 13:58:29 -04:00
Lang Martin a0a6766740 CSI: Scheduler knows about CSI constraints and availability (#6995)
* structs: piggyback csi volumes on host volumes for job specs

* state_store: CSIVolumeByID always includes plugins, matches usecase

* scheduler/feasible: csi volume checker

* scheduler/stack: add csi volumes

* contributing: update rpc checklist

* scheduler: add volumes to State interface

* scheduler/feasible: introduce new checker collection tgAvailable

* scheduler/stack: taskGroupCSIVolumes checker is transient

* state_store CSIVolumeDenormalizePlugins comment clarity

* structs: remote TODO comment in TaskGroup Validate

* scheduler/feasible: CSIVolumeChecker hasPlugins improve comment

* scheduler/feasible_test: set t.Parallel

* Update nomad/state/state_store.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* Update scheduler/feasible.go

Co-Authored-By: Danielle <dani@hashicorp.com>

* structs: lift ControllerRequired to each volume

* state_store: store plug.ControllerRequired, use it for volume health

* feasible: csi match fast path remove stale host volume copied logic

* scheduler/feasible: improve comments

Co-authored-by: Danielle <dani@builds.terrible.systems>
2020-03-23 13:58:29 -04:00
Tim Gross 8673ea5cba csi: add empty CSI volume publication GC to scheduled core jobs (#7014)
This changeset adds a new core job `CoreJobCSIVolumePublicationGC` to
the leader's loop for scheduling core job evals. Right now this is an
empty method body without even a config file stanza. Later changesets
will implement the logic of volume publication GC.
2020-03-23 13:58:29 -04:00
Lang Martin 88316208a0 csi: server-side plugin state tracking and api (#6966)
* structs: CSIPlugin indexes jobs acting as plugins and node updates

* schema: csi_plugins table for CSIPlugin

* nomad: csi_endpoint use vol.Denormalize, plugin requests

* nomad: csi_volume_endpoint: rename to csi_endpoint

* agent: add CSI plugin endpoints

* state_store_test: use generated ids to avoid t.Parallel conflicts

* contributing: add note about registering new RPC structs

* command: agent http register plugin lists

* api: CSI plugin queries, ControllerHealthy -> ControllersHealthy

* state_store: copy on write for volumes and plugins

* structs: copy on write for volumes and plugins

* state_store: CSIVolumeByID returns an unhealthy volume, denormalize

* nomad: csi_endpoint use CSIVolumeDenormalizePlugins

* structs: remove struct errors for missing objects

* nomad: csi_endpoint return nil for missing objects, not errors

* api: return meta from Register to avoid EOF error

* state_store: CSIVolumeDenormalize keep allocs in their own maps

* state_store: CSIVolumeDeregister error on missing volume

* state_store: CSIVolumeRegister set indexes

* nomad: csi_endpoint use CSIVolumeDenormalizePlugins tests
2020-03-23 13:58:29 -04:00
Lang Martin 61cfc806ad csi_volume_endpoint_test: gen uuids to avoid t.Parallel conflicts 2020-03-23 13:58:29 -04:00
Lang Martin 334979a754 nomad/rpc: indicate missing region in error message 2020-03-23 13:58:29 -04:00
Lang Martin 5b31b140c3 csi: do not use namespace specific identifiers 2020-03-23 13:58:29 -04:00
Lang Martin e922531aaf structs: move the content of csi_volumes into csi 2020-03-23 13:58:29 -04:00
Lang Martin 04b6e7c7fb server: rpc register CSIVolume 2020-03-23 13:58:29 -04:00
Lang Martin 8f33fb9a6d csi volume endpoint: new RPC endpoint for CSI volumes 2020-03-23 13:58:29 -04:00
Lang Martin 4bb4dd98eb state_store: CSIVolume insert, get, delete, claim
state_store: change claim counts

state_store: get volumes by all, by driver

state_store: process volume claims

state_store: csi volume register error on update
2020-03-23 13:58:29 -04:00
Lang Martin 0422b967db schema: csi_volumes schema 2020-03-23 13:58:29 -04:00
Lang Martin 857cd37ab5 fsm: dispatch CSIVolume register, deregister, claim 2020-03-23 13:58:29 -04:00
Lang Martin f9d9faf673 structs: eliminate MaxReaders & MaxWriters 2020-03-23 13:58:29 -04:00
Lang Martin 3a7e1b6d14 client structs: move CSIVolumeAttachmentMode and CSIVolumeAccessMode 2020-03-23 13:58:29 -04:00
Lang Martin 637ce9dfad structs: new CSIVolume, request types 2020-03-23 13:58:29 -04:00
Danielle Lancashire 57ae1d2cd6 csimanager: Fingerprint Node Service capabilities 2020-03-23 13:58:29 -04:00
Danielle Lancashire 564f5cec93 csimanager: Fingerprint controller capabilities 2020-03-23 13:58:29 -04:00
Danielle Lancashire 426c26d7c0 CSI Plugin Registration (#6555)
This changeset implements the initial registration and fingerprinting
of CSI Plugins as part of #5378. At a high level, it introduces the
following:

* A `csi_plugin` stanza as part of a Nomad task configuration, to
  allow a task to expose that it is a plugin.

* A new task runner hook: `csi_plugin_supervisor`. This hook does two
  things. When the `csi_plugin` stanza is detected, it will
  automatically configure the plugin task to receive bidirectional
  mounts to the CSI intermediary directory. At runtime, it will then
  perform an initial heartbeat of the plugin and handle submitting it to
  the new `dynamicplugins.Registry` for further use by the client, and
  then run a lightweight heartbeat loop that will emit task events
  when health changes.

* The `dynamicplugins.Registry` for handling plugins that run
  as Nomad tasks, in contrast to the existing catalog that requires
  `go-plugin` type plugins and to know the plugin configuration in
  advance.

* The `csimanager` which fingerprints CSI plugins, in a similar way to
  `drivermanager` and `devicemanager`. It currently only fingerprints
  the NodeID from the plugin, and assumes that all plugins are
  monolithic.

Missing features

* We do not use the live updates of the `dynamicplugin` registry in
  the `csimanager` yet.

* We do not deregister the plugins from the client when they shutdown
  yet, they just become indefinitely marked as unhealthy. This is
  deliberate until we figure out how we should manage deploying new
  versions of plugins/transitioning them.
2020-03-23 13:58:28 -04:00
Drew Bailey b09abef332
Audit config, seams for enterprise audit features
allow oss to parse sink duration

clean up audit sink parsing

ent eventer config reload

fix typo

SetEnabled to eventer interface

client acl test

rm dead code

fix failing test
2020-03-23 13:47:42 -04:00
Jasmine Dahilig 73a64e4397 change jobspec lifecycle stanza to use sidecar attribute instead of
block_until status
2020-03-21 17:52:57 -04:00
Jasmine Dahilig 1485b342e2 remove deadline code for now 2020-03-21 17:52:56 -04:00
Jasmine Dahilig d54a83afee fix linting errors 2020-03-21 17:52:53 -04:00
Jasmine Dahilig a0fe570317 clean up restore test 2020-03-21 17:52:52 -04:00
Jasmine Dahilig 7ed08eb75a partial test for restore functionality 2020-03-21 17:52:52 -04:00
Jasmine Dahilig 81d051d7e8 fix bug in lifecycle scheduler test mocks 2020-03-21 17:52:51 -04:00
Jasmine Dahilig b7f08c9d13 add appropriate lifecycle deadline default of 120s 2020-03-21 17:52:48 -04:00
Jasmine Dahilig 0cc9212a54 add test cases for scheduler alloc placement with lifecycle resources 2020-03-21 17:52:47 -04:00
Jasmine Dahilig 0d2988652c add lifecycle job mock 2020-03-21 17:52:47 -04:00
Jasmine Dahilig c27223207c update task hook coordinator tests 2020-03-21 17:52:46 -04:00
Mahmood Ali b880607bad update scheduler to account for hooks 2020-03-21 17:52:45 -04:00
Jasmine Dahilig 12393f90e7 add test for lifecycle coordinator 2020-03-21 17:52:42 -04:00
Jasmine Dahilig f6e58d6dad add canonicalize in the right place 2020-03-21 17:52:41 -04:00
Jasmine Dahilig 4498c8c24f add canonicalization 2020-03-21 17:52:39 -04:00
Jasmine Dahilig 67262d841b add validation tests and more validation 2020-03-21 17:52:39 -04:00
Mahmood Ali 214d128bd9 it's running now 2020-03-21 17:52:37 -04:00
Jasmine Dahilig fc13fa9739 change TaskLifecycle RunLevel to Hook and add Deadline time duration 2020-03-21 17:52:37 -04:00
Mahmood Ali 4ebeac721a update structs with lifecycle 2020-03-21 17:52:36 -04:00
Yoan Blanc 67692789b7
vendor: vault api and sdk
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-21 17:57:48 +01:00
Mahmood Ali 53e20e5cc2 Deflake TestRPC_Limits_Streaming test
The test starts enough connections to hit the limit, then closes the
connection and immediately starts one expecting the new one to succeed.

We must wait until the server side recognizes the connection
closing and free up a limits slot.  The current test attempts to achieve
that by waiting to get an error on conn.Read, however, this error is
returned from local client without waiting for server update.

As such, I change the logic so it retries on connection rejection but
force the first non-EOF failure to be a deadline error.
2020-03-20 17:21:43 -04:00
Mahmood Ali 0da7130a1a Protect against args being modified 2020-03-18 08:11:16 -04:00
Mahmood Ali 52fd31af80 server: node connections must not be forwarded
This fixes a bug where a forwarded node update request may be assumed
to be the actual direct client connection if the server just lost
leadership.

When a nomad non-leader server receives a Node.UpdateStatus request, it
forwards the RPC request to the leader, and holds on the request
Yamux connection in a cache to allow for server<->client forwarding.

When the leader handles the request, it must differentiate between a
forwarded connection vs the actual connection.  This is done in
https://github.com/hashicorp/nomad/blob/v0.10.4/nomad/node_endpoint.go#L412

Now, consider if the non-leader server forwards to the connection to a
recently deposed nomad leader, which in turn forwards the RPC request to
the new leader.

Without this change, the deposed leader will mistake the forwarded
connection for the actual client connection and cache it mapped to the
client ID.  If the server attempts to connect to that client, it will
attempt to start a connection/session to the other server instead and
the call will hang forever.

This change ensures that we only add node connection mapping if the
request is not a forwarded request, regardless of circumstances.
2020-03-17 16:39:01 -04:00
Mahmood Ali 9d88f1d568 tests: deflake deploymentwatcher package
This deflake the tests in the deploymentwatcher package.  The package
uses a mock deployment watcher backend, where the watcher in a
background goroutine calls UpdateDeploymentStatus .  If the mock isn't
configured to expect the call, the background goroutine will fail.  One
UpdateDeploymentStatus call is made at the end of the background
goroutine, which may occur after the test completes, thus explaining the
flakiness.
2020-03-12 15:42:01 -04:00
Michael Schurter 2dcc85bed1 jobspec: fixup vault_grace deprecation
Followup to #7170

- Moved canonicalization of VaultGrace back into `api/` package.
- Fixed tests.
- Made docs styling consistent.
2020-03-10 14:58:49 -07:00
Michael Schurter b72b3e765c
Merge pull request #7170 from fredrikhgrelland/consul_template_upgrade
Update consul-template to v0.24.1 and remove deprecated vault grace
2020-03-10 14:15:47 -07:00
Mahmood Ali 005bd37758 tests: deflake TestServer_ReconcileMember
TestServer_ReconcileMember assumes that S3 isn't the leader:
`reconcileMembers` call would fail when attempting to remove itself!
2020-03-06 14:14:41 -05:00
Mahmood Ali 17ee94b52b fix typo 2020-03-03 16:55:54 -05:00
Mahmood Ali acbfeb5815 Simplify Bootstrap logic in tests
This change updates tests to honor `BootstrapExpect` exclusively when
forming test clusters and removes test only knobs, e.g.
`config.DevDisableBootstrap`.

Background:

Test cluster creation is fragile.  Test servers don't follow the
BootstapExpected route like production clusters.  Instead they start as
single node clusters and then get rejoin and may risk causing brain
split or other test flakiness.

The test framework expose few knobs to control those (e.g.
`config.DevDisableBootstrap` and `config.Bootstrap`) that control
whether a server should bootstrap the cluster.  These flags are
confusing and it's unclear when to use: their usage in multi-node
cluster isn't properly documented.  Furthermore, they have some bad
side-effects as they don't control Raft library: If
`config.DevDisableBootstrap` is true, the test server may not
immediately attempt to bootstrap a cluster, but after an election
timeout (~50ms), Raft may force a leadership election and win it (with
only one vote) and cause a split brain.

The knobs are also confusing as Bootstrap is an overloaded term.  In
BootstrapExpect, we refer to bootstrapping the cluster only after N
servers are connected.  But in tests and the knobs above, it refers to
whether the server is a single node cluster and shouldn't wait for any
other server.

Changes:

This commit makes two changes:

First, it relies on `BootstrapExpected` instead of `Bootstrap` and/or
`DevMode` flags.  This change is relatively trivial.

Introduce a `Bootstrapped` flag to track if the cluster is bootstrapped.
This allows us to keep `BootstrapExpected` immutable.  Previously, the
flag was a config value but it gets set to 0 after cluster bootstrap
completes.
2020-03-02 13:47:43 -05:00
Fredrik Hoem Grelland edb3bd0f3f Update consul-template to v0.24.1 and remove deprecated vault_grace (#7170) 2020-02-23 16:24:53 +01:00
Seth Hoenig 0f99cdd0d9
Merge pull request #7192 from hashicorp/b-connect-stanza-ignore
consul/connect: in-place update sidecar service registrations on changes
2020-02-21 09:24:53 -06:00
Seth Hoenig 07b9b24ceb
nomad: note why AddressMode is not part of CSD hash
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2020-02-21 09:24:42 -06:00
Seth Hoenig 54b5173eca consul/connect: in-place update sidecar service registrations on changes
Fix a bug where consul service definitions would not be updated if changes
were made to the service in the Nomad job. Currently this only fixes the
bug for cases where the fix is a matter of updating consul agent's service
registration. There is related bug where destructive changes are required
(see #6877) which will be fixed in another PR.

The enable_tag_override configuration setting for the parent service is
applied to the sidecar service.

Fixes #6459
2020-02-19 13:07:04 -06:00
Mahmood Ali 98ad59b1de update rest of consul packages 2020-02-16 16:25:04 -06:00
Mahmood Ali f492ab6d9e implement MinQuorum 2020-02-16 16:04:59 -06:00
Mahmood Ali 3dcc65d58d Update consul autopilot dependency 2020-02-16 15:41:43 -06:00
Mahmood Ali cf53ee57cd remove unused dropButLastChannel 2020-02-13 18:56:53 -05:00
Mahmood Ali fd51982018 tests: Avoid StartAsLeader raft config flag
It's being deprecated
2020-02-13 18:56:53 -05:00
Mahmood Ali 367133a399 Use latest raft patterns 2020-02-13 18:56:52 -05:00
Seth Hoenig 543354aabe
Merge pull request #7106 from hashicorp/f-ctag-override
client: enable configuring enable_tag_override for services
2020-02-13 12:34:48 -06:00
Michael Schurter 8c332a3757
Merge pull request #7102 from hashicorp/test-limits
Fix some race conditions and flaky tests
2020-02-13 10:19:11 -08:00
Mahmood Ali bc70beeb4a
Merge pull request #7044 from hashicorp/f-use-multiplexv2
rpc: Use MultiplexV2 for connections
2020-02-13 12:07:20 -05:00
Drew Bailey 24a5d36fcf
Merge pull request #7112 from hashicorp/f-include-pro-tag
include pro tag in serveral oss.go files
2020-02-13 11:26:41 -05:00
Seth Hoenig 2829b4cd23
Merge pull request #7129 from hashicorp/b-consistent-ct-name
command: use consistent CONSUL_HTTP_TOKEN name
2020-02-12 12:27:46 -06:00
Seth Hoenig 7f33b92e0b command: use consistent CONSUL_HTTP_TOKEN name
Consul CLI uses CONSUL_HTTP_TOKEN, so Nomad should use the same.
Note that consul-template uses CONSUL_TOKEN, which Nomad also uses,
so be careful to preserve any reference to that in the consul-template
context.
2020-02-12 10:42:33 -06:00
Seth Hoenig ce50345b7a nomad: assert consul token is unset on job register in tests 2020-02-12 10:17:42 -06:00
Seth Hoenig 02151dee45 nomad: unset consul token on job register 2020-02-12 09:58:51 -06:00
Drew Bailey 6bd6c6638c
include pro tag in serveral oss.go files 2020-02-10 15:56:14 -05:00
Seth Hoenig 0e44094d1a client: enable configuring enable_tag_override for services
Consul provides a feature of Service Definitions where the tags
associated with a service can be modified through the Catalog API,
overriding the value(s) configured in the agent's service configuration.

To enable this feature, the flag enable_tag_override must be configured
in the service definition.

Previously, Nomad did not allow configuring this flag, and thus the default
value of false was used. Now, it is configurable.

Because Nomad itself acts as a state machine around the the service definitions
of the tasks it manages, it's worth describing what happens when this feature
is enabled and why.

Consider the basic case where there is no Nomad, and your service is provided
to consul as a boring JSON file. The ultimate source of truth for the definition
of that service is the file, and is stored in the agent. Later, Consul performs
"anti-entropy" which synchronizes the Catalog (stored only the leaders). Then
with enable_tag_override=true, the tags field is available for "external"
modification through the Catalog API (rather than directly configuring the
service definition file, or using the Agent API). The important observation
is that if the service definition ever changes (i.e. the file is changed &
config reloaded OR the Agent API is used to modify the service), those
"external" tag values are thrown away, and the new service definition is
once again the source of truth.

In the Nomad case, Nomad itself is the source of truth over the Agent in
the same way the JSON file was the source of truth in the example above.
That means any time Nomad sets a new service definition, any externally
configured tags are going to be replaced. When does this happen? Only on
major lifecycle events, for example when a task is modified because of an
updated job spec from the 'nomad job run <existing>' command. Otherwise,
Nomad's periodic re-sync's with Consul will now no longer try to restore
the externally modified tag values (as long as enable_tag_override=true).

Fixes #2057
2020-02-10 08:00:55 -06:00
Michael Schurter c5073f61a7 test: add timeout to ease debugging 2020-02-07 15:50:53 -08:00
Michael Schurter 9905dec6a3 test: workaround limits race 2020-02-07 15:50:53 -08:00
Michael Schurter 14c5ef3a8d test: fix race around reused default rpc addr
The default RPC addr was a global which is fine for normal runtime use
when it only has a single user.

However many tests modify it and cause races. Follow our convention of
returning defaults from funcs instead of using globals.
2020-02-07 15:50:53 -08:00
Mahmood Ali e106d373b2 rpc: Use MultiplexV2 for connections
MultiplexV2 is a new connection multiplex header that supports multiplex both
RPC and streaming requests over the same Yamux connection.

MultiplexV2 was added in 0.8.0 as part of
https://github.com/hashicorp/nomad/pull/3892 .  So Nomad 0.11 can expect it to
be supported.  Though, some more rigorous testing is required before merging
this.

I want to call out some implementation details:

First, the current connection pool reuses the Yamux stream for multiple RPC calls,
and doesn't close them until an error is encountered.  This commit doesn't
change it, and sets the `RpcNomad` byte only at stream creation.

Second, the StreamingRPC session gets closed by callers and cannot be reused.
Every StreamingRPC opens a new Yamux session.
2020-02-03 19:31:39 -05:00
Drew Bailey 9a65556211
add state store test to ensure PlacedCanaries is updated 2020-02-03 13:58:01 -05:00
Drew Bailey f51a3d1f37
nomad state store must be modified through raft, rm local state change 2020-02-03 13:57:34 -05:00
Drew Bailey 74779f23e6
keep placed canaries aligned with alloc status 2020-02-03 13:57:33 -05:00
Michael Schurter 9bedd0202e sentinel: copy jobs to prevent mutation
It's unclear whether Sentinel code can mutate values passed to the eval,
so ensure it cannot by copying the job.
2020-02-03 08:48:51 -05:00
Seth Hoenig 6bfa50acdc nomad: remove unused default schedular variable
This is from a merge conflict resolution that went the wrong direction.

I assumed the block had been added, but really it had been removed. Now,
it is removed once again.
2020-01-31 19:06:37 -06:00
Seth Hoenig d3cd6afd7e nomad: min cluster version for connect ACLs is now v0.10.4 2020-01-31 19:06:19 -06:00
Seth Hoenig 587a5d4a8d nomad: make TaskGroup.UsesConnect helper a public helper 2020-01-31 19:05:11 -06:00
Seth Hoenig ee89a754f1 nomad: fix leftover missed refactoring in consul policy checking 2020-01-31 19:05:06 -06:00
Seth Hoenig 4ee55fcd6c nomad,client: apply more comment/style PR tweaks 2020-01-31 19:04:52 -06:00
Seth Hoenig be7c671919 nomad,client: apply smaller PR suggestions
Apply smaller suggestions like doc strings, variable names, etc.

Co-Authored-By: Nick Ethier <nethier@hashicorp.com>
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2020-01-31 19:04:40 -06:00
Seth Hoenig 78a7d1e426 comments: cleanup some leftover debug comments and such 2020-01-31 19:04:35 -06:00
Seth Hoenig 8219c78667 nomad: handle SI token revocations concurrently
Be able to revoke SI token accessors concurrently, and also
ratelimit the requests being made to Consul for the various
ACL API uses.
2020-01-31 19:04:14 -06:00
Seth Hoenig 2c7ac9a80d nomad: fixup token policy validation 2020-01-31 19:04:08 -06:00
Seth Hoenig 9df33f622f nomad: proxy requests for Service Identity tokens between Clients and Consul
Nomad jobs may be configured with a TaskGroup which contains a Service
definition that is Consul Connect enabled. These service definitions end
up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In
the case where Consul ACLs are enabled, a Service Identity token is required
for these tasks to run & connect, etc. This changeset enables the Nomad Server
to recieve RPC requests for the derivation of SI tokens on behalf of instances
of Consul Connect using Tasks. Those tokens are then relayed back to the
requesting Client, which then injects the tokens in the secrets directory of
the Task.
2020-01-31 19:03:53 -06:00
Seth Hoenig 93cf770edb client: enable nomad client to request and set SI tokens for tasks
When a job is configured with Consul Connect aware tasks (i.e. sidecar),
the Nomad Client should be able to request from Consul (through Nomad Server)
Service Identity tokens specific to those tasks.
2020-01-31 19:03:38 -06:00
Seth Hoenig 2b66ce93bb nomad: ensure a unique ClusterID exists when leader (gh-6702)
Enable any Server to lookup the unique ClusterID. If one has not been
generated, and this node is the leader, generate a UUID and attempt to
apply it through raft.

The value is not yet used anywhere in this changeset, but is a prerequisite
for gh-6701.
2020-01-31 19:03:26 -06:00
Seth Hoenig f030a22c7c command, docs: create and document consul token configuration for connect acls (gh-6716)
This change provides an initial pass at setting up the configuration necessary to
enable use of Connect with Consul ACLs. Operators will be able to pass in a Consul
Token through `-consul-token` or `$CONSUL_TOKEN` in the `job run` and `job revert`
commands (similar to Vault tokens).

These values are not actually used yet in this changeset.
2020-01-31 19:02:53 -06:00
Michael Schurter dd7712795d
Merge branch 'master' into b-tls-validation 2020-01-30 11:05:15 -08:00
Mahmood Ali a9f551542d Merge pull request #160 from hashicorp/b-mtls-hostname
server: validate role and region for RPC w/ mTLS
2020-01-30 12:59:17 -06:00
Michael Schurter c82b14b0c4 core: add limits to unauthorized connections
Introduce limits to prevent unauthorized users from exhausting all
ephemeral ports on agents:

 * `{https,rpc}_handshake_timeout`
 * `{http,rpc}_max_conns_per_client`

The handshake timeout closes connections that have not completed the TLS
handshake by the deadline (5s by default). For RPC connections this
timeout also separately applies to first byte being read so RPC
connections with TLS enabled have `rpc_handshake_time * 2` as their
deadline.

The connection limit per client prevents a single remote TCP peer from
exhausting all ephemeral ports. The default is 100, but can be lowered
to a minimum of 26. Since streaming RPC connections create a new TCP
connection (until MultiplexV2 is used), 20 connections are reserved for
Raft and non-streaming RPCs to prevent connection exhaustion due to
streaming RPCs.

All limits are configurable and may be disabled by setting them to `0`.

This also includes a fix that closes connections that attempt to create
TLS RPC connections recursively. While only users with valid mTLS
certificates could perform such an operation, it was added as a
safeguard to prevent programming errors before they could cause resource
exhaustion.
2020-01-30 10:38:25 -08:00
Drew Bailey da4af9bef3
fix tests, update changelog 2020-01-29 13:55:39 -05:00
Drew Bailey a61bf32314
Allow nomad monitor command to lookup server UUID
Allows addressing servers with nomad monitor using the servers name or
ID.

Also unifies logic for addressing servers for client_agent_endpoint
commands and makes addressing logic region aware.

rpc getServer test
2020-01-29 13:55:29 -05:00
Mahmood Ali 9611324654
Merge pull request #6922 from hashicorp/b-alloc-canoncalize
Handle Upgrades and Alloc.TaskResources modification
2020-01-28 15:12:41 -05:00
Mahmood Ali 90cae566e5
Merge pull request #6935 from hashicorp/b-default-preemption-flag
scheduler: allow configuring default preemption for system scheduler
2020-01-28 15:11:06 -05:00
Mahmood Ali af17b4afc7 Support customizing full scheduler config 2020-01-28 14:51:42 -05:00
Mahmood Ali f7a51a14c6
Merge pull request #6977 from hashicorp/b-leadership-flapping-2
Handle Nomad leadership flapping (attempt 2)
2020-01-28 11:40:41 -05:00
Mahmood Ali 687d2b7054 tests: defer closing shutdownCh 2020-01-28 09:53:48 -05:00
Mahmood Ali ded4233c27 tweak leadership flapping log messages 2020-01-28 09:49:36 -05:00
Mahmood Ali 79823ae07d handle channel close signal
Always deliver last value then send close signal.
2020-01-28 09:44:34 -05:00
Mahmood Ali d202924a93 include test and address review comments 2020-01-28 09:06:52 -05:00
Nick Ethier 5cbb94e16e consul: add support for canary meta 2020-01-27 09:53:30 -05:00
Mahmood Ali e436d2701a Handle Nomad leadership flapping
Fixes a deadlock in leadership handling if leadership flapped.

Raft propagates leadership transition to Nomad through a NotifyCh channel.
Raft blocks when writing to this channel, so channel must be buffered or
aggressively consumed[1]. Otherwise, Raft blocks indefinitely in `raft.runLeader`
until the channel is consumed[1] and does not move on to executing follower
related logic (in `raft.runFollower`).

While Raft `runLeader` defer function blocks, raft cannot process any other
raft operations.  For example, `run{Leader|Follower}` methods consume
`raft.applyCh`, and while runLeader defer is blocked, all raft log applications
or config lookup will block indefinitely.

Sadly, `leaderLoop` and `establishLeader` makes few Raft calls!
`establishLeader` attempts to auto-create autopilot/scheduler config [3]; and
`leaderLoop` attempts to check raft configuration [4].  All of these calls occur
without a timeout.

Thus, if leadership flapped quickly while `leaderLoop/establishLeadership` is
invoked and hit any of these Raft calls, Raft handler _deadlock_ forever.

Depending on how many times it flapped and where exactly we get stuck, I suspect
it's possible to get in the following case:

* Agent metrics/stats http and RPC calls hang as they check raft.Configurations
* raft.State remains in Leader state, and server attempts to handle RPC calls
  (e.g. node/alloc updates) and these hang as well

As we create goroutines per RPC call, the number of goroutines grow over time
and may trigger a out of memory errors in addition to missed updates.

[1] d90d6d6bda/config.go (L190-L193)
[2] d90d6d6bda/raft.go (L425-L436)
[3] 2a89e47746/nomad/leader.go (L198-L202)
[4] 2a89e47746/nomad/leader.go (L877)
2020-01-22 13:08:34 -05:00
Mahmood Ali 129c884105 extract leader step function 2020-01-22 10:55:48 -05:00
Mahmood Ali f36cc54efd actually always canonicalize alloc.Job
alloc.Job may be stale as well and need to migrate it.  It does cost
extra cycles but should be negligible.
2020-01-15 09:02:48 -05:00
Mahmood Ali b1b714691c address review comments 2020-01-15 08:57:05 -05:00
Mahmood Ali 1ab682f622 scheduler: allow configuring default preemption for system scheduler
Some operators want a greater control over when preemption is enabled,
especially during an upgrade to limit potential side-effects.
2020-01-13 08:30:49 -05:00
Drew Bailey ff4bfb8809
Merge pull request #6841 from hashicorp/f-agent-pprof-acl
Remote agent pprof endpoints
2020-01-10 14:52:39 -05:00
Mahmood Ali bfa33cf471 canonicalize allocs from plan results too 2020-01-10 10:41:12 -05:00
Nick Ethier 1f28633954
Merge pull request #6816 from hashicorp/b-multiple-envoy
connect: configure envoy to support multiple sidecars in the same alloc
2020-01-09 23:25:39 -05:00
Drew Bailey b702dede49
adds qc param, address pr feedback 2020-01-09 15:15:11 -05:00
Drew Bailey 45210ed901
Rename profile package to pprof
Address pr feedback, rename profile package to pprof to more accurately
describe its purpose. Adds gc param for heap lookup profiles.
2020-01-09 15:15:10 -05:00
Drew Bailey 1b8af920f3
address pr feedback 2020-01-09 15:15:09 -05:00
Drew Bailey 4ced73875b
leave acl checking to rpc endpoints
fix test expectation

test wrapNonJSON
2020-01-09 15:15:08 -05:00
Drew Bailey 279512c7f8
provide helpful error, cleanup logic 2020-01-09 15:15:08 -05:00
Drew Bailey 7bbba613a5
prevent doubly wrapping with rpc error 2020-01-09 15:15:07 -05:00
Drew Bailey fd42020ad6
RPC server EnableDebug option
Passes in agent enable_debug config to nomad server and client configs.
This allows for rpc endpoints to have more granular control if they
should be enabled or not in combination with ACLs.

enable debug on client test
2020-01-09 15:15:07 -05:00
Drew Bailey 31c0aca10a
rename forward func, add comment for why we forward 2020-01-09 15:15:06 -05:00
Drew Bailey 9a80938fb1
region forwarding; prevent recursive forwards for impossible requests
prevent region forwarding loop, backfill tests

fix failing test
2020-01-09 15:15:06 -05:00
Drew Bailey 46121fe3fd
move shared structs out of client and into nomad 2020-01-09 15:15:05 -05:00
Drew Bailey aec81a0b99
api agent endpoints
helper func to return serverPart based off of serverID
2020-01-09 15:15:05 -05:00
Drew Bailey 3672414888
test pprof headers and profile methods
tidy up, add comments

clean up seconds param assignment
2020-01-09 15:15:04 -05:00
Drew Bailey fc37448683
warn when enabled debug is on when registering
m -> a receiver name

return codederrors, fix query
2020-01-09 15:15:04 -05:00
Drew Bailey 50288461c9
Server request forwarding for Agent.Profile
Return rpc errors for profile requests, set up remote forwarding to
target leader or server id for profile requests.

server forwarding, endpoint tests
2020-01-09 15:15:03 -05:00
Mahmood Ali d740d347ce Migrate old alloc structs on read
This commit ensures that Alloc.AllocatedResources is properly populated
when read from persistence stores (namely Raft and client state store).
The alloc struct may have been written previously by an arbitrary old
version that may only populate Alloc.TaskResources.
2020-01-09 08:46:50 -05:00
James Rasell df2dc48790 Fix error parsing config when setting consul.timeout. (#6907)
When parsing a config file which had the consul.timeout param set,
Nomad was reporting an error causing startup to fail. This seems
to be caused by the HCL decoder interpreting the timeout type as
an int rather than a string. This is caused by the struct
TimeoutHCL param having a hcl key of timeout alongside a Timeout
struct param of type time.Duration (int). Ensuring the decoder
ignores the Timeout struct param ensure the decoder runs
correctly.
2020-01-07 13:40:55 -05:00
Nick Ethier 677e9cdc16 connect: configure envoy such that multiple sidecars can run in the same alloc 2020-01-06 11:26:27 -05:00
Michael Schurter 92cdc9de01 nomad/state: remove dead upgrade path code
It is uncalled so there hsould be no runtime changes.
2019-12-20 11:10:22 -08:00
Drew Bailey d9e41d2880
docs for shutdown delay
update docs, address pr comments

ensure pointer is not nil

use pointer for diff tests, set vs unset
2019-12-16 11:38:35 -05:00
Drew Bailey ae145c9a37
allow only positive shutdown delay
more explicit test case, remove select statement
2019-12-16 11:38:30 -05:00
Drew Bailey 24929776a2
shutdown delay for task groups
copy struct values

ensure groupserviceHook implements RunnerPreKillhook

run deregister first

test that shutdown times are delayed

move magic number into variable
2019-12-16 11:38:16 -05:00
Seth Hoenig 270233e23d tests: remove trace statements from nodeDrainWatcher.watch
Avoid logging in the `watch` function as much as possible, since
it is not waited on during a server shutdown. When the logger
logs after a test passes, it may or may not cause the testing
framework to panic.

More info in:
https://github.com/golang/go/issues/29388#issuecomment-453648436
2019-12-16 07:08:11 -06:00
Michael Schurter 95fd2643d7 connect: canonicalize before adding sidecar
Fixes #6853

Canonicalize jobs first before adding any sidecars. This fixes a bug
where sidecar tasks were added without interpolated names and broke
validation. Sidecar tasks must be canonicalized independently.

Also adds a group network to the mock connect job because it wasn't a
valid connect job before!
2019-12-12 20:55:56 -08:00
Seth Hoenig d45dec1ca8 tests: parallelize state store tests
It has been decided we're going to live in a many core world.
Let's take advantage of that and parallelize these state store
tests which all run in memory and are largely CPU bound.

An unscientific benchmark demonstrating the improvement:

[mp state (master)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	5.162s

[mp state (f-parallelize-state-store-tests)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	1.527s
2019-12-11 09:36:37 -06:00
Seth Hoenig f0c3dca49c tests: swap lib/freeport for tweaked helper/freeport
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.

Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.

This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
2019-12-09 08:37:32 -06:00
Danielle Lancashire d2075ebae9 spellcheck: Fix spelling of retrieve 2019-12-05 18:59:47 -06:00
Mahmood Ali b3e557cae3 address feedback review
apply `s/requestAuthToken/requestACLToken/g`
2019-11-26 08:39:04 -05:00
Mahmood Ali 02e20c720b acl_endpoint: permission denied for unauthenticated requests
If ACL Request is unauthenticated, we should honor the anonymous token.
This PR makes few changes:

* `GetPolicy` endpoints may return policy if anonymous policy allows it,
or return permission denied otherwise.
* `ListPolicies` returns an empty policy list, or one with anonymous
policy if one exists.

Without this PR, the we return an incomprehensible error.

Before:
```
$ curl http://localhost:4646/v1/acl/policy/doesntexist; echo
acl token lookup failed: index error: UUID must be 36 characters
$ curl http://localhost:4646/v1/acl/policies; echo
acl token lookup failed: index error: UUID must be 36 characters
```

After:
```
$ curl http://localhost:4646/v1/acl/policy/doesntexist; echo
Permission denied
$ curl http://localhost:4646/v1/acl/policies; echo
[]
```
2019-11-22 08:43:09 -05:00
Michael Schurter 4b6762511d
Merge pull request #6021 from hashicorp/f-anonymous-policy-access
api: Update policy endpoint to permit anonymous access
2019-11-20 15:33:45 -08:00
Buck Doyle 5fcc00d0f9 Add gofmt changes 2019-11-20 12:47:01 -06:00
Buck Doyle dc9c0d5ead
Add explanatory comment 2019-11-20 11:45:44 -06:00
Buck Doyle bea9837510
Remove extraneous else block 2019-11-20 11:37:45 -06:00
Buck Doyle d6a3e571bd
Remove extraneous whitespace 2019-11-20 11:37:01 -06:00
Buck Doyle db77a24ed3
Merge branch 'master' into f-policy-json 2019-11-20 11:20:07 -06:00
Mahmood Ali 7fb4c35831 comments and casing 2019-11-19 16:03:55 -05:00
Mahmood Ali 97d0fd009d 404 if token isn't found 2019-11-19 15:52:53 -05:00
Mahmood Ali 6f8bb5e90b api: acl bootstrap errors aren't 500
Noticed that ACL endpoints return 500 status code for user errors.  This
is confusing and can lead to false monitoring alerts.

Here, I introduce a concept of RPCCoded errors to be returned by RPC
that signal a code in addition to error message.  Codes for now match
HTTP codes to ease reasoning.

```
$ nomad acl bootstrap
Error bootstrapping: Unexpected response code: 500 (ACL bootstrap already done (reset index: 9))

$ nomad acl bootstrap
Error bootstrapping: Unexpected response code: 400 (ACL bootstrap already done (reset index: 9))
```
2019-11-19 15:51:57 -05:00
Michael Schurter 796758b8a5 core: add semver constraint
The existing version constraint uses logic optimized for package
managers, not schedulers, when checking prereleases:

- 1.3.0-beta1 will *not* satisfy ">= 0.6.1"
- 1.7.0-rc1 will *not* satisfy ">= 1.6.0-beta1"

This is due to package managers wishing to favor final releases over
prereleases.

In a scheduler versions more often represent the earliest release all
required features/APIs are available in a system. Whether the constraint
or the version being evaluated are prereleases has no impact on
ordering.

This commit adds a new constraint - `semver` - which will use Semver
v2.0 ordering when evaluating constraints. Given the above examples:

- 1.3.0-beta1 satisfies ">= 0.6.1" using `semver`
- 1.7.0-rc1 satisfies ">= 1.6.0-beta1" using `semver`

Since existing jobspecs may rely on the old behavior, a new constraint
was added and the implicit Consul Connect and Vault constraints were
updated to use it.
2019-11-19 08:40:19 -08:00
Nick Ethier bd454a4c6f
client: improve group service stanza interpolation and check_re… (#6586)
* client: improve group service stanza interpolation and check_restart support

Interpolation can now be done on group service stanzas. Note that some task runtime specific information
that was previously available when the service was registered poststart of a task is no longer available.

The check_restart stanza for checks defined on group services will now properly restart the allocation upon
check failures if configured.
2019-11-18 13:04:01 -05:00
Luiz Aoqui e499c5bddc
Merge pull request #6698 from hashicorp/f-add-drain-start-time
api: add `StartedAt` in `Node.DrainStrategy`
2019-11-15 15:38:38 -05:00
Luiz Aoqui e862b61daa
api: use the same initial time for all drain properties 2019-11-14 16:06:09 -05:00
Drew Bailey 9b63828658
serverID to target remote leader or server
handle the case where we request a server-id which is this current server

update docs, error on node and server id params

more accurate names for tests

use shared no leader err, formatting

rm bad comment

remove redundant variable
2019-11-14 10:07:35 -05:00
Drew Bailey b644e1f47d
add server-id to monitor specific server 2019-11-14 09:53:41 -05:00
Drew Bailey 2185c1a89e
Allows monitor to target leader server
Allows user to pass in node-id=leader to forward monitor request to
remote a remote leader.
2019-11-14 09:53:40 -05:00
Luiz Aoqui 5bd7cdd5c3
api: add StartedAt in Node.DrainStrategy 2019-11-13 17:54:40 -05:00
Lars Lehtonen 22a3c21dd0
nomad: fix dropped test error 2019-11-13 12:49:41 -08:00
Michael Schurter 08afb7d605 vault: allow overriding implicit vault constraint
There's a bug in version parsing that breaks this constraint when using
a prerelease enterprise version of Vault (eg 1.3.0-beta1+ent). While
this does not fix the underlying bug it does provide a workaround for
future issues related to the implicit constraint. Like the implicit
Connect constraint: *all* implicit constraints should be overridable to
allow users to workaround bugs or other factors should the need arise.
2019-11-12 12:26:36 -08:00
Mahmood Ali c4c37cb42e vault: check token_explicit_max_ttl as well
Vault 1.2.0 deprecated `explicit_max_ttl` in favor of
`token_explicit_max_ttl`.
2019-11-12 08:47:23 -05:00
Lars Lehtonen adbab29228 nomad: TestEvalBroker_Dequeue_Empty_Timeout() proper goroutine error handling (#6657) 2019-11-08 14:35:06 -05:00
Drew Bailey 7420446458
Merge pull request #6639 from hashicorp/return-after-forward
return after request has been forwarded
2019-11-08 09:48:35 -05:00
Lars Lehtonen 39b68e0b88 TestEvalBroker_Dequeue_Blocked() proper goroutine error handling (#6651)
TestEvalBroker_Dequeue_Blocked() improve test readability
2019-11-08 08:52:23 -05:00
Nick Ethier e947aaed4f
nomad: fix bug that didn't allow for multiple connect services in same tg 2019-11-08 04:33:39 -05:00
Lars Lehtonen 6deae70e35 TestEvalBroker_PauseResumeNackTimeout() proper goroutine error handling (#6649)
TestEvalBroker_PauseResumeNackTimeout() improve test readability
2019-11-07 16:04:59 -05:00
Lars Lehtonen 2638cbb31d nomad: TestEvalBroker_EnqueueAll_Dequeue_Fair() proper goroutine error handling (#6636)
nomad: TestEvalBroker_EnqueueAll_Dequeue_Fair() improve test readability
2019-11-07 10:39:29 -05:00
Drew Bailey a5e2e1805f
return after request has been forwarded 2019-11-07 08:33:53 -05:00
Lars Lehtonen e64f98837c nomad: fix dropped error in TestJobEndpoint_Deregister_ACL (#6602) 2019-11-06 16:40:45 -05:00
Drew Bailey f4a7e3dc75
coordinate closing of doneCh, use interface to simplify callers
comments
2019-11-05 11:44:26 -05:00
Drew Bailey fe542680dc
log-json -> json
fix typo command/agent/monitor/monitor.go

Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com>

Update command/agent/monitor/monitor.go

Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com>

address feedback, lock to prevent send on closed channel

fix lock/unlock for dropped messages
2019-11-05 09:51:59 -05:00
Drew Bailey 298b8358a9
move forwarded monitor request into helper 2019-11-05 09:51:56 -05:00
Drew Bailey 8726b685de
address feedback 2019-11-05 09:51:56 -05:00
Drew Bailey 0e759c401c
moving endpoints over to frames 2019-11-05 09:51:54 -05:00
Drew Bailey 17d876d5ef
rename function, initialize log level better
underscores instead of dashes for query params
2019-11-05 09:51:53 -05:00
Drew Bailey 8178beecf0
address feedback, use agent_endpoint instead of monitor 2019-11-05 09:51:53 -05:00
Drew Bailey db65b1f4a5
agent:read acl policy for monitor 2019-11-05 09:51:52 -05:00
Drew Bailey 2533617888
rpc acl tests for both monitor endpoints 2019-11-05 09:51:51 -05:00
Drew Bailey 3c33747e1f
client monitor endpoint tests 2019-11-05 09:51:50 -05:00
Drew Bailey 4bc68855d0
use intercepting loggers for rpchandlers 2019-11-05 09:51:50 -05:00
Drew Bailey 3b9c33a5f0
new hclog with standardlogger intercept 2019-11-05 09:51:49 -05:00
Drew Bailey a45ae1cd58
enable json formatting, use queryoptions 2019-11-05 09:51:49 -05:00
Drew Bailey 786989dbe3
New monitor pkg for shared monitor functionality
Adds new package that can be used by client and server RPC endpoints to
facilitate monitoring based off of a logger

clean up old code

small comment about write

rm old comment about minsize

rename to Monitor

Removes connection logic from monitor command

Keep connection logic in endpoints, use a channel to send results from
monitoring

use new multisink logger and interfaces

small test for dropped messages

update go-hclogger and update sink/intercept logger interfaces
2019-11-05 09:51:49 -05:00
Lars Lehtonen 0a4542fadc nomad: fix test goroutine (#6593) 2019-10-31 08:23:32 -04:00
Seth Hoenig 98592113a3
Merge pull request #6582 from hashicorp/b-vault-createToken-log-msg
nomad: fix vault.CreateToken log message printing wrong error
2019-10-29 17:35:05 -05:00
Mahmood Ali 7f2e4dc5d8
Merge pull request #6574 from hashicorp/b-gh-6570-vault-role-validation
vault: honor new `token_period` in vault token role
2019-10-29 10:18:59 -04:00
Seth Hoenig 838c6e3329 nomad: fix vault.CreateToken log message printing wrong error
Fixes typo in word "failed".

Fixes bug where incorrect error is printed. The old code would only
ever print a nil error, instead of the validationErr which is being
created.
2019-10-28 23:05:32 -05:00
Mahmood Ali c5d8d66787 Fix admissionValidators
`admissionValidators` doesn't aggregate errors correctly, as it
aggregates errors in `errs` reference yet it always returns the nil
`err`.

Here, we avoid shadowing `err`, and move variable declarations to where
they are used.
2019-10-28 10:52:53 -04:00
Mahmood Ali abb930249a consul connect: do basic validation before mutating job
`groupConnectHook` assumes that Networks is a non-empty slice, but TG
hasn't been validated yet and validation may depend on mutation results.
As such, we do basic check here before dereferencing network slice
elements.
2019-10-28 10:49:02 -04:00
Mahmood Ali bb45a7a776 add tests for consul connect validation 2019-10-28 10:41:51 -04:00
Mahmood Ali 4c64658397 vault: Support new role field token_role
Vault 1.2.0 deprecated `period` field in favor of `token_period` in auth
role:

>  * Token store roles use new, common token fields for the values
>    that overlap with other auth backends. `period`, `explicit_max_ttl`, and
>    `bound_cidrs` will continue to work, with priority being given to the
>    `token_` prefixed versions of those parameters. They will also be returned
>    when doing a read on the role if they were used to provide values initially;
>    however, in Vault 1.4 if `period` or `explicit_max_ttl` is zero they will no
>    longer be returned. (`explicit_max_ttl` was already not returned if empty.)
https://github.com/hashicorp/vault/blob/master/CHANGELOG.md#120-july-30th-2019
2019-10-28 09:33:26 -04:00
Seth Hoenig 8b03477f46
Merge pull request #6448 from hashicorp/f-set-connect-sidecar-tags
connect: enable setting tags on consul connect sidecar service in job…
2019-10-17 15:14:09 -05:00
Seth Hoenig 039fbd3f3b connect: enable setting tags on consul connect sidecar service in jobspec (#6415) 2019-10-17 19:25:20 +00:00
Mahmood Ali 4e4a9b252c
Merge pull request #6290 from hashicorp/r-generated-code-refactor
dev: avoid codecgen code in downstream projects
2019-10-15 08:22:31 -04:00
Danielle fee482ae6c
Merge pull request #6331 from hashicorp/dani/f-volume-mount-propagation
volumes: Add support for mount propagation
2019-10-14 14:29:40 +02:00
Danielle Lancashire 4fbcc668d0
volumes: Add support for mount propagation
This commit introduces support for configuring mount propagation when
mounting volumes with the `volume_mount` stanza on Linux targets.

Similar to Kubernetes, we expose 3 options for configuring mount
propagation:

- private, which is equivalent to `rprivate` on Linux, which does not allow the
           container to see any new nested mounts after the chroot was created.

- host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts
                that have been created _outside of the container_ to be visible
                inside the container after the chroot is created.

- bidirectional, which is equivalent to `rshared` on Linux, which allows both
                 the container to see new mounts created on the host, but
                 importantly _allows the container to create mounts that are
                 visible in other containers an don the host_

private and host-to-task are safe, but bidirectional mounts can be
dangerous, as if the code inside a container creates a mount, and does
not clean it up before tearing down the container, it can cause bad
things to happen inside the kernel.

To add a layer of safety here, we require that the user has ReadWrite
permissions on the volume before allowing bidirectional mounts, as a
defense in depth / validation case, although creating mounts should also require
a priviliged execution environment inside the container.
2019-10-14 14:09:58 +02:00
Mahmood Ali 4b2ba62e35 acl: check ACL against object namespace
Fix a bug where a millicious user can access or manipulate an alloc in a
namespace they don't have access to.  The allocation endpoints perform
ACL checks against the request namespace, not the allocation namespace,
and performs the allocation lookup independently from namespaces.

Here, we check that the requested can access the alloc namespace
regardless of the declared request namespace.

Ideally, we'd enforce that the declared request namespace matches
the actual allocation namespace.  Unfortunately, we haven't documented
alloc endpoints as namespaced functions; we suspect starting to enforce
this will be very disruptive and inappropriate for a nomad point
release.  As such, we maintain current behavior that doesn't require
passing the proper namespace in request.  A future major release may
start enforcing checking declared namespace.
2019-10-08 12:59:22 -04:00
Mahmood Ali 674a457865 use RequestNamespace(), the canonical way to get namespace 2019-09-27 07:40:58 -04:00
Mahmood Ali e29ee4c400 nomad: defensive check for namespaces in job registration call
In a job registration request, ensure that the request namespace "header" and job
namespace field match.  This should be the case already in prod, as http
handlers ensures that the values match [1].

This mitigates bugs that exploit bugs where we may check a value but act
on another, resulting into bypassing ACL system.

[1] https://github.com/hashicorp/nomad/blob/v0.9.5/command/agent/job_endpoint.go#L415-L418
2019-09-26 17:02:47 -04:00
Lang Martin fb41dd86ba default raft protocol v2 2019-09-24 14:37:55 -04:00
Lang Martin 31d7f116dd nomad/server comments 2019-09-24 14:36:18 -04:00
Tim Gross cd9c23617f
client/connect: ConsulProxy LocalServicePort/Address (#6358)
Without a `LocalServicePort`, Connect services will try to use the
mapped port even when delivering traffic locally. A user can override
this behavior by pinning the port value in the `service` stanza but
this prevents us from using the Consul service name to reach the
service.

This commits configures the Consul proxy with its `LocalServicePort`
and `LocalServiceAddress` fields.
2019-09-23 14:30:48 -04:00
Danielle Lancashire 78b61de45f
config: Hoist volume.config.source into volume
Currently, using a Volume in a job uses the following configuration:

```
volume "alias-name" {
  type = "volume-type"
  read_only = true

  config {
    source = "host_volume_name"
  }
}
```

This commit migrates to the following:

```
volume "alias-name" {
  type = "volume-type"
  source = "host_volume_name"
  read_only = true
}
```

The original design was based due to being uncertain about the future of storage
plugins, and to allow maxium flexibility.

However, this causes a few issues, namely:
- We frequently need to parse this configuration during submission,
scheduling, and mounting
- It complicates the configuration from and end users perspective
- It complicates the ability to do validation

As we understand the problem space of CSI a little more, it has become
clear that we won't need the `source` to be in config, as it will be
used in the majority of cases:

- Host Volumes: Always need a source
- Preallocated CSI Volumes: Always needs a source from a volume or claim name
- Dynamic Persistent CSI Volumes*: Always needs a source to attach the volumes
                                   to for managing upgrades and to avoid dangling.
- Dynamic Ephemeral CSI Volumes*: Less thought out, but `source` will probably point
                                  to the plugin name, and a `config` block will
                                  allow you to pass meta to the plugin. Or will
                                  point to a pre-configured ephemeral config.
*If implemented

The new design simplifies this by merging the source into the volume
stanza to solve the above issues with usability, performance, and error
handling.
2019-09-13 04:37:59 +02:00
Mahmood Ali 4b8280e51d remove generated code 2019-09-06 19:24:15 +00:00
Nomad Release bot dc7d728a82 Generate files for 0.10.0-beta1 release 2019-09-06 18:47:09 +00:00
Mahmood Ali 01f42053e4 dev: avoid codecgen code in downstream projects
This is an attempt to ease dependency management for external driver
plugins, by avoiding requiring them to compile ugorji/go generated
files.  Plugin developers reported some pain with the brittleness of
ugorji/go dependency in particular, specially when using go mod, the
default go mod manager in golang 1.13.

Context
--------

Nomad uses msgpack to persist and serialize internal structs, using
ugorji/go library.  As an optimization, we use ugorji/go code generation
to speedup process and aovid the relection-based slow path.

We commit these generated files in repository when we cut and tag the
release to ease reproducability and debugging old releases.  Thus,
downstream projects that depend on release tag, indirectly depends on
ugorji/go generated code.

Sadly, the generated code is brittle and specific to the version of
ugorji/go being used.  When go mod picks another version of ugorji/go
then nomad (go mod by default uses release according to semver),
downstream projects face compilation errors.

Interestingly, downstream projects don't commonly serialize nomad
internal structs.  Drivers and device plugins use grpc instead of
msgpack for the most part.  In the few cases where they use msgpag (e.g.
decoding task config), they do without codegen path as they run on
driver specific structs not the nomad internal structs.  Also, the
ugorji/go serialization through reflection is generally backward
compatible (mod some ugorji/go regression bugs that get introduced every
now and then :( ).

Proposal
---------

The proposal here is to keep committing ugorji/go codec generated files
for releases but to use a go tag for them.

All nomad development through the makefile, including releasing, CI and
dev flow, has the tag enabled.

Downstream plugin projects, by default, will skip these files and life
proceed as normal for them.

The downside is that nomad developers who use generated code but avoid
using make must start passing additional go tag argument.  Though this
is not a blessed configuration.
2019-09-06 09:22:00 -04:00
Mahmood Ali 6d73ca0cfb
Merge pull request #6250 from hashicorp/f-raft-protocol-v3
Update default raft protocol to version 3
2019-09-04 09:34:41 -04:00
Mahmood Ali c94a5ef1f8 tests: give up on TestAutopilot_CleanupStaleRaftServer for now 2019-09-04 09:10:53 -04:00
Nick Ethier 6a90a9f505
structs: canonicalize tg Services and Networks (#6257) 2019-09-04 08:55:47 -04:00
Mahmood Ali 6cefd8f97e tests: attempt to fix TestAutopilot_CleanupStaleRaftServer
Also add a utility function for waiting for stable leadership
2019-09-04 08:49:33 -04:00
Mahmood Ali 035a7a94d9 tests: update time sensitive tests
Fix tests whose messages seem timing dependent.
2019-09-04 08:45:25 -04:00
Mahmood Ali 0beb757b6f tests: disable server auto join by default
Tests typically call join cluster directly rather than rely on consul
discovery.  Worse, consul discovery seems to cause additional leadership
transitions when a server is shutdown in tests than tests expect.
2019-09-04 07:54:54 -04:00
Mahmood Ali 3e2ab6e2a3 address review feedback 2019-09-03 21:44:39 -04:00
Mahmood Ali 0a6d73020c use current nomad version in testing 2019-09-03 21:42:41 -04:00
Mahmood Ali 9bd56587cd Fix raft tests
Wait until leadership stabalizes and all non-voters get promoted before
killing leader
2019-09-03 14:53:29 -04:00
Michael Schurter 5957030d18
connect: add unix socket to proxy grpc for envoy (#6232)
* connect: add unix socket to proxy grpc for envoy

Fixes #6124

Implement a L4 proxy from a unix socket inside a network namespace to
Consul's gRPC endpoint on the host. This allows Envoy to connect to
Consul's xDS configuration API.

* connect: pointer receiver on structs with mutexes

* connect: warn on all proxy errors
2019-09-03 08:43:38 -07:00
Buck Doyle 21ec6a237c Merge branch 'master' into f-policy-json
# Conflicts:
#	CHANGELOG.md
2019-09-03 09:56:25 -05:00
Jasmine Dahilig 4edebe389a
add default update stanza and max_parallel=0 disables deployments (#6191) 2019-09-02 10:30:09 -07:00
Buck Doyle ab96785fc9 Change test to use valid HCL for rules 2019-08-29 16:09:02 -05:00
Buck Doyle 4a159f5dcf Change parsing error to set rules to nil 2019-08-29 15:50:34 -05:00
Buck Doyle 5495a7e689 Add standard error-handling for parse failure 2019-08-29 11:12:02 -05:00
Buck Doyle 8b06712d21 Merge branch 'master' into f-policy-json 2019-08-29 11:11:21 -05:00
Mahmood Ali 3da10b5cb3 scheduler: tests for multiple drivers in TG 2019-08-29 09:03:31 -04:00
Mahmood Ali a67f5f0565 update tests to run with v2 2019-08-28 16:42:08 -04:00
Mahmood Ali 6eabf53b91 Default raft protocol to version 3 2019-08-28 15:56:59 -04:00
Michael Schurter f5792635ca
Merge pull request #6218 from hashicorp/f-consul-defaults
consul: use Consul's defaults and env vars
2019-08-28 11:54:44 -07:00
Nick Ethier 9e96971a75
cli: display group ports and address in alloc status command output (#6189)
* cli: display group ports and address in alloc status command output

* add assertions for port.To = -1 case and convert assertions to testify
2019-08-27 23:59:36 -04:00
Nick Ethier cbb27e74bc
Add environment variables for connect upstreams (#6171)
* taskenv: add connect upstream env vars + test

* set taskenv upstreams instead of appending

* Update client/taskenv/env.go

Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2019-08-27 23:41:38 -04:00
Michael Schurter 3b0e1d8ef7 consul: use Consul's defaults and env vars
Use Consul's API package defaults and env vars as Nomad's defaults.
2019-08-27 14:56:52 -07:00
Mahmood Ali 3791a70aa9
Merge pull request #5676 from hashicorp/f-b-upgrade-ugorji-dep-20190508
Update ugorji/go to latest
2019-08-23 18:29:49 -04:00
Jerome Gravel-Niquet cbdc1978bf Consul service meta (#6193)
* adds meta object to service in job spec, sends it to consul

* adds tests for service meta

* fix tests

* adds docs

* better hashing for service meta, use helper for copying meta when registering service

* tried to be DRY, but looks like it would be more work to use the
helper function
2019-08-23 12:49:02 -04:00
Michael Schurter 95b8048553
Merge pull request #6121 from hashicorp/f-connect-bootstrap
connect: task hook for bootstrapping envoy sidecar
2019-08-22 10:58:31 -07:00
Michael Schurter 59e0b67c7f connect: task hook for bootstrapping envoy sidecar
Fixes #6041

Unlike all other Consul operations, boostrapping requires Consul be
available. This PR tries Consul 3 times with a backoff to account for
the group services being asynchronously registered with Consul.
2019-08-22 08:15:32 -07:00
Danielle Lancashire 2e5f28029f
remove hidden field from host volumes
We're not shipping support for "hidden" volumes in 0.10 any more, I'll
convert this to an issue+mini RFC for future enhancement.
2019-08-22 08:48:05 +02:00
Danielle 0428284aee
Merge pull request #6180 from hashicorp/dani/readonly-acl
Fine grained ACLs for Host Volumes
2019-08-21 22:22:14 +02:00
Danielle Lancashire 91bb67f713
acls: Break mount acl into mount-rw and mount-ro 2019-08-21 21:17:30 +02:00
Nick Ethier c8556daf37
structs: validate no tcp checks for connect services (#6169) 2019-08-21 12:42:53 -04:00
Michael Schurter 050cc32fde
Merge pull request #6157 from hashicorp/f-connect-register
Register connect enabled group services with Consul
2019-08-20 14:45:38 -07:00
Tim Gross 7dc6ee2d27 structs: add taskgroup networks and services to plan diffs
Adds a check for differences in `job.Diff` so that task group networks
and services, including new Consul connect stanzas, show up in the job
plan outputs.
2019-08-20 16:18:30 -04:00
Michael Schurter b008fd1724 connect: register group services with Consul
Fixes #6042

Add new task group service hook for registering group services like
Connect-enabled services.

Does not yet support checks.
2019-08-20 12:25:10 -07:00
Tim Gross a0e923f46c add optional task field to group service checks 2019-08-20 09:35:31 -04:00
Mahmood Ali d699a70875
Merge pull request #5911 from hashicorp/b-rpc-consistent-reads
Block rpc handling until state store is caught up
2019-08-20 09:29:37 -04:00
Nick Ethier 24f5a4c276
sidecar_task override in connect admission controller (#6140)
* structs: use seperate SidecarTask struct for sidecar_task stanza and add merge

* nomad: merge SidecarTask into proxy task during connect Mutate hook
2019-08-20 01:22:46 -04:00
Nick Ethier 965f00b2fc
Builtin Admission Controller Framework (#6116)
* nomad: add admission controller framework

* nomad: add admission controller framework and Consul Connect hooks

* run admission controllers before checking permissions

* client: add default node meta for connect configurables

* nomad: remove validateJob func since it has been moved to admission controller

* nomad: use new TaskKind type

* client: use consts for connect sidecar image and log level

* Apply suggestions from code review

Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>

* nomad: add job register test with connect sidecar

* Update nomad/job_endpoint_hooks.go

Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2019-08-15 11:22:37 -04:00
Preetha Appan 72e45dd01e
More code review feedback 2019-08-12 17:41:40 -05:00
Preetha 76c8a11b31
Apply suggestions from code review
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2019-08-12 17:03:30 -05:00
Preetha Appan 219dc05541
Fix type for kind 2019-08-12 14:39:50 -05:00
Preetha Appan 35506c516d
Improve validation logic and add table driven tests 2019-08-12 14:39:50 -05:00
Preetha Appan d324a9864e
Add validation for kind field if it is a consul connect proxy 2019-08-12 14:39:50 -05:00
Danielle Lancashire b38c1d810e
job_endpoint: Validate volume permissions 2019-08-12 15:39:09 +02:00
Danielle Lancashire 33db40d4e6
structs: Document VolumeMount 2019-08-12 15:39:08 +02:00
Danielle Lancashire 861caa9564
HostVolumeConfig: Source -> Path 2019-08-12 15:39:08 +02:00
Danielle Lancashire e132a30899
structs: Unify Volume and VolumeRequest 2019-08-12 15:39:08 +02:00
Danielle Lancashire 6d7b417e54
structs: Add declarations of basic structs for volume support 2019-08-12 15:39:08 +02:00
Nick Ethier 1871c1edbc
Add sidecar_task stanza parsing (#6104)
* jobspec: breakup parse.go into smaller files

* add sidecar_task parsing to jobspec and api

* jobspec: combine service parsing logic for task and group service stanzas

* api: use slice of ConsulUpstream values instead of pointers
2019-08-09 15:18:53 -04:00
Preetha Appan a393ea79e8
Add field "kind" to task for use in connect tasks 2019-08-07 18:43:36 -05:00
Jasmine Dahilig 8d980edd2e
add create and modify timestamps to evaluations (#5881) 2019-08-07 09:50:35 -07:00
Michael Schurter 3e4796799a
Merge pull request #6003 from pete-woods/add-job-status-metrics
nomad: add job status metrics
2019-08-07 08:02:16 -07:00
Michael Schurter d2862b33e6
Merge pull request #6045 from hashicorp/f-connect-groupservice
consul: add Connect structs
2019-08-06 15:43:38 -07:00
Michael Schurter ef9d100d2f
Merge pull request #6082 from hashicorp/b-vault-deadlock
vault: fix deadlock in SetConfig
2019-08-06 15:30:17 -07:00
Michael Schurter ecb1a65bb9
Merge pull request #6077 from hashicorp/b-vault-revlock
vault: fix race in accessor revocations
2019-08-06 14:28:47 -07:00
Michael Schurter b8e127b3c0 vault: ensure SetConfig calls are serialized
This is a defensive measure as SetConfig should only be called serially.
2019-08-06 11:17:10 -07:00
Michael Schurter 5022341b27 vault: fix deadlock in SetConfig
This seems to be the minimum viable patch for fixing a deadlock between
establishConnection and SetConfig.

SetConfig calls tomb.Kill+tomb.Wait while holding v.lock.
establishConnection needs to acquire v.lock to exit but SetConfig is
holding v.lock until tomb.Wait exits. tomb.Wait can't exit until
establishConnect does!

```
  SetConfig -> tomb.Wait
     ^              |
     |              v
  v.lock <- establishConnection
```
2019-08-06 10:40:14 -07:00
Michael Schurter 17fd82d6ad consul: add Connect structs
Refactor all Consul structs into {api,structs}/services.go because
api/tasks.go didn't make sense anymore and structs/structs.go is
gigantic.
2019-08-06 08:15:07 -07:00
Michael Schurter d0a83eb818 vault: fix race in accessor revocations 2019-08-05 15:08:04 -07:00
Preetha Appan 8b298621ef
Add more comments to clarify job.Stable field 2019-08-05 15:00:53 -05:00
Preetha Appan e6a496bac0
Code review feedback 2019-07-31 01:04:08 -04:00
Preetha Appan 99eca85206
Scheduler changes to support network at task group level
Also includes unit tests for binpacker and preemption.
The tests verify that network resources specified at the
task group level are properly accounted for
2019-07-31 01:04:08 -04:00
Michael Schurter 4501fe3c4d
structs: deepcopy shared alloc resources
Also DRY up Networks code by using Networks.Copy
2019-07-31 01:04:06 -04:00
Michael Schurter fb487358fb
connect: add group.service stanza support 2019-07-31 01:04:05 -04:00
Nick Ethier a03f6a95a2
structs: refactor network validation to seperate fn 2019-07-31 01:03:16 -04:00
Danielle 1e7571eb85
fix structs comment
Co-Authored-By: nickethier <ncethier@gmail.com>
2019-07-31 01:03:16 -04:00
Nick Ethier aa7c08679e
structs: Add validations for task group networks 2019-07-31 01:03:16 -04:00
Nick Ethier 6c160df689
fix tests from introducing new struct fields 2019-07-31 01:03:16 -04:00
Nick Ethier 8650429e38
Add network stanza to group
Adds a network stanza and additional options to the task group level
in prep for allowing shared networking between tasks of an alloc.
2019-07-31 01:03:12 -04:00
Preetha Appan d048029b5a
remove generated code and change version to 0.10.0 2019-07-30 15:56:05 -05:00
Nomad Release bot e39fb11531 Generate files for 0.9.4 release 2019-07-30 19:05:18 +00:00
Buck Doyle 0a1a0419cb Combine conditionals 2019-07-29 10:38:07 -05:00
Buck Doyle 0a082c1e5e Update assertion to use better failure-reporting 2019-07-29 10:35:07 -05:00
Buck Doyle c3deb7703d Update policy endpoint to permit anonymous access 2019-07-26 13:07:42 -05:00
Pete Woods 9096aa3d23
Add job status metrics
This avoids having to write services to repeatedly hit the jobs API
2019-07-26 10:12:49 +01:00
Buck Doyle 77f5a38c8f Add parsed rules to policy response 2019-07-25 10:43:57 -05:00
Preetha Appan 6b4c40f5a8
remove generated code 2019-07-23 12:07:49 -05:00
Nomad Release bot 04187c8b86 Generate files for 0.9.4-rc1 release 2019-07-22 21:42:36 +00:00
Jasmine Dahilig 2157f6ddf1
add formatting for hcl parsing error messages (#5972) 2019-07-19 10:04:39 -07:00
Lang Martin f282da4ced blocked_evals_test disable calls Flush 2019-07-18 10:32:13 -04:00
Lang Martin 8f7a20839e worker comment system -> core 2019-07-18 10:32:13 -04:00
Lang Martin 83d20169f6 blocked_evals reset system evals on Flush 2019-07-18 10:32:13 -04:00
Lang Martin 6e3425babf blocked_evals_test Test_UnblockNode 2019-07-18 10:32:12 -04:00
Lang Martin ea275d5ce7 fsm attach UnblockNode on node updates 2019-07-18 10:32:12 -04:00
Lang Martin 3bf618f217 blocked_evals system evals indexed by job and node 2019-07-18 10:32:12 -04:00
Michael Schurter 81b4b6f19b
Merge pull request #5791 from hashicorp/b-plan-snapshotindex
nomad: include snapshot index when submitting plans
2019-07-17 09:25:00 -07:00
Mahmood Ali ad39bcef60 rpc: use tls wrapped connection for streaming rpc
This ensures that server-to-server streaming RPC calls use the tls
wrapped connections.

Prior to this, `streamingRpcImpl` function uses tls for setting header
and invoking the rpc method, but returns unwrapped tls connection.
Thus, streaming writes fail with tls errors.

This tls streaming bug existed since 0.8.0[1], but PR #5654[2]
exacerbated it in 0.9.2.  Prior to PR #5654, nomad client used to
shuffle servers at every heartbeat -- `servers.Manager.setServers`[3]
always shuffled servers and was called by heartbeat code[4].  Shuffling
servers meant that a nomad client would heartbeat and establish a
connection against all nomad servers eventually.  When handling
streaming RPC calls, nomad servers used these local connection to
communicate directly to the client.  The server-to-server forwarding
logic was left mostly unexercised.

PR #5654 means that a nomad client may connect to a single server only
and caused the server-to-server forward streaming RPC code to get
exercised more and unearthed the problem.

[1] https://github.com/hashicorp/nomad/blob/v0.8.0/nomad/rpc.go#L501-L515
[2] https://github.com/hashicorp/nomad/pull/5654
[3] https://github.com/hashicorp/nomad/blob/v0.9.1/client/servers/manager.go#L198-L216
[4] https://github.com/hashicorp/nomad/blob/v0.9.1/client/client.go#L1603
2019-07-12 14:41:44 +08:00
Mahmood Ali 9c9bec62fd rpc: add positive tests for server streaming RPC 2019-07-12 14:32:52 +08:00
Lang Martin 0b97175a16 node_endpoint preserve both messages as rpcs and in raft 2019-07-10 13:56:20 -04:00
Lang Martin ee4848167c core_sched add compat comment for later removal 2019-07-10 13:56:20 -04:00
Lang Martin c13c97c6c2 structs drop deprecation warning, revert unnecessary comment change 2019-07-10 13:56:20 -04:00
Lang Martin a95225d754 NodeDeregisterBatch -> NodeBatchDeregister match JobBatch pattern 2019-07-10 13:56:20 -04:00
Lang Martin a8e72a5b68 state_store error if called without node_ids 2019-07-10 13:56:20 -04:00
Lang Martin 44cbca9b98 fsm new NodeDeregisterBatchRequestType sorted at the end of the case 2019-07-10 13:56:20 -04:00
Lang Martin 91e139dcb5 structs NodeDeregisterBatchRequestType must go at the end 2019-07-10 13:56:20 -04:00
Lang Martin 1cc6b4062c fsm label batch_deregister_node metrics explicitly
Co-Authored-By: Mahmood Ali <mahmood@notnoop.com>
2019-07-10 13:56:20 -04:00
Lang Martin ad3549f906 core_sched use the new rpc names 2019-07-10 13:56:20 -04:00
Lang Martin ce0f03651a fsm support new NodeDeregisterBatchRequest 2019-07-10 13:56:20 -04:00
Lang Martin fa5649998e node endpoint support new NodeDeregisterBatchRequest 2019-07-10 13:56:19 -04:00
Lang Martin 683ab8d1d2 structs add NodeDeregisterBatchRequest 2019-07-10 13:56:19 -04:00
Lang Martin 82349aba5d node_endpoint argument setup 2019-07-10 13:56:19 -04:00
Lang Martin 6dbf5d7d13 fsm return an error on both NodeDeregisterRequest fields set 2019-07-10 13:56:19 -04:00
Lang Martin fbc78ba96c fsm variable names for consistency 2019-07-10 13:56:19 -04:00
Lang Martin 09fd05bd8f node_endpoint raft store then shutdown, test deprecation 2019-07-10 13:56:19 -04:00
Lang Martin 4610c70777 util simplify partitionAll 2019-07-10 13:56:19 -04:00
Lang Martin d22d9fb5b2 core_sched check ServersMeetMinimumVersion 2019-07-10 13:56:19 -04:00
Lang Martin 3bf41211fb fsm honor new and old style NodeDeregisterRequests 2019-07-10 13:56:19 -04:00
Lang Martin 3fb82e83a5 structs add back NodeDeregisterRequest.NodeID, compatibility 2019-07-10 13:56:19 -04:00
Lang Martin a4472e3d34 core_sched check ServersMeetMinimumVersion, send old node deregister 2019-07-10 13:56:19 -04:00
Lang Martin 8e53c105fc state_store just one index update, test deletion 2019-07-10 13:56:19 -04:00
Lang Martin 3e2d1f0338 node_endpoint improve error messages 2019-07-10 13:56:19 -04:00
Lang Martin 5a6a947e98 state_store improve error messages 2019-07-10 13:56:19 -04:00
Lang Martin fd14cedf95 drainer watch_nodes_test batch of 1 2019-07-10 13:56:19 -04:00
Lang Martin b176066d42 node_endpoint deregister the batch of nodes 2019-07-10 13:56:19 -04:00
Lang Martin a97407e030 fsm NodeDeregisterRequest is now a batch 2019-07-10 13:56:19 -04:00
Lang Martin d5ff2834ca core_sched batch node deregistration requests 2019-07-10 13:56:19 -04:00
Lang Martin 10848841be util partitionAll for paging 2019-07-10 13:56:19 -04:00
Lang Martin be2d6853cb state_store DeleteNode operates on a batch of ids 2019-07-10 13:56:19 -04:00
Lang Martin 77cf037bff struct NodeDeregisterRequest has a batch of NodeIDs 2019-07-10 13:56:19 -04:00
Mahmood Ali ea3a98357f Block rpc handling until state store is caught up
Here, we ensure that when leader only responds to RPC calls when state
store is up to date.  At leadership transition or launch with restored
state, the server local store might not be caught up with latest raft
logs and may return a stale read.

The solution here is to have an RPC consistency read gate, enabled when
`establishLeadership` completes before we respond to RPC calls.
`establishLeadership` is gated by a `raft.Barrier` which ensures that
all prior raft logs have been applied.

Conversely, the gate is disabled when leadership is lost.

This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files
2019-07-02 16:07:37 +08:00
Preetha Appan 3cb798235d
Missed one revert of backwards compatibility for node drain 2019-07-01 16:46:05 -05:00
Preetha Appan aa2b4b4e00
Undo removal of node drain compat changes
Decided to remove that in 0.10
2019-07-01 15:12:01 -05:00
Preetha Appan 3484f18984
Fix more tests 2019-06-26 16:30:53 -05:00
Preetha Appan ff1b80dba6
Fix node drain test 2019-06-26 16:12:07 -05:00
Preetha Appan 23319e04d6
Restore accidentally deleted block 2019-06-26 13:59:14 -05:00
Michael Schurter 69ba495f0c nomad: expand comments on subtle plan apply behaviors 2019-06-26 08:49:24 -07:00
Preetha Appan 66fa6a67ec
newline 2019-06-25 19:41:09 -05:00
Preetha Appan 10e7d6df6d
Remove compat code associated with many previous versions of nomad
This removes compat code for namespaces (0.7), Drain(0.8) and other
older features from releases older than Nomad 0.7
2019-06-25 19:05:25 -05:00
Michael Schurter e4bc943a68 nomad: SnapshotAfter -> SnapshotMinIndex
Rename SnapshotAfter to SnapshotMinIndex. The old name was not
technically accurate. SnapshotAtOrAfter is more accurate, but wordy and
still lacks context about what precisely it is at or after (the index).

SnapshotMinIndex was chosen as it describes the action (snapshot), a
constraint (minimum), and the object of the constraint (index).
2019-06-24 12:16:46 -07:00
Michael Schurter 0f8164b2f1 nomad: evaluate plans after previous plan index
The previous commit prevented evaluating plans against a state snapshot
which is older than the snapshot at which the plan was created.  This is
correct and prevents failures trying to retrieve referenced objects that
may not exist until the plan's snapshot. However, this is insufficient
to guarantee consistency if the following events occur:

1. P1, P2, and P3 are enqueued with snapshot @ 100
2. Leader evaluates and applies Plan P1 with snapshot @ 100
3. Leader evaluates Plan P2 with snapshot+P1 @ 100
4. P1 commits @ 101
4. Leader evaluates applies Plan P3 with snapshot+P2 @ 100

Since only the previous plan is optimistically applied to the state
store, the snapshot used to evaluate a plan may not contain the N-2
plan!

To ensure plans are evaluated and applied serially we must consider all
previous plan's committed indexes when evaluating further plans.

Therefore combined with the last PR, the minimum index at which to
evaluate a plan is:

    min(previousPlanResultIndex, plan.SnapshotIndex)
2019-06-24 12:16:46 -07:00
Michael Schurter e10fea1d7a nomad: include snapshot index when submitting plans
Plan application should use a state snapshot at or after the Raft index
at which the plan was created otherwise it risks being rejected based on
stale data.

This commit adds a Plan.SnapshotIndex which is set by workers when
submitting plan. SnapshotIndex is set to the Raft index of the snapshot
the worker used to generate the plan.

Plan.SnapshotIndex plays a similar role to PlanResult.RefreshIndex.
While RefreshIndex informs workers their StateStore is behind the
leader's, SnapshotIndex is a way to prevent the leader from using a
StateStore behind the worker's.

Plan.SnapshotIndex should be considered the *lower bound* index for
consistently handling plan application.

Plans must also be committed serially, so Plan N+1 should use a state
snapshot containing Plan N. This is guaranteed for plans *after* the
first plan after a leader election.

The Raft barrier on leader election ensures the leader's statestore has
caught up to the log index at which it was elected. This guarantees its
StateStore is at an index > lastPlanIndex.
2019-06-24 12:16:46 -07:00
Chris Baker 59fac48d92 alloc lifecycle: 404 when attempting to stop non-existent allocation 2019-06-20 21:27:22 +00:00
Preetha 586e50d1a4
Merge pull request #5841 from hashicorp/f-raft-snapshot-metrics
Raft and state store indexes as metrics
2019-06-19 12:01:03 -05:00
Preetha Appan dc0ac81609
Change interval of raft stats collection to 10s 2019-06-19 11:58:46 -05:00
Preetha Appan 104d66f10c
Changed name of metric 2019-06-17 15:51:31 -05:00
Chris Baker e0170e1c67 metrics: add namespace label to allocation metrics 2019-06-17 20:50:26 +00:00
Preetha Appan c54b4a5b17
Emit metrics with raft commit and apply index and statestore latest index 2019-06-14 16:30:27 -05:00
Jasmine Dahilig ed9740db10
Merge pull request #5664 from hashicorp/f-http-hcl-region
backfill region from hcl for jobUpdate and jobPlan
2019-06-13 12:25:01 -07:00
Jasmine Dahilig 51e141be7a backfill region from job hcl in jobUpdate and jobPlan endpoints
- updated region in job metadata that gets persisted to nomad datastore
- fixed many unrelated unit tests that used an invalid region value
(they previously passed because hcl wasn't getting picked up and
the job would default to global region)
2019-06-13 08:03:16 -07:00
Nick Ethier 1b7fa4fe29
Optional Consul service tags for nomad server and agent services (#5706)
Optional Consul service tags for nomad server and agent services
2019-06-13 09:00:35 -04:00
Mahmood Ali e31159bf1f Prepare for 0.9.4 dev cycle 2019-06-12 18:47:50 +00:00
Nomad Release bot 4803215109 Generate files for 0.9.3 release 2019-06-12 16:11:16 +00:00
Mahmood Ali 07f2c77c44 comment DenormalizeAllocationDiffSlice applies to terminal allocs only 2019-06-12 08:28:43 -04:00
Lang Martin fe8a4781d8 config merge maintains *HCL string fields used for duration conversion 2019-06-11 16:34:04 -04:00
Mahmood Ali 392f5bac44 Stop updating allocs.Job on stopping or preemption 2019-06-10 18:30:20 -04:00
Mahmood Ali 6c8e329819 test that stopped alloc jobs aren't modified
When an alloc is stopped, test that we don't update the job found in
alloc with new job  that is no longer relevent for this alloc.
2019-06-10 17:14:26 -04:00
Mahmood Ali d30c3d10b0
Merge pull request #5747 from hashicorp/b-test-fixes-20190521-1
More test fixes
2019-06-05 19:09:18 -04:00
Mahmood Ali 87173111de
Merge pull request #5746 from hashicorp/b-no-updating-inmem-node
set node.StatusUpdatedAt in raft
2019-06-05 19:05:21 -04:00
Mahmood Ali 97957fbf75 Prepare for 0.9.3 dev cycle 2019-06-05 14:54:00 +00:00
Nomad Release bot 43bfbf3fcc Generate files for 0.9.2 release 2019-06-05 11:59:27 +00:00
Michael Schurter 073893f529 nomad: disable service+batch preemption by default
Enterprise only.

Disable preemption for service and batch jobs by default.

Maintain backward compatibility in a x.y.Z release. Consider switching
the default for new clusters in the future.
2019-06-04 15:54:50 -07:00
Michael Schurter a8fc50cc1b nomad: revert use of SnapshotAfter in planApply
Revert plan_apply.go changes from #5411

Since non-Command Raft messages do not update the StateStore index,
SnapshotAfter may unnecessarily block and needlessly fail in idle
clusters where the last Raft message is a non-Command message.

This is trivially reproducible with the dev agent and a job that has 2
tasks, 1 of which fails.

The correct logic would be to SnapshotAfter the previous plan's index to
ensure consistency. New clusters or newly elected leaders will not have
a previous plan, so the index the leader was elected should be used
instead.
2019-06-03 15:34:21 -07:00
Mahmood Ali a4ead8ff79 remove 0.9.2-rc1 generated code 2019-05-23 11:14:24 -04:00
Nomad Release bot 6d6bc59732 Generate files for 0.9.2-rc1 release 2019-05-22 19:29:30 +00:00
Lang Martin d46613ff44 structs check TaskGroup.Update for nil 2019-05-22 12:34:57 -04:00
Lang Martin 10a3fd61b0 comment replace COMPAT 0.7.0 for job.Update with more current info 2019-05-22 12:34:57 -04:00
Lang Martin 67ebcc47dd structs comment todo DeploymentStatus & DeploymentStatusDescription 2019-05-22 12:34:57 -04:00
Lang Martin 21bf9fdf90 structs job warnings for taskgroup with mixed auto_promote settings 2019-05-22 12:34:57 -04:00
Lang Martin 0f6f543a5f deployment_watcher auto promote iff every task group is auto promotable 2019-05-22 12:34:57 -04:00
Lang Martin d27d6f8ede structs validate requires Canary for AutoPromote 2019-05-22 12:32:08 -04:00
Lang Martin 0c668ecc7a log error on autoPromoteDeployment failure 2019-05-22 12:32:08 -04:00
Lang Martin f23f9fd99e describe a pending deployment without auto_promote more explicitly 2019-05-22 12:32:08 -04:00
Lang Martin 34230577df describe a pending deployment with auto_promote accurately 2019-05-22 12:32:08 -04:00
Lang Martin b5fd735960 add update AutoPromote bool 2019-05-22 12:32:08 -04:00
Lang Martin 3c5a9fed22 deployments_watcher_test new TestWatcher_AutoPromoteDeployment 2019-05-22 12:32:08 -04:00
Lang Martin 0bebf5d7f8 deployment_watcher when it's ok to autopromote, do so 2019-05-22 12:32:08 -04:00
Lang Martin 0cf4168ed9 deployments_watcher comments 2019-05-22 12:32:08 -04:00
Lang Martin 0c403eafde state_store typo in a comment 2019-05-22 12:32:08 -04:00
Lang Martin e1e28307be new deploymentwatcher/doc.go for package level documentation 2019-05-22 12:32:08 -04:00
Mahmood Ali 9ff5f163b5 update callers in tests 2019-05-21 21:10:17 -04:00
Mahmood Ali 6bdbeed319 set node.StatusUpdatedAt in raft
Fix a case where `node.StatusUpdatedAt` was manipulated directly in
memory.

This ensures that StatusUpdatedAt is set in raft layer, and ensures that
the field is updated when node drain/eligibility is updated too.
2019-05-21 16:13:32 -04:00
Mahmood Ali 2159d0f3ac tests: fix some nomad/drainer test data races 2019-05-21 14:40:58 -04:00
Mahmood Ali 3b0152d778 tests: fix deploymentwatcher tests data races 2019-05-21 14:29:45 -04:00
Michael Schurter 689794e08d nomad: fix deadlock in UnblockClassAndQuota
Previous commit could introduce a deadlock if the capacityChangeCh was
full and the receiving side exited before freeing a slot for the sending
side could send. Flush would then block forever waiting to acquire the
lock just to throw the pending update away.

The race is around getting/setting the chan field, not chan operations,
so only lock around getting the chan field.
2019-05-20 15:41:52 -07:00
Michael Schurter 8c99214f69 nomad: fix race in BlockedEvals
I assume the mutex was being released before sending on capacityChangeCh
to avoid blocking in the critical section, but:

1. This is race.
2. capacityChangeCh has a *huge* buffer (8096). If it's full things
   already seem Very Bad, and a little backpressure seems appropriate.
2019-05-20 15:26:20 -07:00
Michael Schurter 05a9c6aedb
Merge pull request #5411 from hashicorp/b-snapshotafter
Block plan application until state store has caught up to raft
2019-05-20 14:03:10 -07:00
Mahmood Ali cd64ada95d Run TestClientAllocations_Restart_ACL test 2019-05-17 20:30:23 -04:00
Michael Schurter 0e39927782 nomad: emit more detailed error
Avoid returning context.DeadlineExceeded as it lacks helpful information
and is often ignored or handled specially by callers.
2019-05-17 14:37:42 -07:00
Michael Schurter b80a7e0feb nomad: wait for state store to sync in plan apply
Wait for state store to catch up with raft when applying plans.
2019-05-17 14:37:12 -07:00
Michael Schurter 1bc731da47 nomad: remove unused NotifyGroup struct
I don't think it's been used for a long time.
2019-05-17 13:30:23 -07:00
Michael Schurter 9732bc37ff nomad: refactor waitForIndex into SnapshotAfter
Generalize wait for index logic in the state store for reuse elsewhere.
Also begin plumbing in a context to combine handling of timeouts and
shutdown.
2019-05-17 13:30:23 -07:00
Preetha c8fdf20c66
Merge pull request #5717 from hashicorp/b-plan-apply-preemptions
Fix bug in plan applier introduced in PR-5602
2019-05-16 11:01:05 -05:00
Preetha 2dcd4291f8
Merge pull request #5702 from hashicorp/f-filter-by-create-index
Filter deployments by create index
2019-05-15 21:50:41 -05:00
Preetha 555dd23c2c
remove stray newline
Co-Authored-By: Danielle <dani@builds.terrible.systems>
2019-05-15 21:11:52 -05:00
Preetha Appan 2b787aad7e
Fix bug in plan applier introduced in PR-5602
This fixes a bug in the state store during plan apply. When
denormalizing preempted allocations it incorrectly set the preemptor's
job during the update. This eventually causes a panic downstream in the
client. Added a test assertion that failed before and passes after this fix
2019-05-15 20:34:06 -05:00
Danielle d202582502
Merge pull request #5699 from hashicorp/dani/b-eval-broker-lifetime
Eval Broker: Prevent redundant enqueue's when a node is not a leader
2019-05-15 23:30:52 +01:00
Danielle Lancashire 2fb93a6229 evalbroker: test for no enqueue on disabled 2019-05-15 11:02:21 +02:00
Nick Ethier ade97bc91f
fixup #5172 and rebase against master 2019-05-14 14:37:34 -04:00
Nick Ethier cab6a95668
Merge branch 'master' into pr/5172
* master: (912 commits)
  Update redirects.txt
  Added redirect for Spark guide link
  client: log when server list changes
  docs: mention regression in task config validation
  fix update to changelog
  update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665
  typo: "atleast" -> "at least"
  implement nomad exec for rkt
  docs: fixed typo
  use pty/tty terminology similar to github.com/kr/pty
  vendor github.com/kr/pty
  drivers: implement streaming exec for executor based drivers
  executors: implement streaming exec
  executor: scaffolding for executor grpc handling
  client: expose allocated memory per task
  client improve a comment in updateNetworks
  stalebot: Add 'thinking' as an exempt label (#5684)
  Added Sparrow link
  update links to use new canonical location
  Add redirects for restructing done in GH-5667
  ...
2019-05-14 14:10:33 -04:00
Michael Schurter d7e5ace1ed client: do not restart dead tasks until server is contacted
Fixes #1795

Running restored allocations and pulling what allocations to run from
the server happen concurrently. This means that if a client is rebooted,
and has its allocations rescheduled, it may restart the dead allocations
before it contacts the server and determines they should be dead.

This commit makes tasks that fail to reattach on restore wait until the
server is contacted before restarting.
2019-05-14 10:53:27 -07:00
Danielle Lancashire d9815888ed evalbroker: Simplify nextDelayedEval locking 2019-05-14 14:06:27 +02:00
Danielle Lancashire 38562afbc1 evalbroker: No new enqueues when disabled
Currently when an evalbroker is disabled, it still recieves delayed
enqueues via log application in the fsm. This causes an ever growing
heap of evaluations that will never be drained, and can cause memory
issues in larger clusters, or when left running for an extended period
of time without a leader election.

This commit prevents the enqueuing of evaluations while we are
disabled, and relies on the leader restoreEvals routine to handle
reconciling state during a leadership transition.

Existing dequeues during an Enabled->Disabled broker state transition are
handled by the enqueueLocked function dropping evals.
2019-05-14 13:59:10 +02:00