Commit graph

17106 commits

Author SHA1 Message Date
Drew Bailey a716d57ad7
clean up 2020-02-04 11:59:28 -05:00
Drew Bailey 75053a0d10
get test passing, new util func to wait for not pending 2020-02-04 11:56:37 -05:00
Drew Bailey 5117a22c30
add e2e test for system sched ineligible nodes 2020-02-04 11:56:33 -05:00
Seth Hoenig f4a66ebd28 e2e: wait 2m rather than 10s after disabling consul acls
Pretty sure Consul / Nomad clients are often not ready yet after
the ConsulACLs test disables ACLs, by the time the next test starts
running.

Running locally things tend to work, but in TeamCity this seems to
be a recurring problem. However, when running locally sometimes I do
see that the "show status" step after disabling ACLs, some nodes are
still initializing, suggesting we're right on the border of not waiting
long enough

    nomad node status
    ID        DC   Name              Class   Drain  Eligibility  Status
    0e4dfce2  dc1  EC2AMAZ-JB3NF9P   <none>  false  eligible     ready
    6b90aa06  dc2  ip-172-31-16-225  <none>  false  eligible     ready
    7068558a  dc2  ip-172-31-20-143  <none>  false  eligible     ready
    e0ae3c5c  dc1  ip-172-31-25-165  <none>  false  eligible     ready
    15b59ed6  dc1  ip-172-31-23-199  <none>  false  eligible     initializing

Going to try waiting a full 2 minutes after disabling ACLs, hopefully that
will help things Just Work. In the future, we should probably be parsing the
output of the status checks and actually confirming all nodes are ready.

Even better, maybe that's something shipyard will have built-in.
2020-02-04 10:51:03 -06:00
Tim Gross 0b48baf0ba
e2e: rename linux runner to avoid implicit build tag (#7070)
Go implicitly treats files ending with `_linux.go` as build tagged for
Linux only. This broke the e2e provisioning framework on macOS once we
tried importing it into the `e2e/consulacls` module.
2020-02-04 10:55:38 -05:00
Tim Gross 940110b2de
e2e: improve provisioning defaults and documentation (#7062)
This changeset improves the ergonomics of running the Nomad e2e test
provisioning process by defaulting to a blank `nomad_sha` in the
Terraform configuration. By default, a user will now need to pass in
one of the Nomad version flags. But they won't have to manually edit
the `provisioning.json` file for the common case of deploying a
released version of Nomad, and won't need to put dummy values for
`nomad_sha`.

Includes general documentation improvements.
2020-02-04 10:37:00 -05:00
Seth Hoenig 15b5a73660
Merge pull request #7060 from hashicorp/f-e2e-more-missed-debug
e2e: turn no-ACLs connect tests back on
2020-02-04 08:47:10 -06:00
Seth Hoenig 653c8fe9a5 e2e: turn no-ACLs connect tests back on
Also cleanup more missed debugging things >.>
2020-02-03 20:46:36 -06:00
Mahmood Ali e106d373b2 rpc: Use MultiplexV2 for connections
MultiplexV2 is a new connection multiplex header that supports multiplex both
RPC and streaming requests over the same Yamux connection.

MultiplexV2 was added in 0.8.0 as part of
https://github.com/hashicorp/nomad/pull/3892 .  So Nomad 0.11 can expect it to
be supported.  Though, some more rigorous testing is required before merging
this.

I want to call out some implementation details:

First, the current connection pool reuses the Yamux stream for multiple RPC calls,
and doesn't close them until an error is encountered.  This commit doesn't
change it, and sets the `RpcNomad` byte only at stream creation.

Second, the StreamingRPC session gets closed by callers and cannot be reused.
Every StreamingRPC opens a new Yamux session.
2020-02-03 19:31:39 -05:00
Michael Schurter 7febd6beb8
docs: fix typo, ordering, & style in changelog 2020-02-03 13:59:57 -08:00
Mahmood Ali 2424870937
Merge pull request #7055 from hashicorp/r-dev-tweaks-20200203
Grab bag of dev tweaks
2020-02-03 14:25:06 -05:00
Drew Bailey 00637fe317
Merge pull request #6975 from hashicorp/b-update-placed-canaries
keep placed canaries aligned in raft store
2020-02-03 14:24:32 -05:00
Michael Schurter 0ecc943cf5
docs: add link & reorg #6690 in changelog 2020-02-03 11:03:45 -08:00
Drew Bailey 9a65556211
add state store test to ensure PlacedCanaries is updated 2020-02-03 13:58:01 -05:00
Drew Bailey f51a3d1f37
nomad state store must be modified through raft, rm local state change 2020-02-03 13:57:34 -05:00
Drew Bailey 74779f23e6
keep placed canaries aligned with alloc status 2020-02-03 13:57:33 -05:00
Drew Bailey 38965bf3b3
Merge pull request #7053 from hashicorp/b-client-monitor-acl-panic
Fix panic when monitoring a local client node
2020-02-03 13:45:46 -05:00
Michael Schurter ef498b26ab
docs: fix misspelling 2020-02-03 10:32:22 -08:00
Michael Lange 1d9c7515ad
Merge pull request #6979 from hashicorp/f/codeowners
Add the digital marketing team as the code owners for the website dir
2020-02-03 10:28:12 -08:00
Drew Bailey 29dc0688a2
Merge pull request #6996 from hashicorp/system-sched-ineligible-updates
System sched ignore ineligible updates
2020-02-03 13:22:30 -05:00
Drew Bailey 06f4a0d946
update changelog 2020-02-03 13:20:07 -05:00
Drew Bailey d830998572
agent Profile req nil check s.agent.Server()
clean up logic and tests
2020-02-03 13:20:05 -05:00
Drew Bailey c4f45f9bde
Fix panic when monitoring a local client node
Fixes a panic when accessing a.agent.Server() when agent is a client
instead. This pr removes a redundant ACL check since ACLs are validated
at the RPC layer. It also nil checks the agent server and uses Client()
when appropriate.
2020-02-03 13:20:04 -05:00
Mahmood Ali 06a283ea93
Merge pull request #7045 from hashicorp/b-rpc-fixes
Some fixes to connection pooling
2020-02-03 13:10:15 -05:00
Seth Hoenig e3684f84cb
Merge pull request #7054 from hashicorp/f-remove-leftover-debug-line
e2e: remove leftover e2e debug println
2020-02-03 12:02:19 -06:00
Mahmood Ali a4d58c3178 pool: Clear connection before releasing
This to be consistent with other connection clean up handler as well as consul's https://github.com/hashicorp/consul/blob/v1.6.3/agent/pool/pool.go#L468-L479 .
2020-02-03 12:41:11 -05:00
Mahmood Ali fbd1c270f7 make: emit explanation for /api isolation
Emit a slightly helpful message when /api depends on nomad internal
packages.
2020-02-03 12:22:10 -05:00
Mahmood Ali 7171488e81 run "make hclfmt" 2020-02-03 12:15:53 -05:00
Seth Hoenig 057179edea e2e: remove leftover debug println statement 2020-02-03 11:15:38 -06:00
Mahmood Ali 65fb8ba637 dev: Tweaks to cluster dev scripts
Consolidate all nomad data dir in a single root
`/tmp/nomad-dev-cluster`.  Eases clean up.

Allow running script from any path - don't require devs to cd into
`dev/cluster` directory first.

Also, block while nomad processes are running and prapogate
SIGTERM/SIGINT to nomad processes to shutdown.
2020-02-03 11:50:43 -05:00
Mahmood Ali ea18569f2c prehook: fix enterprise repo remote value 2020-02-03 11:29:25 -05:00
Mahmood Ali 12dc311659 vagrant: disable audio interference
Avoid Vagrant/virtualbox interferring with host audio when the VM boots.
2020-02-03 11:26:41 -05:00
Mahmood Ali d43e005291
Merge pull request #7051 from hashicorp/b-copy-jobs-oss
sentinel: copy jobs to prevent mutation
2020-02-03 10:49:44 -05:00
Drew Bailey a255993ecd
update changelog 2020-02-03 09:04:09 -05:00
Drew Bailey 1c046a74d8
comment for filtering reason 2020-02-03 09:02:09 -05:00
Drew Bailey e71f132455
add test for node eligibility 2020-02-03 09:02:09 -05:00
Drew Bailey 6b492630dd
make diffSystemAllocsForNode aware of eligibility
diffSystemAllocs -> diffSystemAllocsForNode, this function is only used
for diffing system allocations, but lacked awareness of eligible
nodes and the node ID that the allocation was going to be placed.

This change now ignores a change if its existing allocation is on an
ineligible node. For a new allocation, it also checks tainted and
ineligible nodes in the same function instead of nil-ing out the diff
after computation in diffSystemAllocs
2020-02-03 09:02:08 -05:00
Drew Bailey e613a258da
ignore computed diffs if node is ineligible
test flakey, add temp sleeps for debugging

fix computed class
2020-02-03 09:02:08 -05:00
Michael Schurter 9bedd0202e sentinel: copy jobs to prevent mutation
It's unclear whether Sentinel code can mutate values passed to the eval,
so ensure it cannot by copying the job.
2020-02-03 08:48:51 -05:00
Michael Lange 71688376bd
Merge pull request #7047 from hashicorp/f-ui/node-drain-icons
UI: Node drain status light icons
2020-01-31 22:45:04 -08:00
Seth Hoenig 792a014b1a
Merge pull request #7027 from hashicorp/dev-connect-acls
connect acls - rebase all the things
2020-01-31 19:37:06 -06:00
Seth Hoenig 22eb91fbdf nomad/docs: increment version number to 0.10.4 2020-01-31 19:06:46 -06:00
Seth Hoenig 8872b44b58 docs: update chanagelog to mention connect with acls 2020-01-31 19:06:42 -06:00
Seth Hoenig 6bfa50acdc nomad: remove unused default schedular variable
This is from a merge conflict resolution that went the wrong direction.

I assumed the block had been added, but really it had been removed. Now,
it is removed once again.
2020-01-31 19:06:37 -06:00
Seth Hoenig d3cd6afd7e nomad: min cluster version for connect ACLs is now v0.10.4 2020-01-31 19:06:19 -06:00
Seth Hoenig db7bcba027 tests: set consul token for nomad client for testing SIDS TR hook 2020-01-31 19:06:15 -06:00
Seth Hoenig 9b20ca5b25 e2e: setup consul ACLs a little more correctly 2020-01-31 19:06:11 -06:00
Seth Hoenig 83c717a624 e2e: remove redundant extra API call for getting allocs 2020-01-31 19:06:07 -06:00
Seth Hoenig b212654b92 e2e: agent token was only being set for server0 2020-01-31 19:06:03 -06:00
Seth Hoenig f7a1e9cee3 e2e: use hclfmt on consul acls policy config files 2020-01-31 19:05:59 -06:00