Commit graph

18179 commits

Author SHA1 Message Date
Tim Gross 0a9b32e94e
docs: fix broken internal link on job plan page (#7640) 2020-04-06 16:39:11 -04:00
Michael Lange 35d4b48644
Merge pull request #7577 from hashicorp/f-ui/csi-beta-label
UI: Change CSI to Storage and mark it as beta
2020-04-06 13:21:11 -07:00
Nick Ethier 567609e101
tr/service_hook: reset initialized flag during deregister 2020-04-06 16:05:36 -04:00
Buck Doyle f10906e006
UI: add exec handling for dead jobs/task states (#7637)
This closes #7456. It hides the terminal when the job is dead and
displays an error when trying to open an exec session for a task
that isn’t running. There’s a skipped test for the latter behaviour
that I’ll have to come back for.
2020-04-06 14:08:22 -05:00
Buck Doyle fc7de8b153
UI: add live-updating to exec sidebar (#7499)
This closes #7454. It makes use of the existing watchable tools to
allow the exec popup sidebar to be live-updating. It also adds
alphabetic sorting of task groups and tasks.
2020-04-06 13:52:42 -05:00
Drew Bailey 476ca7e7b6
Merge pull request #7638 from hashicorp/audit-docs-queryparams
add note about query params for filtering
2020-04-06 13:08:55 -04:00
Drew Bailey 71b687e7d8
add note about query params for filtering 2020-04-06 13:07:38 -04:00
Drew Bailey 4ab7c03641
Merge pull request #7618 from hashicorp/b-shutdown-delay-updates
Fixes bug that prevented group shutdown_delay updates
2020-04-06 13:05:20 -04:00
Michael Lange bf9887083b
Merge pull request #7630 from hashicorp/f-ui/csi-acceptance-tests
UI: CSI Acceptance Tests
2020-04-06 09:37:45 -07:00
Drew Bailey 0d4bb6bf92
guard against nil maps 2020-04-06 12:25:50 -04:00
Drew Bailey 0c26161e56
update changelog 2020-04-06 11:58:27 -04:00
Tim Gross 50f807060a
e2e: csi tests can only run on linux (#7635) 2020-04-06 11:57:59 -04:00
Drew Bailey 3b8afce9e6
test added and removed 2020-04-06 11:53:46 -04:00
Drew Bailey 0d550049e9
ensure shutdown delay can be removed 2020-04-06 11:33:04 -04:00
Drew Bailey 9874e7b21d
Group shutdown delay fixes
Group shutdown delay updates were not properly handled in Update hook.
This commit also ensures that plan output is displayed.
2020-04-06 11:29:12 -04:00
Tim Gross 73dc2ad443 e2e/csi: add waiting for alloc stop 2020-04-06 10:15:55 -04:00
Tim Gross 027277a0d9 csi: make volume GC in job deregister safely async
The `Job.Deregister` call will block on the client CSI controller RPCs
while the alloc still exists on the Nomad client node. So we need to
make the volume claim reaping async from the `Job.Deregister`. This
allows `nomad job stop` to return immediately. In order to make this
work, this changeset changes the volume GC so that the GC jobs are on a
by-volume basis rather than a by-job basis; we won't have to query
the (possibly deleted) job at the time of volume GC. We smuggle the
volume ID and whether it's a purge into the GC eval ID the same way we
smuggled the job ID previously.
2020-04-06 10:15:55 -04:00
Tim Gross 5a3b45864d csi: fix unpublish workflow ID mismatches
The CSI plugins uses the external volume ID for all operations, but
the Client CSI RPCs uses the Nomad volume ID (human-friendly) for the
mount paths. Pass the External ID as an arg in the RPC call so that
the unpublish workflows have it without calling back to the server to
find the external ID.

The controller CSI plugins need the CSI node ID (or in other words,
the storage provider's view of node ID like the EC2 instance ID), not
the Nomad node ID, to determine how to detach the external volume.
2020-04-06 10:15:55 -04:00
Tim Gross 161f9aedc3
scheduler: prevent a reported NPE for CSI (#7633) 2020-04-06 09:42:27 -04:00
Mahmood Ali 23be53a366
Merge pull request #7612 from hashicorp/b-auth-alloc-exec-ws
Authenticate alloc/exec websocket requests
2020-04-06 09:24:51 -04:00
Michael Lange 2955d356e7 Test coverage for the volume detail page 2020-04-04 17:13:40 -07:00
Michael Lange 25f4f5a61d Sort allocation tables by modify index 2020-04-04 17:11:58 -07:00
Michael Lange a1d2e585a1 Update breadcrumb to match side menu 2020-04-04 17:11:29 -07:00
Michael Lange 6b798518b9 Fix the allocations page compoent to support multiple prop keys
It was designed to be used this way, but allocationFor has never
worked as intended 🤦
2020-04-04 10:56:12 -07:00
Michael Lange 76bead58a3 Add page size select tests to volumes list tests 2020-04-04 09:58:34 -07:00
Michael Lange e8e41c5757 Acceptance tests for the volumes list page 2020-04-03 19:28:12 -07:00
Michael Lange 0d90d082bc Page object for volumes list 2020-04-03 19:28:11 -07:00
Michael Lange 59427662d0 Handle namespaces in the mirage handler for volumes 2020-04-03 19:28:10 -07:00
Michael Lange fb44f76800 Correctly handle the namespace query param and forbidden state 2020-04-03 19:28:09 -07:00
Michael Lange 280fa5d53b Annotate volume row and make the tr clickable 2020-04-03 19:27:44 -07:00
Michael Lange 62aa943a95 Filter out volumes that don't match the chosen namespace 2020-04-03 19:27:11 -07:00
Michael Lange 62b7a07189 Sort alphabetically, A first 2020-04-03 19:26:26 -07:00
Michael Lange 1729d41509
Merge pull request #7574 from hashicorp/f-ui/configurable-page-sizes
UI Configurable Page Sizes
2020-04-03 16:06:17 -07:00
Lang Martin 1750426d04
csi: run volume claim GC on job stop -purge (#7615)
* nomad/state/state_store: error message copy/paste error

* nomad/structs/structs: add a VolumeEval to the JobDeregisterResponse

* nomad/job_endpoint: synchronously, volumeClaimReap on job Deregister

* nomad/core_sched: make volumeClaimReap available without a CoreSched

* nomad/job_endpoint: Deregister return early if the job is missing

* nomad/job_endpoint_test: job Deregistion is idempotent

* nomad/core_sched: conditionally ignore alloc status in volumeClaimReap

* nomad/job_endpoint: volumeClaimReap all allocations, even running

* nomad/core_sched_test: extra argument to collectClaimsToGCImpl

* nomad/job_endpoint: job deregistration is not idempotent
2020-04-03 17:37:26 -04:00
Mahmood Ali 6c950e971d
Merge pull request #7622 from hashicorp/tests-deflake-TestAutopilot_RollingUpdate
tests: deflake TestAutopilot_RollingUpdate
2020-04-03 17:29:55 -04:00
Mahmood Ali 816a93ed4a tests: deflake TestAutopilot_RollingUpdate
I hypothesize that the flakiness in rolling update is due to shutting
down s3 server before s4 is properly added as a voter.

The chain of the flakiness is as follows:

1. Bootstrap with s1, s2, s3
2. Add s4
3. Wait for servers to register with 3 voting peers
   * But we already have 3 voters (s1, s2, and s3)
   * s4 is added as a non-voter in Raft v3 and must wait until autopilot promots it
4. Test proceeds without s4 being a voter
5. s3 shutdown
6. cluster changes stall due to leader election and too many pending configuration
changes (e.g. removing s3 from raft, promoting s4).

Here, I have the test wait until s4 is marked as a voter before shutting
down s3, so we don't have too many configuration changes at once.

In https://circleci.com/gh/hashicorp/nomad/57092, I noticed the
following events:

```
TestAutopilot_RollingUpdate: autopilot_test.go:204: adding server s4
    TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.789Z [INFO]  nomad/serf.go:60: nomad: adding server: server="nomad-137.global (Addr: 127.0.0.1:9177) (DC: dc1)"
    TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.789Z [INFO]  raft/raft.go:1018: nomad.raft: updating configuration: command=AddNonvoter server-id=c54b5bf4-1159-34f6-032d-56aefeb08425 server-addr=127.0.0.1:9177 servers="[{Suffrage:Voter ID:df01ba65-d1b2-17a9-f792-a4459b3a7c09 Address:127.0.0.1:9171} {Suffrage:Voter ID:c3337778-811e-2675-87f5-006309888387 Address:127.0.0.1:9173} {Suffrage:Voter ID:186d5e15-c473-e2b3-b5a4-3259a84e10ef Address:127.0.0.1:9169} {Suffrage:Nonvoter ID:c54b5bf4-1159-34f6-032d-56aefeb08425 Address:127.0.0.1:9177}]"

    TestAutopilot_RollingUpdate: autopilot_test.go:218: shutting down server s3
    TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.797Z [INFO]  raft/replication.go:456: nomad.raft: aborting pipeline replication: peer="{Nonvoter c54b5bf4-1159-34f6-032d-56aefeb08425 127.0.0.1:9177}"
    TestAutopilot_RollingUpdate: autopilot_test.go:235: waiting for s4 to stabalize and be promoted
    TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.975Z [ERROR] raft/raft.go:1656: nomad.raft: failed to make requestVote RPC: target="{Voter c3337778-811e-2675-87f5-006309888387 127.0.0.1:9173}" error="dial tcp 127.0.0.1:9173: connect: connection refused"
    TestAutopilot_RollingUpdate: retry.go:121: autopilot_test.go:241: don't want "c3337778-811e-2675-87f5-006309888387"
        autopilot_test.go:241: didn't find map[c54b5bf4-1159-34f6-032d-56aefeb08425:true] in []raft.ServerID{"df01ba65-d1b2-17a9-f792-a4459b3a7c09", "186d5e15-c473-e2b3-b5a4-3259a84e10ef"}
```

Note how s3, c3337778, is present in the peers list in the final
failure, but s4, c54b5bf4, is added as a Nonvoter and isn't present in
the final peers list.
2020-04-03 17:15:41 -04:00
Tim Gross 966286fee5
fix encoding/decoding tags for api.Task (#7620)
When `nomad job inspect` encodes the response, if the decoded JSON
from the API doesn't exactly match the API struct, the field value
will be omitted even if it has a value. We only want the JSON struct
tag to `omitempty`.
2020-04-03 16:45:49 -04:00
Tim Gross d81797ea33
e2e: improve test reliability for CSI (#7616)
This changeset:

* adds eval status to the error messages emitted when we have
  placement failure in tests. The implementation here isn't quite
  perfect but it's a lot better than "condition not met".
* enforces the ordering of teardown of the CSI test
* doesn't pass the purge flag to one of the two CSI tests, so that we
  exercise both code paths.
2020-04-03 15:52:58 -04:00
Ben Buzbee 769a3cd8b3 Rename OCIRuntime to Runtime; allow gpu conflicts is they are the same runtime; add conflict test 2020-04-03 12:15:11 -07:00
Mahmood Ali 340f9a5e91 ui: explicit reference to window.localStorage 2020-04-03 14:31:19 -04:00
Mahmood Ali ed4c4d13a4 fixup! backend: support WS authentication handshake in alloc/exec 2020-04-03 14:20:31 -04:00
Ben Buzbee d4f26d1eee Support custom docker runtimes
This enables customers who want to use gvisor and have it configured on their clients.
2020-04-03 11:07:37 -07:00
Buck Doyle 2940aa14e5 Remove redundant step assertion 2020-04-03 12:54:47 -05:00
Buck Doyle b9a2d20445 Remove redundant pause 2020-04-03 12:53:57 -05:00
Buck Doyle e6ecd2bf4f Remove redundant assertions
These are more things that are already covered elsewhere.
2020-04-03 12:52:39 -05:00
Buck Doyle 4de1255a31 Remove redundant assertions from token exec test
This only needs to check that the token is sent, the rest of
the assertions were covered by the previous test.
2020-04-03 12:35:51 -05:00
Buck Doyle cb6f110b97 Remove intermediate storage variable 2020-04-03 12:27:03 -05:00
Buck Doyle b12f97bb81 Change to setting token directly
Most tests bypass setting the token via the UI, instead choosing
to set it in localStorage directly, because the acceptance tests
for the token UI are sufficient to exercise that part of the UI,
so this speeds up the test a bit.
2020-04-03 12:26:25 -05:00
Buck Doyle 0ec5e95f46 Add space 2020-04-03 12:21:44 -05:00
Buck Doyle fbe40a5d36
UI: add handling for exec command-editing keys (#7601)
This is a minimal implementation that closes #7463. It doesn’t include
true support for moving around within the command to edit using arrow
keys because it gets too complex when managing wrapping at the edge of
the terminal. Instead, arrow keys are ignored. It also ignores ^A and
^E, which are cursor manipulations that pose similar problems to arrow
keys. It does support ^U, which deletes the entire command.

It also allows a command to be pasted, which was previously unsupported.
This is accomplished by migrating from Xterm.js’s onKey handler to
onData, which is recommended here:
https://github.com/xtermjs/xterm.js/issues/2673#issuecomment-574897733

onData is a higher-level handler that issues events with the final
interpreted data instead of the individual key events. That means the
processing in this PR has changed from inspecting DOM key events to
inspecting their ASCII equivalents, which I’ve extracted into a utility
dictionary for use in tests and implementation.

One consequence of ignoring most control characters is that if you paste
a string that includes a control character, that character will be
stripped. It’s somewhat strange for compound sequences like arrow keys; 
if you run copy('/bin/b' + '\x1b[D' + 'ash') in a Javascript console and
paste what’s on the clipboard, you get "/bin/b[Dash". That’s because
the left arrow key, as in that centre portion of the string,
is represented by the escape character and a coded sequence. Stripping
the control character leaves the coded sequence as part of the paste.
That seems like an acceptable compromise vs either ignoring any pasted
string with control characters (confusing UX) or trying to interpret and
strip all such compound control sequences (difficult to be exhaustive).
2020-04-03 12:14:47 -05:00