Commit Graph

15536 Commits

Author SHA1 Message Date
Mahmood Ali 8a82260319 log unrecoverable errors 2019-07-17 11:01:59 +07:00
Mahmood Ali ec7e258d71 address review feedback 2019-07-17 10:43:13 +07:00
Lang Martin 47e725011f
Merge pull request #5960 from shvar/master
take NodeID from url in api for node eligibility
2019-07-16 16:09:12 -04:00
Yishan Lin d786b08f5f
Add interoperability support line to Nomad Downloads documentation page.
Added line around interoperability to Downloads page.
2019-07-16 10:51:22 -07:00
Yishan Lin 8662b00bdc Added line around interoperability to Nomad Downloads page. 2019-07-15 14:11:11 -07:00
Buck Doyle 9322dfc46f
UI: Add copy button for client/allocation UUIDs (#5926)
The button shows a success icon and tooltip on click, and resets
after two seconds.
2019-07-15 12:14:32 -05:00
Preetha 09e950dd48
Merge pull request #5938 from RenaudWasTaken/master
Updated the TensorRT demo to use the official NVIDIA image
2019-07-15 11:30:31 -05:00
Preetha cd1d1cb7d9
Merge pull request #5952 from cneira/jail-task-driver
Added Community task driver for FreeBSD jails
2019-07-15 11:12:25 -05:00
Eli Shvartsman 692fd19884 take NodeID from url in api for node eligibility 2019-07-15 18:34:53 +03:00
Mahmood Ali d25183d0a0 sort changelog entries 2019-07-15 10:56:47 +08:00
Mahmood Ali 116ca3e5fb changelog GH-5954 2019-07-15 10:55:31 +08:00
Mahmood Ali f40aa97954
Merge pull request #5954 from hashicorp/b-fix-streaming-rpc-tls
rpc: use tls wrapped connection for streaming rpc
2019-07-13 07:29:48 +08:00
cneira ef214a8790 fixup 2019-07-12 17:08:23 -04:00
cneira 2f7061a40f Merge branch 'jail-task-driver' of https://github.com/cneira/nomad into jail-task-driver 2019-07-12 16:52:22 -04:00
cneira 438d27c652 fixup 2019-07-12 16:52:19 -04:00
Mahmood Ali 69d3ec73d5 update changelog 2019-07-13 00:47:43 +08:00
Carlos Neira 33e1cf4ba6
Update jail-task-driver.html.md 2019-07-12 11:45:56 -04:00
Carlos Neira c9112dd9bf
Fixed LXC reference 2019-07-12 11:27:47 -04:00
Mahmood Ali ad39bcef60 rpc: use tls wrapped connection for streaming rpc
This ensures that server-to-server streaming RPC calls use the tls
wrapped connections.

Prior to this, `streamingRpcImpl` function uses tls for setting header
and invoking the rpc method, but returns unwrapped tls connection.
Thus, streaming writes fail with tls errors.

This tls streaming bug existed since 0.8.0[1], but PR #5654[2]
exacerbated it in 0.9.2.  Prior to PR #5654, nomad client used to
shuffle servers at every heartbeat -- `servers.Manager.setServers`[3]
always shuffled servers and was called by heartbeat code[4].  Shuffling
servers meant that a nomad client would heartbeat and establish a
connection against all nomad servers eventually.  When handling
streaming RPC calls, nomad servers used these local connection to
communicate directly to the client.  The server-to-server forwarding
logic was left mostly unexercised.

PR #5654 means that a nomad client may connect to a single server only
and caused the server-to-server forward streaming RPC code to get
exercised more and unearthed the problem.

[1] https://github.com/hashicorp/nomad/blob/v0.8.0/nomad/rpc.go#L501-L515
[2] https://github.com/hashicorp/nomad/pull/5654
[3] https://github.com/hashicorp/nomad/blob/v0.9.1/client/servers/manager.go#L198-L216
[4] https://github.com/hashicorp/nomad/blob/v0.9.1/client/client.go#L1603
2019-07-12 14:41:44 +08:00
Mahmood Ali 9c9bec62fd rpc: add positive tests for server streaming RPC 2019-07-12 14:32:52 +08:00
Omar Khawaja 22ebf2bbc1
TF config enable services (#5947)
* enable vault, consul, and nomad services to make them persistent after reboot

* update AMI
2019-07-11 22:36:58 +02:00
cneira 82baa8c5a7 Added Community task driver for FreeBSD jails 2019-07-11 13:43:16 -04:00
Preetha 0a2e21353f
Merge pull request #5912 from hashicorp/f-systemd-nofile
systemd: set a high but non-infinite fd limit
2019-07-11 12:31:12 -05:00
Mahmood Ali 1a299c7b28 client/taskrunner: fix stats stats retry logic
Previously, if a channel is closed, we retry the Stats call.  But, if that call
fails, we go in a backoff loop without calling Stats ever again.

Here, we use a utility function for calling driverHandle.Stats call that retries
as one expects.

I aimed to preserve the logging formats but made small improvements as I saw fit.
2019-07-11 13:58:07 +08:00
Mahmood Ali 72d81da4e0 Signal plugin shutdown for driver.TaskStats
The driver plugin stub client must call `grpcutils.HandleGrpcErr` to handle plugin
shutdown similar to other functions.  This ensures that TaskStats returns
`ErrPluginShutdown` when plugin shutdown.
2019-07-11 13:57:35 +08:00
Lang Martin b8b45711f3
Merge pull request #5784 from hashicorp/b-batch-node-dereg
batch node deregistration
2019-07-10 14:24:54 -04:00
Lang Martin 66e4b68946 Changelog 2019-07-10 13:56:57 -04:00
Lang Martin 0b97175a16 node_endpoint preserve both messages as rpcs and in raft 2019-07-10 13:56:20 -04:00
Lang Martin ee4848167c core_sched add compat comment for later removal 2019-07-10 13:56:20 -04:00
Lang Martin c13c97c6c2 structs drop deprecation warning, revert unnecessary comment change 2019-07-10 13:56:20 -04:00
Lang Martin a95225d754 NodeDeregisterBatch -> NodeBatchDeregister match JobBatch pattern 2019-07-10 13:56:20 -04:00
Lang Martin a8e72a5b68 state_store error if called without node_ids 2019-07-10 13:56:20 -04:00
Lang Martin 44cbca9b98 fsm new NodeDeregisterBatchRequestType sorted at the end of the case 2019-07-10 13:56:20 -04:00
Lang Martin b9f90701ea checklist NodeDeregisterBatchRequestType must go at the end 2019-07-10 13:56:20 -04:00
Lang Martin 91e139dcb5 structs NodeDeregisterBatchRequestType must go at the end 2019-07-10 13:56:20 -04:00
Lang Martin 1cc6b4062c fsm label batch_deregister_node metrics explicitly
Co-Authored-By: Mahmood Ali <mahmood@notnoop.com>
2019-07-10 13:56:20 -04:00
Lang Martin 909f3b0534 new file: contributing/checklist-rpc-endpoint.md 2019-07-10 13:56:20 -04:00
Lang Martin ad3549f906 core_sched use the new rpc names 2019-07-10 13:56:20 -04:00
Lang Martin ce0f03651a fsm support new NodeDeregisterBatchRequest 2019-07-10 13:56:20 -04:00
Lang Martin fa5649998e node endpoint support new NodeDeregisterBatchRequest 2019-07-10 13:56:19 -04:00
Lang Martin 683ab8d1d2 structs add NodeDeregisterBatchRequest 2019-07-10 13:56:19 -04:00
Lang Martin 82349aba5d node_endpoint argument setup 2019-07-10 13:56:19 -04:00
Lang Martin 6dbf5d7d13 fsm return an error on both NodeDeregisterRequest fields set 2019-07-10 13:56:19 -04:00
Lang Martin fbc78ba96c fsm variable names for consistency 2019-07-10 13:56:19 -04:00
Lang Martin 09fd05bd8f node_endpoint raft store then shutdown, test deprecation 2019-07-10 13:56:19 -04:00
Lang Martin 4610c70777 util simplify partitionAll 2019-07-10 13:56:19 -04:00
Lang Martin d22d9fb5b2 core_sched check ServersMeetMinimumVersion 2019-07-10 13:56:19 -04:00
Lang Martin 3bf41211fb fsm honor new and old style NodeDeregisterRequests 2019-07-10 13:56:19 -04:00
Lang Martin 3fb82e83a5 structs add back NodeDeregisterRequest.NodeID, compatibility 2019-07-10 13:56:19 -04:00
Lang Martin a4472e3d34 core_sched check ServersMeetMinimumVersion, send old node deregister 2019-07-10 13:56:19 -04:00