Commit Graph

15457 Commits

Author SHA1 Message Date
tariq87 98c8103a05
Update index.html.md 2019-07-19 09:21:20 +05:30
Mahmood Ali 790e18b973
Merge pull request #5976 from hashicorp/b-consul-template-update-20190718
Update consul-template dependency to latest
2019-07-19 06:49:13 +07:00
Michael Schurter db4de5fae9
Merge pull request #5975 from hashicorp/b-check-watcher-deadlock
consul: fix deadlock in check-based restarts
2019-07-18 13:13:40 -07:00
Lang Martin ee64b00141
Merge pull request #5900 from hashicorp/f-system-sched-blocked-evals
System scheduler blocked evals
2019-07-18 16:13:15 -04:00
Michael Schurter 6d095b3b36 consul: add test for check watcher deadlock 2019-07-18 08:24:09 -07:00
Mahmood Ali acbb75635b changelog GH-5837 and GH-5948 2019-07-18 22:24:07 +07:00
Lang Martin f282da4ced blocked_evals_test disable calls Flush 2019-07-18 10:32:13 -04:00
Lang Martin 8f7a20839e worker comment system -> core 2019-07-18 10:32:13 -04:00
Lang Martin 83d20169f6 blocked_evals reset system evals on Flush 2019-07-18 10:32:13 -04:00
Lang Martin 6e3425babf blocked_evals_test Test_UnblockNode 2019-07-18 10:32:12 -04:00
Lang Martin ea275d5ce7 fsm attach UnblockNode on node updates 2019-07-18 10:32:12 -04:00
Lang Martin 8157a7b6f8 system_sched submits failed evals as blocked 2019-07-18 10:32:12 -04:00
Lang Martin 3bf618f217 blocked_evals system evals indexed by job and node 2019-07-18 10:32:12 -04:00
Michael Schurter 826d2503e6
Update command/agent/consul/check_watcher.go
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2019-07-18 07:08:27 -07:00
Mahmood Ali cd6f1d3102 Update consul-template dependency to latest
To pick up the fix in
https://github.com/hashicorp/consul-template/pull/1231 .
2019-07-18 07:32:03 +07:00
Michael Schurter 5407584bc3 consul: fix deadlock in check-based restarts
Fixes #5395
Alternative to #5957

Make task restarting asynchronous when handling check-based restarts.
This matches the pre-0.9 behavior where TaskRunner.Restart was an
asynchronous signal. The check-based restarting code was not designed
to handle blocking in TaskRunner.Restart. 0.9 made it reentrant and
could easily overwhelm the buffered update chan and deadlock.

Many thanks to @byronwolfman for his excellent debugging, PR, and
reproducer!

I created this alternative as changing the functionality of
TaskRunner.Restart has a much larger impact. This approach reverts to
old known-good behavior and minimizes the number of places changes are
made.
2019-07-17 15:22:21 -07:00
Buck Doyle 90c9b89b5e
UI: Add page titles (#5924)
This uses ember-page-title to add dynamic page titles throughout the
route hierarchy. When there’s more than one region, the current
current region is added before the final entry of “- Nomad”.
2019-07-17 15:02:58 -05:00
Chris Baker 8a75afcb39
Merge pull request #5870 from hashicorp/b-nmd-1529-alloc-stop-missing-header
api: return X-Nomad-Index header on allocation stop
2019-07-17 13:25:17 -04:00
Michael Schurter 81b4b6f19b
Merge pull request #5791 from hashicorp/b-plan-snapshotindex
nomad: include snapshot index when submitting plans
2019-07-17 09:25:00 -07:00
Mahmood Ali 8a48369d38
Merge pull request #5948 from hashicorp/b-stats-recover-plugin-shutdown
Collect driver stats when driver plugins are restarted
2019-07-17 12:14:56 +08:00
Mahmood Ali 5d09b04f69
Merge pull request #5837 from hashicorp/b-consul-restore-sync-2
Avoid de-registering slowly restored services
2019-07-17 12:02:24 +08:00
Mahmood Ali 8a82260319 log unrecoverable errors 2019-07-17 11:01:59 +07:00
Mahmood Ali ec7e258d71 address review feedback 2019-07-17 10:43:13 +07:00
Lang Martin 47e725011f
Merge pull request #5960 from shvar/master
take NodeID from url in api for node eligibility
2019-07-16 16:09:12 -04:00
Yishan Lin d786b08f5f
Add interoperability support line to Nomad Downloads documentation page.
Added line around interoperability to Downloads page.
2019-07-16 10:51:22 -07:00
Yishan Lin 8662b00bdc Added line around interoperability to Nomad Downloads page. 2019-07-15 14:11:11 -07:00
Buck Doyle 9322dfc46f
UI: Add copy button for client/allocation UUIDs (#5926)
The button shows a success icon and tooltip on click, and resets
after two seconds.
2019-07-15 12:14:32 -05:00
Preetha 09e950dd48
Merge pull request #5938 from RenaudWasTaken/master
Updated the TensorRT demo to use the official NVIDIA image
2019-07-15 11:30:31 -05:00
Preetha cd1d1cb7d9
Merge pull request #5952 from cneira/jail-task-driver
Added Community task driver for FreeBSD jails
2019-07-15 11:12:25 -05:00
Eli Shvartsman 692fd19884 take NodeID from url in api for node eligibility 2019-07-15 18:34:53 +03:00
Mahmood Ali d25183d0a0 sort changelog entries 2019-07-15 10:56:47 +08:00
Mahmood Ali 116ca3e5fb changelog GH-5954 2019-07-15 10:55:31 +08:00
Mahmood Ali f40aa97954
Merge pull request #5954 from hashicorp/b-fix-streaming-rpc-tls
rpc: use tls wrapped connection for streaming rpc
2019-07-13 07:29:48 +08:00
cneira ef214a8790 fixup 2019-07-12 17:08:23 -04:00
cneira 2f7061a40f Merge branch 'jail-task-driver' of https://github.com/cneira/nomad into jail-task-driver 2019-07-12 16:52:22 -04:00
cneira 438d27c652 fixup 2019-07-12 16:52:19 -04:00
Mahmood Ali 69d3ec73d5 update changelog 2019-07-13 00:47:43 +08:00
Carlos Neira 33e1cf4ba6
Update jail-task-driver.html.md 2019-07-12 11:45:56 -04:00
Carlos Neira c9112dd9bf
Fixed LXC reference 2019-07-12 11:27:47 -04:00
Mahmood Ali ad39bcef60 rpc: use tls wrapped connection for streaming rpc
This ensures that server-to-server streaming RPC calls use the tls
wrapped connections.

Prior to this, `streamingRpcImpl` function uses tls for setting header
and invoking the rpc method, but returns unwrapped tls connection.
Thus, streaming writes fail with tls errors.

This tls streaming bug existed since 0.8.0[1], but PR #5654[2]
exacerbated it in 0.9.2.  Prior to PR #5654, nomad client used to
shuffle servers at every heartbeat -- `servers.Manager.setServers`[3]
always shuffled servers and was called by heartbeat code[4].  Shuffling
servers meant that a nomad client would heartbeat and establish a
connection against all nomad servers eventually.  When handling
streaming RPC calls, nomad servers used these local connection to
communicate directly to the client.  The server-to-server forwarding
logic was left mostly unexercised.

PR #5654 means that a nomad client may connect to a single server only
and caused the server-to-server forward streaming RPC code to get
exercised more and unearthed the problem.

[1] https://github.com/hashicorp/nomad/blob/v0.8.0/nomad/rpc.go#L501-L515
[2] https://github.com/hashicorp/nomad/pull/5654
[3] https://github.com/hashicorp/nomad/blob/v0.9.1/client/servers/manager.go#L198-L216
[4] https://github.com/hashicorp/nomad/blob/v0.9.1/client/client.go#L1603
2019-07-12 14:41:44 +08:00
Mahmood Ali 9c9bec62fd rpc: add positive tests for server streaming RPC 2019-07-12 14:32:52 +08:00
Omar Khawaja 22ebf2bbc1
TF config enable services (#5947)
* enable vault, consul, and nomad services to make them persistent after reboot

* update AMI
2019-07-11 22:36:58 +02:00
cneira 82baa8c5a7 Added Community task driver for FreeBSD jails 2019-07-11 13:43:16 -04:00
Preetha 0a2e21353f
Merge pull request #5912 from hashicorp/f-systemd-nofile
systemd: set a high but non-infinite fd limit
2019-07-11 12:31:12 -05:00
Mahmood Ali 1a299c7b28 client/taskrunner: fix stats stats retry logic
Previously, if a channel is closed, we retry the Stats call.  But, if that call
fails, we go in a backoff loop without calling Stats ever again.

Here, we use a utility function for calling driverHandle.Stats call that retries
as one expects.

I aimed to preserve the logging formats but made small improvements as I saw fit.
2019-07-11 13:58:07 +08:00
Mahmood Ali 72d81da4e0 Signal plugin shutdown for driver.TaskStats
The driver plugin stub client must call `grpcutils.HandleGrpcErr` to handle plugin
shutdown similar to other functions.  This ensures that TaskStats returns
`ErrPluginShutdown` when plugin shutdown.
2019-07-11 13:57:35 +08:00
Lang Martin b8b45711f3
Merge pull request #5784 from hashicorp/b-batch-node-dereg
batch node deregistration
2019-07-10 14:24:54 -04:00
Lang Martin 66e4b68946 Changelog 2019-07-10 13:56:57 -04:00
Lang Martin 0b97175a16 node_endpoint preserve both messages as rpcs and in raft 2019-07-10 13:56:20 -04:00
Lang Martin ee4848167c core_sched add compat comment for later removal 2019-07-10 13:56:20 -04:00