Commit graph

15528 commits

Author SHA1 Message Date
Mahmood Ali 790e18b973
Merge pull request #5976 from hashicorp/b-consul-template-update-20190718
Update consul-template dependency to latest
2019-07-19 06:49:13 +07:00
Michael Schurter db4de5fae9
Merge pull request #5975 from hashicorp/b-check-watcher-deadlock
consul: fix deadlock in check-based restarts
2019-07-18 13:13:40 -07:00
Lang Martin ee64b00141
Merge pull request #5900 from hashicorp/f-system-sched-blocked-evals
System scheduler blocked evals
2019-07-18 16:13:15 -04:00
Lang Martin a6817359d8 jobs_test AutoRevert and AutoPromote merged differently 2019-07-18 13:37:50 -04:00
Lang Martin e3b34c35a8 jobs update stanza canonicalize and default AutoPromote 2019-07-18 13:36:40 -04:00
Lang Martin 698e9d4940 tasks_test assert merging behavior around Canonicalize 2019-07-18 13:36:06 -04:00
Michael Schurter 6d095b3b36 consul: add test for check watcher deadlock 2019-07-18 08:24:09 -07:00
Mahmood Ali acbb75635b changelog GH-5837 and GH-5948 2019-07-18 22:24:07 +07:00
Lang Martin a0fe1ffdd5 default e.getAllPids in executor_basic 2019-07-18 10:57:27 -04:00
Lang Martin f282da4ced blocked_evals_test disable calls Flush 2019-07-18 10:32:13 -04:00
Lang Martin 8f7a20839e worker comment system -> core 2019-07-18 10:32:13 -04:00
Lang Martin 83d20169f6 blocked_evals reset system evals on Flush 2019-07-18 10:32:13 -04:00
Lang Martin 6e3425babf blocked_evals_test Test_UnblockNode 2019-07-18 10:32:12 -04:00
Lang Martin ea275d5ce7 fsm attach UnblockNode on node updates 2019-07-18 10:32:12 -04:00
Lang Martin 8157a7b6f8 system_sched submits failed evals as blocked 2019-07-18 10:32:12 -04:00
Lang Martin 3bf618f217 blocked_evals system evals indexed by job and node 2019-07-18 10:32:12 -04:00
Michael Schurter 826d2503e6
Update command/agent/consul/check_watcher.go
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2019-07-18 07:08:27 -07:00
Michael Schurter a3fcb8fcca e2e: debug log level for everyone! 2019-07-18 06:55:27 -07:00
Mahmood Ali cd6f1d3102 Update consul-template dependency to latest
To pick up the fix in
https://github.com/hashicorp/consul-template/pull/1231 .
2019-07-18 07:32:03 +07:00
Michael Schurter 5407584bc3 consul: fix deadlock in check-based restarts
Fixes #5395
Alternative to #5957

Make task restarting asynchronous when handling check-based restarts.
This matches the pre-0.9 behavior where TaskRunner.Restart was an
asynchronous signal. The check-based restarting code was not designed
to handle blocking in TaskRunner.Restart. 0.9 made it reentrant and
could easily overwhelm the buffered update chan and deadlock.

Many thanks to @byronwolfman for his excellent debugging, PR, and
reproducer!

I created this alternative as changing the functionality of
TaskRunner.Restart has a much larger impact. This approach reverts to
old known-good behavior and minimizes the number of places changes are
made.
2019-07-17 15:22:21 -07:00
Michael Schurter ea68c930fe e2e: enable_debug=true for all agents
Enables the pprof http endpoint for debugging.
2019-07-17 15:20:45 -07:00
Lang Martin 9d0c0c459d executor_unix and _windows stub getAllPids ByScanning 2019-07-17 17:34:06 -04:00
Lang Martin e071f6b022 executor_universal_linux getAllPids chooses cgroup when available 2019-07-17 17:33:55 -04:00
Lang Martin e1bab541ad executor use e.getAllPids() 2019-07-17 17:33:11 -04:00
Lang Martin 18597c4917 resource_container_linux new getAllPidsByCgroup 2019-07-17 17:31:36 -04:00
Lang Martin 2e981a812e pid_collector getAllPids -> getAllPidsByScanning 2019-07-17 17:31:20 -04:00
Buck Doyle 90c9b89b5e
UI: Add page titles (#5924)
This uses ember-page-title to add dynamic page titles throughout the
route hierarchy. When there’s more than one region, the current
current region is added before the final entry of “- Nomad”.
2019-07-17 15:02:58 -05:00
Chris Baker 8a75afcb39
Merge pull request #5870 from hashicorp/b-nmd-1529-alloc-stop-missing-header
api: return X-Nomad-Index header on allocation stop
2019-07-17 13:25:17 -04:00
Michael Schurter 81b4b6f19b
Merge pull request #5791 from hashicorp/b-plan-snapshotindex
nomad: include snapshot index when submitting plans
2019-07-17 09:25:00 -07:00
Mahmood Ali 8a48369d38
Merge pull request #5948 from hashicorp/b-stats-recover-plugin-shutdown
Collect driver stats when driver plugins are restarted
2019-07-17 12:14:56 +08:00
Mahmood Ali 5d09b04f69
Merge pull request #5837 from hashicorp/b-consul-restore-sync-2
Avoid de-registering slowly restored services
2019-07-17 12:02:24 +08:00
Mahmood Ali 8a82260319 log unrecoverable errors 2019-07-17 11:01:59 +07:00
Mahmood Ali ec7e258d71 address review feedback 2019-07-17 10:43:13 +07:00
Lang Martin 47e725011f
Merge pull request #5960 from shvar/master
take NodeID from url in api for node eligibility
2019-07-16 16:09:12 -04:00
Yishan Lin d786b08f5f
Add interoperability support line to Nomad Downloads documentation page.
Added line around interoperability to Downloads page.
2019-07-16 10:51:22 -07:00
Yishan Lin 8662b00bdc Added line around interoperability to Nomad Downloads page. 2019-07-15 14:11:11 -07:00
Buck Doyle 9322dfc46f
UI: Add copy button for client/allocation UUIDs (#5926)
The button shows a success icon and tooltip on click, and resets
after two seconds.
2019-07-15 12:14:32 -05:00
Preetha 09e950dd48
Merge pull request #5938 from RenaudWasTaken/master
Updated the TensorRT demo to use the official NVIDIA image
2019-07-15 11:30:31 -05:00
Preetha cd1d1cb7d9
Merge pull request #5952 from cneira/jail-task-driver
Added Community task driver for FreeBSD jails
2019-07-15 11:12:25 -05:00
Eli Shvartsman 692fd19884 take NodeID from url in api for node eligibility 2019-07-15 18:34:53 +03:00
Mahmood Ali d25183d0a0 sort changelog entries 2019-07-15 10:56:47 +08:00
Mahmood Ali 116ca3e5fb changelog GH-5954 2019-07-15 10:55:31 +08:00
Mahmood Ali f40aa97954
Merge pull request #5954 from hashicorp/b-fix-streaming-rpc-tls
rpc: use tls wrapped connection for streaming rpc
2019-07-13 07:29:48 +08:00
cneira ef214a8790 fixup 2019-07-12 17:08:23 -04:00
cneira 2f7061a40f Merge branch 'jail-task-driver' of https://github.com/cneira/nomad into jail-task-driver 2019-07-12 16:52:22 -04:00
cneira 438d27c652 fixup 2019-07-12 16:52:19 -04:00
Mahmood Ali 69d3ec73d5 update changelog 2019-07-13 00:47:43 +08:00
Carlos Neira 33e1cf4ba6
Update jail-task-driver.html.md 2019-07-12 11:45:56 -04:00
Carlos Neira c9112dd9bf
Fixed LXC reference 2019-07-12 11:27:47 -04:00
Mahmood Ali ad39bcef60 rpc: use tls wrapped connection for streaming rpc
This ensures that server-to-server streaming RPC calls use the tls
wrapped connections.

Prior to this, `streamingRpcImpl` function uses tls for setting header
and invoking the rpc method, but returns unwrapped tls connection.
Thus, streaming writes fail with tls errors.

This tls streaming bug existed since 0.8.0[1], but PR #5654[2]
exacerbated it in 0.9.2.  Prior to PR #5654, nomad client used to
shuffle servers at every heartbeat -- `servers.Manager.setServers`[3]
always shuffled servers and was called by heartbeat code[4].  Shuffling
servers meant that a nomad client would heartbeat and establish a
connection against all nomad servers eventually.  When handling
streaming RPC calls, nomad servers used these local connection to
communicate directly to the client.  The server-to-server forwarding
logic was left mostly unexercised.

PR #5654 means that a nomad client may connect to a single server only
and caused the server-to-server forward streaming RPC code to get
exercised more and unearthed the problem.

[1] https://github.com/hashicorp/nomad/blob/v0.8.0/nomad/rpc.go#L501-L515
[2] https://github.com/hashicorp/nomad/pull/5654
[3] https://github.com/hashicorp/nomad/blob/v0.9.1/client/servers/manager.go#L198-L216
[4] https://github.com/hashicorp/nomad/blob/v0.9.1/client/client.go#L1603
2019-07-12 14:41:44 +08:00