Commit graph

14925 commits

Author SHA1 Message Date
Michael Schurter e07f73bfe0 client: do not restart dead tasks until server is contacted (try 2)
Refactoring of 104067bc2b2002a4e45ae7b667a476b89addc162

Switch the MarkLive method for a chan that is closed by the client.
Thanks to @notnoop for the idea!

The old approach called a method on most existing ARs and TRs on every
runAllocs call. The new approach does a once.Do call in runAllocs to
accomplish the same thing with less work. Able to remove the gate
abstraction that did much more than was needed.
2019-05-14 10:53:27 -07:00
Michael Schurter 8589233a0e drivers/mock: implement InspectTask 2019-05-14 10:53:27 -07:00
Michael Schurter d7e5ace1ed client: do not restart dead tasks until server is contacted
Fixes #1795

Running restored allocations and pulling what allocations to run from
the server happen concurrently. This means that if a client is rebooted,
and has its allocations rescheduled, it may restart the dead allocations
before it contacts the server and determines they should be dead.

This commit makes tasks that fail to reattach on restore wait until the
server is contacted before restarting.
2019-05-14 10:53:27 -07:00
Michael Schurter 2b7f398726 e2e: fix nomad service for systemd<230 2019-05-14 10:53:26 -07:00
Yishan Lin 20638e7119
Merge pull request #5703 from hashicorp/yishan/corrected-website-redirects
Fixed Spark links in redirects.txt.
2019-05-14 10:36:31 -07:00
Yishan Lin 7ffd608456 Update redirects.txt
Fixed Spark redirects post-website restructuring for the guides.
2019-05-14 08:56:13 -07:00
Michael Schurter f072c7421c
Merge pull request #5695 from hashicorp/f-squelch-logline
client: log when server list changes
2019-05-14 08:38:05 -07:00
Michael Schurter be973d32e9
Merge pull request #4590 from hashicorp/d-fix-stagger
docs: fix description of update.stagger
2019-05-14 08:36:08 -07:00
Michael Schurter f871b8998f
Merge pull request #5693 from hashicorp/docs-task-config
docs: mention regression in task config validation
2019-05-14 07:50:39 -07:00
Michael Schurter 94ab5c8b43
Merge pull request #5657 from hashicorp/docs-plugin-link
docs: add lots of links to plugin guide
2019-05-14 07:50:09 -07:00
Yishan Lin a850c3141a Added redirect for Spark guide link 2019-05-13 16:16:14 -07:00
Michael Schurter 3b1f8991a1 client: log when server list changes
Stop logging in the happy path when nothing has changed.
2019-05-13 15:42:55 -07:00
Michael Schurter 1e4330bf2b docs: mention regression in task config validation 2019-05-13 14:08:46 -07:00
Michael Schurter 48db8135da
Merge pull request #5492 from hashicorp/f-allocated-mem
client: expose allocated memory per task
2019-05-13 13:31:22 -07:00
Jasmine Dahilig e3b69ca98f
fix update to changelog 2019-05-13 13:14:01 -07:00
Jasmine Dahilig 27161d8a12
update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665 2019-05-13 13:10:29 -07:00
Jasmine Dahilig 30d346ca15
Merge pull request #5665 from hashicorp/b-empty-datacenters
add non-empty string validation for datacenters
2019-05-13 10:23:26 -07:00
Mahmood Ali 2ddc39973d
Merge pull request #5668 from hashicorp/flaky-test-20190430
fix flaky test by allowing for call invocation overhead
2019-05-13 12:33:44 -04:00
Lang Martin 1d03a43ce2
Merge pull request #5642 from hashicorp/b-network-fingerprinting-ipv4
network fingerprinting multiple IPs on the configured network device
2019-05-13 11:46:53 -04:00
Mahmood Ali ed5b008ed0
Merge pull request #5690 from hashicorp/f-nomad-exec-part-04-rkt
implement nomad exec for rkt
2019-05-13 10:03:55 -04:00
Mahmood Ali dd8762e348 typo: "atleast" -> "at least" 2019-05-13 10:01:19 -04:00
Mahmood Ali d1526571a5 implement nomad exec for rkt
Implement the streaming exec handler for the rkt driver
2019-05-12 18:59:00 -04:00
Danielle 495ff647de
Merge pull request #5685 from jweissig/patch-9
docs: fixed typo
2019-05-11 21:00:47 +02:00
Justin Weissig e137b7f2e3
docs: fixed typo
Fixed typo: programatic/programmatic
2019-05-11 10:40:39 -07:00
Mahmood Ali f58932afe9
Merge pull request #5634 from hashicorp/f-nomad-exec-parts-03-executors
nomad exec part 3: executor based drivers
2019-05-10 21:24:23 -04:00
Mahmood Ali b4df061fef use pty/tty terminology similar to github.com/kr/pty 2019-05-10 19:17:14 -04:00
Mahmood Ali 7fdb7564e8 vendor github.com/kr/pty 2019-05-10 19:17:14 -04:00
Mahmood Ali a4640db7a6 drivers: implement streaming exec for executor based drivers
These simply delegate call to backend executor.
2019-05-10 19:17:14 -04:00
Mahmood Ali 3055fd53df executors: implement streaming exec
Implements streamign exec handling in both executors (i.e. universal and
libcontainer).

For creation of TTY, some incidental complexity leaked in.  The universal
executor uses github.com/kr/pty for creation of TTYs.

On the other hand, libcontainer expects a console socket and for libcontainer to
create the underlying console object on process start.  The caller can then use
`libcontainer.utils.RecvFd()` to get tty master end.

I chose github.com/kr/pty for managing TTYs here.  I tried
`github.com/containerd/console` package (which is already imported), but the
package did not work as expected on macOS.
2019-05-10 19:17:14 -04:00
Mahmood Ali 085d2ef759 executor: scaffolding for executor grpc handling
Prepare executor to handle streaming exec API calls that reuse drivers protobuf
structs.
2019-05-10 19:17:14 -04:00
Mahmood Ali ea241d5da7
Merge pull request #5674 from hashicorp/b-ui/flaky-client-detail-test
UI: Fixed flaky client-detail test
2019-05-10 18:51:00 -04:00
Michael Schurter 1c4e585fa7 client: expose allocated memory per task
Related to #4280

This PR adds
`client.allocs.<job>.<group>.<alloc>.<task>.memory.allocated` as a gauge
in bytes to metrics to ease calculating how close a task is to OOMing.

```
'nomad.client.allocs.memory.allocated.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 268435456.000
'nomad.client.allocs.memory.cache.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 5677056.000
'nomad.client.allocs.memory.kernel_max_usage.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 0.000
'nomad.client.allocs.memory.kernel_usage.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 0.000
'nomad.client.allocs.memory.max_usage.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 8908800.000
'nomad.client.allocs.memory.rss.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 876544.000
'nomad.client.allocs.memory.swap.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 0.000
'nomad.client.allocs.memory.usage.example.cache.6d98cbaf-d6bc-2a84-c63f-bfff8905a9d8.redis.rusty': 8208384.000
```
2019-05-10 11:12:12 -07:00
Charlie Voiselle dba077d5dd
Merge pull request #5683 from hashicorp/docs-describe-sched-restart
Added Sparrow link
2019-05-10 11:25:27 -04:00
Lang Martin f6bc45dd23 client improve a comment in updateNetworks 2019-05-10 11:25:04 -04:00
Danielle 79ced20e20 stalebot: Add 'thinking' as an exempt label (#5684) 2019-05-10 11:00:35 -04:00
Danielle d529023040
Merge pull request #5375 from hashicorp/dani/stale-issues
Setup probot/stale
2019-05-10 16:53:05 +02:00
Charlie Voiselle 1af7e4c4d7 Added Sparrow link 2019-05-10 10:35:21 -04:00
Charlie Voiselle 6f1dfbbe24
Added info about scheduler fail and success cases (#5675)
* Added info about scheduler fail and success cases

* fix "and and"; copypasta

* Moving links to bottom for consistency

* Adopted @schmichael wording
2019-05-10 10:26:43 -04:00
Mahmood Ali 0ac57895db
Merge pull request #5682 from hashicorp/b-fix-website-links-20180510
Fix website links
2019-05-10 10:08:27 -04:00
Mahmood Ali 3a3165f127 update links to use new canonical location 2019-05-10 09:41:19 -04:00
Mahmood Ali f8be7608a2 Add redirects for restructing done in GH-5667
https://github.com/hashicorp/nomad/pull/5667 restructured lots of
guides; this adds redirects to ensure that old links work.
2019-05-10 09:41:18 -04:00
Danielle 3215d04db2
Merge pull request #5677 from jweissig/patch-8
doc: wording
2019-05-10 12:16:33 +02:00
Justin Weissig 9f9785dde8
doc: wording
Fixed wording: "as show below" -> "as shown below".
2019-05-10 00:12:55 -07:00
Mahmood Ali 919827f2df
Merge pull request #5632 from hashicorp/f-nomad-exec-parts-01-base
nomad exec part 1: plumbing and docker driver
2019-05-09 18:09:27 -04:00
Mahmood Ali 2a555a7e74 add e2e tests for nomad exec 2019-05-09 16:49:08 -04:00
Mahmood Ali 09931bcdce add api support for nomad exec
Adds nomad exec support in our API, by hitting the websocket endpoint.

We introduce API structs that correspond to the drivers streaming exec structs.

For creating the websocket connection, we reuse the transport setting from api
http client.
2019-05-09 16:49:08 -04:00
Mahmood Ali 13c83ee38e drivers/docker: implement streaming exec 2019-05-09 16:49:08 -04:00
Mahmood Ali 57c18fec4e Add basic drivers conformance tests
Add consolidated testing package to serve as conformance tests for all drivers.
2019-05-09 16:49:08 -04:00
Mahmood Ali 050e88c668 vendor github.com/gorilla/websocket 2019-05-09 16:49:08 -04:00
Mahmood Ali 66982a1660 agent: add websocket handler for nomad exec
This adds a websocket endpoint for handling `nomad exec`.

The endpoint is a websocket interface, as we require a bi-directional
streaming (to handle both input and output), which is not very appropriate for
plain HTTP 1.0. Using websocket makes implementing the web ui a bit simpler. I
considered using golang http hijack capability to treat http request as a plain
connection, but the web interface would be too complicated potentially.

Furthermore, the API endpoint operates against the raw core nomad exec streaming
datastructures, defined in protobuf, with json serializer.  Our APIs use json
interfaces in general, and protobuf generates json friendly golang structs.
Reusing the structs here simplify interface and reduce conversion overhead.
2019-05-09 16:49:08 -04:00