Commit graph

3176 commits

Author SHA1 Message Date
Alex Dadgar c9765deff1 address comments 2018-10-16 16:53:29 -07:00
Alex Dadgar 80f6ce50c0 vault hook 2018-10-16 16:53:29 -07:00
Michael Schurter 30d377eba4 tr: improve skip log line 2018-10-16 16:53:29 -07:00
Michael Schurter ef213b864b tr: pass context to hooks 2018-10-16 16:53:29 -07:00
Michael Schurter 3a4f387fd3 tr: fix setting done in existing hooks 2018-10-16 16:53:29 -07:00
Michael Schurter b360f6f96e fix hclog level 2018-10-16 16:53:29 -07:00
Michael Schurter ae89b7da95 reimplement success state for tr hooks and state persistence
splits apart local and remote persistence

removes some locking *for now*
2018-10-16 16:53:29 -07:00
Michael Schurter 4f43ff5c51 pass statedb into allocrunnerv2 2018-10-16 16:53:29 -07:00
Michael Schurter 582c76a420 remove unused allocrunner shim 2018-10-16 16:53:29 -07:00
Michael Schurter c5504bd939 tr: cleanup main loop and shutdown hook impl 2018-10-16 16:53:29 -07:00
Michael Schurter 561260d6fe tr: skip error/success saving
All hooks only need to be run once.
Since only one hook can fail per run there's no need to
track errors on a per hook basis.
2018-10-16 16:53:29 -07:00
Michael Schurter 67874e761f tr: don't lock for immutable fields 2018-10-16 16:53:29 -07:00
Michael Schurter f473cd03d6 tr: start update/shutdown logic 2018-10-16 16:53:29 -07:00
Michael Schurter 637ef264ae Copy TR.Config vals to TR
I think I like this pattern better as some Config vals are mutable
(Alloc) and some aren't and some are used to derive other values and
never used directly.

Promoting them onto the TR struct is a little more work but is hopefully
more clear as to how each value is used.
2018-10-16 16:53:29 -07:00
Michael Schurter 0f7dcfdc9a example redis job "runs" on arv2! see below
Tons left to do and lots of churn:
1. No state saving
2. No shutdown or gc
3. Removed AR factory *for now*
4. Made all "Config" structs local to the package they configure
5. Added allocID to GC to avoid a lookup

Really hating how many things use *structs.Allocation. It's not bad
without state saving, but if AllocRunner starts updating its copy things
get racy fast.
2018-10-16 16:53:29 -07:00
Michael Schurter 9a6aa38b0f begin adding AllocRunner.Update 2018-10-16 16:53:29 -07:00
Michael Schurter eae54e2954 artifact task hook 2018-10-16 16:53:29 -07:00
Alex Dadgar b9bed81e6e Initial V2 alloc runner 2018-10-16 16:53:28 -07:00
Alex Dadgar a78cefec18 use int64 2018-10-16 15:34:32 -07:00
Preetha Appan 7c0d8c646c
Change CPU/Disk/MemoryMB to int everywhere in new resource structs 2018-10-16 16:21:42 -05:00
Alex Dadgar 01f8e5b95f renames 2018-10-04 14:57:25 -07:00
Alex Dadgar 52f9cd7637 fixing tests 2018-10-04 14:26:19 -07:00
Alex Dadgar bac5cb1e8b Scheduler uses allocated resources 2018-10-02 17:08:25 -07:00
Alex Dadgar 5c8697667e Node reserved resources 2018-09-29 18:44:55 -07:00
Alex Dadgar 3183153315 Node resources on client 2018-09-29 17:23:41 -07:00
Alex Dadgar 9971b3393f yamux 2018-09-17 14:22:40 -07:00
Alex Dadgar ca28afa3b2 small fixes 2018-09-15 16:42:38 -07:00
Alex Dadgar 7739ef51ce agent + consul 2018-09-13 10:43:40 -07:00
Michael Schurter 08862fc177 fix race around error handling 2018-09-05 17:34:17 -07:00
Michael Schurter 6def5bc4f9 client: set host name when migrating over tls
Not setting the host name led the Go HTTP client to expect a certificate
with a DNS-resolvable name. Since Nomad uses `${role}.${region}.nomad`
names ephemeral dir migrations were broken when TLS was enabled.

Added an e2e test to ensure this doesn't break again as it's very
difficult to test and the TLS configuration is very easy to get wrong.
2018-09-05 17:24:17 -07:00
Alex Dadgar c6576ddac1 Fix make check errors 2018-09-04 16:03:52 -07:00
Alex Dadgar 089b533047 Fix kill timeout exceeding 5m on Docker driver
Fixes an issue where the Docker API client would timeout before the kill
timeout was hit.
2018-08-17 16:01:09 -07:00
Alex Dadgar 49a1ba9297
Merge pull request #4535 from hashicorp/f-keep-docker-container-0.8.4
Option to prevent removal of container on exit
2018-07-26 11:11:22 -07:00
Charlie Voiselle f319a149cd Option to prevent removal of container on exit 2018-07-26 11:10:48 -07:00
Michael Schurter ddf948001e
Merge pull request #4462 from omame/omame/cpu_cfs_period
Add support for specifying cpu_cfs_period in the Docker driver
2018-07-25 09:34:38 -07:00
Daniele Valeriani b0a14caca2 Add test for cpu_cfs_period 2018-07-16 22:43:34 +02:00
Michael Schurter 91588cb861 rkt: revert to redis 3.2 to favor stability 2018-07-09 16:15:32 -07:00
Michael Schurter c56f899ee9 rkt: speed up tests
Disable networking when it's not needed and improve failure message for
UserGroup test by including the full ps output on failure.
2018-07-09 14:02:27 -07:00
Michael Schurter a1d4f77ce0 rkt: skip retrieving network information when net=none
Even when net=none we would attempt to retrieve network information from
rkt which would spew useless log lines such as:

```
testlog.go:30: 20:37:31.409209 [DEBUG] driver.rkt: failed getting network info for pod UUID 8303cfe6-0c10-4288-84f5-cb79ad6dbf1c attempt 2: no networks found. Sleeping for 970ms
```

It would also delay tests for ~60s during the network information retry
period.

So skip this when net=none. It's unlikely anyone actually uses net=none
outside of tests, so I doubt anyone will notice this change.

Official docs:
https://coreos.com/rkt/docs/latest/networking/overview.html#no-loopback-only-networking
2018-07-09 13:44:43 -07:00
Michael Schurter 0fbc84b81d tests: make alloc id consistent in helper
It worked, but the old code used a different alloc id for the path than
the actual alloc! Use the same alloc id everywhere to prevent confusing
test output.
2018-07-09 13:37:35 -07:00
Michael Schurter f3b8815c96 rkt: fix failing TestRktDriver_UserGroup test
Started failing due to the docker redis image switching from Debian
jessie to stretch:
53f8680550 (diff-acff46b161a3b7d6ed01ba79a032acc9)

Switched from Debian based image to Alpine to get a working `ps` command
again (albeit busybox's stripped down implementation)
2018-07-09 12:19:02 -07:00
Daniele Valeriani 748f6afd89 Validate the value of cpu_cfs_period 2018-07-02 22:30:22 +02:00
Daniele Valeriani 9364446a03 Remove an unnecessary conversion 2018-07-02 17:47:23 +02:00
Daniele Valeriani 906952a2c8 Add support for specifying cpu_cfs_period in the Docker driver 2018-07-02 16:37:04 +02:00
Preetha b567750824
Merge pull request #4392 from burdandrei/telemetry-parametrized-jobs
Parametrized/periodic jobs per child tagged metric emmision
2018-06-21 17:13:36 -05:00
Preetha 043f4c208b
Merge pull request #3882 from burdandrei/telemetry-add-node-class-tag
Added node class to tagged metrics
2018-06-21 17:04:35 -05:00
Andrei Burd 444ee45aff Parametrized/periodic jobs per child tagged metric emmision 2018-06-21 10:40:56 +03:00
James Rasell 75f95ccf09
Merge branch 'master' into f_gh_4381 2018-06-19 17:51:57 +02:00
Alex Dadgar b61051b3cd
Merge pull request #4409 from hashicorp/r-client-packages
Refactor client packages
2018-06-13 17:32:25 -07:00
Alex Dadgar 22757d964e lint 2018-06-13 16:06:39 -07:00