Michael Schurter
a5d3e3fb0a
Implement alloc updates in arv2
...
Updates are applied asynchronously but sequentially
2018-10-16 16:53:30 -07:00
Michael Schurter
39b3f3a85b
call handle.Network() instead of storing it
2018-10-16 16:53:30 -07:00
Michael Schurter
7132b67c1e
Add Network method to Handle interface
...
Should probably be moved to an Inspect method in the Driver Plugin world
2018-10-16 16:53:30 -07:00
Michael Schurter
a4b4d7b266
consul service hook
...
Deregistration works but difficult to test due to terminal updates not
being fully implemented in the new client/ar/tr.
2018-10-16 16:53:29 -07:00
Michael Schurter
5be982e674
restore vault client
2018-10-16 16:53:29 -07:00
Michael Schurter
ce04915c9f
log before killing tasks
2018-10-16 16:53:29 -07:00
Michael Schurter
a2bf851805
no need to TaskStateUpdated to return an error
...
also updated comments
2018-10-16 16:53:29 -07:00
Alex Dadgar
fd3bc1bd39
Update state with server
2018-10-16 16:53:29 -07:00
Alex Dadgar
bc905cc61d
Define and thread through state updating interface
2018-10-16 16:53:29 -07:00
Michael Schurter
9a63d6103d
tr: add validate task hook
2018-10-16 16:53:29 -07:00
Michael Schurter
7f4ec50906
missed locking around c.allocs access
2018-10-16 16:53:29 -07:00
Alex Dadgar
c93cfc89c0
wip
2018-10-16 16:53:29 -07:00
Alex Dadgar
7ddc0eb65c
Fix deadlock
2018-10-16 16:53:29 -07:00
Alex Dadgar
3779077052
Remove SetState from interface
2018-10-16 16:53:29 -07:00
Alex Dadgar
e1ba73b515
compile
2018-10-16 16:53:29 -07:00
Michael Schurter
6ebdf532ea
wip split event emitting and state transitions
2018-10-16 16:53:29 -07:00
Michael Schurter
516d641db0
client: implement all-or-nothing alloc restoration
...
Restoring calls NewAR -> Restore -> Run
NewAR now calls NewTR
AR.Restore calls TR.Restore
AR.Run calls TR.Run
2018-10-16 16:53:29 -07:00
Alex Dadgar
e401c660e7
Implement lifecycle hooks on the task runner
2018-10-16 16:53:29 -07:00
Alex Dadgar
89b4ba9cc8
comments
2018-10-16 16:53:29 -07:00
Alex Dadgar
86e81947b4
Hook renames
2018-10-16 16:53:29 -07:00
Alex Dadgar
2599cf9d74
remove comment
2018-10-16 16:53:29 -07:00
Alex Dadgar
88aa0299a9
Template hook
2018-10-16 16:53:29 -07:00
Alex Dadgar
c9765deff1
address comments
2018-10-16 16:53:29 -07:00
Alex Dadgar
80f6ce50c0
vault hook
2018-10-16 16:53:29 -07:00
Michael Schurter
30d377eba4
tr: improve skip log line
2018-10-16 16:53:29 -07:00
Michael Schurter
ef213b864b
tr: pass context to hooks
2018-10-16 16:53:29 -07:00
Michael Schurter
3a4f387fd3
tr: fix setting done in existing hooks
2018-10-16 16:53:29 -07:00
Michael Schurter
b360f6f96e
fix hclog level
2018-10-16 16:53:29 -07:00
Michael Schurter
ae89b7da95
reimplement success state for tr hooks and state persistence
...
splits apart local and remote persistence
removes some locking *for now*
2018-10-16 16:53:29 -07:00
Michael Schurter
4f43ff5c51
pass statedb into allocrunnerv2
2018-10-16 16:53:29 -07:00
Michael Schurter
582c76a420
remove unused allocrunner shim
2018-10-16 16:53:29 -07:00
Michael Schurter
c5504bd939
tr: cleanup main loop and shutdown hook impl
2018-10-16 16:53:29 -07:00
Michael Schurter
561260d6fe
tr: skip error/success saving
...
All hooks only need to be run once.
Since only one hook can fail per run there's no need to
track errors on a per hook basis.
2018-10-16 16:53:29 -07:00
Michael Schurter
67874e761f
tr: don't lock for immutable fields
2018-10-16 16:53:29 -07:00
Michael Schurter
f473cd03d6
tr: start update/shutdown logic
2018-10-16 16:53:29 -07:00
Michael Schurter
637ef264ae
Copy TR.Config vals to TR
...
I think I like this pattern better as some Config vals are mutable
(Alloc) and some aren't and some are used to derive other values and
never used directly.
Promoting them onto the TR struct is a little more work but is hopefully
more clear as to how each value is used.
2018-10-16 16:53:29 -07:00
Michael Schurter
0f7dcfdc9a
example redis job "runs" on arv2! see below
...
Tons left to do and lots of churn:
1. No state saving
2. No shutdown or gc
3. Removed AR factory *for now*
4. Made all "Config" structs local to the package they configure
5. Added allocID to GC to avoid a lookup
Really hating how many things use *structs.Allocation. It's not bad
without state saving, but if AllocRunner starts updating its copy things
get racy fast.
2018-10-16 16:53:29 -07:00
Michael Schurter
9a6aa38b0f
begin adding AllocRunner.Update
2018-10-16 16:53:29 -07:00
Michael Schurter
eae54e2954
artifact task hook
2018-10-16 16:53:29 -07:00
Alex Dadgar
b9bed81e6e
Initial V2 alloc runner
2018-10-16 16:53:28 -07:00
Alex Dadgar
a78cefec18
use int64
2018-10-16 15:34:32 -07:00
Preetha Appan
7c0d8c646c
Change CPU/Disk/MemoryMB to int everywhere in new resource structs
2018-10-16 16:21:42 -05:00
Christian Winther
0c5154100c
fix: increase log rotator line scan limit
...
In case where gelf/json logging is used, its fairly easy to exceed the 16k limit, resulting in json output being cut up into multiple strings
the result is invalid json lines which can create all kind of badness in the logging server
This fixes https://github.com/hashicorp/nomad/issues/4699
Signed-off-by: Christian Winther <jippignu@gmail.com>
2018-10-09 18:57:18 +02:00
Alex Dadgar
01f8e5b95f
renames
2018-10-04 14:57:25 -07:00
Alex Dadgar
52f9cd7637
fixing tests
2018-10-04 14:26:19 -07:00
Alex Dadgar
bac5cb1e8b
Scheduler uses allocated resources
2018-10-02 17:08:25 -07:00
Alex Dadgar
5c8697667e
Node reserved resources
2018-09-29 18:44:55 -07:00
Alex Dadgar
3183153315
Node resources on client
2018-09-29 17:23:41 -07:00
Alex Dadgar
9971b3393f
yamux
2018-09-17 14:22:40 -07:00
Alex Dadgar
ca28afa3b2
small fixes
2018-09-15 16:42:38 -07:00
Alex Dadgar
7739ef51ce
agent + consul
2018-09-13 10:43:40 -07:00
Michael Schurter
08862fc177
fix race around error handling
2018-09-05 17:34:17 -07:00
Michael Schurter
6def5bc4f9
client: set host name when migrating over tls
...
Not setting the host name led the Go HTTP client to expect a certificate
with a DNS-resolvable name. Since Nomad uses `${role}.${region}.nomad`
names ephemeral dir migrations were broken when TLS was enabled.
Added an e2e test to ensure this doesn't break again as it's very
difficult to test and the TLS configuration is very easy to get wrong.
2018-09-05 17:24:17 -07:00
Alex Dadgar
c6576ddac1
Fix make check errors
2018-09-04 16:03:52 -07:00
Alex Dadgar
089b533047
Fix kill timeout exceeding 5m on Docker driver
...
Fixes an issue where the Docker API client would timeout before the kill
timeout was hit.
2018-08-17 16:01:09 -07:00
Alex Dadgar
49a1ba9297
Merge pull request #4535 from hashicorp/f-keep-docker-container-0.8.4
...
Option to prevent removal of container on exit
2018-07-26 11:11:22 -07:00
Charlie Voiselle
f319a149cd
Option to prevent removal of container on exit
2018-07-26 11:10:48 -07:00
Michael Schurter
ddf948001e
Merge pull request #4462 from omame/omame/cpu_cfs_period
...
Add support for specifying cpu_cfs_period in the Docker driver
2018-07-25 09:34:38 -07:00
Daniele Valeriani
b0a14caca2
Add test for cpu_cfs_period
2018-07-16 22:43:34 +02:00
Michael Schurter
91588cb861
rkt: revert to redis 3.2 to favor stability
2018-07-09 16:15:32 -07:00
Michael Schurter
c56f899ee9
rkt: speed up tests
...
Disable networking when it's not needed and improve failure message for
UserGroup test by including the full ps output on failure.
2018-07-09 14:02:27 -07:00
Michael Schurter
a1d4f77ce0
rkt: skip retrieving network information when net=none
...
Even when net=none we would attempt to retrieve network information from
rkt which would spew useless log lines such as:
```
testlog.go:30: 20:37:31.409209 [DEBUG] driver.rkt: failed getting network info for pod UUID 8303cfe6-0c10-4288-84f5-cb79ad6dbf1c attempt 2: no networks found. Sleeping for 970ms
```
It would also delay tests for ~60s during the network information retry
period.
So skip this when net=none. It's unlikely anyone actually uses net=none
outside of tests, so I doubt anyone will notice this change.
Official docs:
https://coreos.com/rkt/docs/latest/networking/overview.html#no-loopback-only-networking
2018-07-09 13:44:43 -07:00
Michael Schurter
0fbc84b81d
tests: make alloc id consistent in helper
...
It worked, but the old code used a different alloc id for the path than
the actual alloc! Use the same alloc id everywhere to prevent confusing
test output.
2018-07-09 13:37:35 -07:00
Michael Schurter
f3b8815c96
rkt: fix failing TestRktDriver_UserGroup test
...
Started failing due to the docker redis image switching from Debian
jessie to stretch:
53f8680550 (diff-acff46b161a3b7d6ed01ba79a032acc9)
Switched from Debian based image to Alpine to get a working `ps` command
again (albeit busybox's stripped down implementation)
2018-07-09 12:19:02 -07:00
Daniele Valeriani
748f6afd89
Validate the value of cpu_cfs_period
2018-07-02 22:30:22 +02:00
Daniele Valeriani
9364446a03
Remove an unnecessary conversion
2018-07-02 17:47:23 +02:00
Daniele Valeriani
906952a2c8
Add support for specifying cpu_cfs_period in the Docker driver
2018-07-02 16:37:04 +02:00
Preetha
b567750824
Merge pull request #4392 from burdandrei/telemetry-parametrized-jobs
...
Parametrized/periodic jobs per child tagged metric emmision
2018-06-21 17:13:36 -05:00
Preetha
043f4c208b
Merge pull request #3882 from burdandrei/telemetry-add-node-class-tag
...
Added node class to tagged metrics
2018-06-21 17:04:35 -05:00
Andrei Burd
444ee45aff
Parametrized/periodic jobs per child tagged metric emmision
2018-06-21 10:40:56 +03:00
James Rasell
75f95ccf09
Merge branch 'master' into f_gh_4381
2018-06-19 17:51:57 +02:00
Alex Dadgar
b61051b3cd
Merge pull request #4409 from hashicorp/r-client-packages
...
Refactor client packages
2018-06-13 17:32:25 -07:00
Alex Dadgar
22757d964e
lint
2018-06-13 16:06:39 -07:00
Alex Dadgar
af558df94c
Fix test using a lot of memory
2018-06-13 15:52:25 -07:00
Alex Dadgar
300b1a7a15
Tests only use testlog package logger
2018-06-13 15:40:56 -07:00
Chelsea Komlo
03075b603a
Merge pull request #4399 from hashicorp/r-reload-refactor
...
Refactor logic for dynamic reloading
2018-06-13 13:35:12 -04:00
Alex Dadgar
9bab9edf27
test fixes
2018-06-12 17:45:39 -07:00
Alex Dadgar
90c2108bfb
Fix gc tests + parallel destroy + small test fixes
2018-06-12 10:23:45 -07:00
Alex Dadgar
f5ff509fa5
Refactor - wip
2018-06-12 10:23:45 -07:00
Alex Dadgar
ff2ab8f58e
Fix vault template test
2018-06-12 09:57:28 -07:00
Alex Dadgar
d0043691fb
remove structs + bump version
2018-06-11 13:52:19 -07:00
Alex Dadgar
af5753d2cd
bump version + generated files
2018-06-11 13:39:42 -07:00
Nick Ethier
f36eb14360
Merge pull request #4403 from hashicorp/b-fix-dispatched-optional-meta
...
Fix dispatched optional meta correctly
2018-06-11 16:17:14 -04:00
Nick Ethier
e75e3ae665
nomad: use require pkg for tests
2018-06-11 13:50:50 -04:00
Nick Ethier
3aa6241b5c
client/driver/env: fix optional meta test
2018-06-11 12:29:13 -04:00
Nick Ethier
c65882cafd
client/driver/env: use 'job.Dispatch' to trigger optional meta logic
2018-06-11 12:15:19 -04:00
Nick Ethier
ccb5372813
Revert "Revert "client/driver/env: interpolate empty optional meta params as empty strings""
...
This reverts commit c17e0fc9dc5fd288935ab2b68fb441b4d25ac189.
2018-06-11 11:59:23 -04:00
Michael Schurter
c198cfd8ea
executor: fix log line formatting
2018-06-08 14:55:39 -07:00
Michael Schurter
d1a60e700e
executor: fix Windows blocking on pipe close
...
Sending the Ctrl-Break signal to PowerShell <6 causes it to drop into
debug mode. Closing its output pipe at that point will block
indefinitely and prevent the process from being killed by Nomad.
See the upstream powershell issue for details:
https://github.com/PowerShell/PowerShell/issues/4254
2018-06-08 14:48:05 -07:00
Chelsea Holland Komlo
f74e74b22d
add client logic to determine whether TLS RPC connections should reload
2018-06-08 14:38:58 -04:00
James Rasell
b9009c419c
Add 'nomad.advertise.address' to client meta via NomadFingerPrint
...
This change removes the addition of the advertise address to the
exported task env vars and instead moves this work into the
NomadFingerprint.Fingerprint which adds this value to the client
attrs. This can then be used within a Nomad job like
${attr.nomad.advertise.address}.
2018-06-08 09:44:10 +02:00
Alex Dadgar
d9b35fab52
Revert "client/driver/env: interpolate empty optional meta params as empty strings"
...
This reverts commit 84926f759a63a90be7bbcf0fad78deb3f02af23d.
2018-06-07 16:27:47 -07:00
Nick Ethier
b3c767fae0
client/driver: drop docker pull progress estimate if its < 0
2018-06-07 15:23:31 -04:00
James Rasell
367a8b5152
Add the local clients advertise address to interpolation env vars
...
This commit adds the Nomad local client advertise address in the
form host:port to the environment variables passed to each task.
2018-06-07 09:45:15 +02:00
Alex Dadgar
98705824ed
Merge pull request #4185 from jesusvazquez/add-counter-metric-for-oom-killer-events
...
Add driver.docker counter metric for OOM Killer events
2018-06-04 15:12:51 -07:00
Alex Dadgar
23cd56dc78
remove generated structs
2018-06-01 16:11:28 -07:00
Alex Dadgar
bf5b5747ab
fix test message
2018-06-01 15:51:54 -07:00
Alex Dadgar
3e3d3c7445
Disable Exec on non-linux platforms
...
This PR disables exec on non-linux platforms
2018-06-01 15:48:14 -07:00
Alex Dadgar
c0386819b3
bump version/lint/generated files
2018-06-01 15:23:10 -07:00
Preetha Appan
ce6d4a8d7a
Fix tests and move isClient to constructor
2018-06-01 15:59:53 -05:00
Alex Dadgar
a62dd2aadb
Merge pull request #4350 from hashicorp/b-raw-exec-cgroups
...
Raw exec can use cgroups to manage PIDs
2018-06-01 17:37:49 +00:00
Alex Dadgar
8da42940c9
wait for result
2018-06-01 10:14:53 -07:00
Alex Dadgar
40fec81315
Merge pull request #4277 from hashicorp/f-retry-join-clients
...
Add go-discover support to Nomad clients
2018-06-01 16:57:40 +00:00
Alex Dadgar
460ecb8705
Comments
2018-05-31 18:05:03 -07:00
Alex Dadgar
de98774f2c
Add test and docs
2018-05-31 18:05:03 -07:00
Alex Dadgar
ff28b04c46
Use more appropriate name than cgroup
2018-05-31 18:05:03 -07:00
Alex Dadgar
37e900b1d3
Only use freezer/devices when in the basic cgroup only
2018-05-31 18:05:03 -07:00
Alex Dadgar
ffd9270f2f
Use cgroup when possible
2018-05-31 18:05:03 -07:00
Alex Dadgar
0ff0ed290d
Fix TestDockerDriver_StartNVersions
2018-05-31 17:14:59 -07:00
Alex Dadgar
7e6dd498c9
Remove debug logging
2018-05-31 15:52:42 -07:00
Alex Dadgar
b1b908527f
spelling
2018-05-31 15:29:55 -07:00
Alex Dadgar
a3b29553a5
Force close stdout/stderr after grace
...
This commit changes the force closing of the stdout/stderr file
descriptor from closing immediately to being closed after a grace
period. This allows the created process to close its own file and allows
copying of the data.
2018-05-31 15:21:36 -07:00
Alex Dadgar
5e787e2d72
test build
2018-05-31 12:22:31 -07:00
Alex Dadgar
ead1b7f423
Log more info for TestExecutor_IsolationAndConstraints
2018-05-31 11:57:44 -07:00
Alex Dadgar
b05740ad13
Merge pull request #4341 from hashicorp/f-docker-pids
...
Support Docker Pids Limit
2018-05-31 17:59:29 +00:00
Chelsea Holland Komlo
064b5481e0
add server join info to server and client
2018-05-31 10:50:03 -07:00
Alex Dadgar
f4d4bbdc97
test pid limit
2018-05-30 12:55:24 -07:00
Chelsea Holland Komlo
94d510e969
Support Docker Pids Limit
2018-05-25 19:54:14 -04:00
Alex Dadgar
1685c8ebe4
cleanup
2018-05-24 16:25:20 -07:00
Alex Dadgar
2eacdb6bd6
Force closing of pipe to child process
2018-05-24 16:03:48 -07:00
Chelsea Holland Komlo
38f611a7f2
refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
...
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Preetha
9084bb025e
Merge pull request #4303 from hashicorp/b-docker-client-nil-panic
...
Add nil check before setting timeout on docker client
2018-05-21 19:34:44 -07:00
Jesus Vazquez
23d959e42c
Add job, task, taskgroup to open method
2018-05-21 20:37:18 +02:00
Jesus Vazquez
0a062a04c7
Remove allocID from dockerhandle struct
2018-05-21 20:33:01 +02:00
Jesus Vazquez
e5a81815bb
Rename labels job, task_group and task
2018-05-21 20:32:50 +02:00
Jesus Vazquez
ffe1b1a1b6
Remove allocid label from driver.docker.oom counter metric
2018-05-21 20:30:56 +02:00
Alex Dadgar
38762d9bde
Merge pull request #4282 from hashicorp/f-rotator
...
Avoid splitting log line across two files
2018-05-21 17:52:13 +00:00
Alex Dadgar
d95698e2c5
Merge pull request #4298 from justenwalker/docker-driver-digest-tags
...
driver/docker: pull image with digest
2018-05-21 17:46:14 +00:00
Nick Ethier
6392009dd6
client/driver: use correct repo address when using docker-credential helper ( #4266 )
2018-05-15 17:39:48 -04:00
Justen Walker
a8989f33bb
driver/docker: add test for dockerImageRef
2018-05-14 14:24:03 -04:00
Justen Walker
194b2231d6
driver/docker: fix up TestParseDockerImage
2018-05-14 14:23:48 -04:00
Justen Walker
25b2807ce3
driver/docker: fix TestDockerDriver_ForcePull_RepoDigest
2018-05-14 14:23:02 -04:00
Nick Ethier
c4d07a2200
client/driver: gaurd authHelper test from running on windows
2018-05-14 13:46:57 -04:00
Justen Walker
b23ca7574c
driver/docker: cleanup parseDockerImage
2018-05-14 11:11:51 -04:00
Justen Walker
60f7f1aa08
driver/docker: pull image with digest
...
GH #4290
Add digest support to the docker driver image config. This commit
factors out some common code to print the repo:tag (dockerImageRef) for
events/logs as well as parsing the image to retreive the repo,tag
(parseDockerImage) so that the results are consistent/sane for both
repo:tag and repo@sha256:... references.
When pulling an image with a digest, the tag is blank and the repo
contains the digest. See:
https://github.com/fsouza/go-dockerclient/blob/master/image_test.go#L471
2018-05-14 10:42:58 -04:00
Preetha Appan
de66ec7394
Add nil check before setting timeout on docker client
2018-05-11 17:09:26 -05:00
Alex Dadgar
7ad5c76734
Add new line test
2018-05-11 10:52:09 -07:00
Alex Dadgar
3671ed139d
Avoid splitting log line across two files
...
We attempt to avoid splitting a log line between two files by detecting
if we are near the file size limit and scanning for new lines and only
flushing those.
BenchmarkRotator/1KB-8 300000 5613 ns/op
BenchmarkRotator/2KB-8 200000 8384 ns/op
BenchmarkRotator/4KB-8 100000 14604 ns/op
BenchmarkRotator/8KB-8 50000 25002 ns/op
BenchmarkRotator/16KB-8 30000 47572 ns/op
BenchmarkRotator/32KB-8 20000 92080 ns/op
BenchmarkRotator/64KB-8 10000 165883 ns/op
BenchmarkRotator/128KB-8 5000 294405 ns/op
BenchmarkRotator/256KB-8 2000 572374 ns/op
2018-05-10 15:11:01 -07:00
Alex Dadgar
f5d91b5338
Benchmark for rotator
...
BenchmarkRotator/1KB-8 200000 5572 ns/op
BenchmarkRotator/2KB-8 200000 8338 ns/op
BenchmarkRotator/4KB-8 100000 14246 ns/op
BenchmarkRotator/8KB-8 50000 25279 ns/op
BenchmarkRotator/16KB-8 30000 48602 ns/op
BenchmarkRotator/32KB-8 20000 92159 ns/op
BenchmarkRotator/64KB-8 10000 154766 ns/op
BenchmarkRotator/128KB-8 5000 296872 ns/op
BenchmarkRotator/256KB-8 3000 551793 ns/op
2018-05-10 14:15:15 -07:00
Nick Ethier
91603a377e
client/driver: parse repo instead of attempting to pull repo info
2018-05-09 22:34:25 -04:00
Nick Ethier
38a33f9c75
client/driver: add test for docker auth helper
2018-05-09 22:33:56 -04:00
Alex Dadgar
e067a9ae06
naming of constants
2018-05-09 16:46:52 -07:00
Chelsea Holland Komlo
796bae6f1b
allow configurable cipher suites
...
disallow 3DES and RC4 ciphers
add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Alex Dadgar
0e79e1a46e
Keep stream and logs in sync for detecting closed pipe
2018-05-09 11:22:52 -07:00
Preetha
e7ae6e98d9
Merge pull request #4259 from hashicorp/f-deployment-improvements
2018-05-08 16:37:10 -05:00
Nick Ethier
3598925ca4
client/driver: use correct repo address when using docker-credential helper
2018-05-08 15:17:28 -04:00
Nick Ethier
54c86a0292
client/driver/env: interpolate empty optional meta params as empty strings
2018-05-07 20:19:51 -04:00
Nick Ethier
016ab7a105
client/driver: remove unused const 'dockerPullProgressEmitInterval'
2018-05-07 16:24:48 -04:00
Michael Schurter
f1d13683e6
consul: remove services with/without canary tags
...
Guard against Canary being set to false at the same time as an
allocation is being stopped: this could cause RemoveTask to be called
with the wrong Canary value and leaking a service.
Deleting both Canary values is the safest route.
2018-05-07 14:55:01 -05:00
Michael Schurter
50e04c976e
consul: support canary tags for services
...
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.
Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00