Commit graph

3667 commits

Author SHA1 Message Date
Mahmood Ali 32551fb0e5 emit TaskRestartSignal event on vault restart
When Vault token expires and task is restarted, emit `TaskRestartSignal`
similar to v0.8.7
2019-02-22 15:56:14 -05:00
Mahmood Ali 8cb4bbcc08 address review comments 2019-02-22 15:56:14 -05:00
Mahmood Ali 216eaa4843 tests: port TestTaskRunner_VaultManager_Signal
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1427
2019-02-22 15:53:04 -05:00
Mahmood Ali 8e9e732319 tests: port TestTaskRunner_VaultManager_Restart
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1352
2019-02-22 15:53:04 -05:00
Mahmood Ali 33122ca7c0 tests: port TestTaskRunner_UnregisterConsul_Retries
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L620
2019-02-22 15:53:04 -05:00
Mahmood Ali 0128b0ce7a tests: port TestTaskRunner_Template_NewVaultToken
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1275
2019-02-22 15:53:04 -05:00
Mahmood Ali cfb80583af tests: port TestTaskRunner_Template_Artifact
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1195
2019-02-22 15:52:59 -05:00
Mahmood Ali 1b14214a88 tests: port TestAllocRunner_RetryArtifact
Port TestAllocRunner_RetryArtifact from https://github.com/hashicorp/nomad/blob/v0.8.7/client/alloc_runner_test.go#L610-L672

I changed the test name because it doesn't actually test that artifact
hooks is retried
2019-02-22 15:50:39 -05:00
Mahmood Ali c827e6e05a tests: port TestAllocRunner_MoveAllocDir test 2019-02-22 15:50:39 -05:00
Michael Schurter a2e3ea6dc9 logmon: fix reattach configuration
There were multiple bugs here:

1. Reattach unmarshalling always returned an error because you can't
   unmarshal into a nil pointer.
2. The hook data wasn't being saved because it was put on the request
   struct, not the response struct.
3. The plugin configuration should only have reattach *or* a command
   set. Not both.
4. Setting Done=true meant the hook was never re-run on agent restart so
   reattaching was never attempted.
2019-02-21 15:32:18 -08:00
Michael Schurter f5e0dba9d1 fingerprint: improve initial fingerpint message
The initial fingerprint message is actually fairly useful, so I bumped
it to Debug and fixed the output formatting.
2019-02-21 15:32:18 -08:00
Michael Schurter 01cabdff88 client: restart on recoverable StartTask errors
Fixes restarting on recoverable errors from StartTask.

Ports TestTaskRunner_Run_RecoverableStartError from 0.8 which discovered
the bug.
2019-02-21 15:30:49 -08:00
Michael Schurter e3f321cd27 test: port TestTaskRunner_RestartSignalTask_NotRunning from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter f3aa945a00 test: port TestTaskRunner_DriverNetwork from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter 518405ac33
Merge pull request #5322 from hashicorp/b-artifact-retries
Fix regression by restarting on artifact download errors
2019-02-21 15:28:51 -08:00
Mahmood Ali 6d30284ec9
Merge pull request #5341 from hashicorp/ci-windows-docker
Run Docker tests in Windows AppVeyor CI
2019-02-21 13:17:33 -05:00
Michael Schurter 2553800eb8 tests: port TestAllocRunner_Destroy from 0.8
Also add destroy(ar) helper to fix a bunch of shutdown races in AR
tests.
2019-02-20 12:35:09 -08:00
Michael Schurter 6580ed668e client: don't redownload completed artifacts on retries
Track the download status of each artifact independently so that if only
one of many artifacts fails to download, completed artifacts aren't
downloaded again.
2019-02-20 08:45:12 -08:00
Michael Schurter 908bfab4c2 client: artifact errors are retry-able
0.9.0beta2 contains a regression where artifact download errors would
not cause a task restart and instead immediately fail the task.

This restores the pre-0.9 behavior of retrying all artifact errors and
adds missing tests.
2019-02-20 07:21:27 -08:00
Michael Schurter 79ccf00b72 tests: add new task runner test helper
Adds a new helper and removes a duplicated test.
2019-02-20 07:21:27 -08:00
Mahmood Ali 33ff8c3e8d tests: expect Docker on AppVeyor
Prepare to run docker on AppVeyor Windows environment
2019-02-20 07:41:47 -05:00
Michael Schurter 159042a1a3 client: fix setting alloc unhealthy at deadline
During the 0.9 client refactor the code to fail a deployment when the
deadline was reached was broken. This restores and tests that behavior.
2019-02-19 07:44:14 -08:00
Mahmood Ali 87be233aca
test: improve readability of duration
Co-Authored-By: schmichael <michael.schurter@gmail.com>
2019-02-14 08:12:06 -08:00
Mahmood Ali 16d3414842
test: improve failure message
Co-Authored-By: schmichael <michael.schurter@gmail.com>
2019-02-14 08:11:37 -08:00
Michael Schurter 4814f0fb0b tests: port TestTaskRunner_Download_List from 0.8 2019-02-12 15:48:04 -08:00
Michael Schurter a152e3ef17 consul: fix task deregistration hook
Broke ShutdownDelay but the test was timing dependent so it just
appeared flaky. Made the test slower so that it should never incorrectly
pass.
2019-02-12 15:36:02 -08:00
Michael Schurter 4ad879e75e tests: port TaskRunner_DeriveToken tests from 0.8 2019-02-12 15:36:02 -08:00
Michael Schurter 6743ed9fdc tests: port TestTaskRunner_BlockForVault from 0.8
Also fix race conditions in the mock vault client.
2019-02-12 13:46:09 -08:00
Michael Schurter 6c0cc65b2e simplify hcl2 parsing helper
No need to pass in the entire eval context
2019-02-04 11:07:57 -08:00
Michael Schurter fec2752fb2 client: log when allocs have been processed
Will hopefully help us catch deadlocks/livelocks/slowdowns in the
add/remove allocs pipeline which should be fast.
2019-02-04 11:07:57 -08:00
Michael Schurter 2db91425e3 Remove 0.9.0-beta2 generated files 2019-02-01 08:28:44 -08:00
Alex Dadgar 84d0afccae Generate files for 0.9.0-beta2 2019-01-30 13:31:50 -08:00
Alex Dadgar 449e582ffc
Merge pull request #5281 from hashicorp/f-affinity-weight-int
Change types of weights on spread/affinity
2019-01-30 13:25:56 -08:00
Alex Dadgar d2e5ede119 remove generated structs 2019-01-30 12:38:34 -08:00
Nick Ethier e7ea26449e
client: fix bug during 0.8 state up grade that causes external drivers to fail 2019-01-30 14:22:29 -05:00
Alex Dadgar bc804dda2e Nomad 0.9.0-beta1 generated code 2019-01-30 10:49:44 -08:00
Alex Dadgar 5062c54874 Fix usage of fsi variable 2019-01-29 14:07:55 -08:00
Alex Dadgar 6f418ebaf0 Always populate task dir environment variables
Fixes an issue where if a task was restarted after restating the client,
the task dir environment variables would not be populated. This PR fixes
this for both upgrades from 0.8.X and for normal 0.9 restarts.
2019-01-29 13:17:10 -08:00
Nick Ethier bcbed3c532
Merge pull request #5248 from hashicorp/b-rawexec-leak
Fix leaked executor in raw_exec
2019-01-28 21:18:31 -05:00
Alex Dadgar 5da21635fb Fix env templates having interpolated destinations
Fixes an issue where env templates that had interpolated destinations
would not work.

Fixes https://github.com/hashicorp/nomad/issues/5250
2019-01-28 10:28:53 -08:00
Nick Ethier 8d7a47340c
drivermanager: don't store nil reattach configs 2019-01-25 23:07:04 -05:00
Alex Dadgar d6412fd8e7 Fix double restart counting for templates
This PR fixes an issue where template restarts would count twice since
it was emitting a restarting event.
2019-01-25 15:38:13 -08:00
Nick Ethier be976d9c9a
Merge branch 'master' into f-driver-upgradepath-test
* master: (23 commits)
  tests: avoid assertion in goroutine
  spell check
  ci: run checkscripts
  tests: deflake TestRktDriver_StartWaitRecoverWaitStop
  drivers/rkt: Remove unused github.com/rkt/rkt
  drivers/rkt: allow development on non-linux
  cli: Hide `nomad docker_logger` from help output
  api: test api and structs are in sync
  goimports until make check is happy
  nil check node resources to prevent panic
  tr: use context in as select statement
  move pluginutils -> helper/pluginutils
  vet
  goimports
  gofmt
  Split hclspec
  move hclutils
  Driver tests do not use hcl2/hcl, hclspec, or hclutils
  move reattach config
  loader and singleton
  ...
2019-01-23 21:01:24 -05:00
Nick Ethier 5b9013528e
drivers: add docker upgrade path and e2e test 2019-01-23 14:44:42 -05:00
Nick Ethier a36c4320ff
Merge pull request #5227 from hashicorp/b-client-highcpu-usage
Fix bug related to high cpu usage
2019-01-23 14:27:51 -05:00
Michael Schurter 13f061a83f
Merge pull request #5196 from hashicorp/f-plugin-utils
Make plugins/shared external and make pluginutls/
2019-01-23 06:59:32 -08:00
Preetha 05bf183ba3
Merge pull request #5225 from hashicorp/b-notaskevent-terminalallocs
Don't emit task events after alloc is in a terminal DesiredState
2019-01-23 08:54:10 -06:00
Michael Schurter 32daa7b47b goimports until make check is happy 2019-01-23 06:27:14 -08:00
Nick Ethier bcc3935228
tr: use context in as select statement 2019-01-22 20:11:39 -05:00
Michael Schurter be0bab7c3f move pluginutils -> helper/pluginutils
I wanted a different color bikeshed, so I get to paint it
2019-01-22 15:50:08 -08:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Alex Dadgar b7a65676fe gofmt 2019-01-22 15:43:34 -08:00
Alex Dadgar 2ca0e97361 Split hclspec 2019-01-22 15:43:34 -08:00
Alex Dadgar 5ca6dd7988 move hclutils 2019-01-22 15:43:34 -08:00
Alex Dadgar 72a5691897 Driver tests do not use hcl2/hcl, hclspec, or hclutils 2019-01-22 15:43:34 -08:00
Alex Dadgar b2c7268843 move reattach config 2019-01-22 15:11:58 -08:00
Alex Dadgar cdcd3c929c loader and singleton 2019-01-22 15:11:57 -08:00
Alex Dadgar 6c2782f037 move catalog + grpcutils 2019-01-22 15:11:57 -08:00
Preetha Appan 38422642cb
Use DesiredState to determine whether to stop sending task events 2019-01-22 16:43:32 -06:00
Preetha Appan 862c9b7de5
dont emit events for terminal allocs 2019-01-22 16:26:33 -06:00
Michael Schurter 1fa376cac6
Merge pull request #5211 from hashicorp/test-porting-08
Port some 0.8 TaskRunner tests
2019-01-22 14:05:53 -08:00
Michael Schurter 8ced0adb67 test: port TestTaskRunner_CheckWatcher_Restart
Added ability to adjust the number of events the TaskRunner keeps as
there's no way to observe all events otherwise.

Task events differ slightly from 0.8 because 0.9 emits Terminated every
time a task exits instead of only when it exits on its own (not due to
restart or kill).

0.9 does not emit Killing/Killed for restarts like 0.8 which seems fine
as `Restart Signaled/Terminated/Restarting` is more descriptive.

Original v0.8 events emitted:
```
	expected := []string{
		"Received",
		"Task Setup",
		"Started",
		"Restart Signaled",
		"Killing",
		"Killed",
		"Restarting",
		"Started",
		"Restart Signaled",
		"Killing",
		"Killed",
		"Restarting",
		"Started",
		"Restart Signaled",
		"Killing",
		"Killed",
		"Not Restarting",
	}
```
2019-01-22 09:46:46 -08:00
Michael Schurter 1719752a9d test: port RestartTask from 0.8 2019-01-22 08:08:08 -08:00
Michael Schurter 9edff19625 test: port SignalFailure test from 0.8
Also fix signal error handling in mock_driver.
2019-01-22 08:08:08 -08:00
Preetha Appan 299a5fc821
Rename TaskKillRequest/Response to TaskPreKillRequest/Response 2019-01-22 09:54:02 -06:00
Preetha Appan 5a5b9c5666
Fix log comments 2019-01-22 09:45:58 -06:00
Preetha Appan 06e15f8381
Rename TaskKillHook to TaskPreKillHook to more closely match usage
Also added/fixed comments
2019-01-22 09:41:56 -06:00
Michael Schurter 3b02af9386
Fix comment
Co-Authored-By: preetapan <preetha@hashicorp.com>
2019-01-22 09:41:21 -06:00
Preetha Appan 09291c689b
Rename TaskKillHook to TaskPreKillHook to more closely match usage
Also added/fixed comments
2019-01-22 09:41:21 -06:00
Mahmood Ali a9b73e6b86
Merge pull request #5216 from hashicorp/b-fix-tests-20180118
tests: deflake client TestFS_Logs_TaskPending test
2019-01-21 09:54:15 -05:00
Mahmood Ali d19ba5bd8e tests: deflake client TestFS_Logs_TaskPending test 2019-01-18 21:26:48 -05:00
Nick Ethier 47127de671
ar: return error from hooks if occured 2019-01-18 18:31:02 -05:00
Nick Ethier e3c6f89b9a
drivers: use consts for task handle version 2019-01-18 18:31:01 -05:00
Nick Ethier 6804450c69
cleanup code comments and small fixes from refactor 2019-01-18 18:31:01 -05:00
Nick Ethier 05bd369d1f
driver: add pre09 migration logic 2019-01-18 18:31:01 -05:00
Mahmood Ali 5df63fda7c
Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Chris Baker 290c3f36ad set TaskGroupName in task_runner 2019-01-18 20:25:11 +00:00
Chris Baker 8917961caa documenting test for task runner failure to set TaskGroupName 2019-01-18 20:00:49 +00:00
Michael Schurter cfadacfd95
Merge pull request #5203 from hashicorp/b-terminated
client: restore Terminated event on every exit
2019-01-18 08:54:15 -08:00
Danielle Tomlinson bf21612e2b
Merge pull request #5174 from hashicorp/dani/windows
Some Windows fixes and CI
2019-01-18 11:21:53 +01:00
Preetha Appan e0b68a19c6
Fix one more place that should be using taskResources
taskResources handles new resource fields in a backwards compatible way
2019-01-17 15:52:51 -06:00
Michael Schurter a20ac7c1de client: restore Terminated event on every exit
v0.9.0-dev started emitting a Terminated event every time a task process
exited. While this wasn't true in previous versions, it's a useful task
event because it's the only place for job operators to view the task's
exit code.

This behavior is asserted in the e2e/taskevents tests.
2019-01-17 10:02:25 -08:00
Danielle Tomlinson 11c733faa8 allocwatcher: Stat_t is unavailable on win 2019-01-17 18:43:14 +01:00
Danielle Tomlinson 62e06eda56 chore: Cleanup formatting 2019-01-17 18:43:13 +01:00
Danielle Tomlinson 580b8c5dda client/fs: Skip delete-while-streaming test on win 2019-01-17 18:43:13 +01:00
Danielle Tomlinson 4dbddd0620 client/fs: windows error message for not found 2019-01-17 18:43:13 +01:00
Danielle Tomlinson 915bab2365 vaultclient: use require for error assertions 2019-01-17 18:43:13 +01:00
Danielle Tomlinson dc55d3e353 vaultclient: Update tests for vault 1.0 2019-01-17 18:43:13 +01:00
Danielle Tomlinson 7a5d511349 fingerprinter: Use HCLogger for windows 2019-01-17 18:43:13 +01:00
Danielle Tomlinson a695b3562c
Merge pull request #5193 from hashicorp/dani/logmon-reattach
logmon: Reattach to existing loggers
2019-01-16 17:34:13 +01:00
Danielle Tomlinson 99da4c780d logmon: Reattach to existing loggers
This commit prevents us from creating duplicate logmon hooks when
restoring allocations by persisting the logmon reattach config using
HookData.
2019-01-16 14:56:10 +01:00
Michael Schurter daa7d029a1 test: porting TestTaskRunner_SimpleRun_Dispatch
Porting test from 0.8 to 0.9.
2019-01-15 15:22:13 -08:00
Michael Schurter 48afda786b
Merge pull request #5187 from hashicorp/test-consul
Port a bunch of pre-0.9 Consul tests to 0.9
2019-01-15 07:41:50 -08:00
Alex Dadgar 471fdb3ccf
Merge pull request #5173 from hashicorp/b-log-levels
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Mahmood Ali 9909d98bee Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Nick Ethier c619e70d39
Merge pull request #5018 from hashicorp/f-executor-stats
executor: streaming stats api
2019-01-14 15:02:35 -05:00
Michael Schurter 4e7ea460e8 test: port some pre-0.9 DeploymentHealth tests
Skipping a failing one as I need to move to some other work and don't
want to leave this work orphaned on my machine.
2019-01-14 09:56:53 -08:00
Michael Schurter ff2f23f5f9 test: assert service interpolation behavior
Ported from pre-0.9 tests.
2019-01-14 09:56:53 -08:00
Michael Schurter 5746be5844 test: add some extra logging 2019-01-14 09:56:53 -08:00
Michael Schurter e877bb6370 test: assert shutdown delay deregs first
Restore a pre-0.9 test that asserts Consul services are deregistered
before a task's shutdown delay.
2019-01-14 09:56:53 -08:00