Commit Graph

14303 Commits

Author SHA1 Message Date
Danielle Tomlinson 6d1112bcd7 Update changelog for new beta 3 enhancements 2019-02-26 17:23:32 +01:00
Michael Schurter e39a10a1f4 tests: move unix-specific test to its own file
Other logmon tests should be portable.
2019-02-26 07:56:44 -08:00
Mahmood Ali 45b6392d4e
tests: port some fingerprint tests from 0.8 (#5359)
Port some integration tests of driver fingerprinting.

Some tests (e.g. `TestFingerprintManager_Run_DriversInBlacklist`) have
been subsituted by more isolated tests in
`client/pluginmanager/drivermanager/manager_test.go`
2019-02-26 10:54:16 -05:00
Danielle Tomlinson e250aad31b
Merge pull request #5355 from hashicorp/dani/windows-dockerstats
docker: Support Stats on Windows
2019-02-26 16:39:48 +01:00
Michael Schurter 3b2a592e93 client: restart task on logmon failures
This code chooses to be conservative as opposed to optimal: when failing
to reattach to logmon simply return a recoverable error instead of
immediately trying to restart logmon.

The recoverable error will cause the task's restart policy to be
applied and a new logmon will be launched upon restart.

Trying to do the optimal approach of simply starting a new logmon
requires error string comparison and should be tested against a task
actively logging to assert the behavior (are writes blocked? dropped?).
2019-02-25 15:42:45 -08:00
Preetha a49f6d16e6
Merge pull request #5358 from hashicorp/ar-tests
More alloc runner tests ported from 0.8.7
2019-02-25 13:32:12 -06:00
Danielle Tomlinson 6c96c7695d
Merge pull request #5362 from hashicorp/dani/f-fingerprint-undetected
docker: Return undetected before first detection
2019-02-25 15:17:15 +01:00
Danielle Tomlinson e3dc80bea3 docker: Return undetected before first detection
This commit causes the docker driver to return undetected before it
first establishes a connection to the docker daemon.

This fixes a bug where hosts without docker installed would return as
unhealthy, rather than undetected.
2019-02-25 11:02:42 +01:00
Michael Schurter 8830b00866 client: test logmon_hook 2019-02-23 15:36:48 -08:00
Preetha Appan 43679f4ce1
More alloc runner tests ported from 0.8.7 2019-02-22 17:58:06 -06:00
Mahmood Ali 7c9b40a6f1
Merge pull request #5349 from hashicorp/port-tests-20190221
Port some 0.8.7 alloc runner tests
2019-02-22 18:23:50 -05:00
Charlie Voiselle be93fc91db
Merge pull request #5357 from hashicorp/doc-add-coc
Created .github; Added CoC
2019-02-22 14:15:26 -08:00
Charlie Voiselle 604c49beb8
Merge pull request #5344 from hashicorp/b-nexteval-for-failed-follow-up
Set NextEval when making `failed-follow-up` evals
2019-02-22 14:14:41 -08:00
Mahmood Ali 32551fb0e5 emit TaskRestartSignal event on vault restart
When Vault token expires and task is restarted, emit `TaskRestartSignal`
similar to v0.8.7
2019-02-22 15:56:14 -05:00
Mahmood Ali 8cb4bbcc08 address review comments 2019-02-22 15:56:14 -05:00
Mahmood Ali 216eaa4843 tests: port TestTaskRunner_VaultManager_Signal
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1427
2019-02-22 15:53:04 -05:00
Mahmood Ali 8e9e732319 tests: port TestTaskRunner_VaultManager_Restart
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1352
2019-02-22 15:53:04 -05:00
Mahmood Ali 33122ca7c0 tests: port TestTaskRunner_UnregisterConsul_Retries
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L620
2019-02-22 15:53:04 -05:00
Mahmood Ali 0128b0ce7a tests: port TestTaskRunner_Template_NewVaultToken
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1275
2019-02-22 15:53:04 -05:00
Mahmood Ali cfb80583af tests: port TestTaskRunner_Template_Artifact
From https://github.com/hashicorp/nomad/blob/v0.8.7/client/task_runner_test.go#L1195
2019-02-22 15:52:59 -05:00
Mahmood Ali 1b14214a88 tests: port TestAllocRunner_RetryArtifact
Port TestAllocRunner_RetryArtifact from https://github.com/hashicorp/nomad/blob/v0.8.7/client/alloc_runner_test.go#L610-L672

I changed the test name because it doesn't actually test that artifact
hooks is retried
2019-02-22 15:50:39 -05:00
Mahmood Ali c827e6e05a tests: port TestAllocRunner_MoveAllocDir test 2019-02-22 15:50:39 -05:00
Preetha 911c93f7bd
Merge pull request #5350 from hashicorp/b-json-logging-meta
Support json logging for CLI output for agent
2019-02-22 13:40:56 -06:00
Preetha Appan 8f9ec85fe6
fix import order 2019-02-22 13:40:13 -06:00
Charlie Voiselle 42718cd9f5 Created .github; Added CoC 2019-02-22 10:49:05 -08:00
Charlie Voiselle 006afdca9b Added comments
* caller should created eval id
* prev/next eval used in failed-follow-up
2019-02-22 10:22:52 -08:00
Michael Schurter 0da04cfb43
Merge pull request #5325 from hashicorp/test-port-08-2
Do not wrap errors from StartTask as that loses Recoverable
2019-02-22 07:06:01 -08:00
Danielle Tomlinson 8aff115fca docker: Support stats on Windows 2019-02-22 14:19:58 +01:00
Michael Schurter a2e3ea6dc9 logmon: fix reattach configuration
There were multiple bugs here:

1. Reattach unmarshalling always returned an error because you can't
   unmarshal into a nil pointer.
2. The hook data wasn't being saved because it was put on the request
   struct, not the response struct.
3. The plugin configuration should only have reattach *or* a command
   set. Not both.
4. Setting Done=true meant the hook was never re-run on agent restart so
   reattaching was never attempted.
2019-02-21 15:32:18 -08:00
Michael Schurter f5e0dba9d1 fingerprint: improve initial fingerpint message
The initial fingerprint message is actually fairly useful, so I bumped
it to Debug and fixed the output formatting.
2019-02-21 15:32:18 -08:00
Michael Schurter 38821954b7 plugins: squelch context Canceled error logs
As far as I can tell this is the most straightforward and resilient way
to skip error logging on context cancellation with grpc streams. You
cannot compare the error against context.Canceled directly as it is of
type `*status.statusError`. The next best solution I found was:

```go
resp, err := stream.Recv()
if code, ok := err.(interface{ Code() code.Code }); ok {
	if code.Code == code.Canceled {
		return
	}
}
```

However I think checking ctx.Err() directly makes the code much easier
to read and is resilient against grpc API changes.
2019-02-21 15:32:18 -08:00
Michael Schurter 01cabdff88 client: restart on recoverable StartTask errors
Fixes restarting on recoverable errors from StartTask.

Ports TestTaskRunner_Run_RecoverableStartError from 0.8 which discovered
the bug.
2019-02-21 15:30:49 -08:00
Michael Schurter e3f321cd27 test: port TestTaskRunner_RestartSignalTask_NotRunning from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter f3aa945a00 test: port TestTaskRunner_DriverNetwork from 0.8 2019-02-21 15:30:49 -08:00
Michael Schurter 518405ac33
Merge pull request #5322 from hashicorp/b-artifact-retries
Fix regression by restarting on artifact download errors
2019-02-21 15:28:51 -08:00
Omar Khawaja 084230a12b
reformat lxc guide (#5339)
* reformat lxc guide

* reformat lxc guide
2019-02-21 17:40:22 -05:00
Michael Schurter 3a9730a890
Merge pull request #5348 from hashicorp/docs-env-json
docs: clarify use of toJSON for passwords
2019-02-21 14:36:28 -08:00
Preetha Appan 3ab2e431b6
Move logger initialization to earlier step 2019-02-21 12:41:54 -06:00
Mahmood Ali 6d30284ec9
Merge pull request #5341 from hashicorp/ci-windows-docker
Run Docker tests in Windows AppVeyor CI
2019-02-21 13:17:33 -05:00
Michael Schurter 7dd6259e93 docs: clarify use of toJSON for passwords
Fixes #5347
2019-02-21 09:47:18 -08:00
Charlie Voiselle c28c195f42 Set NextEval when making `failed-follow-up` evals
This allows users to locate failed-follow-up evals more easily
2019-02-20 16:07:11 -08:00
Danielle Tomlinson b08e6e5925
Merge pull request #5332 from hashicorp/dani/docker_logger_container_leak
docker: Avoid leaking containers during Reattach
2019-02-20 22:53:06 +01:00
Michael Schurter b7fad990b7
Merge pull request #5336 from hashicorp/test-ar-destroy
tests: port TestAllocRunner_Destroy from 0.8
2019-02-20 13:07:51 -08:00
Michael Schurter 2553800eb8 tests: port TestAllocRunner_Destroy from 0.8
Also add destroy(ar) helper to fix a bunch of shutdown races in AR
tests.
2019-02-20 12:35:09 -08:00
Danielle Tomlinson 2610e2d9ef docker: Avoid leaking containers during Reattach
Currently if a docker_logger cannot be reattached to, we will leak the
container that was being used. This is problematic if e.g using static
ports as it means you can never recover your task, or if a service is
expensive to run and will then be running without supervision.
2019-02-20 17:47:06 +01:00
Michael Schurter 6580ed668e client: don't redownload completed artifacts on retries
Track the download status of each artifact independently so that if only
one of many artifacts fails to download, completed artifacts aren't
downloaded again.
2019-02-20 08:45:12 -08:00
Danielle Tomlinson 953755ce24
Merge pull request #5335 from hashicorp/dani/docker-logger-spawn
Increase resiliency of docker driver logging
2019-02-20 17:16:05 +01:00
Michael Schurter a1645edb0b Update drivers/docker/docklog/docker_logger.go
Co-Authored-By: dantoml <dani@tomlinson.io>
2019-02-20 17:12:56 +01:00
Danielle Tomlinson 2f18441a47 docker: Respawn docker logger during recovery
Sometimes the nomad docker_logger may be killed by a service manager
when restarting the client for upgrades or reliability reasons.

Currently if this happens, we leak the users container and try to
reschedule over it.

This commit adds a new step to the recovery process that will spawn a
new docker logger process that will fetch logs from _the current
timestamp_. This is to avoid restarting users tasks because our logging
sidecar has failed.
2019-02-20 17:12:56 +01:00
Michael Schurter 908bfab4c2 client: artifact errors are retry-able
0.9.0beta2 contains a regression where artifact download errors would
not cause a task restart and instead immediately fail the task.

This restores the pre-0.9 behavior of retrying all artifact errors and
adds missing tests.
2019-02-20 07:21:27 -08:00