Michael Schurter
f92ffe5af5
Merge pull request #3105 from hashicorp/f-876-restart-unhealthy
...
Restart unhealthy tasks
2017-09-17 19:38:32 -07:00
epipho
a16c97394f
Fix incorrect docker stats
2017-09-16 00:43:03 -04:00
Michael Schurter
67a4a169a9
Name const after what it represents
2017-09-15 14:57:18 -07:00
Michael Schurter
79a7bf3d7c
Cleanup and test restart failure code
2017-09-15 14:54:37 -07:00
Michael Schurter
06ca379da0
Add comments
2017-09-15 14:34:36 -07:00
Michael Schurter
4dbaa52aba
Fold SetFailure into SetRestartTriggered
2017-09-14 16:48:39 -07:00
Michael Schurter
ed77c0944b
DRY up restart handling a bit.
...
All 3 error/failure cases share restart logic, but 2 of them have
special cased conditions.
2017-09-14 16:48:39 -07:00
Michael Schurter
73fb71ca10
RestartDelay isn't needed as checks are re-added on restarts
...
@dadgar made the excellent observation in #3105 that TaskRunner removes
and re-registers checks on restarts. This means checkWatcher doesn't
need to do *any* internal restart tracking. Individual checks can just
remove themselves and be re-added when the task restarts.
2017-09-14 16:48:39 -07:00
Michael Schurter
06dd86adbd
Remove unused lastStart field
2017-09-14 16:47:41 -07:00
Michael Schurter
0447f79288
Removed partially implemented allocLock
2017-09-14 16:47:41 -07:00
Michael Schurter
ade29ecbed
Improve check watcher logging and add tests
...
Also expose a mock Consul Agent to allow testing ServiceClient and
checkWatcher from TaskRunner without actually talking to a real Consul.
2017-09-14 16:47:41 -07:00
Michael Schurter
a137676358
Add comments and move delay calc to TaskRunner
2017-09-14 16:46:54 -07:00
Michael Schurter
8a87475498
Use existing restart policy infrastructure
2017-09-14 16:46:54 -07:00
Michael Schurter
22690c5f4c
Add check watcher for restarting unhealthy tasks
2017-09-14 16:46:54 -07:00
Alex Dadgar
d306da846c
changelog and feedback
2017-09-14 14:08:58 -07:00
Alex Dadgar
07ed83fdd5
Non-locked accessors to common Node fields
...
This PR removes locking around commonly accessed node attributes that do
not need to be locked. The locking could cause nodes to TTL as the
heartbeat code path was acquiring a lock that could be held for an
excessively long time. An example of this is when Vault is inaccessible,
since the fingerprint is run with a lock held but the Vault
fingerprinter makes the API calls with a large timeout.
Fixes https://github.com/hashicorp/nomad/issues/2689
2017-09-14 14:08:26 -07:00
Chelsea Komlo
536d38454b
Merge pull request #3191 from hashicorp/b-tagged-metrics-panic
...
Fix panic in emitting tagged allocation metrics
2017-09-11 14:28:50 -04:00
Armon Dadgar
d4aed839d2
Merge pull request #3185 from hashicorp/f-acl-reset
...
Add ability to reset ACL bootstrap process
2017-09-11 10:47:17 -07:00
Armon Dadgar
3d5ecaafff
Address @dadgar feedback
2017-09-11 10:30:59 -07:00
Alex Dadgar
b3958faa14
Merge pull request #3187 from hashicorp/b-windows-docker
...
Fix MemorySwappiness on Windows Docker
2017-09-11 09:56:49 -07:00
Alex Dadgar
1cd8f7523f
Merge pull request #3184 from hashicorp/b-docker-logging
...
Fix docker user specified syslogging
2017-09-11 09:31:33 -07:00
Chelsea Holland Komlo
848af92183
fix panic in emitting tagged metrics
2017-09-11 15:32:37 +00:00
Alex Dadgar
d3a9463358
Fix MemorySwappiness on Windows Docker
...
Fixes https://github.com/hashicorp/nomad/issues/3181
2017-09-10 17:46:45 -07:00
Alex Dadgar
3ec7946b3e
Fix invalid CPU stats on Windows
...
This PR fixes an issue introduced in Nomad 0.6.0 due to
https://github.com/shirou/gopsutil/issues/420 . The issue arised from the
fact that the Windows stats from gopsutil reports CPUs in
percentages where we expected ticks.
2017-09-10 15:30:48 -07:00
Alex Dadgar
637ae9580a
Fix docker user specified syslogging
2017-09-10 14:57:48 -07:00
James Nugent
448145872f
client: Guard against "NaN" values from floats
...
This commit protects against finding `0.NaN` tokens in JSON streams
because of infinity representation on serialization.
2017-09-08 16:21:07 -05:00
Alex Dadgar
31f9e099d9
Merge pull request #3148 from clinta/purge-stopped
...
Always purge stopped containers
2017-09-05 17:18:05 -07:00
Alex Dadgar
6fdaf38389
Fix repo name passed to docker credential helpers
...
This PR fixes the server url passed to docker credential helpers and
fixes stderr capture.
Fixes https://github.com/hashicorp/nomad/issues/2957
2017-09-05 16:43:21 -07:00
Alex Dadgar
21564c7c04
Parse Docker mounts correctly ( #3163 )
...
* Parse Docker mounts correctly
This PR fixes the parsing of Docker mounts and adds testing to ensure no
regressions.
Fixes https://github.com/hashicorp/nomad/issues/3156
* Review feedback
2017-09-05 14:02:57 -07:00
Chelsea Holland Komlo
0ef43c3c5f
final code review fixups
2017-09-05 18:47:44 +00:00
Chelsea Holland Komlo
dea1fa089b
fix up travis test failure via race condition
2017-09-05 15:04:59 +00:00
Chelsea Holland Komlo
a8cbd0b559
fixups from code review
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
f72e4aad13
labels depend on full setup of client beforehand
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
87a814397d
refactor to use baseLabels
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
b2953d905a
pass in commonly used values
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
c634043069
create base labels to be used in every metric
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
f5ea83da8d
emit metrics using labels, add option for backwards compatibility
2017-09-05 14:12:57 +00:00
Chelsea Holland Komlo
0175f80775
add metrics options to client config
2017-09-05 14:12:57 +00:00
Armon Dadgar
b8bf35f087
ACL RPCs allow stale reads for scalability
2017-09-04 13:07:44 -07:00
Armon Dadgar
f31cd6a618
client: fixing policy resolution after ACL endpoint enforcement
2017-09-04 13:05:53 -07:00
Armon Dadgar
ddcc5f89bc
Add ErrPermissionDenied, rename TokenNotFound
2017-09-04 13:05:53 -07:00
Armon Dadgar
76a03f2d8e
Address @dadgar feedback
2017-09-04 13:05:53 -07:00
Armon Dadgar
e3f32ca6f1
client: adding token resolution logic
2017-09-04 13:05:36 -07:00
Armon Dadgar
688897561b
client: adding token cache for ACL resolution
2017-09-04 13:05:36 -07:00
Armon Dadgar
c2e72e8a9c
client: create ACL and Policy cache
2017-09-04 13:05:35 -07:00
Armon Dadgar
792f176a44
agent: thread ACL config to client
2017-09-04 13:04:45 -07:00
Clint Armstrong
b5c2636313
Always purge stopped containers
2017-08-31 14:28:48 -04:00
Clint Armstrong
7e35ab6abb
fix logging re-init
2017-08-30 12:36:31 -04:00
Michael Schurter
78823d559b
Squelch logspam when unable to get disk usage stats
...
To reproduce logspam:
```
$ docker plugin install --grant-all-permissions vieux/sshfs
$ nomad agent -dev
...
2017/08/25 17:09:03.282868 [WARN] client: error fetching host disk usage stats for /var/lib/docker/plugins/a8b4a69b07e5180f828d19e1e9e102ccc0e26f9c9939eaef85357260c30b20a7/rootfs/mnt/volumes: permission denied
... repeats every collection period ...
```
2017-08-28 12:04:32 -07:00
Alex Dadgar
876732833f
Merge pull request #3073 from clinta/docker-500
...
Allow retry of 500 API errors to be handled by restart policies
2017-08-24 16:57:36 -07:00