Chris Baker
59fac48d92
alloc lifecycle: 404 when attempting to stop non-existent allocation
2019-06-20 21:27:22 +00:00
Preetha
586e50d1a4
Merge pull request #5841 from hashicorp/f-raft-snapshot-metrics
...
Raft and state store indexes as metrics
2019-06-19 12:01:03 -05:00
Preetha Appan
dc0ac81609
Change interval of raft stats collection to 10s
2019-06-19 11:58:46 -05:00
Preetha Appan
104d66f10c
Changed name of metric
2019-06-17 15:51:31 -05:00
Chris Baker
e0170e1c67
metrics: add namespace label to allocation metrics
2019-06-17 20:50:26 +00:00
Preetha Appan
c54b4a5b17
Emit metrics with raft commit and apply index and statestore latest index
2019-06-14 16:30:27 -05:00
Jasmine Dahilig
ed9740db10
Merge pull request #5664 from hashicorp/f-http-hcl-region
...
backfill region from hcl for jobUpdate and jobPlan
2019-06-13 12:25:01 -07:00
Jasmine Dahilig
51e141be7a
backfill region from job hcl in jobUpdate and jobPlan endpoints
...
- updated region in job metadata that gets persisted to nomad datastore
- fixed many unrelated unit tests that used an invalid region value
(they previously passed because hcl wasn't getting picked up and
the job would default to global region)
2019-06-13 08:03:16 -07:00
Nick Ethier
1b7fa4fe29
Optional Consul service tags for nomad server and agent services ( #5706 )
...
Optional Consul service tags for nomad server and agent services
2019-06-13 09:00:35 -04:00
Mahmood Ali
e31159bf1f
Prepare for 0.9.4 dev cycle
2019-06-12 18:47:50 +00:00
Nomad Release bot
4803215109
Generate files for 0.9.3 release
2019-06-12 16:11:16 +00:00
Mahmood Ali
07f2c77c44
comment DenormalizeAllocationDiffSlice applies to terminal allocs only
2019-06-12 08:28:43 -04:00
Lang Martin
fe8a4781d8
config merge maintains *HCL string fields used for duration conversion
2019-06-11 16:34:04 -04:00
Mahmood Ali
392f5bac44
Stop updating allocs.Job on stopping or preemption
2019-06-10 18:30:20 -04:00
Mahmood Ali
6c8e329819
test that stopped alloc jobs aren't modified
...
When an alloc is stopped, test that we don't update the job found in
alloc with new job that is no longer relevent for this alloc.
2019-06-10 17:14:26 -04:00
Mahmood Ali
d30c3d10b0
Merge pull request #5747 from hashicorp/b-test-fixes-20190521-1
...
More test fixes
2019-06-05 19:09:18 -04:00
Mahmood Ali
87173111de
Merge pull request #5746 from hashicorp/b-no-updating-inmem-node
...
set node.StatusUpdatedAt in raft
2019-06-05 19:05:21 -04:00
Mahmood Ali
97957fbf75
Prepare for 0.9.3 dev cycle
2019-06-05 14:54:00 +00:00
Nomad Release bot
43bfbf3fcc
Generate files for 0.9.2 release
2019-06-05 11:59:27 +00:00
Michael Schurter
073893f529
nomad: disable service+batch preemption by default
...
Enterprise only.
Disable preemption for service and batch jobs by default.
Maintain backward compatibility in a x.y.Z release. Consider switching
the default for new clusters in the future.
2019-06-04 15:54:50 -07:00
Michael Schurter
a8fc50cc1b
nomad: revert use of SnapshotAfter in planApply
...
Revert plan_apply.go changes from #5411
Since non-Command Raft messages do not update the StateStore index,
SnapshotAfter may unnecessarily block and needlessly fail in idle
clusters where the last Raft message is a non-Command message.
This is trivially reproducible with the dev agent and a job that has 2
tasks, 1 of which fails.
The correct logic would be to SnapshotAfter the previous plan's index to
ensure consistency. New clusters or newly elected leaders will not have
a previous plan, so the index the leader was elected should be used
instead.
2019-06-03 15:34:21 -07:00
Mahmood Ali
a4ead8ff79
remove 0.9.2-rc1 generated code
2019-05-23 11:14:24 -04:00
Nomad Release bot
6d6bc59732
Generate files for 0.9.2-rc1 release
2019-05-22 19:29:30 +00:00
Lang Martin
d46613ff44
structs check TaskGroup.Update for nil
2019-05-22 12:34:57 -04:00
Lang Martin
10a3fd61b0
comment replace COMPAT 0.7.0 for job.Update with more current info
2019-05-22 12:34:57 -04:00
Lang Martin
67ebcc47dd
structs comment todo DeploymentStatus & DeploymentStatusDescription
2019-05-22 12:34:57 -04:00
Lang Martin
21bf9fdf90
structs job warnings for taskgroup with mixed auto_promote settings
2019-05-22 12:34:57 -04:00
Lang Martin
0f6f543a5f
deployment_watcher auto promote iff every task group is auto promotable
2019-05-22 12:34:57 -04:00
Lang Martin
d27d6f8ede
structs validate requires Canary for AutoPromote
2019-05-22 12:32:08 -04:00
Lang Martin
0c668ecc7a
log error on autoPromoteDeployment failure
2019-05-22 12:32:08 -04:00
Lang Martin
f23f9fd99e
describe a pending deployment without auto_promote more explicitly
2019-05-22 12:32:08 -04:00
Lang Martin
34230577df
describe a pending deployment with auto_promote accurately
2019-05-22 12:32:08 -04:00
Lang Martin
b5fd735960
add update AutoPromote bool
2019-05-22 12:32:08 -04:00
Lang Martin
3c5a9fed22
deployments_watcher_test new TestWatcher_AutoPromoteDeployment
2019-05-22 12:32:08 -04:00
Lang Martin
0bebf5d7f8
deployment_watcher when it's ok to autopromote, do so
2019-05-22 12:32:08 -04:00
Lang Martin
0cf4168ed9
deployments_watcher comments
2019-05-22 12:32:08 -04:00
Lang Martin
0c403eafde
state_store typo in a comment
2019-05-22 12:32:08 -04:00
Lang Martin
e1e28307be
new deploymentwatcher/doc.go for package level documentation
2019-05-22 12:32:08 -04:00
Mahmood Ali
9ff5f163b5
update callers in tests
2019-05-21 21:10:17 -04:00
Mahmood Ali
6bdbeed319
set node.StatusUpdatedAt in raft
...
Fix a case where `node.StatusUpdatedAt` was manipulated directly in
memory.
This ensures that StatusUpdatedAt is set in raft layer, and ensures that
the field is updated when node drain/eligibility is updated too.
2019-05-21 16:13:32 -04:00
Mahmood Ali
2159d0f3ac
tests: fix some nomad/drainer test data races
2019-05-21 14:40:58 -04:00
Mahmood Ali
3b0152d778
tests: fix deploymentwatcher tests data races
2019-05-21 14:29:45 -04:00
Michael Schurter
689794e08d
nomad: fix deadlock in UnblockClassAndQuota
...
Previous commit could introduce a deadlock if the capacityChangeCh was
full and the receiving side exited before freeing a slot for the sending
side could send. Flush would then block forever waiting to acquire the
lock just to throw the pending update away.
The race is around getting/setting the chan field, not chan operations,
so only lock around getting the chan field.
2019-05-20 15:41:52 -07:00
Michael Schurter
8c99214f69
nomad: fix race in BlockedEvals
...
I assume the mutex was being released before sending on capacityChangeCh
to avoid blocking in the critical section, but:
1. This is race.
2. capacityChangeCh has a *huge* buffer (8096). If it's full things
already seem Very Bad, and a little backpressure seems appropriate.
2019-05-20 15:26:20 -07:00
Michael Schurter
05a9c6aedb
Merge pull request #5411 from hashicorp/b-snapshotafter
...
Block plan application until state store has caught up to raft
2019-05-20 14:03:10 -07:00
Mahmood Ali
cd64ada95d
Run TestClientAllocations_Restart_ACL test
2019-05-17 20:30:23 -04:00
Michael Schurter
0e39927782
nomad: emit more detailed error
...
Avoid returning context.DeadlineExceeded as it lacks helpful information
and is often ignored or handled specially by callers.
2019-05-17 14:37:42 -07:00
Michael Schurter
b80a7e0feb
nomad: wait for state store to sync in plan apply
...
Wait for state store to catch up with raft when applying plans.
2019-05-17 14:37:12 -07:00
Michael Schurter
1bc731da47
nomad: remove unused NotifyGroup struct
...
I don't think it's been used for a long time.
2019-05-17 13:30:23 -07:00
Michael Schurter
9732bc37ff
nomad: refactor waitForIndex into SnapshotAfter
...
Generalize wait for index logic in the state store for reuse elsewhere.
Also begin plumbing in a context to combine handling of timeouts and
shutdown.
2019-05-17 13:30:23 -07:00