Preetha Appan
9fed0d2103
Get reschedule policy from the alloc directly
2018-03-14 16:10:32 -05:00
Preetha Appan
e89bbf7289
Update comment about WaitTime
2018-03-14 16:10:32 -05:00
Preetha Appan
e2656ef546
Cleaner handling of batched evals
2018-03-14 16:10:32 -05:00
Preetha Appan
47e0280d96
More small review feedback
2018-03-14 16:10:32 -05:00
Preetha Appan
2ba976dec8
Remove unnecessary check against 5 second window for determining immediate scheduling eligibility
2018-03-14 16:10:32 -05:00
Preetha Appan
5373ade731
Scheduler and Reconciler changes to support delayed rescheduling
2018-03-14 16:10:32 -05:00
Josh Soref
e0f6a33fe5
spelling: system
2018-03-11 19:01:19 +00:00
Josh Soref
a89e1b8395
spelling: strategy
2018-03-11 18:58:19 +00:00
Josh Soref
f8eb766fb5
spelling: reschedulable
2018-03-11 18:48:12 +00:00
Josh Soref
ed8db9992e
spelling: feasibility
2018-03-11 18:07:09 +00:00
Josh Soref
bf9283c606
spelling: corresponding
2018-03-11 17:51:41 +00:00
Josh Soref
ca4ceb0e5c
spelling: commits
2018-03-11 17:47:45 +00:00
Preetha Appan
7b6ba7a1f4
Fixes bug in reconciler where previously rescheduled allocs are rescheduled again. Simplified logic and added test case to catch this.
2018-02-20 12:07:56 -06:00
Preetha Appan
7c57303dd2
Clarify comment
2018-02-05 16:37:07 -06:00
Preetha Appan
d48c411692
Reconciler should consider failed allocs when marking deployment as failed.
2018-02-02 19:40:25 -06:00
Preetha Appan
a1237d627a
code review feedback
2018-01-31 09:58:05 -06:00
Preetha Appan
5ad892026a
Add a field to track the next allocation during a replacement
2018-01-31 09:58:05 -06:00
Preetha Appan
2ed4de7e7b
Track previous node id correctly, plus unit test
2018-01-31 09:58:05 -06:00
Preetha Appan
dd4917c2f0
Add more clarification in comment
2018-01-31 09:58:05 -06:00
Preetha Appan
09bef7d1ce
Preallocate slice for skipped nodes
2018-01-31 09:58:05 -06:00
Preetha Appan
237beb49ae
Better score threshold
2018-01-31 09:58:05 -06:00
Preetha Appan
fa18c0def4
Add one more unit test
2018-01-31 09:58:05 -06:00
Preetha Appan
a75540cec6
Limit iterator uses a score threshold and a maxSkip value to be able to skip lower scoring nodes
2018-01-31 09:58:05 -06:00
Preetha Appan
b6268a5fab
Beef up unit test for rescheduling batch jobs
2018-01-31 09:56:53 -06:00
Preetha Appan
ea4a889e28
Address more code review feedback
2018-01-31 09:56:53 -06:00
Preetha Appan
bd89d2b39e
Make sure that reschedule trackers are not added for node drain replacements
2018-01-31 09:56:53 -06:00
Preetha Appan
a662b38801
Improve reconciler unit tests
2018-01-31 09:56:53 -06:00
Preetha Appan
fee4ccf154
Prevent side effect modification of select options when preferred nodes are set
2018-01-31 09:56:53 -06:00
Preetha Appan
21b7b79d5d
Add helper methods, use require and other code review feedback
2018-01-31 09:56:53 -06:00
Preetha Appan
d0f9d59abb
Reconile with changes to structs for reschedule tracking
2018-01-31 09:56:53 -06:00
Preetha Appan
fbb1936dee
Fix some comments and lint warnings, remove unused method
2018-01-31 09:56:53 -06:00
Preetha Appan
031c566ada
Reschedule previous allocs and track their reschedule attempts
2018-01-31 09:56:53 -06:00
Preetha Appan
fd2fbefa4c
Add a field to track the next allocation during a replacement
2018-01-24 17:55:05 -06:00
Alex Dadgar
6dda0ebaed
gofmt
2018-01-04 14:45:15 -08:00
Alex Dadgar
2f561609b7
Fix detection of successful batch allocations
...
This PR restores older behavior of detecting successful batch
allocations (04d86ffd1006fde9dfb2ca8c1237fe60b995b0e3). This has the
side effect that we correctly filter desired status stop but not
successful batch allocations and create their replacements.
2018-01-04 14:20:32 -08:00
Preetha
1712b03705
Merge branch 'master' into 0.8
2018-01-03 16:06:38 -06:00
Preetha Appan
51bd0b59c7
Return an error if evaluation doesn't exist in state store at plan apply time.
2017-12-18 14:55:36 -06:00
Preetha Appan
3c36abfe14
Update eval modify index as part of plan apply.
2017-12-18 10:03:55 -06:00
Preetha Appan
3b4d7ac2a3
Fix some typos
2017-12-14 13:29:27 -06:00
Michael Schurter
45494f7304
Fix port labels on mock Alloc/Job/Node
2017-12-08 14:50:06 -08:00
Alex Dadgar
44240ce440
Merge pull request #3375 from hashicorp/b-batch
...
Allow batch jobs to be rerun if purged
2017-10-13 17:11:45 -07:00
Alex Dadgar
c1cc51dbee
sync
2017-10-13 14:36:02 -07:00
Alex Dadgar
746cd7403f
Allow batch jobs to be rerun if purged
...
This PR allows batch jobs to be rerun if they have been purged.
2017-10-13 12:40:37 -07:00
Michael Schurter
a66c53d45a
Remove `structs` import from `api`
...
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar
4173834231
Enable more linters
2017-09-26 15:26:33 -07:00
Alex Dadgar
3904bde9a3
Fix batch handling of complete allocs/node drains
...
This PR fixes:
* An issue in which a node-drain that contains a complete batch alloc
would cause a replacement
* An issue in which allocations with the same name during a scale
down/stop event wouldn't be properly stopped.
* An issue in which batch allocations from previous job versions may not
have been stopped properly.
Fixes https://github.com/hashicorp/nomad/issues/3210
2017-09-14 15:08:57 -07:00
Alex Dadgar
84d06f6abe
Sync namespace changes
2017-09-07 17:04:21 -07:00
Alex Dadgar
0aef02a4f9
fix test
2017-08-21 14:07:54 -07:00
Alex Dadgar
27256ebcc6
Placing allocs counts towards placement limit
...
This PR makes placing new allocations count towards the limit. We do not
restrict how many new placements are made by the limit but we still
count towards the limit. This has the nice affect that if you have a
group with count = 5 and max_parallel = 1 but only 3 allocs exist for it
and a change is made, you will create 2 more at the new version but not
destroy one, taking you down to two running as you would have
previously.
Fixes https://github.com/hashicorp/nomad/issues/3053
2017-08-21 12:41:19 -07:00
Alex Dadgar
2453f13fc5
fixes
2017-08-15 12:27:05 -07:00
Alex Dadgar
0570e09feb
Fix panic occuring from improper bitmap size
...
This PR fixes an allignment calculation when determining the bitmap
size.
Fixes https://github.com/hashicorp/nomad/issues/3008
2017-08-12 15:37:02 -07:00
Luke Farnell
f0ced87b95
fixed all spelling mistakes for goreport
2017-08-07 17:13:05 -04:00
Alex Dadgar
7b13c0d702
Lost allocs replaced even if deployment failed
...
This PR allows the scheduler to replace lost allocations even if the job
has a failed or paused deployment. The prior behavior was confusing to
users.
Fixes https://github.com/hashicorp/nomad/issues/2958
2017-08-03 17:42:14 -07:00
Alex Dadgar
7d2b84ab01
Review fixes
2017-08-01 14:18:52 -07:00
Alex Dadgar
2650bb1d12
Distinct Property supports arbitrary limit
...
This PR enhances the distinct_property constraint such that a limit can
be specified in the RTarget/value parameter. This allows constraints
such as:
```
constraint {
distinct_property = "${meta.rack}"
value = "2"
}
```
This restricts any given rack from running more than 2 allocations from
the task group.
Fixes https://github.com/hashicorp/nomad/issues/1146
2017-07-31 16:52:13 -07:00
Alex Dadgar
4f69355a66
Fix incorrect destructive update with distinct_property constraint
...
This PR fixes an issue in which an update to a task group with a
distinct property constraint would result in an incorrect destructive
update.
2017-07-31 11:17:35 -07:00
Michael Schurter
5f1f91a46c
Use go-testing-interface instead of testing
...
This drops the testings stdlib pkg from our dependencies. Saves a
whopping 46kb on our binary (was really hoping for more of a win there),
but also avoids potential ugliness with how testing sets flags.
2017-07-25 15:35:19 -07:00
Alex Dadgar
492239d3ee
Improve multiple group handling in a deployment
...
This PR resolves a bug in which a job with multiple task groups would
create new deployment objects each, thus clearing out all other task
groups deployment state.
2017-07-25 11:27:47 -07:00
Alex Dadgar
184bfd4836
Better comment
2017-07-20 12:31:08 -07:00
Alex Dadgar
248315a2d9
Handle destructive changes before placements
...
This PR updates the generic scheduler to handle destructive changes
before handling placements. This is important because the destructive
change may be due to a lowering of resources. If this is the case, the
handling of the destructive changes first may make it possible for the
placement to happen.
To reason about this imagine there is one node with CPU = 500.
If the group originally had:
* `count = 1`
* `cpu = 400`
And then the job was updated such that the group had:
* `count = 4`
* `cpu = 120`
If the original alloc isn't discounted first, nothing would be able to
place.
2017-07-20 12:24:27 -07:00
Alex Dadgar
ce265e0aff
Update full node test to test more advanced case
2017-07-20 12:23:40 -07:00
Alex Dadgar
a9ec1d6ca7
Fix update limit calculation to avoid panic
...
This PR fixes the rolling update limit calculation to avoid a panic when
there are more allocations for a deployment that haven't determined
their health than the max_parallel count of the task group.
Fixes https://github.com/hashicorp/nomad/issues/2820
2017-07-19 11:11:47 -07:00
Alex Dadgar
22e84d00ab
Fix deep copy of driver config
2017-07-17 17:53:21 -07:00
Alex Dadgar
641e178416
Stop before trying to place
2017-07-17 17:18:12 -07:00
Alex Dadgar
66a90326e1
Treat destructive updates atomically
2017-07-16 10:35:38 -07:00
Alex Dadgar
f86760db3c
Basic logs
2017-07-07 16:49:08 -07:00
Alex Dadgar
20005f925a
Rolling node drains using max_parallel and stagger
...
This PR adds rolling node drains done at max_parallel and stagger of the
update spec. It brings it inline with old behavior.
2017-07-07 12:12:48 -07:00
Alex Dadgar
3a29b38108
Status description shows requiring promotion
2017-07-07 12:12:48 -07:00
Alex Dadgar
9f016606aa
Fix some tests, eval monitor shows deployment id and deployment cancels based on version
2017-07-07 12:12:48 -07:00
Alex Dadgar
9aa1f2fea2
Respond to comments
2017-07-07 12:10:04 -07:00
Alex Dadgar
454083ba1b
Remove canary
2017-07-07 12:10:04 -07:00
Alex Dadgar
d352d85bb9
Test scheduler's handling of canaries/inplace updates
2017-07-07 12:10:04 -07:00
Alex Dadgar
83c60483f2
Test marking as complete
2017-07-07 12:10:04 -07:00
Alex Dadgar
477c713df5
Plan apply handles canaries and success is set via update
2017-07-07 12:10:04 -07:00
Alex Dadgar
1e8b5e75a5
Fix handling of failed job
2017-07-07 12:10:04 -07:00
Alex Dadgar
e229d3650b
Attach eval id
2017-07-07 12:10:04 -07:00
Alex Dadgar
af1935e1e1
Mark complete
2017-07-07 12:10:04 -07:00
Alex Dadgar
8424a3b380
Change canary handling
2017-07-07 12:10:04 -07:00
Alex Dadgar
c10d7ab871
Remove promoted bit from allocation
2017-07-07 12:10:04 -07:00
Alex Dadgar
09dfa2fc10
Rename CreateDeployments and remove cancelling behavior in state_store
2017-07-07 12:10:04 -07:00
Alex Dadgar
067ed86a47
Client watches for allocation health using task state and Consul checks
...
This PR adds watching of allocation health at the client. The client can
watch for health based on the tasks running on time and also based on
the consul checks passing.
2017-07-07 12:10:04 -07:00
Alex Dadgar
e7034691ea
deployment status
2017-07-07 12:07:07 -07:00
Alex Dadgar
d04877d23c
initial impl
2017-07-07 12:03:11 -07:00
Alex Dadgar
27a6e6b6d1
update description of the alloc update factory function
2017-07-07 12:03:11 -07:00
Alex Dadgar
ce2319be9b
cleanup limit detection
2017-07-07 12:03:11 -07:00
Alex Dadgar
b2573b01f9
Fix canary handling
2017-07-07 12:03:11 -07:00
Alex Dadgar
7952240d69
Deployment tests
2017-07-07 12:03:11 -07:00
Alex Dadgar
ce55559f12
Non-Canary/Deployment Tests
2017-07-07 12:03:11 -07:00
Alex Dadgar
d111dd5c10
Pull out in-place updating into a passed in function; reduce inputs to reconciler
2017-07-07 12:03:11 -07:00
Alex Dadgar
c77944ed29
assign names
2017-07-07 12:03:11 -07:00
Alex Dadgar
ecacd44888
handle batch filtering
2017-07-07 12:03:11 -07:00
Alex Dadgar
4c123500ee
Remove old
2017-07-07 12:03:11 -07:00
Alex Dadgar
270e26c600
Populate desired state per tg
2017-07-07 12:03:11 -07:00
Alex Dadgar
23dcd175ef
Show canaries on plan
2017-07-07 12:03:11 -07:00
Alex Dadgar
cf5baba808
handle annotations
2017-07-07 12:03:11 -07:00
Alex Dadgar
a46f7c3eb8
Todos
2017-07-07 12:03:11 -07:00
Alex Dadgar
00d962b8b5
Some comments and cleanup
2017-07-07 12:03:11 -07:00
Alex Dadgar
994ad285b7
Split reconcile file
2017-07-07 12:03:11 -07:00
Alex Dadgar
07b1c3e5db
Only upsert a job if the spec changes and push deployment creation into reconciler
2017-07-07 12:03:11 -07:00
Alex Dadgar
0d42b5d421
initial reconciler
2017-07-07 12:01:17 -07:00