Alex Dadgar
3183153315
Node resources on client
2018-09-29 17:23:41 -07:00
Alex Dadgar
cc92cd92cd
Merge pull request #4642 from hashicorp/b-vet
...
Fix vet errors and use newer go version in travis
2018-09-04 17:04:02 -07:00
Alex Dadgar
c6576ddac1
Fix make check errors
2018-09-04 16:03:52 -07:00
Preetha Appan
751c0eb5a5
code review feedback
2018-09-04 16:10:11 -05:00
Preetha Appan
9bc0962527
Track top k nodes by norm score rather than top k nodes per scorer
2018-09-04 16:10:11 -05:00
Preetha Appan
6ed527c636
Use heap to store top K scoring nodes.
...
Scoring metadata is now aggregated by scorer type to make it easier
to parse when reading it in the CLI.
2018-09-04 16:10:11 -05:00
Preetha Appan
659cfa3f64
Parsing and API layer for spread stanza
2018-09-04 16:10:11 -05:00
Preetha Appan
1c0b123777
Fix test
2018-09-04 16:10:11 -05:00
Preetha Appan
c407e3626f
More review comments
2018-09-04 16:10:11 -05:00
Preetha Appan
4b3b618e4a
Remove unused field
2018-09-04 16:10:11 -05:00
Preetha Appan
9f0caa9c3d
Affinity parsing, api and structs
2018-09-04 16:10:11 -05:00
Nick Ethier
41e010cdc2
nomad: add 'Dispatch' field to Job
...
New -bash: Dispatch: command not found field is used to denote if the Job is a child dispatched job of
a parameterized job.
2018-06-11 11:59:03 -04:00
Alex Dadgar
72effb8632
code review
2018-06-06 14:52:26 -07:00
Alex Dadgar
f4fccd7ed2
Monitoring non-draining node exits
2018-06-05 17:58:44 -07:00
Alex Dadgar
7f25fcc1bd
Merge pull request #4354 from hashicorp/b-job-modify
...
Deployment adds JobSpecModifyIndex
2018-05-31 17:57:38 +00:00
Alex Dadgar
f2b2e0482b
code review fixes
2018-05-31 10:57:08 -07:00
Preetha Appan
4f835790d7
Set node eligibility to true when old client calls disable
2018-05-30 16:54:07 -05:00
Alex Dadgar
195e19827b
Deployment adds JobSpecModifyIndex
...
Deployment tracks the Job.JobModifyIndex so that PUTS against /v1/jobs
can be more easily coorelated with the created deployment.
Fixes https://github.com/hashicorp/nomad/issues/4301
2018-05-30 11:33:56 -07:00
Nick Ethier
4b64db3a0f
api: emit different monitor message if node's drain strategy is never set
2018-05-24 06:39:09 -04:00
Preetha
159888a856
Merge pull request #4274 from hashicorp/f-force-rescheduling
...
Add CLI and API support for forcing rescheduling of failed allocs
2018-05-21 16:24:22 -07:00
Preetha Appan
4a400f045b
Fix docs and method documentation in API
2018-05-21 17:20:59 -05:00
Preetha Appan
3a8040e36f
Add new method EvaluateWithOptions to avoid breaking go API client
2018-05-11 14:18:53 -05:00
Preetha Appan
b12df3c64b
Added CLI for evaluating job given ID, and modified client API for evaluate to take a request payload
2018-05-09 15:04:27 -05:00
Chelsea Holland Komlo
d51611040f
Add driver health information to node list stub
2018-05-09 11:21:54 -04:00
Alex Dadgar
f4af30fbb5
Canary tags structs
2018-05-07 14:50:01 -05:00
Alex Dadgar
f95ab4ade8
Mark canaries on creation, and unmark on promotion
2018-05-07 14:50:01 -05:00
Alex Dadgar
224b3092ae
change default to 10m and docs
2018-05-07 14:50:01 -05:00
Alex Dadgar
8a81038cdb
Set Reschedule from deployment watcher
2018-05-07 14:50:01 -05:00
Alex Dadgar
e5caaf3358
Small test fix
2018-05-07 14:50:01 -05:00
Alex Dadgar
1336002255
Progress deadline in deployment state
2018-05-07 14:50:01 -05:00
Alex Dadgar
ee50789c22
Initial implementation
2018-05-07 14:50:01 -05:00
Michael Schurter
0d534d30d6
Merge pull request #4251 from hashicorp/f-grpc-checks
...
Support Consul gRPC Health Checks
2018-05-04 14:55:16 -07:00
Michael Schurter
70b02875b7
Merge pull request #4234 from hashicorp/b-4159
...
Fix race in StreamFramer and truncation in api/AllocFS.Logs
2018-05-04 14:24:07 -07:00
Michael Schurter
f6a4713141
consul: make grpc checks more like http checks
2018-05-04 11:08:11 -07:00
Michael Schurter
382caec1e1
consul: initial grpc implementation
...
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Michael Schurter
526af6a246
framer: fix early exit/truncation in framer
2018-05-02 10:46:16 -07:00
Michael Schurter
949938534b
api: never return EOF from Logs error chan
...
Closing the frames chan is the only race-free way to signal to receivers
that all frames have been sent and no errors have occurred.
If EOF is sent on error chan receivers may not receive the last frame
(or frames since the chan is buffered) before receiving the error.
Closing frames is the idiomatic way of signaling there is no more data
to be read from a chan.
2018-05-02 10:46:16 -07:00
Michael Schurter
6f0e2e808b
tests: test logs from client<->api package
2018-05-02 10:46:16 -07:00
Preetha Appan
274bed1892
Add RescheduleTracker to allocs list stub struct
2018-05-01 14:53:47 -05:00
Alex Dadgar
15ad3f94af
Fix command line
2018-04-26 15:46:22 -07:00
James Rasell
b7c2ce2991
Update node-drain logging message to clearer for operators.
...
This change updates the console log message when performing a node
drain and particulary when a node has marked all allocs for
migration. Previously it logged 'drain complete' which was a little
confusing to operators as the node is not drained at this point.
Closes #4183
2018-04-24 07:50:01 +01:00
Nick Ethier
1f99c3a0f7
minor code review fixes to api/jobs
2018-04-17 10:18:36 -04:00
Nick Ethier
03a89060cf
api: add test for canonicalized jobs/parse
2018-04-16 19:21:09 -04:00
Nick Ethier
de4176606d
command/agent: add Canonicalize option to parse args
2018-04-16 19:21:09 -04:00
Nick Ethier
f2db03e56c
command/agent: add /v1/jobs/parse endpoint
...
The parse endpoint accepts a hcl jobspec body within a json object
and returns the parsed json object for the job. This allows users to
register jobs with the nomad json api without specifically needing
a nomad binary to parse their hcl encoded jobspec file.
2018-04-16 19:21:06 -04:00
Preetha
bdc17ebf10
Merge pull request #4139 from hashicorp/b-reschedule-invalid-system-jobs
...
Make system jobs fail validation if they contain a reschedule stanza
2018-04-11 20:01:19 -05:00
Preetha Appan
db549d1388
add canonicalize for reschedulepolicy to simplify validation logic
2018-04-11 18:47:27 -05:00
Preetha Appan
ae1419826e
Always merge with default reschedule policy if its not nil
2018-04-11 15:26:01 -05:00
Preetha Appan
a7b7b662ed
Make system jobs fail validation if they contain a reschedule stanza
2018-04-11 14:56:20 -05:00
Charlie Voiselle
ba88f00ccb
Changed "til" to "until"
...
Should be "till" or "until"; chose "until" because it is unambiguous as to meaning.
2018-04-11 12:36:28 -05:00
Alex Dadgar
3c51d5a5ea
Fix ineffectual assignment
2018-04-05 11:29:39 -07:00
Alex Dadgar
f1d3b47499
Don't assume the read index won't be zero if no jobs have been registered
2018-04-03 18:24:59 -07:00
Michael Schurter
7046db8818
cli: remove unreachable drain message
2018-03-30 14:15:12 -07:00
Michael Schurter
f912cd4272
cli: log if a node goes down during draining
2018-03-30 14:02:42 -07:00
Michael Schurter
06874d8b3d
drain: fix monitor node index handling
2018-03-30 12:43:53 -07:00
Michael Schurter
7199a2b960
cli: differentiate normal output vs info
2018-03-30 11:42:11 -07:00
Michael Schurter
0260bda046
cli: add color to drain output
2018-03-30 11:15:12 -07:00
Michael Schurter
6e10e0f84e
drain: fix cli blocking when allocs already stopped
2018-03-30 10:18:14 -07:00
Michael Schurter
ee3eddbac3
drain: block cli until all allocs stop
...
Before the drain CLI would block until the node was marked as completing
drain operations. While technically correct, it could lead operators (or
more likely: scripts) to shutdown drained nodes before all of its
allocations had *actually* terminated.
This change makes the CLI block until all allocations have terminated
(unless ignoring system jobs).
2018-03-29 10:56:09 -07:00
Alex Dadgar
de4b3772f1
Create evals for system jobs when drain is unset
...
This PR creates evals for system jobs when:
* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Chelsea Holland Komlo
003bc209b9
use time.Time for node events for compatibility
2018-03-27 15:43:57 -04:00
Preetha Appan
33e170c15d
s/linear/constant/g
2018-03-26 14:45:09 -05:00
Alex Dadgar
d7f246efe1
Drain cli, api, http
2018-03-21 20:27:32 -07:00
Michael Schurter
a854c7bdae
docs: improve DrainRequest.MarkEligible comment
2018-03-21 16:55:22 -07:00
Michael Schurter
a7ab75d853
test: index no longer guaranteed on job list
...
Also switch to require and add t.Helper to appropriate funcs.
2018-03-21 16:55:22 -07:00
Michael Schurter
1cc012966b
api: fix tests to expect default migrate strategy
2018-03-21 16:51:45 -07:00
Michael Schurter
2832853bfa
Add DesiredTransition.ShouldMigrate to api pkg
2018-03-21 16:51:45 -07:00
Alex Dadgar
92b636dd32
Fix deadline handling
2018-03-21 16:51:44 -07:00
Alex Dadgar
7b2bad8c5e
Toggle Drain allows resetting eligibility
...
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar
02019f216a
Correct defaulting
2018-03-21 16:51:44 -07:00
Alex Dadgar
a37329189a
Improve DeadlineTime helper
2018-03-21 16:51:44 -07:00
Alex Dadgar
d47c68f764
Add eligibility to node view
2018-03-21 16:51:44 -07:00
Alex Dadgar
8289cc3c6f
HTTP and API
2018-03-21 16:51:44 -07:00
Alex Dadgar
010228577e
Drain cli, api, http
2018-03-21 16:51:43 -07:00
Michael Schurter
d1ec65d765
switch to new raft DesiredTransition message
2018-03-21 16:49:48 -07:00
Alex Dadgar
db4a634072
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter
c0542474db
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Chelsea Komlo
6fc9231dac
Merge pull request #3856 from hashicorp/f-client-add-health-checks
...
Client driver health checks for Docker
2018-03-21 18:05:00 -04:00
Chelsea Holland Komlo
0bde357731
add concept of health checks to fingerprinters and nodes
...
fix up feedback from code review
add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Preetha Appan
26b6e70d13
Addressed code review feedback
2018-03-21 09:15:29 -05:00
Preetha Appan
30fff15de7
Show reschedule policy in addition to when the eval will get rescheduled
2018-03-20 13:34:29 -05:00
Alex Dadgar
d498fa950a
Remove fake advertise address and fix TestAPI_OperatorAutopilotServerHealth
2018-03-19 15:49:12 -07:00
Preetha
6df57c177c
Merge pull request #4002 from hashicorp/b-reschedule-systemjob-panic
...
Fix incorrect initialization of reschedule policy for system jobs.
2018-03-19 13:06:55 -05:00
Preetha Appan
1d748663ce
Fix linting
2018-03-19 12:46:55 -05:00
Preetha Appan
161bc66355
Fix incorrect initialization of reschedule policy for system jobs.
2018-03-19 12:16:13 -05:00
Alex Dadgar
9e05c9a50e
Merge pull request #3997 from hashicorp/b-serf-addr
...
RPC Advertise used exclusively for Clients
2018-03-19 09:30:20 -07:00
Alex Dadgar
7e69d16c15
relax timing
2018-03-19 09:30:00 -07:00
Preetha Appan
462b431291
Fix ineffectual assignment lint warnings
2018-03-17 10:49:48 -05:00
Alex Dadgar
6448523b21
fix api job tests
2018-03-16 16:46:22 -07:00
Preetha Appan
9a5e6edf1f
Rename DelayCeiling to MaxDelay
2018-03-14 16:10:32 -05:00
Preetha Appan
342c3fb961
Added FollowupEvalID field and helper methods to calculate reschedule eligibility based on delay
2018-03-14 16:10:32 -05:00
Preetha Appan
5f50c3d618
Add new reschedule options to API layer and unit tests
2018-03-14 16:10:32 -05:00
Alex Dadgar
de6ebb6e6c
small cleanup
2018-03-13 18:08:22 -07:00
Alex Dadgar
63e14b7d63
nodeevents -> events
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
d30c269fbe
code review feedback
2018-03-13 18:05:40 -07:00
Chelsea Holland Komlo
00d9923454
Ensure node updates don't strip node events
...
Add node events to CLI
2018-03-13 18:05:40 -07:00
Michael Schurter
7dd7fbcda2
non-Existent -> nonexistent
...
Reverting from #3963
https://www.merriam-webster.com/dictionary/existent
2018-03-12 11:59:33 -07:00
Josh Soref
e2167ef047
spelling: unmarshal
2018-03-11 19:07:44 +00:00
Josh Soref
85fabc63c8
spelling: expected
2018-03-11 17:57:01 +00:00
Josh Soref
7f6e4012a0
spelling: existent
2018-03-11 18:30:37 +00:00
Josh Soref
1c07f37d37
spelling: describing
2018-03-11 17:54:04 +00:00
Josh Soref
79e6481463
spelling: configurations
2018-03-11 17:47:24 +00:00
Josh Soref
95f8098e99
spelling: configuration
2018-03-11 17:48:04 +00:00
Josh Soref
dfec2cb25e
spelling: canonicalize
2018-03-11 17:46:20 +00:00
Alex Dadgar
0715a4884b
fix test
2018-02-15 13:59:03 -08:00
Alex Dadgar
57a0b7a774
allow setting timeout on any api config
2018-02-15 13:59:03 -08:00
Alex Dadgar
aa98f8ba7b
Enhance API pkg to utilize Server's Client Tunnel
...
This PR enhances the API package by having client only RPCs route
through the server when they are low cost and for filesystem access to
first attempt a direct connection to the node and then falling back to
a server routed request.
2018-02-15 13:59:03 -08:00
Kyle Havlovitz
54b691f538
Merge pull request #3852 from hashicorp/autopilot-cleanup
...
Clean up some leftover autopilot differences from Consul
2018-02-14 10:42:32 -08:00
Kyle Havlovitz
709b693d39
Clean up some leftover autopilot differences from Consul
2018-02-08 10:27:26 -08:00
Preetha Appan
12da54ae2b
Fix tests broken by default change to RestartPolicy
2018-02-02 19:25:45 -06:00
Preetha Appan
8ac05d4462
Reuse default policy structs in api, and other code review feedback
2018-01-31 16:33:00 -06:00
Preetha Appan
9d15e0c05b
Code review feedback
2018-01-31 09:58:05 -06:00
Preetha Appan
5714a6b8bf
Add method on API alloc to calculate attempted and remaining reschedule events
2018-01-31 09:58:05 -06:00
Preetha Appan
13444c08a4
Remove unused method
2018-01-31 09:58:05 -06:00
Preetha Appan
e09ea8c0b0
Address code review comments
2018-01-31 09:58:05 -06:00
Preetha Appan
0c56a12a77
Add RescheduleTracker to allocations API struct
2018-01-31 09:56:53 -06:00
Preetha Appan
1f834d1a31
Add reschedule policy to API, and HCL parsing support.
2018-01-31 09:56:53 -06:00
Kyle Havlovitz
cb2321353c
Merge pull request #3812 from hashicorp/autopilot-config-change
...
Refactor redundancy_zone/upgrade_version out of client meta
2018-01-30 16:14:26 -08:00
Kyle Havlovitz
2ccf565bf6
Refactor redundancy_zone/upgrade_version out of client meta
2018-01-29 20:03:38 -08:00
Alex Dadgar
31e212f467
Merge pull request #3684 from jrasell/f_gh_3678
...
Update `node-status` verbose command to include node address.
2018-01-29 15:32:14 -08:00
Preetha Appan
fd2fbefa4c
Add a field to track the next allocation during a replacement
2018-01-24 17:55:05 -06:00
Kyle Havlovitz
901c8791f4
Merge pull request #3670 from hashicorp/autopilot
...
Add Autopilot feature from Consul
2018-01-19 12:52:56 -08:00
Kyle Havlovitz
8d41f4ad40
Formatting/test adjustments
2018-01-18 15:03:35 -08:00
Kyle Havlovitz
12ff22ea70
Merge branch 'master' into autopilot
2018-01-18 13:29:25 -08:00
Michael Schurter
9fcda9c5aa
Make example service name make sense
...
Switch from global-redis-check for the example job's service name to
redis-cache. The former name is really confusing and someone finally
called us out on it:
https://groups.google.com/d/msg/nomad-tool/3RTh6CyYkWk/vEe_Sj7lAAAJ
Also specifically mention that the `service.name` parameter is what is
advertised in Consul.
2018-01-17 14:08:30 -08:00
Kyle Havlovitz
7b980c42d8
Add raft remove by id endpoint/command
2018-01-16 13:35:32 -08:00
Michael Schurter
224046ece8
Invert and test CheckRestart merge logic
2018-01-09 15:17:07 -08:00
Michael Schurter
7c282f174b
Fix service.check_restart stanza propagation
...
There was a bug in jobspec parsing, a bug in CheckRestart merging, and a
bug in CheckRestart canonicalization. All are now tested.
2018-01-09 15:15:36 -08:00
Preetha
1712b03705
Merge branch 'master' into 0.8
2018-01-03 16:06:38 -06:00
Fabian Holler
f99aaa9134
revert change to increase min. CPU resource value from 20 to 100
...
In the commit 622d3ddb92ea7e656ef831641c02024cb5a5d6d1
"Fixed test and moved constants into standalone func" the minimum CPU
resource value for a job was increased from 100 to 20.
This can break the nomad setup for people that used lower CPU
values and are at the maximum MHz value of the available CPU on a
machine.
Change the minimum back to 20 MHz to ensure downwards compatibility.
2018-01-02 16:09:44 +01:00
James Rasell
45e8f977f7
Update node-status
verbose command to include node address.
...
This change updates the `nomad node-status -verbose` command to
also include the addreess of the node. This is helpful for cluster
administrators to quickly discover information and access nodes
when required.
2017-12-21 08:58:35 +00:00
Kyle Havlovitz
1c07066064
Add autopilot functionality based on Consul's autopilot
2017-12-18 14:29:41 -08:00
Kyle Havlovitz
b775fc7b33
Added support for v2 raft APIs and -raft-protocol option
2017-12-12 10:17:16 -06:00
Michael Schurter
4ae115dc59
Allow custom ports for services and checks
...
Fixes #3380
Adds address_mode to checks (but no auto) and allows services and checks
to set literal port numbers when using address_mode=driver.
This allows SDNs, overlays, etc to advertise internal and host addresses
as well as do checks against either.
2017-12-08 12:03:00 -08:00
Chelsea Holland Komlo
021336eaee
fix up test fixture to properly parse
2017-12-06 16:23:55 -05:00
Chelsea Holland Komlo
b08611cfac
move kill_signal to task level, extend to docker
2017-12-06 14:36:39 -05:00
Preetha
0e6484a397
Merge pull request #3536 from angrycub/b-resource-memory-test-fix
...
Fixed test and moved constants into standalone func
2017-11-13 17:00:14 -06:00
Charlie Voiselle
1419e7a1c3
gofmt and goimports
2017-11-13 12:25:02 -05:00
Preetha Appan
be5fd87b9a
Fixes bug with display message logic due to deprecating GenericSource. Also added more test cases to cover a bunch more edge cases
2017-11-13 11:14:57 -06:00
Charlie Voiselle
26acd7f025
Fixed test and moved constants into standalone func
...
In #3520 , work was done to true up the defaults for Nomad resource
stanzas with the documentation. This fixes the tests that I
accidentally broke in the process. Some questions were raised about
using dynamic elements as part of expects, which is why I opted to
copy the MinResources pattern. During this refactor I also noticed
that structs.go had a similar issue and an inconsistent minium for CPU.
2017-11-13 12:05:30 -05:00
Alex Dadgar
b68f18bcc5
Merge pull request #3520 from hashicorp/b-memory-default-incorrect
...
`memory` default is 10, not 300
2017-11-10 12:43:37 -08:00
Charlie Voiselle
18750980ff
Simplified based on review comments
2017-11-09 20:09:37 -05:00
Charlie Voiselle
c15e91951b
Adopted pattern from LogConfig to handle panic when no resource stanza at all
...
In testing realized that Resources night not be present at all. Testing this case caused panic. Added in a means to collect clean defaults in that case.
2017-11-09 10:57:57 -05:00
Charlie Voiselle
eda45a0cee
Canonicalize task and remove merge with MinResources
...
The current code would merge the job with the output of MinResources causing the nil to be replaced with MemoryMB=helper.IntToPtr(10). When later `Canonicalize`d, the 10 would cause the default of 300 not to be applied. Running `Canonicalize` on the task itself guarantees that CPU, MemoryMB, and IOPS is set, so we can remove the Merge withMinResources.
2017-11-09 10:10:11 -05:00
Chelsea Komlo
bb96d79cae
Merge pull request #3480 from loomnetwork/fix-jobs-tests
...
jobs tests were sending invalid jobs
2017-11-08 11:49:19 -05:00
Charlie Voiselle
ff5d1a90a4
Rolled back documentation change in favor of correcting the default
2017-11-07 17:17:49 -05:00
Preetha Appan
61a49e08af
Fix formatting
2017-11-03 12:48:55 -05:00
Preetha Appan
56574ed87d
Adds deprecation notice to API struct definition.
2017-11-03 10:12:37 -05:00
Preetha Appan
5f09c968b3
Move logic for determinic event display message to task_runner, added two new fields DisplayMessage and Details.
2017-11-03 09:13:01 -05:00
Preetha Appan
39d70be009
Add ModifyTime to Allocation and update it both on plan applies and client initiated updates
2017-11-01 15:13:48 -05:00