Commit graph

1135 commits

Author SHA1 Message Date
Sean Chittenden 7c24487850
Fix up various error handling 2016-06-16 14:40:09 -07:00
Sean Chittenden 71cd9984ae
Immediately query Consul upon initialization if we have no peers.
Also don't attempt to join the Server with itself.
2016-06-16 14:27:10 -07:00
Sean Chittenden 65319252b9
Rework server_auto_join to use a timer instead of the peer count.
It is perfectly viable for an admin to downsize a Nomad Server cluster
down to 1, 2, or `num % 2 == 0` (however ill-advised such activities
may be).  And instead of using `bootstrap_expect`, use a timeout-based
strategy.  If the `bootstrapFn` hasn't observed a leader in 15s it will
fall back to Consul and will poll every ~60s until it sees a leader.
2016-06-16 12:14:03 -07:00
Sean Chittenden b0fecbefc1
Define BootstrapExepct as an int32 so it can be manipulated atomically. 2016-06-16 12:00:15 -07:00
Alex Dadgar ea5d11e628 remove consul reference 2016-06-15 17:23:02 -07:00
Alex Dadgar bf14fd355f plan displays launch time of periodic jobs 2016-06-15 13:34:45 -07:00
Sean Chittenden 14f9d2a947
Use the config's log output 2016-06-15 12:40:51 -07:00
Sean Chittenden 5b0def194a
Namespace the log messages 2016-06-15 12:40:51 -07:00
Sean Chittenden bffc82d668
Do not consider the number of Serf members when considering falling back to Consul. 2016-06-15 12:40:51 -07:00
Sean Chittenden 324af8d7f1
Guard the auto-join functionality behind its consul.server_auto_join tunable 2016-06-15 12:40:51 -07:00
Sean Chittenden 5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs.  With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden 2123460cf0
Bump various Consul search limits
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.

This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Sean Chittenden e8d1264dbc
Short-circuit the bootstrapFn if we have a leader 2016-06-15 12:40:51 -07:00
Sean Chittenden f05514335b
Teach Nomad servers how to fall back to Consul. 2016-06-15 12:40:51 -07:00
Alex Dadgar aea21affdb Document consul configuration 2016-06-14 15:21:57 -07:00
Sean Chittenden 6e22b680ce
Disambiguate auto_join from auto_register, rename reg to auto_advertise.
Provide an option that describes the value to the user vs the
operation performed by the software.  Momentarily introducing
`auto_join`
2016-06-14 12:11:38 -07:00
Sean Chittenden 4f14d51013
Fix up validation and allow existing unset timeouts to continue to be unset 2016-06-13 18:55:15 -07:00
Sean Chittenden c3a3fdc230
Upon further review, the Timeout needs to be validate for more than script checks.
This value is used for Consul HTTP and TCP checks.
2016-06-13 18:28:27 -07:00
Sean Chittenden baac19cad6
Remove diff check for ServiceID, may it R.I.P. 2016-06-13 18:22:53 -07:00
Sean Chittenden 79c675cf72
Guard against an interval and timeout being less than 1s 2016-06-13 18:19:40 -07:00
Sean Chittenden af8db7ec18
Don't export ServiceCheck validate 2016-06-13 18:17:43 -07:00
Sean Chittenden 08c88102a7
There is no "docker" check type 2016-06-13 18:15:07 -07:00
Alex Dadgar 8bbf4a55e5 Fix IDs and domain scoping 2016-06-13 16:30:58 -07:00
Alex Dadgar 8e231fa382 Rename ConsulService back to Service 2016-06-12 16:36:49 -07:00
Diptanu Choudhury 3024c080e8 Removing artifact check for java and qemu drivers 2016-06-12 12:57:35 +02:00
Alex Dadgar 480a281031 Merge pull request #1243 from hashicorp/f-run-modify-index
Add check-index flag to nomad run
2016-06-11 16:12:53 -07:00
Sean Chittenden 2f036231e5 Merge pull request #1201 from hashicorp/f-dyn-server-list
Dynamic Server Lists/Client Bootstrapping via consul.
2016-06-11 18:58:25 -04:00
Alex Dadgar 59b0a7b3f6 Merge pull request #1256 from hashicorp/b-node-gc
Improve partial garbage collection of allocations
2016-06-11 15:41:00 -07:00
Sean Chittenden bbd8dfa798
goling(1) compliance pass (e.g. Rpc* -> RPC) 2016-06-10 23:38:28 -04:00
Alex Dadgar 98bf249625 Partial GC allocations 2016-06-10 18:32:37 -07:00
Alex Dadgar 7ccc7d20a0 test 2016-06-10 15:48:59 -07:00
Alex Dadgar b064b392fc Only unblock if missed class was added after eval snapshot index 2016-06-10 15:24:06 -07:00
Sean Chittenden 948663c89a
Fix another unit test not expecting ServiceID 2016-06-10 16:50:35 -04:00
Sean Chittenden d99467ef5e
Always create a consul.Syncer. Use a default Consul Config if necessary. 2016-06-10 15:55:27 -04:00
Sean Chittenden 3d64daafd9
Fold RaftPeers() into its only call site now 2016-06-10 15:54:39 -04:00
Sean Chittenden 0ba1da9c9c
Always pass in a snapshot before calling constructNodeServerInfoResponse() 2016-06-10 15:54:39 -04:00
Sean Chittenden 1df6fc253f
Rename updateNodeUpdateResponse to constructNodeServerInfoResponse 2016-06-10 15:54:39 -04:00
Sean Chittenden 077203fe93
Update the structure of ConsulService to match reality.
ConsulService is the configuration for a Consul Service
2016-06-10 15:54:39 -04:00
Sean Chittenden 197feae679
Sync services with Consul by comparing the AgentServiceReg w/ ConsulService
The source of truth is the local Nomad Agent.  Any services not local that
have a matching prefix are removed.  Changed services are re-registered
and missing services are re-added.
2016-06-10 15:54:39 -04:00
Sean Chittenden 9a223936bb
Generate and sync Consul ServiceIDs consistently 2016-06-10 15:54:39 -04:00
Sean Chittenden 95c9d1a63e
Per-comment, remove structs.Allocation's Services attribute.
Nuke PopulateServiceIDs() now that it's also no longer needed.
2016-06-10 15:54:39 -04:00
Sean Chittenden 7956eb0c80
Rename structs.Task's Service attribute to ConsulService 2016-06-10 15:54:39 -04:00
Sean Chittenden fda03c5c9e
Change the signature of the PeriodicCallback to return an error
I *KNEW* I should have done this when I wrote it, but didn't want to
go back and audit the handlers to include the appropriate return
handling, but now that the code is taking shape, make this change.
2016-06-10 15:54:39 -04:00
Sean Chittenden 4973ec32bb
Rename structs.Services to structs.ConsulServices 2016-06-10 15:54:39 -04:00
Sean Chittenden 060300007e
Use a monotonically incrementing number to create unique node names.
Also remove the space from the "name" of the node
2016-06-10 15:50:11 -04:00
Sean Chittenden 1ec7d6c266
Push down the server list even on node registration and evaluation
Be mindful of the cost of taking a snapshot from the statestore and
reuse the snapshot if one has already been taken.
2016-06-10 15:50:11 -04:00
Sean Chittenden bff57a0dce
Reconcile, clean up, and centralize API version numbers (major and minor).
Reduce future confusion by introducing a minor version that is gossiped out
via the `mvn` Serf tag (Minor Version Number, `vsn` is already being used for
to communicate `Major Version Number`).

Background: hashicorp/consul/issues/1346#issuecomment-151663152
2016-06-10 15:50:11 -04:00
Sean Chittenden dde6a4074d
Nuke trace-level logging in heartbeats 2016-06-10 15:50:11 -04:00
Sean Chittenden d76c042a13
Invert error handling logic 2016-06-10 15:50:11 -04:00
Sean Chittenden 1fe979a5e4
Remove types.ShutdownChannel and replace with chan struct{} 2016-06-10 15:50:11 -04:00
Sean Chittenden 438becb28b
Pass the datacenter name in the heartbeat
Servers that are part of a different datacenter are added as backup
servers instead of primary servers.
2016-06-10 15:50:11 -04:00
Sean Chittenden 89168b0c51
Invert check definition so the error is first 2016-06-10 15:50:11 -04:00
Sean Chittenden dc78baedfd
Fix typo in the comment to reflect the actual function name. 2016-06-10 15:50:11 -04:00
Sean Chittenden 410d85cc78
Rename the package from client/rpc_proxy to client/rpcproxy
Also rename `NewRpcProxy()` to just `New()` to avoid package stutter.
2016-06-10 15:50:11 -04:00
Sean Chittenden 1aefdb1e15
Use the correctly typed rand.Int* variant 2016-06-10 15:50:11 -04:00
Sean Chittenden 3a1dc9a194
Use rand.Int*n() where appropriate 2016-06-10 15:50:11 -04:00
Sean Chittenden e727fd8c3c
Centralize the creation of a consul/api.Config struct.
While documented, the consul.timeout parameter wasn't ever set
except one-off in the Consul fingerprinter.
2016-06-10 15:50:11 -04:00
Sean Chittenden f695d6d70d
Reconcile consul's address configuration section.
There were conflicting directives previously, both consul.addr and
consul.address were required to achieve the desired behavior.  The
documentation said `consul.address` was the canonical name for the
parameter, so consolidate configuration parameters to `consul.address`.
2016-06-10 15:50:11 -04:00
Sean Chittenden e60580b279
Define a type for the PeriodicCallback handlers and ShutdownChannel 2016-06-10 15:50:11 -04:00
Sean Chittenden 17116fc5a7
Rebalance Nomad client RPCs among different Nomad servers.
Implement client/rpc_proxy.RpcProxy.
2016-06-10 15:50:11 -04:00
Sean Chittenden b509da2d0c
Create a nomad/structs/config to break an import cycle.
Flattening and normalizing the various Consul config structures and
services has led to an import cycle.  Break this by creating a new package
that is intended to be terminal in the import DAG.
2016-06-10 15:48:36 -04:00
Sean Chittenden 6d162e1e03
Fix copy pasta comment.
These parameters are used to bootstrap Nomad servers, not Consul servers.
2016-06-10 15:48:36 -04:00
Sean Chittenden 4e2835d5ff
Use the correctly typed rand.Int* variant 2016-06-10 15:48:36 -04:00
Sean Chittenden 49deaae2ae
Seed random once in main 2016-06-10 15:48:36 -04:00
Sean Chittenden db97a88f94
Fix small typo 2016-06-10 15:48:36 -04:00
Sean Chittenden 66b4b2a99f
Use rand.Int*n() where appropriate 2016-06-10 15:48:36 -04:00
Sean Chittenden e36686a17d
Use consul/lib's RandomStagger
Removes four redundant copies of the method in the process.
2016-06-10 15:48:36 -04:00
Sean Chittenden e0e7d94450
Use consul/lib's RateScaledInterval 2016-06-10 15:48:36 -04:00
Alex Dadgar 527afa5119 Merge pull request #1244 from hashicorp/b-eval-reblock-test-hardening
Don't dequeue requeued evals in tests
2016-06-09 11:35:42 -07:00
Alex Dadgar 5d181d203c Add check-index flag to nomad run 2016-06-08 17:56:32 -07:00
Alex Dadgar b7e3a45fef fix channel being nil on restore 2016-06-07 15:03:08 -07:00
Alex Dadgar ecdce9a641 don't dequeue 2016-06-07 09:51:20 -07:00
Alex Dadgar cc95d5d332 GC Nodes even if they have terminal allocations 2016-06-03 16:24:41 -07:00
Alex Dadgar 5f3e27ecd8 Fix case in periodic dispatch and blocked evals where lock was not released 2016-06-03 13:46:57 -07:00
Alex Dadgar 3100b4a086 Change eval_endpoint test to not retry but block longer 2016-06-03 12:02:49 -07:00
Alex Dadgar 299a0bb4b3 up timeout for dequeue in test 2016-06-03 11:36:50 -07:00
Alex Dadgar 0f84d8968b Merge pull request #1221 from hashicorp/b-nil-wait
fix wait result being nil and some panics in the cli
2016-05-31 16:50:38 -07:00
Alex Dadgar 629542f64e flaky test 2016-05-31 23:50:14 +00:00
Alex Dadgar 7196133f0a Merge pull request #1220 from hashicorp/f-plan-failure-reasons
plan shows failure reasons and ordered annotations
2016-05-31 15:32:22 -07:00
Alex Dadgar b1298bb658 plan shows failure reasons and ordered annotations 2016-05-31 21:51:23 +00:00
Alex Dadgar 13f0ff03c1 Merge pull request #1209 from hashicorp/b-blocked-eval-fixes
Fix race condition in which a reblocked evaluation could be dropped
2016-05-31 13:26:58 -07:00
Alex Dadgar 060318845f Comments addressed 2016-05-31 11:39:03 -07:00
Alex Dadgar 75bd7a50f7 changelog 2016-05-27 17:43:20 -07:00
Alex Dadgar cc00a66e38 validate that tasks don't contain slashes 2016-05-27 17:17:10 -07:00
Alex Dadgar 1f9f015c1b Fix race condition in which a reblocked evaluation could be dropped 2016-05-27 16:53:10 -07:00
Alex Dadgar 6a236872b4 address comment 2016-05-25 10:30:47 -07:00
Alex Dadgar a3336b7761 test fixes and delete 2016-05-24 20:20:06 -07:00
Alex Dadgar 3fd51ecece Periodically unblock failed evaluations 2016-05-24 20:10:56 -07:00
Alex Dadgar bfdd5846e1 Track unblock indexes and check evals on block to see if they missed an update while in the scheduler 2016-05-24 20:10:56 -07:00
Alex Dadgar 15936822a4 Worker annotates evals with their snapshot index 2016-05-24 20:10:56 -07:00
Alex Dadgar 18d9e89065 Reuse the same evaluation and reblock it until there is no more work to do 2016-05-24 20:10:56 -07:00
Alex Dadgar 3cbb89c61e Merge pull request #1188 from hashicorp/f-no-failed-allocs
Failed Allocation Metrics stored in Evaluation
2016-05-24 20:06:28 -07:00
Alex Dadgar fcc57fbc66 rename SpawnedBlockedEval and simplify map safety check 2016-05-24 18:12:59 -07:00
Alex Dadgar b5ad18a7ea Dont restart successfully finished batch allocations 2016-05-24 17:23:18 -07:00
Alex Dadgar 1feb57b047 Evals track blocked evals they create 2016-05-19 13:09:52 -07:00
Alex Dadgar 8f5f12ae81 Scheduler no longer produces failed allocations; failed alloc metrics stored in evaluation 2016-05-18 18:11:40 -07:00
Alex Dadgar 1c6d3e129a EnqueueAll inserts all evaluations before unblocking dequeue calls 2016-05-18 12:13:59 -07:00
Alex Dadgar 045f7807e0 eval_broker.Enqueue no longer returns an error 2016-05-18 11:35:15 -07:00
Alex Dadgar 0c653c3c8f Fix determining whether a job is edited 2016-05-17 15:48:35 -07:00
Alex Dadgar a5ab96d40e Merge pull request #1168 from hashicorp/f-plan-endpoint
Job.Plan endpoint
2016-05-16 13:15:40 -07:00
Alex Dadgar a231f6f998 Switch to using the harness 2016-05-16 12:49:18 -07:00
Alex Dadgar 5085c25f8b Rename Cas to JobModifyIndex 2016-05-16 11:48:44 -07:00
Sean Chittenden dc28ab0cb5
Speling police 2016-05-15 09:41:34 -07:00
Diptanu Choudhury 2e2e2e500e Using a helper method to create service identifiers 2016-05-14 00:43:25 -07:00
Diptanu Choudhury d94e651c6e Merge branches 'master' and 'f-config-consul-block' of github.com:hashicorp/nomad into f-config-consul-block 2016-05-13 13:05:11 -07:00
Alex Dadgar bed4cb7a9f Fixes 2016-05-13 11:53:11 -07:00
Alex Dadgar 7a44ec5ccc Remove plan from the response 2016-05-12 11:29:38 -07:00
Alex Dadgar b634486ebd Plan api 2016-05-11 18:51:48 -07:00
Alex Dadgar 2bd962ca0d add endpoint tests 2016-05-11 17:02:14 -07:00
Diptanu Choudhury 347cb890d2 Removed allocID and task name from consul service 2016-05-11 16:26:41 -07:00
Alex Dadgar ab4c184916 Undo testing.go 2016-05-11 15:44:27 -07:00
Alex Dadgar 81f0286dd8 Merge branch 'master' into f-plan-endpoint 2016-05-11 15:39:36 -07:00
Alex Dadgar 724d0ab489 Merge pull request #1166 from hashicorp/f-job-diff2
Job Diff structure
2016-05-11 15:38:56 -07:00
Alex Dadgar 24bfaa70ac Fix switching diff structures 2016-05-11 15:36:28 -07:00
Alex Dadgar 3a4f99c976 Merge branch 'f-job-diff2' into f-plan-endpoint 2016-05-11 15:28:07 -07:00
Alex Dadgar 59e244d18e Add service diff 2016-05-11 15:25:59 -07:00
Alex Dadgar 2f74bd9a74 remove old diff 2016-05-11 11:27:00 -07:00
Alex Dadgar ccc3caae4a Merge branch 'f-job-diff2' into f-plan-endpoint 2016-05-11 11:25:25 -07:00
Alex Dadgar 967011e25c contextual diff 2016-05-11 11:11:26 -07:00
Alex Dadgar b4bb28c425 Job diff using generic structures 2016-05-10 22:23:34 -07:00
Alex Dadgar 8b45e2c474 Check if network asks have changed when checking task updates 2016-05-05 21:32:01 -07:00
Alex Dadgar ab0b57a9a1 Initial plan endpoint implementation - WIP 2016-05-05 11:21:58 -07:00
Alex Dadgar 789722bf3f Index fields using a map 2016-05-04 14:29:08 -07:00
Alex Dadgar 11289526b5 change the packages 2016-05-03 18:53:12 -07:00
Diptanu Choudhury d41939fefd Merge pull request #1144 from hashicorp/unique-service-names
Ensuring check names are unique
2016-05-03 14:36:49 -07:00
Alex Dadgar bea01efa5d Diff Task Configs, handle the nil cases, better type setting and more tests 2016-05-03 14:23:44 -07:00
Diptanu Choudhury 8a582c1b6c Ensuring check names are unique 2016-05-03 13:16:02 -07:00
Alex Dadgar 082d6901f7 visitor 2016-05-03 09:50:49 -07:00
Diptanu Choudhury 26a823873a Merge pull request #1143 from hashicorp/unique-service-names
Ensuring service names are unique
2016-05-03 01:15:31 -07:00
Alex Dadgar fa042c45b9 flatmap for walking the task config 2016-05-02 22:25:06 -07:00
Alex Dadgar e7b1c309ef Job diff infrastructure 2016-05-02 18:32:23 -07:00
Diptanu Choudhury 506ea1a206 Ensuring service names are unique 2016-05-02 13:43:07 -07:00
Alex Dadgar ff0dd9b81c Task is not eligible for update if User, Meta, or Resources change 2016-04-25 17:20:25 -07:00
Alex Dadgar de2b26650f Merge pull request #1097 from hashicorp/service-no-port
Invalidating services when they contain check of type tcp and http bu…
2016-04-19 09:18:52 -07:00
Alex Dadgar fce63bdc5c Ensure the label exists on checks and small enhancements 2016-04-18 19:38:47 -07:00
Alex Dadgar ea08b45c70 Always trigger evals on drain update 2016-04-18 18:43:52 -07:00
Diptanu Choudhury f617fb89fd Invalidating services when they contain check of type tcp and http but no ports 2016-04-15 12:50:55 +04:00
Alex Dadgar d94204554f Merge pull request #1012 from hashicorp/f-partition-gc
core: Limit GC size
2016-04-14 13:00:53 -07:00
Alex Dadgar b34ab80c93 Address comments 2016-04-14 11:41:04 -07:00
Alex Dadgar f38ed64938 Validate driver config on job register 2016-04-13 15:55:46 -07:00
Alex Dadgar 2d4b3f3d46 Merge pull request #1070 from hashicorp/f-interpret-artifacts
Interpret artifact source
2016-04-13 14:58:55 -07:00
Alex Dadgar 273dfaf2c7 Periodic jobs always are evaluated in UTC TZ 2016-04-12 09:47:25 -07:00
Alex Dadgar dc63c24e59 interpet the artifact source 2016-04-11 18:46:16 -07:00
Alex Dadgar 034bae90bb Revert "Remove client status from allocation TerminalStatus"
This reverts commit 819e1e4b3967c7029ee8221144666ff460fdd7ed.
2016-04-08 14:22:06 -07:00
Alex Dadgar ca938f205c Force GC garbage collects nodes last and fix eval GC to cleanup deregistered batch jobs 2016-04-08 11:42:02 -07:00
Diptanu Choudhury 26d1b60369 Adding raft peers in agent info 2016-04-05 10:30:46 -07:00
Diptanu Choudhury d472dc2988 Adding the raft leader addr to server stats 2016-04-03 16:38:39 -07:00
Diptanu Choudhury 778167e26f Using the node in the req to unblock evals 2016-03-31 18:19:01 -07:00
Diptanu Choudhury fd3c05a2be Making driver fingerprinters run periodically and unblocking evals when node registration happens 2016-03-31 18:11:27 -07:00
Alex Dadgar 066d006868 Limit GC size 2016-03-30 15:17:13 -07:00
Diptanu Choudhury 802844397d Supporting interpolation in service name, check name/protocol/path 2016-03-28 15:52:57 -07:00
Diptanu Choudhury b791c91c6f Added support for interpolation in check cmd and args 2016-03-28 15:02:00 -07:00
Diptanu Choudhury 5439d4c23c Interpolating service tags 2016-03-28 15:02:00 -07:00
Diptanu Choudhury 2963c3f1f4 Renamed checks cmd to command to be consistent with exec driver config 2016-03-28 14:05:12 -07:00
Diptanu Choudhury c872711d01 Merge pull request #986 from hashicorp/scripts
Implement script checks
2016-03-25 18:22:35 -07:00
Alex Dadgar b9a80f14f1 Limit garbage collection of batch jobs 2016-03-25 16:46:48 -07:00
Alex Dadgar 5fc83bd868 Dead->Complete 2016-03-25 12:56:54 -07:00
Alex Dadgar 09f63fd3c0 Remove client status from allocation TerminalStatus 2016-03-25 12:53:37 -07:00
Alex Dadgar 25dc8a0dcb Explain restart decision and display in alloc-status 2016-03-25 12:47:14 -07:00
Alex Dadgar 647c20e465 Merge pull request #978 from hashicorp/f-enforce-user
Operator specifiable blacklist for task's using certain users
2016-03-25 12:44:05 -07:00
Alex Dadgar adbc7ed0d8 Merge pull request #981 from hashicorp/f-eval-force
Deregister creates an eval for a job even if it doesn't exist
2016-03-25 11:54:15 -07:00
Alex Dadgar 5293b6a5e9 small fixes 2016-03-25 11:38:18 -07:00
Alex Dadgar a0b364714a Merge pull request #974 from hashicorp/b-remove-gc-field
Remove the GC field on the job and use the job type
2016-03-25 10:41:08 -07:00
Alex Dadgar b80e61a66c Merge pull request #975 from hashicorp/f-rename-complete-alloc
Successful allocations are marked as complete instead of dead
2016-03-25 10:35:11 -07:00
Diptanu Choudhury caaa94a136 Enabling script checks 2016-03-24 14:54:26 -07:00
Diptanu Choudhury 2a5f300f92 Added a check type for consul service to delegate certain checks 2016-03-24 14:54:26 -07:00
Alex Dadgar 8968ff30c8 deregister creates an eval for a job even if it doesn't exist 2016-03-24 11:41:46 -07:00
Alex Dadgar 45dfae8f6f Operator specifiable blacklist for task's using certain users 2016-03-24 10:55:14 -07:00
Alex Dadgar fc0e48b0ec Merge pull request #950 from mssola/raw-exec-user
client: added the "user" option to tasks
2016-03-24 08:57:13 -07:00
Diptanu Choudhury 76343a3748 Merge pull request #972 from hashicorp/scripts
Moving consul service to executor
2016-03-24 00:12:45 -07:00
Alex Dadgar 94522e7bed Successful allocations are marked as complete instead of dead 2016-03-23 18:08:19 -07:00
Alex Dadgar 47390c5186 remove the GC field on the job and use the job type 2016-03-23 18:02:01 -07:00
Diptanu Choudhury 62242595fc Using the name of the task and the alloc id in the service name 2016-03-23 17:35:29 -07:00
Diptanu Choudhury 83f41993aa removing ids of services and checks on the server side 2016-03-23 17:12:53 -07:00
Diptanu Choudhury 8b91606b22 Making the docker driver register services and checks 2016-03-23 12:59:22 -07:00
Diptanu Choudhury 7e2789d1fc Added the sync logic 2016-03-23 10:43:35 -07:00
Miquel Sabaté Solà 8c56c64615 client/driver: added the user config value
This way Nomad allows to set which user has to run the task.

Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
2016-03-23 12:57:31 +01:00
Diptanu Choudhury 2c6ec8cf89 Added consul sync 2016-03-23 00:50:41 -07:00
Alex Dadgar 5f2a9e5973 Make artifact escape analysis work on windows 2016-03-22 20:54:36 -07:00
Alex Dadgar ce55bef7a7 Clarify log config disk usage message 2016-03-22 10:27:14 -07:00
Alex Dadgar 2de9299cab ProposedAllocs dedups in-place updated allocations 2016-03-21 18:09:32 -07:00
Alex Dadgar a6a98daa2e Fix copy to distinguish between nil and empty map/structs 2016-03-21 16:29:21 -07:00
Alex Dadgar 823261c974 add destination to artifact block and validate it 2016-03-18 12:01:46 -07:00
Alex Dadgar bf74e2f790 display server leaders per region 2016-03-17 16:04:09 -07:00
Alex Dadgar 52aac4421f Fix validate test 2016-03-17 11:29:41 -07:00
Alex Dadgar 914207a5c2 Allow count zero 2016-03-17 11:02:59 -07:00
Alex Dadgar e6d50f96ef Respond to comments and fix test 2016-03-15 20:21:52 -07:00
Alex Dadgar 4dd7ff37ef validate that java/qemu have artifacts and better artifact validation 2016-03-15 15:05:36 -07:00
Alex Dadgar 9f878a16bf Download artifacts and remove old code for drivers 2016-03-15 13:28:57 -07:00
Alex Dadgar 081556db93 update the getter 2016-03-15 13:28:57 -07:00
Alex Dadgar 68059fa85c Go-getter options are generic and validation of checksums 2016-03-15 13:28:57 -07:00
Alex Dadgar 277a91336d Switch from string to artifact struct 2016-03-15 13:28:57 -07:00
Alex Dadgar 315fd954ea Parse artifacts and options 2016-03-15 13:28:57 -07:00
Sean Chittenden a16b030e92 Add two tests for invalid service names 2016-03-14 17:54:49 -07:00
Sean Chittenden 935fcc020f Restrict the service name to match RFC-1123 and RFC-2782 2016-03-14 17:44:59 -07:00
Alex Dadgar 4f0ed036c2 leader: nodes TTLing log promoted from DEBUG to WARN 2016-03-10 13:16:31 -08:00
Alex Dadgar 8ad65876bf Add allocation index environment variable 2016-03-09 18:09:51 -08:00
Alex Dadgar 330aec45b9 Remove outstanding reset from plan applier 2016-03-04 17:25:37 -08:00
Alex Dadgar 5bcfdeb41d Fix bug introduced with paused nack timers 2016-03-04 16:17:14 -08:00
Alex Dadgar 74726278b9 core: Pause NackTimeout while in the plan_queue as progress is being made 2016-03-04 12:59:35 -08:00
Alex Dadgar ad92e50a24 Avoid serializes Allocation.Resources 2016-03-01 14:09:25 -08:00
Alex Dadgar 61972c9ddc Refactor task runner to include driver starting into restart policy and add recoverable errors 2016-02-28 16:56:05 -08:00
Alex Dadgar a9d410dbee Store the job on allocations that are from stopped jobs 2016-02-24 14:50:59 -08:00
Armon Dadgar 6a883e1389 nomad: do not skip node plan if we receive an evaluation result 2016-02-22 22:32:02 -08:00
Alex Dadgar 88ddfbed31 Revert "Debug messages around the plan and plan response"
This reverts commit 7646657e6b8a892210779eaf5708341b94b29b24.
2016-02-22 22:24:52 -08:00
Alex Dadgar 64cabb5a97 Revert "Debug log in plan apply"
This reverts commit 532df0d27e02e0cb30b6e3a28b9c4b5778bd71b3.
2016-02-22 22:24:27 -08:00
Alex Dadgar fa8e2d31ee Revert "err logs in worker and scheduler"
This reverts commit 7befc586521b70eb84013bff367310e4cfa45c27.
2016-02-22 22:23:57 -08:00
Alex Dadgar c2242552a1 Debug messages around the plan and plan response 2016-02-22 20:36:11 -08:00
Alex Dadgar b72b3ce190 Debug log in plan apply 2016-02-22 16:12:16 -08:00
Alex Dadgar f48eabe753 err logs in worker and scheduler 2016-02-22 14:47:59 -08:00
Alex Dadgar 0b65fb410a add debug error if there is a partial apply and refresh index isn't set 2016-02-22 13:03:40 -08:00
Armon Dadgar fd5d7906f4 nomad: cleanup handling of refresh index 2016-02-22 11:43:41 -08:00
Alex Dadgar 5eeb9c5ef6 Merge pull request #834 from hashicorp/f-serverbatch
Batch client updates of allocation status
2016-02-21 21:39:06 -08:00
Armon Dadgar 5a9c7f7002 Fixing comment 2016-02-21 21:03:24 -08:00
Armon Dadgar 7fc7cd9453 nomad: batch client updates for 50msec 2016-02-21 18:51:34 -08:00
Armon Dadgar 06425211ab nomad: alloc update_alloc endpoint to take multiple allocations 2016-02-21 18:00:46 -08:00
Armon Dadgar 6e8d6e61fc nomad: fixing test name 2016-02-21 18:00:33 -08:00
Armon Dadgar 74fb2d5a65 nomad: fixing tests 2016-02-21 17:59:12 -08:00
Armon Dadgar 0e1f9a3def nomad: FSM supports multiple client updates 2016-02-21 17:58:56 -08:00
Armon Dadgar 4c4581a07e nomad/state: Allow UpdateAllocFromClient to take multiple allocs 2016-02-21 17:49:46 -08:00
Armon Dadgar e4524b173c Address review feedback 2016-02-21 13:32:45 -08:00
Armon Dadgar 6ddc4fd17b nomad: normalize job for evictions 2016-02-21 11:51:51 -08:00
Armon Dadgar 0e24df554e nomad: allow for job normalization 2016-02-21 11:42:54 -08:00
Armon Dadgar b36d96baf7 nomad: allow sharing a single Job in a plan and alloc update request 2016-02-21 11:31:27 -08:00
Armon Dadgar 0dbd4c46c9 nomad: make PopulateServiceIDs more efficient 2016-02-21 11:15:00 -08:00
Alex Dadgar 2a19e179bc Switch to using go/codec and use code generation 2016-02-20 18:05:17 -08:00
Armon Dadgar a08391455a nomad: simple encode/decode benchmark 2016-02-20 17:43:17 -08:00
Armon Dadgar 623b473ee6 Revert "Lzw compress raft entries"
This reverts commit a687ee6df316ffadac5305f4be9cf9a5642cd6af.

The LZW compression reduces the file size by ~60%, but increases
our encode and decode costs by 2x in a CPU bottlenecked path.
2016-02-20 17:36:39 -08:00
Alex Dadgar d42e0a7dfd Add force node gc 2016-02-20 16:11:29 -08:00
Alex Dadgar a3ac4bbc5a Merge pull request #828 from hashicorp/f-gc-endpoint
Job GC endpoint
2016-02-20 16:03:39 -08:00
Alex Dadgar d14d885805 Merge pull request #826 from hashicorp/f-compress-raft
LZW compress raft entries
2016-02-20 16:03:05 -08:00
Alex Dadgar 821b9c13db Merge pull request #823 from hashicorp/f-bitmap
Switch port collision checking to use bitmap instead of map
2016-02-20 16:02:48 -08:00
Armon Dadgar 50135ac402 nomad: guard against a negative pool size 2016-02-20 15:56:08 -08:00
Alex Dadgar 143972b6d9 Job GC endpoint 2016-02-20 15:50:41 -08:00
Armon Dadgar 3746bf7cd3 nomad: use CPU count to determine pool size 2016-02-20 13:42:13 -08:00
Armon Dadgar 1076fb44d9 nomad: Adding Size to EvaluatePool 2016-02-20 13:42:13 -08:00
Armon Dadgar eb0f00103e nomad: Allow resize of EvaluatePool 2016-02-20 13:42:13 -08:00
Armon Dadgar d69b8d055b nomad: extra comment 2016-02-20 13:41:49 -08:00
Alex Dadgar 6e8a57ba1a Lzw compress raft entries 2016-02-20 13:21:38 -08:00
Armon Dadgar 813b02c8b4 nomad: plan apply uses the EvaluatePool 2016-02-20 13:12:14 -08:00
Armon Dadgar 8724ef5f57 nomad: adding a worker pool for plan apply 2016-02-20 13:07:09 -08:00
Armon Dadgar 9784bb7285 nomad: cache bitmaps to avoid GC pressure 2016-02-20 12:18:22 -08:00
Armon Dadgar bdcc922247 nomad: use bitmap for port collision checking 2016-02-20 12:08:27 -08:00
Armon Dadgar 579c0fce7f nomad: adding simple bitmap implementation 2016-02-20 11:56:48 -08:00
Armon Dadgar 2d7236eb95 nomad: Use AllocsByNodeTerminal in plan apply 2016-02-20 11:26:38 -08:00
Armon Dadgar abf7e52689 nomad: adding compound index on alloc terminal status 2016-02-20 11:24:06 -08:00
Alex Dadgar e2a4c4ccc5 Client stores when it receives a task 2016-02-19 14:49:43 -08:00
Alex Dadgar 9e4b66a24c Merge pull request #808 from hashicorp/b-seed-rand
Seed the servers random number generator
2016-02-17 14:05:43 -08:00
Alex Dadgar df0767e114 Merge pull request #812 from hashicorp/f-disable-cores
Disable half the workers, freeing half the CPUs, on the leader
2016-02-17 14:05:27 -08:00
Alex Dadgar 75d7fce90c Disable half the workers, freeing half the CPUs 2016-02-17 13:50:06 -08:00
Alex Dadgar e1dc47de91 Remove blank line 2016-02-17 11:48:52 -08:00
Alex Dadgar 25c5e543f4 Use crypto random seed 2016-02-17 11:47:02 -08:00
Alex Dadgar 18d2d9c091 Killing a driver handle is retried with an exponential backoff 2016-02-16 21:00:49 -08:00
Alex Dadgar 01cadf7cb0 Seed the servers random number generator 2016-02-16 19:40:02 -08:00
Alex Dadgar d4d8197244 lower cpu minimum 2016-02-16 10:25:45 -08:00
Alex Dadgar e0eb8911e8 Merge branch 'master' of https://github.com/hashicorp/nomad 2016-02-12 16:13:28 -08:00
Alex Dadgar f6e0349d3b go vet 2016-02-12 16:08:58 -08:00
Alex Dadgar c000dce72b Add human readable output to fs 2016-02-12 15:36:52 -08:00
Diptanu Choudhury d64b109afc Bumped up disk requirements in a test 2016-02-11 22:50:34 -08:00
Diptanu Choudhury c84a000416 Fixed some more tests 2016-02-11 22:33:41 -08:00
Diptanu Choudhury a254d664cb Bumping up default disk size 2016-02-11 18:15:55 -08:00
Diptanu Choudhury c6e395fa0e Fixed tests 2016-02-11 17:34:46 -08:00
Diptanu Choudhury 7eba3f3b5f Merge pull request #763 from hashicorp/f-logrotator
Implemented log rotation for drivers
2016-02-11 15:02:39 -08:00
Diptanu Choudhury 3db617472c Fixed an issue with purge 2016-02-11 14:44:35 -08:00
Diptanu Choudhury e7f8997bb1 Added a validator for log storage 2016-02-11 12:30:47 -08:00
Alex Dadgar fa4b9b9964 Merge pull request #790 from hashicorp/b-eval-broker-delivery-limit
Fix panic when Ack occurs at delivery limit
2016-02-11 11:09:28 -08:00
Alex Dadgar d2e88f0116 Fix panic when Ack occurs at delivery limit 2016-02-11 11:07:18 -08:00
Diptanu Choudhury 9e994cb8d0 Fixed some tests 2016-02-11 10:42:56 -08:00
Alex Dadgar bb36684fec Copy handles empty case 2016-02-11 09:08:20 -08:00
Alex Dadgar 4d7ed4f164 Strip as much copystructure as possible 2016-02-10 17:54:43 -08:00
Diptanu Choudhury 675b679e3f Added a LogConfig to the mock tasks 2016-02-10 17:46:35 -08:00
Diptanu Choudhury 43206bfc1c Made a DefaultLogConfig method 2016-02-10 16:44:31 -08:00
Alex Dadgar 0c4c3fc4ee safe but slow 2016-02-10 13:44:53 -08:00
Diptanu Choudhury 9d6eed1fb4 Merge branch 'master' into f-logrotator 2016-02-09 12:25:44 -08:00
Alex Dadgar a64f07b809 Store alloc creation time 2016-02-08 21:58:05 -08:00
Diptanu Choudhury b8e4941862 Fixed merge conflicts 2016-02-08 10:24:27 -08:00
Alex Dadgar 2627b742fa Fix == 2016-02-05 15:15:56 -08:00
Alex Dadgar 40a373f9e0 Ensure there are no periods in the service name 2016-02-05 14:42:35 -08:00
Diptanu Choudhury 5b9dcf9926 Added parsing logic for the logrotator 2016-02-04 23:28:01 -08:00
Alex Dadgar 5018f5dd1e Only interpret vars wrapped in braces 2016-02-04 17:26:46 -08:00
Alex Dadgar 25cb7fc03d Fix computed class when the job has multiple task groups 2016-02-03 21:22:18 -08:00
Alex Dadgar 6f20d3f435 Restart on-success shouldn't be user specifiable 2016-02-02 17:35:06 -08:00
Alex Dadgar db7997d08c Merge branch 'master' of https://github.com/hashicorp/nomad 2016-02-02 16:13:34 -08:00
Alex Dadgar 2b17dcdf78 Zero IOPS by default 2016-02-02 16:13:25 -08:00
Alex Dadgar 30ccc2f46d Merge pull request #742 from hashicorp/b-restart-attempts
Restart policy handling / validation adjustments
2016-02-02 15:47:07 -08:00
Ryan Uber 198a1a3cd4 structs: more restart policy validation 2016-02-02 14:32:30 -08:00
Alex Dadgar 4748bfd869 Tests 2016-02-02 13:50:30 -08:00
Alex Dadgar f83b26029e Fix test 2016-02-02 13:27:50 -08:00
Alex Dadgar 9c9fe2ea2d Network check 2016-02-02 13:27:50 -08:00
Alex Dadgar 5142f1e4c0 Validate min resources and default them 2016-02-02 13:27:50 -08:00
Alex Dadgar 54aebfd329 Merge pull request #731 from hashicorp/f-reduce-client-alloc-pulls
Client only pulls update allocations from server
2016-02-02 13:26:36 -08:00
Alex Dadgar ea00dbd576 Comment on AllocModifyIndex 2016-02-01 18:15:12 -08:00
Alex Dadgar a72d39bd04 Don't share task state with the alloc in the task runner 2016-02-01 17:47:53 -08:00
Alex Dadgar 2d98c0eadd Fix double pull with introduction of AllocModifyIndex 2016-02-01 15:43:59 -08:00
Alex Dadgar 36df3aaac7 Remove running, system scheduler, and fix tg overriding eligibility 2016-01-31 20:56:52 -08:00
Alex Dadgar c55eb0816c Address comments 2016-01-31 18:46:45 -08:00
Alex Dadgar 20c6c3e8a5 Disable blocked eval tracker when leadership is lost 2016-01-31 18:04:45 -08:00
Alex Dadgar 52a27e84c5 Leader reaps and cancels duplicate evals 2016-01-31 18:04:45 -08:00
Alex Dadgar dc978066e2 dedup blocked evals by job id 2016-01-31 18:04:45 -08:00
Alex Dadgar dd19b7e848 Buffered unblock 2016-01-31 18:04:45 -08:00