Commit Graph

1308 Commits

Author SHA1 Message Date
Alex Dadgar 3693de99d8 Merge pull request #1783 from hashicorp/f-consul-template
Consul template manager
2016-10-06 15:05:01 -07:00
Alex Dadgar acdbc2e1c9 Fix diff test 2016-10-06 14:14:59 -07:00
Alex Dadgar 19a6aefd68 more vendoring 2016-10-06 12:36:44 -07:00
Alex Dadgar d2837dec44 Do not allow path to escape the alloc dir for the FS commands 2016-10-03 14:58:44 -07:00
Alex Dadgar 4eaabd675c Consul Template Manager 2016-10-03 12:59:31 -07:00
Diptanu Choudhury d50c395421 Getting snapshot of allocation from remote node (#1741)
* Added the alloc dir move

* Moving allocdirs when starting allocations

* Added the migrate flag to ephemeral disk

* Stopping migration if the allocation doesn't need migration any more

* Added the GetAllocDir method

* refactored code

* Added a test for alloc runner

* Incorporated review comments
2016-10-03 09:59:57 -07:00
Diptanu Choudhury 9b98d70aec Fixed task group validation tests 2016-09-27 17:57:30 -07:00
Michael Schurter 80085ddda5 Merge pull request #1735 from hashicorp/b-bootstrap-flapping
Retry all servers on RPC call failure
2016-09-27 16:33:15 -07:00
Alex Dadgar fca2becaa5 Parse 2016-09-26 15:36:11 -07:00
Alex Dadgar b4c3f015bc Template diff 2016-09-26 15:36:11 -07:00
Alex Dadgar e925f8ca58 Add template structs 2016-09-26 15:36:11 -07:00
Diptanu Choudhury dae7f88118 Not setting a drained node as preferred node (#1740) 2016-09-23 21:15:50 -07:00
Michael Schurter 2ab5264595 Retry all servers on RPC call failure
rpcproxy is refactored into serverlist which prioritizes good servers
over servers in a remote DC or who have had a failure.

Registration, heartbeating, and alloc status updating will retry faster
when new servers are discovered.

Consul discovery will be retried more quickly when no servers are
available (eg on startup or an outage).
2016-09-23 11:44:48 -07:00
Alex Dadgar d64ef28c39 Handle the various valid root cases 2016-09-21 17:30:57 -07:00
Alex Dadgar f99d84d2c3 Renew root tokens where applicable 2016-09-21 16:49:15 -07:00
Alex Dadgar 08d79893b0 Merge pull request #1727 from kaskavalci/patch-4
Change error message
2016-09-21 13:53:13 -07:00
Alex Dadgar 59df3dcf58 Vault diff 2016-09-21 13:49:34 -07:00
Alex Dadgar 12de69a66f Struct and parse 2016-09-21 11:31:09 -07:00
Alex Dadgar 5a9919bfbf Fix VaultPolicies returning non-empty map when there are no policies 2016-09-21 11:14:09 -07:00
Halil Kaskavalci d166f8e3bf Lowercase message 2016-09-21 20:43:15 +03:00
Halil Kaskavalci 389beda383 Change error message
Local Disk -> Ephemeral Disk
2016-09-21 17:57:19 +03:00
Alex Dadgar 50efdb00e9 Merge pull request #1713 from hashicorp/f-alloc-runner-vault
Vault integration in client
2016-09-20 16:15:55 -07:00
Alex Dadgar 64de46432a Merge pull request #1677 from hashicorp/f-vault-implicit-constraint
Vault implicit Task Group constraint + allow root tokens
2016-09-20 16:15:32 -07:00
Alex Dadgar ada5d8dd0c alloc-status 2016-09-14 18:27:13 -07:00
Diptanu Choudhury 1b3c5e98c8 Renaming LocalDisk to EphemeralDisk (#1710)
Renaming LocalDisk to EphemeralDisk
2016-09-14 15:43:42 -07:00
Alex Dadgar 6702a29071 Vault token threaded 2016-09-14 13:30:01 -07:00
Diptanu Choudhury 6028682ad2 Adding LocalDisk to alloc.Job 2016-09-01 17:41:50 -07:00
Alex Dadgar 874dfbc84a Implicit vault constraint 2016-09-01 14:23:40 -07:00
Alex Dadgar b0adaa5301 Allow root token 2016-09-01 12:05:08 -07:00
Alex Dadgar 3c9936ae4a Merge pull request #1659 from hashicorp/f-revoke-accessors
Token revocation and keeping only a single Vault client active among servers
2016-08-31 14:10:46 -07:00
Alex Dadgar 6047414fb9 address comments 2016-08-31 14:10:33 -07:00
Diptanu Choudhury d94bb45ad3 Added some more comments 2016-08-31 14:06:31 -07:00
Diptanu Choudhury 64c57d9136 Added a test 2016-08-31 13:40:43 -07:00
Diptanu Choudhury 52e9946da9 Implemented SetPrefferingNodes in stack 2016-08-30 16:17:50 -07:00
Diptanu Choudhury bfee7b30a3 Introducing shared resources in alloc 2016-08-29 13:49:25 -07:00
Alex Dadgar 48696ba0cc Use tomb to shutdown
Token revocation

Remove from the statestore

Revoke tokens

Don't error when Vault is disabled as this could cause issue if the operator ever goes from enabled to disabled

update server interface to allow enable/disable and config loading

test the new functions

Leader revoke

Use active
2016-08-28 14:06:25 -07:00
Diptanu Choudhury 3dec7cd2c9 Added LocalDisk to diff 2016-08-26 20:38:50 -07:00
Diptanu Choudhury 13497913f9 Ensuring resources are re-calculated properly in fsm 2016-08-26 20:13:11 -07:00
Diptanu Choudhury e79cb67391 Changing implementation of AllocsFit 2016-08-26 17:28:29 -05:00
Diptanu Choudhury 2f681b6415 Added copy method to LocalDisk 2016-08-26 14:24:47 -05:00
Diptanu Choudhury b844dc3600 Fixed more tests 2016-08-26 01:51:19 -05:00
Diptanu Choudhury 3447658bba Added scheduler tests to ensure disk constraints are honored 2016-08-25 15:31:56 -05:00
Diptanu Choudhury ffaf6c6299 Fixed some tests 2016-08-25 13:56:39 -05:00
Diptanu Choudhury 8105613c25 Added an upgrade path for existing jobs with no local disk 2016-08-25 13:00:20 -05:00
Diptanu Choudhury ec73c768f1 Making the scheduler use LocalDisk instead of Resources.DiskMB 2016-08-25 12:27:42 -05:00
Diptanu Choudhury d156f32f94 Implemented job spec parsing for sticky volumes 2016-08-24 13:51:15 -05:00
Diptanu Choudhury 05fe72e89e fixed the exec fingerprinter test 2016-08-23 16:40:56 -05:00
Alex Dadgar db2806a9c1 Merge pull request #1629 from hashicorp/f-derive-token
Server Deriving Tokens on behalf of Clients
2016-08-23 13:58:47 -07:00
Alex Dadgar 1da8566322 Merge pull request #1580 from hashicorp/f-disk-usage-monitoring
Monitor and enforce shared allocation directory disk usage
2016-08-23 09:49:53 -07:00
Alex Dadgar 76d324a8f0 fix comment 2016-08-22 11:41:47 -07:00
Diptanu Choudhury 4ca623bcfe blocking chained allocations until previous allocation hasn't terminated 2016-08-22 11:34:24 -05:00
Alex Dadgar 19be6b57b2 fixes 2016-08-19 20:02:32 -07:00
Alex Dadgar 7f18074637 remove debug 2016-08-19 17:30:16 -07:00
Alex Dadgar c394a45b72 create token tests 2016-08-19 16:40:37 -07:00
Alex Dadgar 909f552d9e tests 2016-08-19 16:40:37 -07:00
Alex Dadgar 10cd844ca8 Commit Vault Accessors to vault and return the response 2016-08-19 16:40:37 -07:00
Alex Dadgar 901000f789 Raft message, fsm and state store table 2016-08-19 16:40:37 -07:00
Alex Dadgar 19752edfaf Pipeline Vault token creation 2016-08-19 16:40:37 -07:00
Alex Dadgar 123a26ffea Rate limiting 2016-08-19 16:40:37 -07:00
Alex Dadgar 94b870a58b Start 2016-08-19 16:40:37 -07:00
Alex Dadgar e7e8d5ab50 Merge pull request #1597 from hashicorp/f-node-secret-id
Nodes generate Secret ID and used retrieving allocations/registering
2016-08-19 10:51:21 -07:00
Alex Dadgar 16285a0dc6 Enforce serverside secret id match 2016-08-19 10:50:49 -07:00
Diptanu Choudhury f8cd51b6e9 Enabling vault if token is present 2016-08-18 12:03:50 -07:00
Ivo Verberk 2a17895a83 Disk resource monitoring and enforcement 2016-08-18 07:59:03 +02:00
Alex Dadgar 895c31f605 Nodes generate Secret ID and used for retrieving allocations and registering 2016-08-17 16:31:47 -07:00
Alex Dadgar cabc9f829d Merge pull request #1614 from hashicorp/f-nomad-0.5
F nomad 0.5
2016-08-17 16:27:50 -07:00
Alex Dadgar 9bd9948c5b Job Register endpoint validates token 2016-08-17 16:25:38 -07:00
Alex Dadgar f9f019fa62 LookupToken 2016-08-17 16:25:38 -07:00
Alex Dadgar a981fb4e0e test renewal 2016-08-17 16:25:38 -07:00
Alex Dadgar a8efce874f Token renewal and beginning of tests 2016-08-17 16:25:38 -07:00
Alex Dadgar 713e310670 Renew loop 2016-08-17 16:25:38 -07:00
Alex Dadgar 750a44b2c0 Create a Vault interface for the server 2016-08-17 16:25:38 -07:00
Alex Dadgar 6e2f0a2776 Server has Vault API client 2016-08-17 16:25:38 -07:00
Diptanu Choudhury 59892e4004 Merge pull request #1612 from hashicorp/diptanu-0-5-branch
Chained Allocs
2016-08-17 16:23:32 -07:00
Alex Dadgar 4135b4ece7 Address field name feedback 2016-08-17 16:23:29 -07:00
vishalnayak bd0a89bdae Added mapstructure tag for VaultToken 2016-08-17 16:23:29 -07:00
Alex Dadgar ad088db6eb Store the Vault token once per job 2016-08-17 16:23:29 -07:00
Alex Dadgar 7d899b6c60 Pass Vault config to client 2016-08-17 16:23:29 -07:00
Alex Dadgar 14b4312502 Add vault struct 2016-08-17 16:23:29 -07:00
Alex Dadgar eac2675faf Add enabled field 2016-08-17 16:23:29 -07:00
Alex Dadgar c913e4396f Add Vault config to server 2016-08-17 16:23:29 -07:00
Alex Dadgar 1584cfe93e small fixes 2016-08-17 16:23:29 -07:00
Alex Dadgar 0ca4a9fa4f Change token/role names 2016-08-17 16:23:29 -07:00
Alex Dadgar adb3ce847f change config variable names to match vault 2016-08-17 16:23:29 -07:00
Alex Dadgar fab7893774 vendor + api 2016-08-17 16:23:29 -07:00
Alex Dadgar b32128aa23 Initial config block 2016-08-17 16:23:29 -07:00
Alex Dadgar de6e662eb4 Fix service validate test 2016-08-17 11:09:40 -07:00
Alex Dadgar a3bcc1cbb1 Fix network dynamic port test 2016-08-17 11:08:21 -07:00
Alex Dadgar be51f1b265 Fix TaskDiff test 2016-08-17 11:07:11 -07:00
Kenjiro Nakayama b6c39349b7 struct: tiny: output case number of the diff test 2016-08-17 19:15:59 +09:00
Diptanu Choudhury 2e22fea61d Merge pull request #1599 from hoffoo/initial_check_state
Add support for initial check status
2016-08-16 15:16:47 -07:00
Alex Dadgar 8a23780aee Fix bitmap test and check bitmap bounds 2016-08-16 15:16:35 -07:00
Marin 8fc52974a3 fix initial status tests 2016-08-16 14:34:36 -07:00
Marin 69bc3a8fc8 Add support for initial check status 2016-08-16 12:05:15 -07:00
Diptanu Choudhury c1a455983d Added the chained alloc for system scheduler 2016-08-16 10:49:45 -07:00
Alex Dadgar ce0b78525d inclusive range 2016-08-15 13:13:04 -07:00
Diptanu Choudhury 761cc40cd2 Fixed a make vet warning 2016-08-12 12:09:44 -07:00
Diptanu Choudhury dd7e69006e Not running tests parallal 2016-08-11 21:53:27 -07:00
Diptanu Choudhury 01e08a64ee Merge pull request #1569 from hashicorp/fix-network-port-collisions
Fix network port collisions when asking for dyn ports
2016-08-11 16:19:45 -07:00
Diptanu Choudhury d81b20c1a6 Fix network port collisions when asking for dyn ports 2016-08-11 16:18:45 -07:00
Alex Dadgar 007a538515 Fix core scheduler tests 2016-08-11 14:36:22 -07:00
Alex Dadgar 6e5c47a315 Merge pull request #1526 from hashicorp/b-random-ports
Set difference when picking random ports
2016-08-10 16:37:57 -07:00
Alex Dadgar 5a37e720c5 Fixes plus address feedback 2016-08-10 16:37:26 -07:00
Alex Dadgar b8fd989d3a Try stochastic and fallback to precise 2016-08-10 11:47:20 -07:00
Diptanu Choudhury ab94c8eed9 Marking allocations which are not terminal and are on down nodes as lost 2016-08-09 13:11:58 -07:00
Diptanu Choudhury 3cc684211a Added a test to ensure summaries are correct when a node goes down 2016-08-09 10:16:17 -07:00
Diptanu Choudhury c63a78b9a3 Removing the check related to checking version of server before reconciling in leader 2016-08-05 17:48:37 -07:00
Diptanu Choudhury 1518f23d0a Making servers reconcile job summaries when they acquire leadership 2016-08-05 16:47:36 -07:00
Alex Dadgar 1b620bcdd8 Add a test 2016-08-05 16:23:41 -07:00
Alex Dadgar 9089a279a1 Set difference when picking random ports 2016-08-05 16:08:35 -07:00
Diptanu Choudhury 6dc5b1972c Setting job's create index as summary create index during reconciliation 2016-08-04 15:14:01 -07:00
Alex Dadgar 2fb67fefb5 Merge pull request #1516 from hashicorp/f-lost-state-sched
Make scheduler mark allocations as lost
2016-08-04 11:36:02 -07:00
Diptanu Choudhury 88d383c47f Updated tests and comments 2016-08-04 11:29:36 -07:00
Alex Dadgar e33bda76bf test sched doesn't mark complete as lost + core_sched tests 2016-08-04 11:24:17 -07:00
Diptanu Choudhury c24e8ba7d8 Not updating summary if job is de-registered 2016-08-03 17:00:08 -07:00
Diptanu Choudhury 74caed0c7a Added an endpoint for users to reconcile job summaries 2016-08-03 16:12:47 -07:00
Alex Dadgar ac3328e812 Make scheduler mark allocations as lost 2016-08-03 15:57:46 -07:00
Diptanu Choudhury 1b60e0823a Added a test for restoring the summaries in fsm 2016-08-03 11:58:36 -07:00
Alex Dadgar 4197e62e78 Remove old way of marking lost 2016-08-03 11:20:56 -07:00
Diptanu Choudhury b95cf91ee3 using the job associated with the alloc to determine if job is present 2016-08-02 19:14:05 -07:00
Diptanu Choudhury 6f8c40fca7 Not updating summary if create index of summary not same as job's create index 2016-08-02 18:59:45 -07:00
Diptanu Choudhury b2d388bcba Merge pull request #1508 from hashicorp/b-dont-update-job
Do not update the job of allocations that are being stopped
2016-08-02 18:58:39 -07:00
Alex Dadgar 2332a58944 Do not update the job of allocations that are being stopped 2016-08-02 17:53:31 -07:00
Diptanu Choudhury 87fdeb5393 Updated the logic to update job summary 2016-08-02 16:08:20 -07:00
Diptanu Choudhury 3966a46996 Updating the summary after we have updated the current allocation when client updates the alloc 2016-08-02 15:06:39 -07:00
Diptanu Choudhury 92e32e46f1 Updating the summary after we have updated the current allocation 2016-08-02 14:59:41 -07:00
Diptanu Choudhury 3aa4f39094 Checking if a job is nil before updating the allocation 2016-08-01 17:05:48 -07:00
Diptanu Choudhury b69b7129a6 Using the parnet transaction to query the allocation while updating summary 2016-08-01 16:46:05 -07:00
Diptanu Choudhury b0e1f02e26 Not updating job summaries if jobs are not present 2016-07-28 15:24:27 -07:00
Diptanu Choudhury 0dd8a84de0 Marking the desired state of an allocation as stop if the node on which it runs disconnects 2016-07-27 17:07:08 -07:00
Diptanu Choudhury b857d7c6c1 Copying job summary before mutating it 2016-07-27 14:46:46 -07:00
Diptanu Choudhury 1bab053490 Updated some tests 2016-07-26 15:11:48 -07:00
Diptanu Choudhury 10a5c06a5a Running the tests in verbose mode 2016-07-26 14:02:47 -07:00
Diptanu Choudhury 9943053239 Fixed a test 2016-07-25 22:22:55 -07:00
Diptanu Choudhury d1a6bdb4ba Making the queued allocations bind late 2016-07-25 22:11:11 -07:00
Diptanu Choudhury 5bded8d54d Setting the right indexes while creating Job Summary 2016-07-25 17:51:20 -07:00
Diptanu Choudhury 3089833397 Reconciling the queued allocations during restore 2016-07-25 17:31:40 -07:00
Diptanu Choudhury cc37ec33cf Renamed Job.GetJobSummary to Job.Summary 2016-07-25 17:31:40 -07:00
Diptanu Choudhury 6193529040 Fixed more tests 2016-07-25 17:31:40 -07:00
Diptanu Choudhury de2c79f421 Added test for blocking query of job summary endpoint 2016-07-25 17:26:38 -07:00
Diptanu Choudhury cce5f483ae Added some more tests 2016-07-25 17:26:38 -07:00
Diptanu Choudhury f1c9427c37 Added code to create missing job summaries 2016-07-25 17:26:38 -07:00
Diptanu Choudhury dabb83063b Review comments 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 50842b88c7 Fixed some bugs 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 1405687a88 Fixed some error messages and conditions 2016-07-25 17:26:38 -07:00
Diptanu Choudhury ef97956333 Added support for retreiving job summary in api 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 632ced5eb2 Adding the summary to the Job Stub 2016-07-25 17:26:38 -07:00
Diptanu Choudhury a5bb0ca6fc Moved the job endpoint around 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 7bafb7c675 Updating the job summary while mutating jobs and allocation objects 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 109b05cb29 Applying changes to job updates via FSM 2016-07-25 17:26:38 -07:00
Diptanu Choudhury affbf5b6e4 Updating the job summary table only if an evaluation has any Queued Allocations 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 2ff2acbfc0 Added the job summary related endpoints 2016-07-25 17:26:38 -07:00
Diptanu Choudhury 1cc0bc392b Setting the number of queued allocations per task group 2016-07-25 17:26:38 -07:00
Alex Dadgar e26f826189 fix job gc tests 2016-07-25 14:56:23 -07:00
Alex Dadgar 0db55c1dce Revert "Fix job gc tests"
This reverts commit 4be50ac8c78b09d603d9680064391d449b268436.
2016-07-25 14:53:07 -07:00
Alex Dadgar e61aa2484a Fix job gc tests 2016-07-25 14:49:57 -07:00
Alex Dadgar 42df093939 Merge pull request #1456 from hashicorp/b-system-job
Node Register handles transistioning to ready and creating evals
2016-07-25 12:46:35 -07:00
Alex Dadgar c4d7f62189 add down to up test 2016-07-25 12:46:18 -07:00
Alex Dadgar 90748cedad Add killing event and mark task as not running when killed 2016-07-21 15:49:54 -07:00
Alex Dadgar ebac5cb283 Node.Register handles the case of transistioning to ready and creating evals 2016-07-21 15:22:02 -07:00
Alex Dadgar af09ef0832 fix validation tests 2016-07-20 16:43:20 -07:00
Alex Dadgar e0114fee05 InitFields to Canonicalize 2016-07-20 16:08:52 -07:00
Alex Dadgar 0634eeb3e0 Sanatize incoming slices/maps 2016-07-20 16:00:02 -07:00
Diptanu Choudhury c8a52f36d5 Merge pull request #1429 from nak3/default-resources
Update comments for the DefaultResources and DefaultLogConfig
2016-07-20 10:09:36 -07:00
Diptanu Choudhury d7e397d3f9 Merge pull request #1439 from nak3/fix-error-message
Add missed service name of the error message for the invalid port
2016-07-20 10:08:45 -07:00
Kenjiro Nakayama 473eb6561a Stop using index for task and task group validation 2016-07-20 22:23:35 +09:00
Kenjiro Nakayama c24e886a5f Add missed service name of the error message for the invalid port 2016-07-20 20:41:24 +09:00
Kenjiro Nakayama 466d7ac1ec Update comments for the DefaultResources and DefaultLogConfig 2016-07-19 15:37:54 +09:00
Alex Dadgar c28027bc9e Merge pull request #1421 from hashicorp/f-system-count-zero
Allow count 0 on system jobs
2016-07-13 14:39:23 -06:00
Alex Dadgar 6bc7009f8c Allow count 0 on system jobs 2016-07-13 13:50:08 -06:00
Diptanu Choudhury 3836d6e54e Merge pull request #1383 from hashicorp/f-job-summary
Job Summary - Part 1
2016-07-13 13:34:27 -06:00
Diptanu Choudhury e35369ec83 Fixed typos in comments 2016-07-13 13:25:07 -06:00
Diptanu Choudhury 487c66b84d Removing the queued state of Job Summary and alloc desired status false 2016-07-13 13:20:46 -06:00
Alex Dadgar c8e7b909c7 Merge pull request #1404 from hashicorp/f-streaming
Implement a streaming API and tail in the fs command
2016-07-12 17:23:04 -06:00
Diptanu Choudhury daa83a4f3e Renamed jobsummary to job_summary 2016-07-12 16:00:35 -06:00
Diptanu Choudhury 5d782abd50 Refactored the test 2016-07-12 14:37:51 -06:00
Diptanu Choudhury 00b9b4c6e8 Accounting lost state of allocations 2016-07-12 14:27:45 -06:00
Alex Dadgar b87cf12f6f Merge pull request #1403 from hashicorp/f-hold-rpc
Gracefully handle short lived outages by holding RPC calls
2016-07-12 13:52:33 -06:00
Diptanu Choudhury e8d1aee3f4 Added a method for listing jobs whose id matches a prefix 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 313d7aa7f5 Added a test to ensure client alloc updates are happening properly 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 91b828d299 Updated logic to handle change in desired status of allocation when client status is still pending 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 6937c0f7f3 Added test for job summary restore 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 5e6f9ef69e Added methods to save and restore job summary snapshots 2016-07-12 11:41:13 -06:00
Diptanu Choudhury ba71757dfb handled the logic of task group count going up 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 67953b1583 Added a test to ensure correctness of job summary when client updates alloc 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 837b70f285 Added test to make sure summary gets deleted when job gets deleted 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 0606840080 Implemented logic to update the job summary when allocs are inserted 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 083f81d17f Implemented job state accounting logic for upsert job 2016-07-12 11:41:13 -06:00
Diptanu Choudhury ebf9fbf1d6 Added a schema for summarizing status of jobs 2016-07-12 11:41:13 -06:00
Diptanu Choudhury 4ea9ceee38 Handling allocations with client state pending 2016-07-12 11:29:23 -06:00
Diptanu Choudhury 2cf2ed6758 Changing the state of an allocation to lost if the node on which it was running was marked as down 2016-07-11 18:24:04 -06:00
Diptanu Choudhury bc0bfc3ae5 Merge pull request #1398 from hashicorp/b-check-timeout
Fixed the validation logic for check timeout and interval
2016-07-10 12:16:50 -07:00
Alex Dadgar 51ae7ace25 initial tail impl 2016-07-10 13:57:04 -04:00
Armon Dadgar 75abbc74a5 nomad: modify forward RPC to hold when no known leader 2016-07-10 13:36:55 -04:00
Armon Dadgar 699c4fc68c nomad: Add RPCHoldTimeout to tune RPC hold interval 2016-07-10 13:35:48 -04:00
Diptanu Choudhury b4fe764f07 Added a test 2016-07-08 22:33:04 -07:00
Diptanu Choudhury 19f0867816 Fixed the validation logic for check timeout 2016-07-08 22:26:03 -07:00
Diptanu Choudhury 48b9684b1e Using net.JoinHostPort instead of handcrafting addrs 2016-07-08 16:45:14 -07:00
Diptanu Choudhury b180223f4b Allowing ports to be overriden in check definitions 2016-07-08 14:14:25 -07:00
Alex Dadgar 099cee067d comments 2016-06-28 10:02:06 -07:00
Alex Dadgar 3f0a47f9e4 Disallow EvalGC to reap batch jobs evals/allocs and make JobGC only oneshot GCs everything 2016-06-27 22:54:03 -07:00
Alex Dadgar 6ca552c451 Reblock test 2016-06-24 10:26:13 -07:00
Alex Dadgar fd3e469d5e Remove requeue because it is a subset of EnqueueAll now 2016-06-24 10:14:34 -07:00
Alex Dadgar 2f8bb4b235 When enqueuing into eval broker always pass blocked eval's token 2016-06-23 22:40:22 -07:00
Alex Dadgar ccf93d7e44 UnblockFailed needs to untrack the job 2016-06-23 15:35:21 -07:00
Alex Dadgar b1c2a9ddb9 UnblockFailed needs to untrack the job 2016-06-23 15:26:26 -07:00
Alex Dadgar 3a8a27bcff refresh index eval id in log 2016-06-22 13:48:41 -07:00
Diptanu Choudhury e43c460534 Fixed name of a test 2016-06-22 13:04:54 -07:00
Diptanu Choudhury 0fe8746692 GC-ing dead batch jobs 2016-06-22 11:40:27 -07:00
Alex Dadgar 8ceb7ead20 Do not use snapshot 2016-06-22 09:33:15 -07:00
Alex Dadgar 91f6976423 tighter index bound when creating GC evals 2016-06-22 09:11:25 -07:00
Alex Dadgar 25decca3ca Worker waitForIndex uses StateStore index, not Raft Applied Index 2016-06-22 09:04:22 -07:00
Sean Chittenden 8bdb38d016
Code golf
Pointed out by: @dadgar
2016-06-21 14:26:01 -07:00
Sean Chittenden df4fe2e502
Fix the shuffling of remote datacenters.
Pointed out by: @ryanuber
2016-06-21 13:37:22 -07:00
Sean Chittenden 9e287858de Merge pull request #1310 from hashicorp/b-logger
Create and pass only one `logger` object around per Agent
2016-06-17 12:16:35 -07:00
Sean Chittenden 46e2d54acf
Provide `nomad.Config` with a default `LogOutput` of `os.StdErr` 2016-06-17 06:44:10 -07:00
Sean Chittenden 9a60999100
Pass a logger arg to `NewClient` and `NewServer` 2016-06-16 23:29:23 -07:00
Sean Chittenden 871a31a8ec
Teach config.ConsulConfig how to construct a consulapi TLS client.
Said differently, centralize the creation of consul's client config
in one place and use it everywhere.
2016-06-16 22:51:06 -07:00
Sean Chittenden d17af396ca
Create config.DefaultConsulConfig() 2016-06-16 20:41:05 -07:00
Sean Chittenden a658299235
Misc typos 2016-06-16 16:17:17 -07:00
Sean Chittenden ec77a1869e
Test for errors 2016-06-16 14:43:46 -07:00
Sean Chittenden 31313b68cf
Don't assign to an atomic w/o using atomic setter func 2016-06-16 14:43:46 -07:00
Sean Chittenden af55b74114 Merge pull request #1276 from hashicorp/f-consul-server-autojoin
Teach Nomad servers how to fall back to Consul.
2016-06-16 14:40:45 -07:00
Sean Chittenden 7c24487850
Fix up various error handling 2016-06-16 14:40:09 -07:00
Sean Chittenden 71cd9984ae
Immediately query Consul upon initialization if we have no peers.
Also don't attempt to join the Server with itself.
2016-06-16 14:27:10 -07:00
Sean Chittenden 65319252b9
Rework `server_auto_join` to use a timer instead of the peer count.
It is perfectly viable for an admin to downsize a Nomad Server cluster
down to 1, 2, or `num % 2 == 0` (however ill-advised such activities
may be).  And instead of using `bootstrap_expect`, use a timeout-based
strategy.  If the `bootstrapFn` hasn't observed a leader in 15s it will
fall back to Consul and will poll every ~60s until it sees a leader.
2016-06-16 12:14:03 -07:00
Sean Chittenden b0fecbefc1
Define `BootstrapExepct` as an `int32` so it can be manipulated atomically. 2016-06-16 12:00:15 -07:00
Alex Dadgar ea5d11e628 remove consul reference 2016-06-15 17:23:02 -07:00
Alex Dadgar bf14fd355f plan displays launch time of periodic jobs 2016-06-15 13:34:45 -07:00
Sean Chittenden 14f9d2a947
Use the config's log output 2016-06-15 12:40:51 -07:00
Sean Chittenden 5b0def194a
Namespace the log messages 2016-06-15 12:40:51 -07:00
Sean Chittenden bffc82d668
Do not consider the number of Serf members when considering falling back to Consul. 2016-06-15 12:40:51 -07:00
Sean Chittenden 324af8d7f1
Guard the auto-join functionality behind its `consul.server_auto_join` tunable 2016-06-15 12:40:51 -07:00
Sean Chittenden 5e0ced2ae7
Shuffle all datacenters vs only the nearest N datacenters.
Per discussion, we want to be aggressive about fanning out vs possibly
fixating on only local DCs.  With RPC forwarding in place, a random walk
may be less optimal from a network latency perspective, but it is guaranteed
to eventually result in a converged state because all DCs are candidates
during the bootstrapping process.
2016-06-15 12:40:51 -07:00
Sean Chittenden 2123460cf0
Bump various Consul search limits
Client: Search limit increased from 4 random DCs to 8 random DCs, plus nearest.
Server: Search factor increased from 3 to 5 times the bootstrap_expect.

This should allow for faster convergence in large environments (e.g.
sub-5min for 10K Consul DCs).
2016-06-15 12:40:51 -07:00
Sean Chittenden e8d1264dbc
Short-circuit the bootstrapFn if we have a leader 2016-06-15 12:40:51 -07:00
Sean Chittenden f05514335b
Teach Nomad servers how to fall back to Consul. 2016-06-15 12:40:51 -07:00
Alex Dadgar aea21affdb Document consul configuration 2016-06-14 15:21:57 -07:00
Sean Chittenden 6e22b680ce
Disambiguate `auto_join` from `auto_register`, rename reg to `auto_advertise`.
Provide an option that describes the value to the user vs the
operation performed by the software.  Momentarily introducing
`auto_join`
2016-06-14 12:11:38 -07:00
Sean Chittenden 4f14d51013
Fix up validation and allow existing unset timeouts to continue to be unset 2016-06-13 18:55:15 -07:00
Sean Chittenden c3a3fdc230
Upon further review, the Timeout needs to be validate for more than script checks.
This value is used for Consul HTTP and TCP checks.
2016-06-13 18:28:27 -07:00
Sean Chittenden baac19cad6
Remove diff check for ServiceID, may it R.I.P. 2016-06-13 18:22:53 -07:00
Sean Chittenden 79c675cf72
Guard against an interval and timeout being less than 1s 2016-06-13 18:19:40 -07:00
Sean Chittenden af8db7ec18
Don't export ServiceCheck validate 2016-06-13 18:17:43 -07:00
Sean Chittenden 08c88102a7
There is no "docker" check type 2016-06-13 18:15:07 -07:00
Alex Dadgar 8bbf4a55e5 Fix IDs and domain scoping 2016-06-13 16:30:58 -07:00
Alex Dadgar 8e231fa382 Rename ConsulService back to Service 2016-06-12 16:36:49 -07:00
Diptanu Choudhury 3024c080e8 Removing artifact check for java and qemu drivers 2016-06-12 12:57:35 +02:00
Alex Dadgar 480a281031 Merge pull request #1243 from hashicorp/f-run-modify-index
Add check-index flag to nomad run
2016-06-11 16:12:53 -07:00
Sean Chittenden 2f036231e5 Merge pull request #1201 from hashicorp/f-dyn-server-list
Dynamic Server Lists/Client Bootstrapping via consul.
2016-06-11 18:58:25 -04:00