Commit graph

1392 commits

Author SHA1 Message Date
Alex Dadgar f97664512b Upsert Job Histories 2017-04-15 17:08:05 -07:00
Alex Dadgar d489ed3c7d Job History schema 2017-04-15 17:08:05 -07:00
Alex Dadgar a331a234d4 NewEvalBroker comment 2017-04-14 15:26:54 -07:00
Alex Dadgar a9c8b09da8 Push to configs 2017-04-14 15:24:55 -07:00
Alex Dadgar 8aec604e3f Easy feedback fixes 2017-04-14 13:19:14 -07:00
Alex Dadgar df7d59051f Reaping failed evaluations creates follow up eval
Create a follow up evaluation when reaping failed evaluations. This
ensures that a job will still make eventual progress.
2017-04-12 14:47:59 -07:00
Alex Dadgar ef875f6dda Delay Nack re-enqueue
Add a delay when an evaluation is nacked that starts off small but
compounds to a larger delay for subsequent Nacks. This creates some
back pressure.
2017-04-12 13:41:40 -07:00
Alex Dadgar c261fe324e Fix a panic during plan evaluation
This PR fixes a potential source of a panic while evaluating a plan with
all_at_once set to true with partial failures.

Fixes #2531
2017-04-10 14:56:21 -07:00
Michael Schurter a81c387adf Require TLS for server RPC when enabled
Fixes #2525

We used to be checking a RequireTLS field that was never set. Instead we
can just check the TLSConfig.EnableRPC field and require TLS if it's
enabled.

Added a few unfortunately slow integration tests to assert the intended
behavior of misconfigured RPC TLS.

Also disable a lot of noisy test logging when -v isn't specified.
2017-04-06 09:34:36 -07:00
Alex Dadgar 81b78f77e1 Track task start/finish time & improve logs errors
This PR adds tracking to when a task starts and finishes and the logs
API takes advantage of this and returns better errors when asking for
logs that do not exist.
2017-03-31 16:14:11 -07:00
Alex Dadgar 54a4d01bf6 Fix error checking 2017-03-29 13:59:43 -07:00
Michael Schurter ae3810052d Merge pull request #2482 from hashicorp/f-2289-better-artifact-err
Improve artifact download error message
2017-03-28 12:48:22 -07:00
Michael Schurter 2d07b75308 Recoverable struct field must be exported 2017-03-28 12:18:57 -07:00
Alex Dadgar 7bcf40bf65 Merge pull request #2489 from hashicorp/b-immediate-periodic
Fix dispatch of periodic job
2017-03-28 11:13:50 -07:00
Alex Dadgar 103e8d21fb Fix dispatch of periodic job
This PR fixes an issue in which when a periodic and parameterized job
was dispatched, an allocation would be immediately created.

Fixes https://github.com/hashicorp/nomad/issues/2470
2017-03-27 16:55:17 -07:00
Michael Schurter 507862ade3 Add WrapRecoverable helper 2017-03-27 15:37:15 -07:00
Alex Dadgar 787be30f13 Fix periodic job state
This PR fixes an issue in which a periodic job would incorrectly
transistion to status dead.

Fixes https://github.com/hashicorp/nomad/issues/2268
2017-03-27 10:35:36 -07:00
Michael Schurter 0e6c564406 Improve artifact download error message
Fixes #2289

Unfortunately took more RecoverableError hijinx than I would have liked.
There might be a better way.
2017-03-24 15:26:05 -07:00
Alex Dadgar 3b323603b5 Fix diff alignment and remove no change DC
Old Output:
```
+/- Job: "example"
    Datacenters {
  Datacenters: "dc1"
}
+/- Task Group: "cache" (1 create/destroy update)
  +/- RestartPolicy {
    +/- Attempts: "10" => "9"
        Delay:    "25000000000"
        Interval: "300000000000"
        Mode:     "delay"
  }
  +/- EphemeralDisk {
        Migrate: "false"
    +/- SizeMB:  "300" => "301"
        Sticky:  "false"
  }
  +/- Task: "redis" (forces create/destroy update)
    +   Meta[key]: "value"
    +/- Config {
          image:           "redis:3.2"
      +/- port_map[0][db]: "6379" => "6380"
    }
    +/- Resources {
          CPU:      "500"
          DiskMB:   "0"
          IOPS:     "0"
      +/- MemoryMB: "256" => "257"
    }
    +/- Service {
          Name:      "global-redis-check"
          PortLabel: "db"
      +/- Check {
            Command:       ""
            InitialStatus: ""
            Interval:      "10000000000"
            Name:          "alive"
            Path:          ""
            PortLabel:     ""
            Protocol:      ""
        +/- Timeout:       "2000000000" => "3000000000"
            Type:          "tcp"
      }
    }
```

New Output:
```
+/- Job: "example"
+/- Task Group: "cache" (1 create/destroy update)
  +/- RestartPolicy {
    +/- Attempts: "10" => "9"
        Delay:    "25000000000"
        Interval: "300000000000"
        Mode:     "delay"
      }
  +/- EphemeralDisk {
        Migrate: "false"
    +/- SizeMB:  "300" => "301"
        Sticky:  "false"
      }
  +/- Task: "redis" (forces create/destroy update)
    +   Meta[key]: "value"
    +/- Config {
          image:           "redis:3.2"
      +/- port_map[0][db]: "6379" => "6380"
        }
    +/- Resources {
          CPU:      "500"
          DiskMB:   "0"
          IOPS:     "0"
      +/- MemoryMB: "256" => "257"
        }
    +/- Service {
          Name:      "global-redis-check"
          PortLabel: "db"
      +/- Check {
            Command:       ""
            InitialStatus: ""
            Interval:      "10000000000"
            Name:          "alive"
            Path:          ""
            PortLabel:     ""
            Protocol:      ""
        +/- Timeout:       "2000000000" => "3000000000"
            Type:          "tcp"
          }
        }
```
2017-03-21 11:42:10 -07:00
Alex Dadgar df9ce2680b Fix output alignment and remove no-change DC
Old Output:

```
+/- Job: "example"
    Datacenters {
  Datacenters: "dc1"
}
+/- Task Group: "cache" (1 create/destroy update)
  +/- RestartPolicy {
    +/- Attempts: "10" => "9"
        Delay:    "25000000000"
        Interval: "300000000000"
        Mode:     "delay"
  }
  +/- EphemeralDisk {
        Migrate: "false"
    +/- SizeMB:  "300" => "301"
        Sticky:  "false"
  }
  +/- Task: "redis" (forces create/destroy update)
    +   Meta[key]: "value"
    +/- Config {
          image:           "redis:3.2"
      +/- port_map[0][db]: "6379" => "6380"
    }
    +/- Resources {
          CPU:      "500"
          DiskMB:   "0"
          IOPS:     "0"
      +/- MemoryMB: "256" => "257"
    }
    +/- Service {
          Name:      "global-redis-check"
          PortLabel: "db"
      +/- Check {
            Command:       ""
            InitialStatus: ""
            Interval:      "10000000000"
            Name:          "alive"
            Path:          ""
            PortLabel:     ""
            Protocol:      ""
        +/- Timeout:       "2000000000" => "3000000000"
            Type:          "tcp"
      }
    }
```

New Output:

```
+/- Job: "example"
+/- Task Group: "cache" (1 create/destroy update)
  +/- RestartPolicy {
    +/- Attempts: "10" => "9"
        Delay:    "25000000000"
        Interval: "300000000000"
        Mode:     "delay"
      }
  +/- EphemeralDisk {
        Migrate: "false"
    +/- SizeMB:  "300" => "301"
        Sticky:  "false"
      }
  +/- Task: "redis" (forces create/destroy update)
    +   Meta[key]: "value"
    +/- Config {
          image:           "redis:3.2"
      +/- port_map[0][db]: "6379" => "6380"
        }
    +/- Resources {
          CPU:      "500"
          DiskMB:   "0"
          IOPS:     "0"
      +/- MemoryMB: "256" => "257"
        }
    +/- Service {
          Name:      "global-redis-check"
          PortLabel: "db"
      +/- Check {
            Command:       ""
            InitialStatus: ""
            Interval:      "10000000000"
            Name:          "alive"
            Path:          ""
            PortLabel:     ""
            Protocol:      ""
        +/- Timeout:       "2000000000" => "3000000000"
            Type:          "tcp"
          }
        }
```
2017-03-21 11:31:06 -07:00
Alex Dadgar 5588fd6c04 Merge pull request #2432 from hashicorp/b-batch-gc
Eval GC will collect allocs from stopped batch job
2017-03-13 10:06:55 -07:00
Alex Dadgar 56ac0970b4 validation errors 2017-03-11 16:23:24 -08:00
Alex Dadgar 3825f7cf1f Eval GC will collect allocs from stopped batch job
This PR fixes a bug in which allocations from stopped batch jobs could
not be garbage collected.
2017-03-11 15:48:57 -08:00
Alex Dadgar 33263ca085 Merge pull request #2418 from hashicorp/f-distinct-property
`distinct_property` constraint
2017-03-09 22:15:52 -08:00
Alex Dadgar 9011a7984c Add metrics to show allocations on the client
This PR adds the following metrics to the client:
client.allocations.migrating
client.allocations.blocked
client.allocations.pending
client.allocations.running
client.allocations.terminal

Also adds some missing fields to the API version of the evaluation.
2017-03-09 12:37:41 -08:00
Ben Barnard abefe537ba Correct error message re length of service name
The error message had an off-by-one error :)
2017-03-08 15:12:30 +01:00
Alex Dadgar d83a8fe9f2 Unoptimized implementation + testing 2017-03-07 14:48:54 -08:00
Alex Dadgar c1d4927088 Debug lines 2017-03-06 10:25:26 -08:00
Alex Dadgar 634cb3365f Remove dead code 2017-03-03 18:27:49 -08:00
Alex Dadgar 3b9bdfef1c Make validate work without a Nomad agent 2017-03-03 15:02:03 -08:00
Alex Dadgar c58494fdb6 Handle periodic paramaterized jobs
Fixes https://github.com/hashicorp/nomad/issues/2382
2017-03-01 11:45:20 -08:00
Alex Dadgar e21792091a remove leading slash on vault path 2017-02-28 14:03:18 -08:00
Alex Dadgar 5be806a3df Fix vet script and fix vet problems
This PR fixes our vet script and fixes all the missed vet changes.

It also fixes pointers being printed in `nomad stop <job>` and `nomad
node-status <node>`.
2017-02-27 16:00:19 -08:00
Alex Dadgar c71cd02edf Fix typo 2017-02-27 11:40:24 -08:00
Alex Dadgar 6936c81720 Merge pull request #2302 from hashicorp/f-task-state-setup
Adding a task event for setup
2017-02-22 18:42:57 -08:00
Alex Dadgar d2cbbac76e Merge pull request #2347 from hashicorp/b-parse
Remove defaulting from parse and fix tests
2017-02-22 18:42:43 -08:00
Diptanu Choudhury 98921575af Adding a task event for setup 2017-02-22 18:28:07 -08:00
Alex Dadgar ed670f0226 Revert "Deregistering non-existant job returns 404" 2017-02-22 18:22:02 -08:00
Alex Dadgar eae24d7d3f Fix flakey test TestNetworkIndex_AssignNetwork_Dynamic_Contention 2017-02-22 15:58:21 -08:00
Alex Dadgar 556fb2562f Remove defaulting from parse and fix parser tests
This PR removes defaulting from the parse, fixes some regressions that
existed as part of the parser refactor and fixes the tests.
2017-02-22 12:30:05 -08:00
Alex Dadgar 1e5183fb89 Deregistering non-existant job returns 404
Fixes https://github.com/hashicorp/nomad/issues/2326
2017-02-20 20:10:21 -08:00
Alex Dadgar b49fceb491 rebase 2017-02-20 16:43:28 -08:00
Alex Dadgar b67c59f03c Merge branch 'master' into refactor-parser 2017-02-20 15:13:21 -08:00
Alex Dadgar b2ee70472d Merge pull request #2331 from hashicorp/b-dispatch-response
swap structs
2017-02-17 11:44:19 -08:00
Alex Dadgar e06056fa82 swap structs 2017-02-17 11:43:14 -08:00
Alex Dadgar c586cb653b Merge pull request #2318 from hashicorp/f-vault-debug
Server side Vault telemetry
2017-02-17 11:34:37 -08:00
Alex Dadgar 733038bc6c Remove panic 2017-02-17 11:21:49 -08:00
Diptanu Choudhury 7567209857 Making the job spec return api.Job 2017-02-16 13:52:39 -08:00
Alex Dadgar 471d63d5ff Fix diff 2017-02-15 15:23:29 -08:00
Alex Dadgar 7e918003ba Allow specification of timezones 2017-02-15 14:37:06 -08:00
Alex Dadgar 743cf410ef Include alloc on server side err/logs 2017-02-14 16:26:49 -08:00
Alex Dadgar 8bfc4255eb Add server metrics 2017-02-14 16:02:18 -08:00
Alex Dadgar 1657c70426 Merge pull request #2310 from hashicorp/f-serf-reap
Handle Serf Reap event
2017-02-14 11:07:43 -08:00
Alex Dadgar dbc4eac936 Merge pull request #2309 from hashicorp/f-no-root-token
Disallow root policy from being specified
2017-02-14 11:06:31 -08:00
Alex Dadgar b6357f96c5 Merge pull request #2308 from hashicorp/f-leader-task
Leader task
2017-02-14 11:03:32 -08:00
Alex Dadgar 89fea63a53 Merge pull request #2305 from hashicorp/f-operator
Add nomad operator command for interacting with Raft configuration
2017-02-14 11:01:41 -08:00
Alex Dadgar 7d158a5a7f Validate the interval within a restart policy
This PR ensures that the interval specified is not less than 5 seconds.

Fixes https://github.com/hashicorp/nomad/issues/2286
2017-02-13 15:27:36 -08:00
Alex Dadgar 96c17714cf Fix test 2017-02-13 14:31:22 -08:00
Alex Dadgar 7c8bd12509 Handle Serf Reap event
This PR adds handling of the Serf member reap event to remove the peer
from Raft.
2017-02-13 14:22:54 -08:00
Alex Dadgar 0b2e2971fd Disallow root policy from being specified
This PR disallows the specification of a root policy by a Nomad task.
2017-02-13 10:51:29 -08:00
Alex Dadgar 238b4bcafd Add Leader support to client 2017-02-10 17:55:19 -08:00
Alex Dadgar 3ba8faeae3 Add leader task to api and server side 2017-02-10 16:57:47 -08:00
Alex Dadgar 21ef1ce685 Add guard 2017-02-10 16:29:28 -08:00
Alex Dadgar b1cd81e997 Remove todos 2017-02-10 15:41:23 -08:00
Alex Dadgar 2d4d9b79d8 Operator command/endpoint/documentation 2017-02-09 18:04:46 -08:00
Alex Dadgar dea460281d Merge pull request #2282 from hashicorp/f-raft-v2-stage-one
Update to Raft V2 stage one
2017-02-08 15:26:16 -08:00
Alex Dadgar b51ba01d54 Merge pull request #2293 from hashicorp/f-vendor-memdb
Vendor MemDB
2017-02-08 14:51:01 -08:00
Alex Dadgar ae31f4c84e Respond to comments 2017-02-08 14:50:19 -08:00
Alex Dadgar eaf285b208 Fix missing summary restoration 2017-02-08 11:51:48 -08:00
Alex Dadgar 5d293c0f1e Add abandon tests and use snapshot for blocking queries 2017-02-08 11:18:03 -08:00
Alex Dadgar d182aac7a7 Fix nomad tests 2017-02-07 22:10:33 -08:00
Alex Dadgar 04862ca10e Tests compile 2017-02-07 21:30:57 -08:00
Alex Dadgar b69b357c7f Nomad builds 2017-02-07 20:31:23 -08:00
Alex Dadgar 36d018514b Fix test 2017-02-07 11:35:38 -08:00
Alex Dadgar bc2e6b0cc2 Fix state store tests 2017-02-06 16:46:23 -08:00
Alex Dadgar 6f9866ca69 Fix forwarded recoverable error 2017-02-05 13:14:24 -08:00
Alex Dadgar c026a97ce7 Use watchset on getter methods 2017-02-05 12:45:57 -08:00
Alex Dadgar 570efcaebd Update state store and blocking query helper 2017-02-05 12:03:11 -08:00
Alex Dadgar da89b1cde6 Fix bootstrap tests 2017-02-03 13:00:28 -08:00
Alex Dadgar 419bf2c497 Tests build 2017-02-02 17:50:06 -08:00
Alex Dadgar ee368762ae It builds 2017-02-02 16:07:15 -08:00
Alex Dadgar 26db1bd12c Join + Leave peer 2017-02-02 15:49:06 -08:00
Alex Dadgar ac10aed731 Update setupRaft 2017-02-02 15:31:36 -08:00
Alex Dadgar 78cfcd2724 Bump protocol version and update numOtherPeers 2017-02-02 13:52:31 -08:00
Alex Dadgar b951b84b44 Merge pull request #2278 from hashicorp/b-consul-merge
Fix Consul Config Merging/Copying
2017-02-02 11:39:26 -08:00
Alex Dadgar 1562a7aeee Merge pull request #2262 from hashicorp/f-permissions
Template destination file permissions.
2017-02-02 11:28:44 -08:00
Alex Dadgar 7fae2d2cea Fix Consul Config Merging/Copying
This PR fixes config merging/copying code.

Fixes https://github.com/hashicorp/nomad/issues/2264
2017-02-02 11:12:07 -08:00
Alex Dadgar b40f284bd1 Merge branch 'master' into f-permissions 2017-02-01 16:57:46 -08:00
Alex Dadgar f2169cb111 Merge branch 'master' into f-vault-sighup 2017-02-01 16:57:43 -08:00
Alex Dadgar 9c75ec7f57 Add role to merge test 2017-02-01 16:37:08 -08:00
Alex Dadgar 15ffdff497 Vault Client on Server handles SIGHUP
This PR allows the Vault client on the server to handle a SIGHUP. This
allows updating the Vault token and any other configuration without
downtime.
2017-02-01 14:24:10 -08:00
Brad Sickles c67f10166d fmt 2017-02-01 16:37:19 -05:00
Brad Sickles da12d8811a Third time is a charm. 2017-02-01 16:18:12 -05:00
Brad Sickles 5a8f2d3f60 Properly dealing with non-nil errors. 2017-02-01 16:07:19 -05:00
Brad Sickles 429fdb3ce7 Preventing panics of RecoverableError casts 2017-02-01 14:38:59 -05:00
Alex Dadgar 6b7efa1ba6 Template destination file permissions.
This PR allows setting the file permissions of the rendered template.
2017-01-31 20:10:01 -08:00
Alex Dadgar 83106de217 Merge branch 'master' into d-vault 2017-01-27 15:19:34 -08:00
Alex Dadgar f11dedbfc6 explain the skip 2017-01-27 15:19:22 -08:00
Alex Dadgar 183d0bdd15 Cleanup and skip test 2017-01-27 15:06:01 -08:00
Alex Dadgar 0a1b5cabf5 Merge pull request #2247 from hashicorp/b-dispatch-gc
Disallow GC of  parameterized jobs
2017-01-27 11:24:04 -08:00