Commit graph

2699 commits

Author SHA1 Message Date
Nick Ethier ade97bc91f
fixup #5172 and rebase against master 2019-05-14 14:37:34 -04:00
Nick Ethier cab6a95668
Merge branch 'master' into pr/5172
* master: (912 commits)
  Update redirects.txt
  Added redirect for Spark guide link
  client: log when server list changes
  docs: mention regression in task config validation
  fix update to changelog
  update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665
  typo: "atleast" -> "at least"
  implement nomad exec for rkt
  docs: fixed typo
  use pty/tty terminology similar to github.com/kr/pty
  vendor github.com/kr/pty
  drivers: implement streaming exec for executor based drivers
  executors: implement streaming exec
  executor: scaffolding for executor grpc handling
  client: expose allocated memory per task
  client improve a comment in updateNetworks
  stalebot: Add 'thinking' as an exempt label (#5684)
  Added Sparrow link
  update links to use new canonical location
  Add redirects for restructing done in GH-5667
  ...
2019-05-14 14:10:33 -04:00
Preetha Appan 4d3f74e161
Fix test setup to have correct jobcreateindex for deployments 2019-05-13 18:53:47 -05:00
Preetha Appan 07690d6f9e
Add flag similar to --all for allocs to be able to filter deployments by latest 2019-05-13 18:33:41 -05:00
Mahmood Ali 2ddc39973d
Merge pull request #5668 from hashicorp/flaky-test-20190430
fix flaky test by allowing for call invocation overhead
2019-05-13 12:33:44 -04:00
Mahmood Ali dd8762e348 typo: "atleast" -> "at least" 2019-05-13 10:01:19 -04:00
Mahmood Ali 513303347c add CLI commands for nomad exec 2019-05-12 22:04:50 -04:00
Mahmood Ali 919827f2df
Merge pull request #5632 from hashicorp/f-nomad-exec-parts-01-base
nomad exec part 1: plumbing and docker driver
2019-05-09 18:09:27 -04:00
Mahmood Ali 66982a1660 agent: add websocket handler for nomad exec
This adds a websocket endpoint for handling `nomad exec`.

The endpoint is a websocket interface, as we require a bi-directional
streaming (to handle both input and output), which is not very appropriate for
plain HTTP 1.0. Using websocket makes implementing the web ui a bit simpler. I
considered using golang http hijack capability to treat http request as a plain
connection, but the web interface would be too complicated potentially.

Furthermore, the API endpoint operates against the raw core nomad exec streaming
datastructures, defined in protobuf, with json serializer.  Our APIs use json
interfaces in general, and protobuf generates json friendly golang structs.
Reusing the structs here simplify interface and reduce conversion overhead.
2019-05-09 16:49:08 -04:00
Danielle 4a22fa0ee2
Merge pull request #5536 from hashicorp/dani/consul
Consul Catalog Integration Fixes
2019-05-09 13:22:54 +02:00
Danielle Lancashire 0da2924b2a consul: Document example check id 2019-05-09 13:22:22 +02:00
Mahmood Ali d405fcb093 fix flaky test by allowing for call invocation overhead 2019-05-08 18:04:37 -04:00
Preetha 1538913a2a
Merge pull request #5628 from hashicorp/f-preemption-config
Add config to disable preemption for batch/service jobs
2019-05-06 15:40:35 -05:00
Lang Martin 9f3f11df97
Merge pull request #5601 from hashicorp/b-config-parse-direct-hcl
config parse direct hcl
2019-05-06 12:05:19 -04:00
Preetha Appan ad3c263d3f
Rename to match system scheduler config.
Also added docs
2019-05-03 14:06:12 -05:00
Danielle Lancashire d824e00d1a consul: Do not deregister external checks
This commit causes sync to skip deregistering checks that are not
managed by nomad, such as service maintenance mode checks.  This is
handled in the same way as service registrations - by doing a Nomad
specific prefix match.
2019-05-02 16:54:18 +02:00
Danielle Lancashire 0b8e85118e consul: Use a stable identifier for services
The current implementation of Service Registration uses a hash of the
nomad-internal state of a service to register it with Consul, this means that
any update to the service invalidates this name and we then deregister, and
recreate the service in Consul.

While this behaviour slightly simplifies reasoning about service registration,
this becomes problematic when we add consul health checks to a service. When
the service is re-registered, so are the checks, which default to failing for
at least one check period.

This commit migrates us to using a stable identifier based on the
allocation, task, and service identifiers, and uses the difference
between the remote and local state to decide when to push updates.

It uses the existing hashing mechanic to decide when UpdateTask should
regenerate service registrations for providing to Sync, but this should
be removable as part of a future refactor.

It additionally introduces the _nomad-check- prefix for check
definitions, to allow for future allowing of consul features like
maintenance mode.
2019-05-02 16:54:18 +02:00
Chris Baker a40477a7b8
test case for 5540 (#5590)
* client/metrics: modified metrics to use (updated) client copy of allocation instead of (unupdated) server copy

* updated armon/go-metrics to address race condition in DisplayMetrics
2019-04-30 10:31:35 -04:00
Lang Martin 2e643d26a2 config_parse leave the *HCL strings in place after converting times 2019-04-30 10:30:53 -04:00
Lang Martin 3ba6095fe3 config_parse_test additional config confirmation w/ sample json 2019-04-30 10:30:53 -04:00
Lang Martin fe9b31dcf9 config comment for future changes 2019-04-30 10:30:53 -04:00
Lang Martin 598112a1cc tag HCL bookkeeping keys with json:"-" to keep them out of the api 2019-04-30 10:29:14 -04:00
Lang Martin 43407cffe3 config_parse_test remove redundant parse direct test 2019-04-30 10:29:14 -04:00
Lang Martin b8e9c35cd0 config_parse remove unused multi-stage parsing via mapstructure 2019-04-30 10:29:14 -04:00
Lang Martin 1f86770456 config_parse_test test direct hcl parsing 2019-04-30 10:29:14 -04:00
Lang Martin 5ebae65d1a agent/config, config/* mapstructure tags -> hcl tags 2019-04-30 10:29:14 -04:00
Lang Martin 92fd988c9f config_parse add new ParseConfigFileDirectHCL
- parse by using hcl.Decode directly
- handle time.Duration strings in a second pass
- report unexpected keys in a third pass
2019-04-30 10:29:14 -04:00
Preetha Appan 6615d5c868
Add config to disable preemption for batch/service jobs 2019-04-29 18:48:07 -05:00
Danielle Lancashire a8880f9643 alloc_signal: Add autcompletion and cmd tests 2019-04-26 12:47:53 +02:00
Danielle Lancashire 3409e0be89 allocs: Add nomad alloc signal command
This command will be used to send a signal to either a single task within an
allocation, or all of the tasks if <task-name> is omitted. If the sent signal
terminates the allocation, it will be treated as if the allocation has crashed,
rather than as if it was operator-terminated.

Signal validation is currently handled by the driver itself and nomad
does not attempt to restrict or validate them.
2019-04-25 12:43:32 +02:00
Mahmood Ali 60ee243149 fix crash when executor parent nomad process dies
Fixes https://github.com/hashicorp/nomad/issues/5593

Executor seems to die unexpectedly after nomad agent dies or is
restarted.  The crash seems to occur at the first log message after
the nomad agent dies.

To ease debugging we forward executor log messages to executor.log as
well as to Stderr.  `go-plugin` sets up plugins with Stderr pointing to
a pipe being read by plugin client, the nomad agent in our case[1].
When the nomad agent dies, the pipe is closed, and any subsequent
executor logs fail with ErrClosedPipe and SIGPIPE signal.  SIGPIPE
results into executor process dying.

I considered adding a handler to ignore SIGPIPE, but hc-log library
currently panics when logging write operation fails[2]

This we opt to revert to v0.8 behavior of exclusively writing logs to
executor.log, while we investigate alternative options.

[1] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-plugin/client.go#L528-L535
[2] https://github.com/hashicorp/nomad/blob/v0.9.0/vendor/github.com/hashicorp/go-hclog/int.go#L320-L323
2019-04-23 09:52:46 -04:00
Danielle 198a838b61
Merge pull request #5512 from hashicorp/dani/f-alloc-stop
alloc-lifecycle: nomad alloc stop
2019-04-23 13:05:08 +02:00
Danielle Lancashire 832f607433 allocs: Add nomad alloc stop
This adds a `nomad alloc stop` command that can be used to stop and
force migrate an allocation to a different node.

This is built on top of the AllocUpdateDesiredTransitionRequest and
explicitly limits the scope of access to that transition to expose it
under the alloc-lifecycle ACL.

The API returns the follow up eval that can be used as part of
monitoring in the CLI or parsed and used in an external tool.
2019-04-23 12:50:23 +02:00
Michael Schurter 373748a327
Merge pull request #5486 from hashicorp/b-validate-migrate
api: fix migrate stanza initialization
2019-04-15 09:44:59 -07:00
Chris Baker 3b9237de4a gofmt/goimport and test formatting 2019-04-12 20:55:55 +00:00
Chris Baker eca8a3d537 changes to appease gofmt 2019-04-12 19:12:42 +00:00
Chris Baker b52d1c9274 cli: add support for periodic force evaluation
resolves #3251
2019-04-12 18:56:35 +00:00
Chris Baker 5a43f10aaf cli: add acl token list command, documentation
docs: fix some incorrect acl policy docs (typos, copy-paste errors)
2019-04-12 15:48:36 +00:00
Michael Schurter 5e8e59eefb api: fix migrate stanza initialization
Fixes Migrate to be initialized like RescheduleStrategy.

Fixes #5477
2019-04-11 15:29:19 -07:00
Danielle Lancashire e135876493 allocs: Add nomad alloc restart
This adds a `nomad alloc restart` command and api that allows a job operator
with the alloc-lifecycle acl to perform an in-place restart of a Nomad
allocation, or a given subtask.
2019-04-11 14:25:49 +02:00
Danielle 35f66d901f
Merge pull request #5516 from hashicorp/dani/f-verbose-status
Allow passing -verbose to meta status
2019-04-11 13:31:48 +02:00
Danielle Lancashire b4547a34b0 status: Allow passing -verbose to meta status
A common issue when using nomad is needing to add in the object verb to
a command to include the `-verbose` flag.

This commit allows users to pass `-verbose` via the `nomad status` alias by
adding a placeholder boolean in the metacommand which allows subcommands
to parse the flag.
2019-04-11 13:15:44 +02:00
Chris Baker ce0c330c7c
agent config: cleaner VAULT_ env lookup 2019-04-10 10:34:10 -05:00
Chris Baker a26d4fe1e5
docs: -vault-namespace, VAULT_NAMESPACE, and config
agent: added VAULT_NAMESPACE env-based configuration
2019-04-10 10:34:10 -05:00
Chris Baker d3041cdb17
wip: added config parsing support, CLI flag, still need more testing, VAULT_ var, documentation 2019-04-10 10:34:10 -05:00
Chris Baker 6a2454f56d
"job revert" command: alphabetized flags 2019-04-10 10:34:10 -05:00
Chris Baker 2f4d8d0a2f
cli: plumbed vault token from job revert command through API call 2019-04-10 10:34:10 -05:00
Arshneet Singh 2b50b5499d
Remove redundant assertion and replace regex matches with require 2019-04-10 10:34:10 -05:00
Arshneet Singh 1272fcb9e1
Don't display node name if output isn't verbose. Add tests. 2019-04-10 10:34:10 -05:00
James Rasell 9470507cf4
Add NodeName to the alloc/job status outputs.
Currently when operators need to log onto a machine where an alloc
is running they will need to perform both an alloc/job status
call and then a call to discover the node name from the node list.

This updates both the job status and alloc status output to include
the node name within the information to make operator use easier.

Closes #2359
Cloess #1180
2019-04-10 10:34:10 -05:00
Nomad Release bot e307734e4a Generate files for 0.9.0 release 2019-04-09 01:56:00 +00:00
Nomad Release bot 16b4336ccf Generate files for 0.9.0-rc2 release 2019-04-03 01:54:29 +00:00
Preetha Appan 71e6550f81
Address review comments 2019-03-29 08:57:49 -05:00
Preetha Appan e0566237a4
fix linting 2019-03-28 18:01:40 -05:00
Preetha Appan cc07256bb5
Fix json parsing bug with plugins that don't provide args
This fixes a bug with JSON agent configuration parsing where the AST
for the plugin stanza had unnecessary flattening originating from hcl parsing
library. The workaround fixes the AST by popping off the flattened element and wrapping
it in a list. The workaround comes from similar code in terraform.

There were no existing test cases for json parsing so I added a few.
2019-03-28 16:33:30 -05:00
Nomad Release bot 3ab3dd4105 Generate files for 0.9.0-rc1 release 2019-03-21 19:06:13 +00:00
Michael Schurter a2b2c29216 Fix version.go for 0.9.0-beta3 release 2019-02-26 10:11:30 -08:00
Michael Schurter d74755900e Generate files for 0.9.0-beta3 release 2019-02-26 09:44:49 -08:00
Preetha 911c93f7bd
Merge pull request #5350 from hashicorp/b-json-logging-meta
Support json logging for CLI output for agent
2019-02-22 13:40:56 -06:00
Preetha Appan 8f9ec85fe6
fix import order 2019-02-22 13:40:13 -06:00
Preetha Appan 3ab2e431b6
Move logger initialization to earlier step 2019-02-21 12:41:54 -06:00
Danielle Tomlinson 91c300c310 ui: Support colored output on Windows
This commit uses the go-colorable library to enable support for coloured
UI output on Windows. This acts as a compatibility layer that takes
standard unix-y terminal codes and translates them into the requisite
windows calls as required.
2019-02-20 14:01:35 +01:00
Preetha Appan 0149bbc608
cli Ui implementation that logs to a hclogger
This makes it so any messages output to the UI *after* the agent has started
will be logged in json format correctly
2019-02-19 17:53:14 -06:00
Preetha Appan 35f6db47b8
fix indentation 2019-02-14 12:49:26 -06:00
Preetha Appan f443c7d321
Fix whitespace 2019-02-14 12:49:26 -06:00
Chris Baker ab02f4588e
Update job_init.go
minor
2019-02-14 12:49:26 -06:00
Preetha Appan d405881e34
expand job init example with spread and affinity 2019-02-14 12:49:26 -06:00
Michael Schurter 3b84e08fa4
Merge pull request #5297 from hashicorp/b-docker-logging
Docker: Fix logging config parsing
2019-02-11 06:57:52 -08:00
Michael Schurter e3e1797850 consul: squelch noisy useless logs
Only log when syncing actually did something.
2019-02-04 11:07:57 -08:00
Iskander (Alex) Sharipov 7b1a4eaef9
nomad/command: fix strings.Contains args order
Swapped call args order to meet the expected behavior.

Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>
2019-02-02 09:43:24 +03:00
Michael Schurter cad3f1022a cli: do not duplicate reschedule headers per group
Fixes #5291
2019-02-01 09:28:36 -08:00
Alex Dadgar 84d0afccae Generate files for 0.9.0-beta2 2019-01-30 13:31:50 -08:00
Preetha Appan 8e621a167b
fix tests 2019-01-30 14:46:24 -06:00
Alex Dadgar 41265d4d61 Change types of weights on spread/affinity 2019-01-30 12:20:38 -08:00
Alex Dadgar bc804dda2e Nomad 0.9.0-beta1 generated code 2019-01-30 10:49:44 -08:00
Nick Ethier c21ce7b523
add circbufwriter package 2019-01-28 11:35:21 -05:00
Nick Ethier 3ef163b03b
executor: prevent logger from blocking when stderr pipe is detached 2019-01-25 23:08:01 -05:00
Michael Schurter 13f061a83f
Merge pull request #5196 from hashicorp/f-plugin-utils
Make plugins/shared external and make pluginutls/
2019-01-23 06:59:32 -08:00
Michael Schurter 32daa7b47b goimports until make check is happy 2019-01-23 06:27:14 -08:00
Preetha Appan 5f1d467ed2
nil check node resources to prevent panic 2019-01-22 19:34:02 -06:00
Michael Schurter be0bab7c3f move pluginutils -> helper/pluginutils
I wanted a different color bikeshed, so I get to paint it
2019-01-22 15:50:08 -08:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Alex Dadgar cdcd3c929c loader and singleton 2019-01-22 15:11:57 -08:00
Alex Dadgar 6c2782f037 move catalog + grpcutils 2019-01-22 15:11:57 -08:00
Mahmood Ali 05e32fb525
Merge pull request #5213 from hashicorp/b-api-separate
Slimmer /api package
2019-01-18 20:52:53 -05:00
Mahmood Ali 5df63fda7c
Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Mahmood Ali 6bdb9864de api: remove MockJob from exported functions
`api.MockJob` is a test utility, that's only used by `command/agent`
package.  This moves it to the package and removes it from the public
API.
2019-01-18 14:51:31 -05:00
Michael Schurter 48afda786b
Merge pull request #5187 from hashicorp/test-consul
Port a bunch of pre-0.9 Consul tests to 0.9
2019-01-15 07:41:50 -08:00
Alex Dadgar 471fdb3ccf
Merge pull request #5173 from hashicorp/b-log-levels
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Mahmood Ali 9909d98bee Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Michael Schurter fc1bb95ef8 Remove old comment; it's been fixed! 2019-01-14 09:56:53 -08:00
Preetha Appan 7bd1440710
REfactor statedb factory config to set it directly in client config 2019-01-12 10:38:20 -06:00
Preetha Appan f059ef8a47
Modified destroy failure handling to rely on allocrunner's destroy method
Added a unit test with custom statedb implementation that errors, to
use to verify destroy errors
2019-01-12 10:37:12 -06:00
Alex Dadgar 5621086f50 Enable json logs 2019-01-11 11:36:37 -08:00
Alex Dadgar 14ed757a56 Plugins use parent loggers
This PR fixes various instances of plugins being launched without using
the parent loggers. This meant that logs would not all go to the same
output, break formatting etc.
2019-01-11 11:36:37 -08:00
Preetha Appan b46728a88b
Make spread weight a pointer with default value if unset 2019-01-11 10:31:21 -06:00
Nick Wales 7a7b5da0df Adds optional Consul service tags to nomad server and agent services, gh#4297 2019-01-09 22:02:46 +00:00
Chris Baker e9db2ae822 Merge branch 'master' of github.com:hashicorp/nomad into f-1157-validate-node-meta-variables 2019-01-09 18:56:49 +00:00
Chris Baker d5b1a56f3b increased config validation coverage for dev mode 2019-01-09 18:56:40 +00:00
Michael Schurter ac169008f0
Merge pull request #5045 from hashicorp/b-drivermanager-tests-drain
drain: fix node drain monitoring
2019-01-09 10:23:28 -08:00
Mahmood Ali 90f3cea187
Merge pull request #5157 from hashicorp/r-drivers-no-cstructs
drivers: avoid referencing client/structs package
2019-01-09 13:06:46 -05:00
Mahmood Ali 03a9e812c8 cli: support hitting pre-0.9 nomad agents
node.NodeResources is nil when operating against pre-0.9.
2019-01-08 19:32:26 -05:00
Chris Baker d8a3a74c43 move if dev check into config validation, to support dev-mod
validation in the future
2019-01-08 22:21:48 +00:00
Michael Schurter 8a6b1acaa6 drain: fix node drain monitoring
The whole approach to monitoring drains has ordering issues and lacks
state to output useful error messages.

AFAICT to get the tests passing reliably I needed to change the behavior
of monitoring.

Parts of these tests are skipped in CI, and they should be rewritten as
e2e tests.
2019-01-08 09:35:16 -08:00
Chris Baker 220e9e838f refactored config validation into a new method, modified Meta.Client
tests appropriately
2019-01-08 15:07:36 +00:00
Mahmood Ali 916a40bb9e move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Chris Baker 91449d6809 Merge branch 'master' of github.com:hashicorp/nomad into f-1157-validate-node-meta-variables 2019-01-08 02:17:35 +00:00
Chris Baker bf00f93d87 moved interp key regex out to a helper function 2019-01-08 00:11:47 +00:00
Alex Dadgar 8a35d7b1dd Test recovery 2019-01-07 14:49:41 -08:00
Chris Baker f99e18aaf4 gofmt to make check happy 2019-01-07 18:01:59 +00:00
Chris Baker a61afad5bb added validation on client metadata keys 2019-01-07 17:16:38 +00:00
Nick Ethier ab3c5c0a8b
fix test 2018-12-20 13:54:29 -05:00
Nick Ethier fad553ab6a
command: wait for drivers to be ready before test 2018-12-20 13:52:33 -05:00
Nick Ethier 5b9bba08c6
fix tests 2018-12-20 01:05:17 -05:00
Nick Ethier 060ceb3635
fix test 2018-12-20 01:01:53 -05:00
Nick Ethier a96afb6c91
fix tests that fail as a result of async client startup 2018-12-20 00:53:44 -05:00
Nick Ethier 82175d1328
client/drivermananger: add driver manager
The driver manager is modeled after the device manager and is started by the client.
It's responsible for handling driver lifecycle and reattachment state, as well as
processing the incomming fingerprint and task events from each driver. The mananger
exposes a method for registering event handlers for task events that is used by the
task runner to update the server when a task has been updated with an event.

Since driver fingerprinting has been implemented by the driver manager, it is no
longer needed in the fingerprint mananger and has been removed.
2018-12-18 22:55:18 -05:00
Alex Dadgar 4c57d2ec4d Add plugin API versioning to plugin loader and plugins 2018-12-18 16:48:00 -08:00
Nick Ethier 09dadf0a23
Merge branch 'master' into f-grpc-executor
* master: (71 commits)
  Fix output of 'nomad deployment fail' with no arg
  Always create a running allocation when testing task state
  tests: ensure exec tests pass valid task resources (#4992)
  some changes for more idiomatic code
  fix iops related tests
  fixed bug in loop delay
  gofmt
  improved code for readability
  client: updateAlloc release lock after read
  fixup! device attributes in `nomad node status -verbose`
  drivers/exec: support device binds and mounts
  fix iops bug and increase test matrix coverage
  tests: tag image explicitly
  changelog
  ci: install lxc-templates explicitly
  tests: skip checking rdma cgroup
  ci: use Ubuntu 16.04 (Xenial) in TravisCI
  client: update driver info on new fingerprint
  drivers/docker: enforce volumes.enabled (#4983)
  client: Style: use fluent style for building loggers
  ...
2018-12-13 14:41:09 -05:00
Brian Lalor 31ef34838e
Fix output of 'nomad deployment fail' with no arg 2018-12-13 13:22:17 -05:00
Mahmood Ali d497729826
Merge pull request #4978 from hashicorp/f-device-tweaks
Display device attributes in `nomad node status -verbose`
2018-12-12 19:45:07 -05:00
Mahmood Ali 00c9385a2b fixup! device attributes in nomad node status -verbose 2018-12-12 09:17:31 -05:00
Alex Dadgar 86d9ad4397 fix iops bug and increase test matrix coverage 2018-12-11 15:28:21 -08:00
Mahmood Ali 69b2355274
Merge pull request #4975 from hashicorp/fix-master-20181209
Some test fixes and remedies
2018-12-11 18:00:21 -05:00
Alex Dadgar 1531b6d534
Merge pull request #4970 from hashicorp/f-no-iops
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Mahmood Ali 5a487ac884 tests: prevent indefinite blocking in some tests
Noticed few places where tests seem to block indefinitely and panic
after the test run reaches the test package timeout.

I intend to follow up with the proper fix later, but timing out is much
better than indefinitely blocking.
2018-12-11 09:35:26 -05:00
Michael Schurter 8808ab9cea
Merge pull request #4953 from hashicorp/b-script-context-wrapper
consul: add ScriptExecutor context wrapper
2018-12-10 17:22:53 -08:00
Michael Schurter 4c5f3ae82c
Merge pull request #4952 from hashicorp/b-script-context
consul: fix script checks exiting after 1 run
2018-12-10 17:22:15 -08:00
Mahmood Ali 14668f48d1 device attributes in nomad node status -verbose
This reports device attributes like the following:

```
$ nomad node status -self -verbose
ID          = f7adb958-29e1-2a5a-2303-9d61ffaab33a
Name        = mars.local
Class       = <none>
DC          = dc1
Drain       = false
Eligibility = eligible
Status      = ready
Uptime      = 12h40m13s

Drivers
Driver       Detected  Healthy  Message                               Time
docker       true      true     healthy                               2018-12-10T11:47:19-05:00
...

Attributes
cpu.arch                      = amd64
cpu.frequency                 = 2200
cpu.modelname                 = Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
cpu.numcores                  = 12
...

Device Group Attributes
Device Group = nomad/file/mock
block_device = sda1
filesystem   = ext4
size         = 63.2 GB

Meta
```
2018-12-10 12:18:24 -05:00
Mahmood Ali 9f69b8bfec Rename helper_stats -> helper_devices 2018-12-10 12:18:24 -05:00
Nick Ethier 47df1dde10
Merge branch 'master' into f-grpc-executor 2018-12-06 21:42:38 -05:00
Nick Ethier 29ef54c0ee
executor: merge plugin shim with executor package 2018-12-06 21:13:45 -05:00
Alex Dadgar c918a96490 Warn if IOPS is being used 2018-12-06 16:17:09 -08:00
Alex Dadgar 1e3c3cb287 Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Nick Ethier 8b20de4801
executor: use grpc instead of netrpc as plugin protocol
* Added protobuf spec for executor
 * Seperated executor structs into their own package
2018-12-05 11:03:56 -05:00
Mahmood Ali f8efc40b8b tests: stop integration tests tasks explicitly
Also update the new recommended `nomad job` subcommands
2018-12-04 11:50:59 -05:00
Michael Schurter 8fa5e90095 consul: add ScriptExecutor context wrapper
Since d335a82859ca2177bc6deda0c2c85b559daf2db3 ScriptExecutors now take
a timeout duration instead of a context. This broke the script check
removal code which used context cancelation propagation to remove
script checks while they were executing.

This commit adds a wrapper around ScriptExecutors that obeys context
cancelation again. The only downside is that it leaks a goroutine until
the underlying Exec call completes or timeouts.

Since check removal is relatively rare, check timeouts usually low, and
scripts usually fast, the risk of leaking a goroutine seems very small.
2018-12-03 20:26:31 -08:00
Michael Schurter 6459c19ffc consul: fix script checks exiting after 1 run
Fixes a regression caused in d335a82859ca2177bc6deda0c2c85b559daf2db3

The removal of the inner context made the remaining cancels cancel the
outer context and cause script checks to exit prematurely.
2018-12-03 18:50:02 -08:00
Danielle Tomlinson 51a9f7369e
Merge pull request #4936 from hashicorp/f-legacy-refactor
Refactor and repackage client/driver
2018-11-30 13:38:06 +01:00
Danielle Tomlinson ffc5e5d56b executors: Unify go-plugin handshake 2018-11-30 10:59:23 +01:00
Danielle Tomlinson fdfe93aa25 fixup: executorplugin: fix rkt build 2018-11-30 10:47:08 +01:00
Danielle Tomlinson d26a310db0 client: Move executor plugins into own package 2018-11-30 10:46:13 +01:00
Danielle Tomlinson 9b3e731f88 command: Remove Extraneous field in nodedrain test 2018-11-30 10:46:13 +01:00
Nick Ethier 80ae7e34f4
Merge pull request #4906 from hashicorp/f-metric-prefix-master
Port metric prefix filtering to master
2018-11-29 22:27:47 -05:00
Nick Ethier b1484aec33
nomad: fix hclog usage 2018-11-29 22:27:39 -05:00
Alex Dadgar 4ee603c382 Device hook and devices affect computed node class
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.

TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
2018-11-27 17:25:33 -08:00
Nick Ethier ed65610ec6
command/agent: additional tests for telemetry config parsing 2018-11-19 23:22:33 -05:00
Nick Ethier b81e4e18f0
agent: suppose filter_default telemetry option 2018-11-19 23:21:48 -05:00
Nick Ethier 85b221a1d6
nomad: add flag to disable publishing of job_summary metrics for dispatched jobs 2018-11-19 23:21:19 -05:00
Nick Ethier 9e64ce7d73
docker: properly launch docker logger process 2018-11-19 22:59:12 -05:00
Mahmood Ali 9479015f51
Merge pull request #4884 from hashicorp/f-alloc-devices-cli
Report alloc device statistics in API and CLI
2018-11-16 18:04:54 -05:00
Mahmood Ali 6f9126f475 show Device Stats header in alloc status 2018-11-16 17:34:37 -05:00
Mahmood Ali 00ffd02ced Show stable order of device attributes 2018-11-16 17:34:37 -05:00
Preetha 5f094633fa
Merge pull request #4889 from hashicorp/f-service-meta
Pass service metadata "external-source" for consul UI integration
2018-11-16 12:24:21 -06:00
Preetha Appan 18708d3f0b
Pass service metadata "external-source" for consul UI integration 2018-11-16 11:28:56 -06:00
Mahmood Ali 33c96a803a tweak whitespace in device stats output 2018-11-16 10:37:39 -05:00
Mahmood Ali 159b8f866a Display device stats in nomad alloc status 2018-11-16 10:26:32 -05:00
Mahmood Ali d88a3f8413 Prepare to reuse device resources printing 2018-11-16 10:26:32 -05:00
Michael Schurter 6f3712ed48 gofmt -s -w command/helper_stats_test.go
Fixes the static checks build
2018-11-15 14:14:05 -08:00
Mahmood Ali 24b37e0aaf Display StatsObject nested objects as well 2018-11-15 08:09:54 -05:00
Mahmood Ali ee9353fbd6 Use disk display format for devices 2018-11-14 22:13:23 -05:00
Mahmood Ali 0712be643f Print verbose device in nomad node status -stats 2018-11-14 22:13:23 -05:00
Mahmood Ali 93e8fc53f9 device stats summary in node status
Sample output with a mock device:

```
Host Resource Utilization
CPU             Memory          Disk
2651/26400 MHz  9.6 GiB/16 GiB  98 GiB/234 GiB

Device Resource Utilization
nomad/file/mock[README.md]    511 bytes
nomad/file/mock[e2e.go]       239 bytes
nomad/file/mock[e2e_test.go]  128 bytes

Allocations
No allocations placed
```
2018-11-14 22:13:23 -05:00
Mahmood Ali c62ec124c0 Set clean config for mock driver
The default job here contains some exec task config (for setting
command and args) that aren't used for mock driver.  Now, the alloc
runner seems stricter about validating fields and errors on unexpected
fields.

Updating configs in tests so we can have an explicit task config
whenever driver is set explicitly.
2018-11-13 10:21:40 -05:00
Mahmood Ali c7610d8c22 mark and skip failing consul failing tests 2018-11-13 10:21:40 -05:00
Preetha Appan fd0ba320da
change path to v1/scheduler/configuration 2018-11-12 15:57:45 -06:00
Preetha Appan 3a10a589d7
Fix failing test 2018-11-10 19:53:47 -06:00
Preetha Appan 7ef126a027
Smaller methods, and added tests for RPC layer 2018-11-10 17:37:33 -06:00
Preetha Appan 75662b50d1
Use response object/querymeta/writemeta in scheduler config API 2018-11-10 10:31:10 -06:00
Alex Dadgar 98398a8a44
Merge pull request #4842 from hashicorp/b-deployment-progress-deadline
Fix multiple bugs with progress deadline handling
2018-11-08 13:31:54 -08:00
Preetha Appan f1a69e529c
Fix vet error 2018-11-08 09:48:43 -06:00
Preetha Appan da75388a9b
review feedback 2018-11-08 09:48:43 -06:00
Preetha Appan d88ec0267f
Comments 2018-11-08 09:48:43 -06:00
Preetha Appan 5f0a9d2cfd
Show preemption output in plan CLI 2018-11-08 09:48:43 -06:00
Alex Dadgar 204ca8230c Device manager
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
2018-11-07 10:43:15 -08:00
Michael Schurter 392d548b85
Merge pull request #4828 from hashicorp/b-restore
Implement client agent restarting
2018-11-05 18:50:15 -06:00
Alex Dadgar 1c31970464 Fix multiple tgs with progress deadline handling
Fix an issue in which the deployment watcher would fail the deployment
based on the earliest progress deadline of the deployment regardless of
if the task group has finished.

Further fix an issue where the blocked eval optimization would make it
so no evals were created to progress the deployment. To reproduce this
issue, prior to this commit, you can create a job with two task groups.
The first group has count 1 and resources such that it can not be
placed. The second group has count 3, max_parallel=1, and can be placed.
Run this first and then update the second group to do a deployment. It
will place the first of three, but never progress since there exists a
blocked eval. However, that doesn't capture the fact that there are two
groups being deployed.
2018-11-05 16:06:17 -08:00
Michael Schurter 6bdbfb8129 tests: get consul integration tests building 2018-11-05 12:32:05 -08:00
Preetha b2b52b1ada
Merge pull request #4794 from hashicorp/f-preemption-systemjobs
Preemption for system jobs
2018-11-02 16:28:06 -05:00
Preetha Appan d03201adf8
Fix formatting of allocation score metrics 2018-10-30 12:03:23 -05:00
Preetha Appan 9d316cbbef
Fix return type in tests after refactor 2018-10-30 11:10:46 -05:00
Preetha Appan 8f7eb61823
Introduce a response object for scheduler configuration 2018-10-30 11:06:32 -05:00
Preetha Appan c1c1c230e4
Make preemption config a struct to allow for enabling based on scheduler type 2018-10-30 11:06:32 -05:00
Preetha Appan bd34cbb1f7
Support for new scheduler config API, first use case is to disable preemption 2018-10-30 11:06:32 -05:00
Michael Schurter d71a1b4547 tests: more fixes due to api changes 2018-10-29 15:25:22 -07:00
Michael Schurter 2b1b3d7e1e tests: get tests building if not yet passing 2018-10-16 16:56:57 -07:00
Michael Schurter 1a29337e48 register drivers by default
Do not register mock_driver on release builds.
2018-10-16 16:56:56 -07:00
Nick Ethier 3183b33d24 client: review comments and fixup/skip tests 2018-10-16 16:56:56 -07:00
Nick Ethier f192c3752a client: refactor post allocrunnerv2 finalization 2018-10-16 16:56:56 -07:00
Nick Ethier 4a4c7dbbfc client: begin driver plugin integration
client: fingerprint driver plugins
2018-10-16 16:56:56 -07:00
Alex Dadgar 7946a14aa8 Fix lints 2018-10-16 16:56:56 -07:00
Alex Dadgar 45e41cca03 allocrunnerv2 -> allocrunner 2018-10-16 16:56:56 -07:00
Alex Dadgar 6c9d9d5173 move files around 2018-10-16 16:56:55 -07:00
Michael Schurter f279b1d1b1 tests: test logs endpoint against pending task
Although the really exciting change is making WaitForRunning return the
allocations that it started. This should cut down test boilerplate
significantly.
2018-10-16 16:56:55 -07:00
Michael Schurter 6bcf772f3c tests: test via ServeMux so http codes are set 2018-10-16 16:56:55 -07:00
Michael Schurter 960f3be76c client: expose task state to client
The interesting decision in this commit was to expose AR's state and not
a fully materialized Allocation struct. AR.clientAlloc builds an Alloc
that contains the task state, so I considered simply memoizing and
exposing that method.

However, that would lead to AR having two awkwardly similar methods:
 - Alloc() - which returns the server-sent alloc
 - ClientAlloc() - which returns the fully materialized client alloc

Since ClientAlloc() could be memoized it would be just as cheap to call
as Alloc(), so why not replace Alloc() entirely?

Replacing Alloc() entirely would require Update() to immediately
materialize the task states on server-sent Allocs as there may have been
local task state changes since the server received an Alloc update.

This quickly becomes difficult to reason about: should Update hooks use
the TaskStates? Are state changes caused by TR Update hooks immediately
reflected in the Alloc? Should AR persist its copy of the Alloc? If so,
are its TaskStates canonical or the TaskStates on TR?

So! Forget that. Let's separate the static Allocation from the dynamic
AR & TR state!

 - AR.Alloc() is for static Allocation access (often for the Job)
 - AR.AllocState() is for the dynamic AR & TR runtime state (deployment
   status, task states, etc).

If code needs to know the status of a task: AllocState()
If code needs to know the names of tasks: Alloc()

It should be very easy for a developer to reason about which method they
should call and what they can do with the return values.
2018-10-16 16:56:55 -07:00
Michael Schurter 1c9ccdeab5 tests: fix races caused by sharing a buffer
httptest.ResponseRecorder exposes a bytes.Buffer which we were reading
and writing concurrently to test streaming log APIs. This is a race, so
I wrapped the struct in a lock with some helpers.
2018-10-16 16:56:55 -07:00
Alex Dadgar 84ce8c3487 extra logging 2018-10-16 16:56:55 -07:00
Alex Dadgar 6f0ed6184b Fix client reloading and pass the plugin loaders to server and client 2018-10-16 16:56:55 -07:00
Alex Dadgar 183561cf82 Plugin loader initialization 2018-10-16 16:54:12 -07:00
Nick Ethier 5dee1141d1 executor v2 (#4656)
* client/executor: refactor client to remove interpolation

* executor: POC libcontainer based executor

* vendor: use hashicorp libcontainer fork

* vendor: add libcontainer/nsenter dep

* executor: updated executor interface to simplify operations

* executor: implement logging pipe

* logmon: new logmon plugin to manage task logs

* driver/executor: use logmon for log management

* executor: fix tests and windows build

* executor: fix logging key names

* executor: fix test failures

* executor: add config field to toggle between using libcontainer and standard executors

* logmon: use discover utility to discover nomad executable

* executor: only call libcontainer-shim on main in linux

* logmon: use seperate path configs for stdout/stderr fifos

* executor: windows fixes

* executor: created reusable pid stats collection utility that can be used in an executor

* executor: update fifo.Open calls

* executor: fix build

* remove executor from docker driver

* executor: Shutdown func to kill and cleanup executor and its children

* executor: move linux specific universal executor funcs to seperate file

* move logmon initialization to a task runner hook

* client: doc fixes and renaming from code review


* taskrunner: use shared config struct for logmon fifo fields

* taskrunner: logmon only needs to be started once per task
2018-10-16 16:53:31 -07:00
Michael Schurter a4b4d7b266 consul service hook
Deregistration works but difficult to test due to terminal updates not
being fully implemented in the new client/ar/tr.
2018-10-16 16:53:29 -07:00
Alex Dadgar a78cefec18 use int64 2018-10-16 15:34:32 -07:00
Preetha Appan 7c0d8c646c
Change CPU/Disk/MemoryMB to int everywhere in new resource structs 2018-10-16 16:21:42 -05:00
Alex Dadgar 5a07f9f96e parse affinities and constraints on devices 2018-10-11 14:05:19 -07:00
Alex Dadgar 87cacb427f parse devices 2018-10-08 16:09:41 -07:00
Alex Dadgar 6b08b9d6b6 Define device request structs 2018-10-08 15:38:03 -07:00
Alex Dadgar 01f8e5b95f renames 2018-10-04 14:57:25 -07:00
Alex Dadgar 52f9cd7637 fixing tests 2018-10-04 14:26:19 -07:00
Alex Dadgar bac5cb1e8b Scheduler uses allocated resources 2018-10-02 17:08:25 -07:00
Alex Dadgar 5c8697667e Node reserved resources 2018-09-29 18:44:55 -07:00
Alex Dadgar c031b22d03 Fix autopilot set enable custom upgrades flag 2018-09-25 13:49:35 -07:00
Alex Dadgar ca28afa3b2 small fixes 2018-09-15 16:42:38 -07:00
Alex Dadgar 3c19d01d7a server 2018-09-15 16:23:13 -07:00
Alex Dadgar 7739ef51ce agent + consul 2018-09-13 10:43:40 -07:00
Alex Dadgar 4f89cabd34
Merge pull request #4631 from hashicorp/f-plugin-config
Parse plugin configs
2018-09-04 17:04:13 -07:00
Alex Dadgar cc92cd92cd
Merge pull request #4642 from hashicorp/b-vet
Fix vet errors and use newer go version in travis
2018-09-04 17:04:02 -07:00
Alex Dadgar c6576ddac1 Fix make check errors 2018-09-04 16:03:52 -07:00
Preetha Appan 254e90ba0e
Fix linting 2018-09-04 16:10:11 -05:00
Preetha Appan 4f8e925b54
Move topk and delay heap to separate packages under lib 2018-09-04 16:10:11 -05:00
Preetha Appan 9bc0962527
Track top k nodes by norm score rather than top k nodes per scorer 2018-09-04 16:10:11 -05:00
Preetha Appan 063004502a
Fix linting 2018-09-04 16:10:11 -05:00
Preetha Appan 6ed527c636
Use heap to store top K scoring nodes.
Scoring metadata is now aggregated by scorer type to make it easier
to parse when reading it in the CLI.
2018-09-04 16:10:11 -05:00
Preetha Appan 659cfa3f64
Parsing and API layer for spread stanza 2018-09-04 16:10:11 -05:00
Preetha Appan f3c4eead91
Refactor method to return affinity struct, and add extra test at task level 2018-09-04 16:10:11 -05:00
Preetha Appan 9f0caa9c3d
Affinity parsing, api and structs 2018-09-04 16:10:11 -05:00
Alex Dadgar c0de218747 plugin dir parsing 2018-08-30 13:43:09 -07:00
Alex Dadgar bff1669ee4 Plugin config parsing 2018-08-29 17:06:01 -07:00
Wyatt Anderson 9dccb62489 Add documentation for eligibility toggle endpoint 2018-08-24 10:50:12 -04:00
Chelsea Komlo 0a69cdb304
Merge pull request #4565 from hashicorp/b-compare-cert-alg
Error if TLS Certificate signature algorithm isn't supported in cipher suites
2018-08-15 16:09:46 -04:00
Chelsea Holland Komlo 71a4ced04c fix up test failure due to keyloader instantiated on tls config during parsing 2018-08-15 00:59:29 -04:00
Chelsea Komlo a936c452b5
Merge pull request #4577 from hashicorp/b-panic-job-history
Fix for panic when submitting non-existent version for job history CLI command
2018-08-14 17:34:05 -04:00
Chelsea Holland Komlo ba7a46471f spelling fix 2018-08-14 14:06:04 -04:00
Chelsea Holland Komlo 3e85a197b8 fix panic for job history cli command when used with non-existent job version 2018-08-13 16:57:36 -04:00
Chelsea Holland Komlo e8379c9059 skip update checking if DisableUpdateCheck is set to true 2018-08-10 13:08:13 -04:00
Chelsea Holland Komlo b92098fd08 change function signature to take entire tls config object 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo 75d631a1c8 fix reload issue for tls certificates in dev mode 2018-07-05 17:08:31 -04:00
Dirk Kok 0cb04c2cbf
Fix typo in nomad node help text
The command `nomad node eligibility` doesn't accept the `-disabled` option, this should be `-disable`.
2018-06-14 15:48:01 +02:00
Alex Dadgar b61051b3cd
Merge pull request #4409 from hashicorp/r-client-packages
Refactor client packages
2018-06-13 17:32:25 -07:00
Alex Dadgar 300b1a7a15 Tests only use testlog package logger 2018-06-13 15:40:56 -07:00
Chelsea Komlo 03075b603a
Merge pull request #4399 from hashicorp/r-reload-refactor
Refactor logic for dynamic reloading
2018-06-13 13:35:12 -04:00
Alex Dadgar 90c2108bfb Fix gc tests + parallel destroy + small test fixes 2018-06-12 10:23:45 -07:00
Alex Dadgar f5ff509fa5 Refactor - wip 2018-06-12 10:23:45 -07:00
Alex Dadgar af5753d2cd bump version + generated files 2018-06-11 13:39:42 -07:00
Chelsea Holland Komlo 3b5d5c7be8 remove logic to reload RPC connections from agent 2018-06-08 13:14:40 -04:00
Alex Dadgar 0181f5defc test less of the monitor on travis 2018-06-07 15:47:03 -07:00
Alex Dadgar 8efe9696ad move log line 2018-06-07 15:12:51 -07:00
Chelsea Komlo d738976234
Merge pull request #4395 from hashicorp/b-vault-second
Fix for dynamically reloading vault
2018-06-07 18:03:00 -04:00
Chelsea Holland Komlo dcc9cdfeb7 fixup! comment and move to always log server reload operation 2018-06-07 17:12:36 -04:00
Chelsea Holland Komlo 9f6bd7bf3a move logic for testing equality for vault config 2018-06-07 16:23:50 -04:00
Chelsea Holland Komlo 282f37b1ee fix for dynamically reloading vault 2018-06-07 15:34:18 -04:00
Alex Dadgar cfaa52e55e
Merge pull request #4380 from hashicorp/b-drain-monitor
Monitoring non-draining node exits
2018-06-06 17:50:30 -07:00
Michael Schurter 0fc624133d
Merge pull request #4384 from hashicorp/b-global-log-flags
agent: global logger should use the same flags
2018-06-06 15:15:15 -07:00
Alex Dadgar 72effb8632 code review 2018-06-06 14:52:26 -07:00
Alex Dadgar c441c17927
Merge pull request #4382 from hashicorp/b-init
Progress deadline included in nomad init
2018-06-06 14:49:10 -07:00
Alex Dadgar d478b50393 indentation 2018-06-06 14:48:51 -07:00
Alex Dadgar 217231347f Handle force draining 2018-06-06 13:05:39 -07:00
Michael Schurter f8e12e6ee7 agent: global logger should use the same flags
Prior to this change logs from the global logger only used seconds:

```
2018/06/06 18:25:58 http: TLS handshake error from ...
```

After this change they properly use the microseconds flag:

```
2018/06/06 18:39:50.702447 http: TLS handshake error ...
```

They still lack a log level unfortunately.
2018-06-06 11:40:08 -07:00
Alex Dadgar 14c1bec157 progress deadline init 2018-06-06 10:30:47 -07:00
Alex Dadgar f4fccd7ed2 Monitoring non-draining node exits 2018-06-05 17:58:44 -07:00
Preetha Appan 82837839eb
Fix bug with determining when agent is a client
This fixes a bug introduced in commit e27caadca6 that sets a boolean flag
when the agent is a client. It incorrectly checked state before initializing
the client. This leads to Nomad clients not deregistering any services registered
in Consul after allocs are destroyed
2018-06-05 19:19:52 -05:00
Alex Dadgar c0386819b3 bump version/lint/generated files 2018-06-01 15:23:10 -07:00
Alex Dadgar 247f1edb11 spelling 2018-06-01 14:53:08 -07:00
Preetha Appan ce6d4a8d7a
Fix tests and move isClient to constructor 2018-06-01 15:59:53 -05:00
Preetha Appan a5bfaa098c
Fix unnecessary deregistration in consul sync
This commit fixes an issue where if a nomad client and server shared the same consul instance, the server would deregister any services and checks registered by clients for running tasks.
2018-06-01 14:48:25 -05:00
Alex Dadgar 40fec81315
Merge pull request #4277 from hashicorp/f-retry-join-clients
Add go-discover support to Nomad clients
2018-06-01 16:57:40 +00:00
Alex Dadgar 62665d8619 Fix node drain monitor 2018-05-31 15:50:05 -07:00
Alex Dadgar aca8d5cece Actually disable the schedulers 2018-05-31 13:11:11 -07:00
Alex Dadgar d098885b79 Disable schedulers for TestHTTP_AllocSnapshot_Atomic 2018-05-31 12:05:44 -07:00
Alex Dadgar 4765b62284 Improve validation/defaulting, handle start-join
This commit:
* Improves how we combine the old retry-* fields and the new stanza and
how it is validated
* Handles the new stanza setting start_join
* Fixes integration test to not bind to the standard port and instead be
randomized.
* Simplifies parsing of the old retry_interval
* Fixes the errors from retry join being masked
* Flags get parsed into new server_join stanza
2018-05-31 10:53:26 -07:00
Alex Dadgar e1bf8780b5 validation errors 2018-05-31 10:53:26 -07:00
Alex Dadgar a02fbe3e0f indentation 2018-05-31 10:53:26 -07:00
Chelsea Holland Komlo 2bf2af4378 ensure default value of 30s is set for server_join stanza 2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo 307458d4a3 ignore default values for retry interval
add additional validation case
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo ebc758aa0e add stronger protections for nil pointers in server join merge 2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo 10aff14509 update config parse test
documentation fixes
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo ac1411ce95 RetryInterval should be a time.Duration 2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo e79bc29e1a set retryInterval and other code feedback 2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo de03c884bc add further configuration validation for server_join 2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo df7539b9d0 update documentation for server_join 2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo a4e514e07f update server_join naming and improve logging 2018-05-31 10:50:03 -07:00
Chelsea Holland Komlo 064b5481e0 add server join info to server and client 2018-05-31 10:50:03 -07:00
Preetha Appan 7414395daa
Use constant in test 2018-05-30 17:27:04 -05:00
Preetha Appan 6cbd25945c
Add unit test to verify compatibility code for node drains 2018-05-30 17:14:53 -05:00
Preetha Appan 4f835790d7
Set node eligibility to true when old client calls disable 2018-05-30 16:54:07 -05:00
Preetha Appan 34db410b74
Fix failing test TestClientStatusRequest 2018-05-30 15:11:54 -05:00
Preetha Appan 2752204f26
Fix failing test TestHTTP_AllocAllGC 2018-05-30 15:11:54 -05:00
Chelsea Holland Komlo 19e4a5489b add support for tls PreferServerCipherSuites
add further tests for tls configuration
2018-05-25 13:20:00 -04:00
Chelsea Komlo af15dda45a
Merge pull request #4328 from hashicorp/r-single-tls-config-constructor
Refactor to prefer using NewTLSConfiguration constructor
2018-05-24 13:46:29 -04:00
Alex Dadgar b1de61e012
Merge pull request #4321 from hashicorp/f-network-info
Display bind/advertise addresses on agent startup
2018-05-24 17:30:56 +00:00
Charlie Voiselle bbbd385dff Fixed typo in deployment help text 2018-05-24 12:44:21 -04:00
Nick Ethier b62825b49c
command: fix node drain monitor case 2018-05-24 06:39:12 -04:00
Nick Ethier b1d2437cf6
command: add docs for node drain -monitor flag 2018-05-24 06:37:28 -04:00
Nick Ethier 3c55f89738
command: use 0 as index for monitor request 2018-05-24 06:37:28 -04:00
Nick Ethier b52d2e3e74
command: add '-monitor' flag to node drain 2018-05-24 06:37:25 -04:00
Chelsea Holland Komlo 38f611a7f2 refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Alex Dadgar 51e67daf69 Use Tags when CanaryTags isn't specified
This PR fixes a bug where we weren't defaulting to `tags` when
`canary_tags` was empty and adds documentation.
2018-05-23 13:07:47 -07:00
Alex Dadgar dd52ec402c Display bind/advertise addresses on agent startup
Sample outputs from demo/vagrant/(server/client1).hcl and `nomad agent -dev` mode

Server:

```
==> Nomad agent configuration:

       Advertise Addrs: HTTP: 192.168.1.75:4646; RPC: 192.168.1.75:4647; Serf: 192.168.1.75:4648
            Bind Addrs: HTTP: 0.0.0.0:4646; RPC: 0.0.0.0:4647; Serf: 0.0.0.0:4648
                Client: false
             Log Level: DEBUG
                Region: global (DC: dc1)
                Server: true
               Version: 0.8.4-dev
```

Client:

```
==> Nomad agent configuration:

       Advertise Addrs: HTTP: 192.168.1.75:5656
            Bind Addrs: HTTP: 0.0.0.0:5656
                Client: true
             Log Level: DEBUG
                Region: global (DC: dc1)
                Server: false
               Version: 0.8.4-dev
```

Dev:

```
==> Nomad agent configuration:

       Advertise Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
            Bind Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
                Client: true
             Log Level: DEBUG
                Region: global (DC: dc1)
                Server: true
               Version: 0.8.4-dev
```
2018-05-22 15:14:33 -07:00
Alex Dadgar 44697efd9a safety guard 2018-05-22 14:45:34 -07:00
Alex Dadgar 586895965c Unit test for dev agent 2018-05-22 14:45:34 -07:00
Alex Dadgar 58d2a4c7c2 Do not bypass normal RPC codepath when running both client and server at once 2018-05-22 14:45:34 -07:00
Alex Dadgar 21c5ed850d Register events 2018-05-22 14:06:33 -07:00
Preetha 159888a856
Merge pull request #4274 from hashicorp/f-force-rescheduling
Add CLI and API support for forcing rescheduling of failed allocs
2018-05-21 16:24:22 -07:00
Preetha Appan 64ae37e19f
remove extra return 2018-05-21 18:00:14 -05:00
Chelsea Holland Komlo f0a5018a91 Add autocomplete where missing 2018-05-11 18:05:43 -04:00
Preetha Appan 3a8040e36f
Add new method EvaluateWithOptions to avoid breaking go API client 2018-05-11 14:18:53 -05:00
Preetha Appan e7d8ae70b2
more review feedback 2018-05-11 13:39:55 -05:00
Chelsea Komlo 687c26093c
Merge pull request #4269 from hashicorp/f-tls-remove-weak-standards
Configurable TLS cipher suites and versions; disallow weak ciphers
2018-05-11 08:11:46 -04:00
Nick Ethier b3612824ed
Merge pull request #4279 from hashicorp/f-short-init
Add job init '-short' command docs to website
2018-05-10 23:20:59 -04:00
Nick Ethier 29ddef040d
command: add autocomplete for init -short flag 2018-05-10 23:19:08 -04:00
Preetha Appan 24115138e8
unit test for job eval should detach 2018-05-10 15:30:44 -05:00
Preetha Appan e4ea18aee7
Add support for monitoring evals, and -detach/-verbose support 2018-05-10 15:02:58 -05:00
Preetha Appan bfa0937bbb
Code review feedback 2018-05-10 14:42:24 -05:00
Nick Ethier 5881e785c5
command: remove ephemeral disk from short init jobspec 2018-05-10 13:16:45 -04:00
Chelsea Holland Komlo 44f536f18e add support for configurable TLS minimum version 2018-05-09 18:07:12 -04:00
Chelsea Holland Komlo 796bae6f1b allow configurable cipher suites
disallow 3DES and RC4 ciphers

add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Preetha Appan b12df3c64b
Added CLI for evaluating job given ID, and modified client API for evaluate to take a request payload 2018-05-09 15:04:27 -05:00
Preetha Appan c1b92c284e
Work in progress - force rescheduling of failed allocs 2018-05-08 17:26:57 -05:00
Preetha e7ae6e98d9
Merge pull request #4259 from hashicorp/f-deployment-improvements 2018-05-08 16:37:10 -05:00
Chelsea Holland Komlo 136635f04d only write error log line on error 2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo 30584639b5 remove log line for empty addresses which could confuse on initalization 2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo 24ff40df01 retry until all options are exhausted 2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo ec4be4f871 ensure provider= is always the string prefix 2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo 5422b1b088 update test for more realistic IP address from go-discover 2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo 7e4d4f8088 comments and other fixups 2018-05-07 16:57:06 -04:00
Chelsea Holland Komlo 8f584f6474 add go-discover 2018-05-07 16:57:06 -04:00
Chelsea Holland Komlo 25ad6eaf96 refactor to retryJoiner interface 2018-05-07 16:57:06 -04:00
Michael Schurter f1d13683e6
consul: remove services with/without canary tags
Guard against Canary being set to false at the same time as an
allocation is being stopped: this could cause RemoveTask to be called
with the wrong Canary value and leaking a service.

Deleting both Canary values is the safest route.
2018-05-07 14:55:01 -05:00
Michael Schurter 50e04c976e
consul: support canary tags for services
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.

Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Alex Dadgar f4af30fbb5
Canary tags structs 2018-05-07 14:50:01 -05:00
Alex Dadgar f95ab4ade8
Mark canaries on creation, and unmark on promotion 2018-05-07 14:50:01 -05:00
Alex Dadgar 6f92e0711c
CLI 2018-05-07 14:50:01 -05:00
Alex Dadgar ee50789c22
Initial implementation 2018-05-07 14:50:01 -05:00
Michael Schurter 0d534d30d6
Merge pull request #4251 from hashicorp/f-grpc-checks
Support Consul gRPC Health Checks
2018-05-04 14:55:16 -07:00
Michael Schurter 70b02875b7
Merge pull request #4234 from hashicorp/b-4159
Fix race in StreamFramer and truncation in api/AllocFS.Logs
2018-05-04 14:24:07 -07:00
Michael Schurter f6a4713141 consul: make grpc checks more like http checks 2018-05-04 11:08:11 -07:00
Michael Schurter 382caec1e1 consul: initial grpc implementation
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Michael Schurter cc13e733ac cli: expand drain info for node status 2018-05-04 10:32:25 -07:00
Nick Ethier 8d73d90cd7
command: remove all comments from short init jobspec 2018-05-03 15:34:56 -04:00
Michael Schurter 526af6a246 framer: fix early exit/truncation in framer 2018-05-02 10:46:16 -07:00
Michael Schurter 5d6bf2938f client: use a bytes.Reader for reading a []byte 2018-05-02 10:46:16 -07:00
Michael Schurter e00e265da5 client: ensure cancel is always called when func exits 2018-05-02 10:46:16 -07:00
Michael Schurter d21b749c68 client: give pipe conns meaningful names 2018-05-02 10:46:16 -07:00
Michael Schurter 5ef0a82e6e client: reset encoders between uses
According to go/codec's docs, Reset(...) should be called on
Decoders/Encoders before reuse:

https://godoc.org/github.com/ugorji/go/codec

I could find no evidence that *not* calling Reset() caused bugs, but
might as well do what the docs say?
2018-05-02 10:46:16 -07:00
Nick Ethier 842ed14e3f
command: add '-short' flag to init command that emits minimal jobspec 2018-05-01 12:51:13 -04:00
Alex Dadgar 43192cefae generated files 2018-04-26 16:28:58 -07:00
Alex Dadgar 15ad3f94af Fix command line 2018-04-26 15:46:22 -07:00
James Rasell b7c2ce2991
Update node-drain logging message to clearer for operators.
This change updates the console log message when performing a node
drain and particulary when a node has marked all allocs for
migration. Previously it logged 'drain complete' which was a little
confusing to operators as the node is not drained at this point.

Closes #4183
2018-04-24 07:50:01 +01:00
Michael Schurter ec201d5f19
Merge pull request #4178 from hashicorp/f-sort-drivers
cli: sort drivers in `node status <node_id>`
2018-04-20 14:03:26 -07:00
Michael Schurter cfcbb9fa21 consul: periodically reconcile services/checks
Periodically sync services and checks from Nomad to Consul. This is
mostly useful when testing with the Consul dev agent which does not
persist state across restarts. However, this is a reasonable safety
measure to prevent skew between Consul's state and Nomad's
services+checks.

Also modernized the test suite a bit.
2018-04-19 15:45:42 -07:00
Nick Ethier 2e6c95f511
Merge pull request #4138 from hashicorp/i-hcl-json-endpoint
HCL to JSON api endpoint
2018-04-19 14:18:34 -04:00
Nick Ethier c67c4f2d6e
Merge pull request #4176 from hashicorp/i-cli-error-help
Improve help text when invalid arguments are given
2018-04-19 12:45:22 -04:00
Nick Ethier 50bab2dd83
command: fix job dispatch arg check 2018-04-18 21:21:43 -04:00
Michael Schurter d3650fb2cd test: build with mock_driver by default
`make release` and `make prerelease` set a `release` tag to disable
enabling the `mock_driver`
2018-04-18 14:45:33 -07:00
Michael Schurter 8ece6b7163 cli: sort drivers in node status <node_id> 2018-04-18 14:30:44 -07:00
Nick Ethier d0326503ac
command: use ':' instead of ',' in error msg 2018-04-18 13:55:51 -04:00
Nick Ethier bde560a863
command: update tests to check for new error message 2018-04-18 13:51:17 -04:00
Nick Ethier 71fd07d4d0
command: fix misspelling in comment 2018-04-18 12:19:53 -04:00
Nick Ethier 182e3bec1b
command: improve help text when invalid arguments are given 2018-04-18 12:02:11 -04:00
Nick Ethier 555b036369
command/agent: fix url in jobs parse ep test 2018-04-16 19:21:09 -04:00
Nick Ethier 0baf9ba385
command/agent: fix ptr ref in job endpoint test 2018-04-16 19:21:09 -04:00
Nick Ethier de4176606d
command/agent: add Canonicalize option to parse args 2018-04-16 19:21:09 -04:00
Nick Ethier 31da01856a
command/agent: add HCL mock for parse endpoint 2018-04-16 19:21:09 -04:00
Nick Ethier f2db03e56c
command/agent: add /v1/jobs/parse endpoint
The parse endpoint accepts a hcl jobspec body within a json object
and returns the parsed json object for the job. This allows users to
register jobs with the nomad json api without specifically needing
a nomad binary to parse their hcl encoded jobspec file.
2018-04-16 19:21:06 -04:00
Alex Dadgar 663c4d0433 Version bump and generated files 2018-04-12 16:21:50 -07:00
Alex Dadgar 73bf52f203 Non-verbose driver formatting and don't display non-detected 2018-04-12 11:26:00 -07:00
Preetha Appan 4f9a52e4a4
Add link to autopilot guide in operator autopilot CLI help text 2018-04-12 09:02:32 -05:00
Preetha eeb065cb21
Merge pull request #4136 from hashicorp/f-eval-status-show-waituntil
Show wait until value in eval status if it is set
2018-04-11 14:49:29 -05:00
Preetha Appan affaa12260
Show wait until value in eval status if it is set 2018-04-11 14:01:42 -05:00
Preetha c88fef4c4b
Merge pull request #4127 from hashicorp/b-autopilot-removepeer-fixes
Add node id persistence
2018-04-10 16:05:00 -05:00
Preetha Appan f04c9d2b19
minor code review fix 2018-04-10 15:33:01 -05:00
Preetha Appan 6b6e97e50c
Dev mode should never persist nodeid 2018-04-10 12:34:14 -05:00
Preetha Appan e1c3c7ab6c
Lint fixes 2018-04-10 11:22:16 -05:00
Preetha Appan 6d0e1c9fea
Use preconfigured nodeID if there isn't a persisted node ID, and persist it if its not persisted. 2018-04-10 08:47:33 -05:00
Alex Dadgar d179a09b83 WIP: Not setting node id properlperly 2018-04-09 18:01:28 -07:00
Michael Schurter efb9a02430 cli: remove outdated hints
Hints no longer apply in the 0.8 Node RPC world.
2018-04-09 12:09:47 -07:00
Chelsea Holland Komlo 7043595b67 correctly initialize array length 2018-04-09 10:51:08 -04:00
Alex Dadgar 9ab32ab7e6 Fix eval status output for alloc failure 2018-04-06 17:16:43 -07:00
Chelsea Komlo d0361ee428
Merge pull request #4098 from hashicorp/f-driver-health-ux-improvements
Driver health UX improvements
2018-04-05 18:51:00 -04:00
Chelsea Holland Komlo fda8fd81bb remove stray comma 2018-04-05 18:20:52 -04:00
Chelsea Holland Komlo 59416f6416 Display truncated driver list in non-verbose mode
separate drivers by commas
2018-04-05 14:25:59 -04:00
Alex Dadgar 235a61cdd3 fix flaky pretty time test 2018-04-03 18:15:12 -07:00
Chelsea Holland Komlo ad4a11f3b4 display driver status only in verbose mode or if the driver is unhealthy 2018-04-03 11:23:12 -04:00
Michael Schurter 3495df7da9
Merge pull request #4085 from hashicorp/docs-node-drain
Initial Node drain docs
2018-03-30 16:34:49 -07:00
Alex Dadgar 6871a068cb Fix prefix output of agent metadata 2018-03-30 16:27:18 -07:00
Michael Schurter ac8969ea38 docs: add migrate to example.nomad 2018-03-30 16:18:46 -07:00
Alex Dadgar af81349dbe Generated files 2018-03-30 16:14:40 -07:00
Michael Schurter 8e7870b590 cli: remove info color on monitored output 2018-03-30 15:51:48 -07:00
Michael Schurter 6b94029740
Merge pull request #4084 from hashicorp/f-drain-timestamps
drain: add timestamps to cli output
2018-03-30 15:07:43 -07:00
Michael Schurter b5aa71a517 cli: fix drain monitoring formating 2018-03-30 14:20:30 -07:00
Michael Schurter 7199a2b960 cli: differentiate normal output vs info 2018-03-30 11:42:11 -07:00
Michael Schurter 0260bda046 cli: add color to drain output 2018-03-30 11:15:12 -07:00
Alex Dadgar 886f7feaef Generic handling of server region errors 2018-03-30 10:57:32 -07:00
Michael Schurter 88d0cdbb72 drain: add timestamps to cli output
Also toggle eligibility output based on value
2018-03-30 10:26:43 -07:00
Chelsea Komlo 2e3cb28cc5
Merge pull request #4013 from hashicorp/node-events-documentation
Add node events to documentation
2018-03-29 17:16:26 -04:00
Michael Schurter ee3eddbac3 drain: block cli until all allocs stop
Before the drain CLI would block until the node was marked as completing
drain operations. While technically correct, it could lead operators (or
more likely: scripts) to shutdown drained nodes before all of its
allocations had *actually* terminated.

This change makes the CLI block until all allocations have terminated
(unless ignoring system jobs).
2018-03-29 10:56:09 -07:00
Chelsea Holland Komlo 48765d334c add further driver info; driver name with subsystem 2018-03-29 13:34:30 -04:00
Chelsea Holland Komlo baeadae990 Revert "omit driver key/val from details"
This reverts commit 74fc5543babc85e3fa16c048bce92e71826e53cc.
2018-03-29 12:49:46 -04:00
Chelsea Holland Komlo a17f0d8875 omit driver key/val from details 2018-03-29 12:30:12 -04:00
Chelsea Komlo 607e631714
Merge pull request #4046 from hashicorp/tls-same-file-reload
Check file contents when determining if agent should reload TLS confi…
2018-03-29 10:51:32 -04:00
Chelsea Holland Komlo be15c07110 make check fix 2018-03-28 19:11:51 -04:00
Chelsea Holland Komlo 58ada9bc42 return error when setting checksum; don't reload 2018-03-28 18:15:50 -04:00
Chelsea Holland Komlo 38c7a38549 output warning for error in creating TLS checksum 2018-03-28 17:44:23 -04:00
Chelsea Komlo 7812ac5abf
Merge pull request #4057 from hashicorp/specify-docker-msg
Specify docker name in driver health messages
2018-03-28 13:32:36 -04:00
Chelsea Holland Komlo 770ac13bc4 check for nil, remove unnecessary set checksum call 2018-03-28 13:29:53 -04:00
Preetha 177d2d6010
Merge pull request #4052 from hashicorp/f-specify-total-memory
Allow to specify total memory on agent configuration
2018-03-28 12:28:41 -05:00
Chelsea Holland Komlo d3d686b58a fix up test for file content changes 2018-03-28 13:18:13 -04:00
Preetha Appan 5c2fc6d46b
Fix compilation fail in test after rebase 2018-03-28 11:54:34 -05:00
Chelsea Holland Komlo 263544bf52 reformat message 2018-03-28 12:28:16 -04:00
Chelsea Holland Komlo efc03e252c specify driver health messages 2018-03-28 11:35:21 -04:00
Chelsea Holland Komlo 2d5af7ff4d set TLS checksum when parsing config
Refactor checksum comparison, always set checksum if it is empty
2018-03-28 09:56:11 -04:00
Michael Schurter 79a2781585
Merge pull request #4053 from hashicorp/b-drain-sys-jobs-2
drain: fix draining of system jobs
2018-03-27 16:26:45 -07:00
Alex Dadgar de4b3772f1 Create evals for system jobs when drain is unset
This PR creates evals for system jobs when:

* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Michael Schurter 559dae984f drain: give cli more time before exiting
May want to bump this a bit more as the drain often completes amidst a
flurry of alloc updates.
2018-03-27 14:24:02 -07:00
Mildred Ki'Lya 1017cbe8ab
Allow to specify total memory on agent configuration
Allow to set the total memory of an agent in its configuration file. This
can be used in case the automatic detection doesn't work or in specific
environments when memory overcommit (using swap for example) can be
desirable.
2018-03-27 15:46:18 -05:00
Chelsea Holland Komlo 003bc209b9 use time.Time for node events for compatibility 2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo ee96bbbe1d properly formatting unix timestamps 2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo 6e6d6b7e33 check file contents when determining if agent should reload TLS configuration 2018-03-27 15:42:20 -04:00
Alex Dadgar 432784dae3 Fix alloc watcher snapshot streaming 2018-03-27 11:14:53 -07:00
Chelsea Komlo 57e2cd04bd
Merge pull request #4025 from hashicorp/reload-http-tls
Allow TLS configurations for HTTP and RPC connections to be reloaded …
2018-03-26 18:00:30 -04:00
Preetha Appan 33e170c15d
s/linear/constant/g 2018-03-26 14:45:09 -05:00
Chelsea Holland Komlo 96df419fff code review feedback 2018-03-26 10:55:22 -04:00
Alex Dadgar 98d3e8a1a7 sort driver output 2018-03-23 10:59:18 -07:00
Alex Dadgar da27fc3880 Driver Info output 2018-03-22 17:18:32 -07:00
Alex Dadgar 65b3548219 Fix spelling error 2018-03-22 15:18:09 -07:00
Alex Dadgar a0195d8d41 remove comment 2018-03-22 14:49:08 -07:00
Alex Dadgar 7192498ea9
Merge pull request #4027 from hashicorp/f-common
New Command Line Groupings
2018-03-22 13:57:37 -07:00
Alex Dadgar 34211f00a7 Allow separate enterprise config overlay 2018-03-22 13:53:08 -07:00
Alex Dadgar 17fc9a6283 Fix old references 2018-03-22 13:39:18 -07:00
Alex Dadgar ee463f1a28 Grouped commands help description 2018-03-21 20:27:32 -07:00
Alex Dadgar 742382146e color 2018-03-21 20:27:32 -07:00
Alex Dadgar aa85b61fc8 Deprecated commands 2018-03-21 20:27:32 -07:00
Alex Dadgar de08c1891a aliases 2018-03-21 20:27:32 -07:00
Alex Dadgar d2af5d22e4 rename files 2018-03-21 20:27:32 -07:00
Alex Dadgar 6faa136e03 common commands 2018-03-21 20:27:32 -07:00
Alex Dadgar b5bee9e06b use subcommands 2018-03-21 20:27:32 -07:00
Michael Schurter 0e0b04afec test: fix by using mock.BatchJob 2018-03-21 16:51:45 -07:00
Michael Schurter 39cef16c73 test: don't call t.Fatal from within a goroutine 2018-03-21 16:51:45 -07:00
Michael Schurter cb61a4bdc7 Fix linting errors 2018-03-21 16:51:45 -07:00
Michael Schurter be7c759867 Improve drain log messages
Also delay "node complete" after the node has been marked complete to
capture a few more alloc events. There are other ways to implement this
that could trade off correctness for responsiveness as technically a
node is considered drained when all of its allocs have been marked to
stop and not when they've actually stopped (which may not happen for a
long time).
2018-03-21 16:51:45 -07:00
Michael Schurter 5eebd53223 Monitor node drains until completion in CLI
allow -detach like other commands
2018-03-21 16:51:45 -07:00
Michael Schurter 3907766b6d Fix node eligibility test 2018-03-21 16:51:45 -07:00
Alex Dadgar 7b2bad8c5e Toggle Drain allows resetting eligibility
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar 02019f216a Correct defaulting 2018-03-21 16:51:44 -07:00
Alex Dadgar 78c7c36e65 code review 2018-03-21 16:51:44 -07:00
Alex Dadgar ac4975cef4 Small refactor and cleanups 2018-03-21 16:51:44 -07:00
Alex Dadgar d47c68f764 Add eligibility to node view 2018-03-21 16:51:44 -07:00
Alex Dadgar c0f808205e node eligibility command 2018-03-21 16:51:44 -07:00
Alex Dadgar 8289cc3c6f HTTP and API 2018-03-21 16:51:44 -07:00
Alex Dadgar f44fbac66a flag comment 2018-03-21 16:51:44 -07:00
Alex Dadgar b3d2346419 Upgrade path 2018-03-21 16:51:43 -07:00
Alex Dadgar 010228577e Drain cli, api, http 2018-03-21 16:51:43 -07:00
Chelsea Holland Komlo 66e44cdb73 Allow TLS configurations for HTTP and RPC connections to be reloaded separately 2018-03-21 17:51:08 -04:00
Preetha 01898b2c25
Merge pull request #4007 from hashicorp/f-show-rescheduling-cli-job-status
Show a section on upcoming delayed evaluations when applicable
2018-03-21 14:37:38 -05:00
Preetha Appan ba217e5224
More code review feedback 2018-03-21 14:28:14 -05:00
Preetha Appan 496ff26d32
Fix test 2018-03-21 09:24:28 -05:00
Preetha Appan 26b6e70d13
Addressed code review feedback 2018-03-21 09:15:29 -05:00
Michael Schurter 70c370c6fe
Merge pull request #4003 from jrasell/f_gh_3988
Allow Nomads Consul health check names to be configurable.
2018-03-20 16:44:08 -07:00
Preetha Appan 30fff15de7
Show reschedule policy in addition to when the eval will get rescheduled 2018-03-20 13:34:29 -05:00
James Rasell 121c3bc997 Update Consul check params from using health-check to check. 2018-03-20 16:03:58 +01:00
Preetha Appan ec13089be9
Unit test and linting fix 2018-03-19 22:11:07 -05:00
Preetha Appan 31a3c81c3b
Show a section on upcoming delayed evaluations when applicable 2018-03-19 21:42:37 -05:00
Preetha 5cbcd6f8a2
Merge pull request #4006 from hashicorp/b-alloc-status-cli-hide-unlimited-attempts
Hide the number of reschedule attempts for unlimited rescheduling
2018-03-19 18:25:20 -05:00
Preetha Appan 96b49cb491
Hide the number of reschedule attempts for unlimited rescheduling 2018-03-19 18:06:15 -05:00
Michael Schurter 86ccdb9115 Fix generating static assets
Broke due to a change in go-bindata-assetfs
2018-03-19 15:52:38 -07:00
James Rasell 15afef9b77 Allow Nomads Consul health checks to be configurable.
This change allows the client HTTP and the server HTTP, Serf and
RPC health check names within Consul to be configurable with the
defaults as previous. The configuration can be done via either a
config file or using CLI flags.

Closes #3988
2018-03-19 19:37:56 +01:00
Preetha 6df57c177c
Merge pull request #4002 from hashicorp/b-reschedule-systemjob-panic
Fix incorrect initialization of reschedule policy for system jobs.
2018-03-19 13:06:55 -05:00
Preetha Appan 161bc66355
Fix incorrect initialization of reschedule policy for system jobs. 2018-03-19 12:16:13 -05:00
Alex Dadgar 9e05c9a50e
Merge pull request #3997 from hashicorp/b-serf-addr
RPC Advertise used exclusively for Clients
2018-03-19 09:30:20 -07:00
Alex Dadgar 9ef23ff277 enable server in test 2018-03-16 16:52:37 -07:00
Alex Dadgar b8607ad6d6 Heartbeat uses client rpc advertise and server defaults server rpc advertise addr 2018-03-16 16:47:08 -07:00
Alex Dadgar 52b7fb5361 Separate client and server rpc advertise addresses 2018-03-16 16:47:08 -07:00
Michael Schurter 86f562be3a Remove unnecessary conversions 2018-03-16 16:32:59 -07:00
Michael Schurter c3e8f6319c gofmt -s (simplify) files 2018-03-16 16:31:16 -07:00
Michael Schurter 1044bc0feb
Merge pull request #3984 from hashicorp/f-loosen-consul-skipverify
Replace Consul TLSSkipVerify handling
2018-03-16 11:21:28 -07:00
Michael Schurter 0971114f0c Replace Consul TLSSkipVerify handling
Instead of checking Consul's version on startup to see if it supports
TLSSkipVerify, assume that it does and only log in the job service
handler if we discover Consul does not support TLSSkipVerify.

The old code would break TLSSkipVerify support if Nomad started before
Consul (such as on system boot) as TLSSkipVerify would default to false
if Consul wasn't running. Since TLSSkipVerify has been supported since
Consul 0.7.2, it's safe to relax our handling.
2018-03-14 17:43:06 -07:00
Alex Dadgar 3537c73289
Merge pull request #3978 from hashicorp/b-core-sched
Always add core scheduler
2018-03-14 16:13:15 -07:00
Preetha Appan fa8863b51a
fix method comment 2018-03-14 16:10:32 -05:00
Preetha Appan ffa0b7b1ed
Show reschedule eligibility time in alloc-status when followup evalid is available 2018-03-14 16:10:32 -05:00
Preetha Appan e75630f8e8
Fix formatting 2018-03-14 16:10:32 -05:00
Preetha Appan 9a5e6edf1f
Rename DelayCeiling to MaxDelay 2018-03-14 16:10:32 -05:00
Preetha Appan 5f50c3d618
Add new reschedule options to API layer and unit tests 2018-03-14 16:10:32 -05:00
Alex Dadgar 92cb552ff6 Always add core scheduler and detect invalid schedulers 2018-03-14 10:53:27 -07:00
Alex Dadgar 63e14b7d63 nodeevents -> events 2018-03-13 18:08:22 -07:00
Alex Dadgar d3c3deffad fixes 2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo b41501e442 code review feedback 2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo 1488b076d1 code review feedback 2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo 00d9923454 Ensure node updates don't strip node events
Add node events to CLI
2018-03-13 18:05:40 -07:00
Michael Schurter 7dd7fbcda2 non-Existent -> nonexistent
Reverting from #3963

https://www.merriam-webster.com/dictionary/existent
2018-03-12 11:59:33 -07:00
Michael Schurter ec381ee705 Revert spelling corrections in generated code 2018-03-12 11:19:29 -07:00
Josh Soref 1359fd2c3d spelling: unexpected 2018-03-11 19:08:07 +00:00
Josh Soref eda5cbb2a6 spelling: triggered 2018-03-11 19:06:15 +00:00
Josh Soref 42d7f19861 spelling: supports 2018-03-11 19:00:11 +00:00
Josh Soref c808dc3095 spelling: submitted 2018-03-11 18:59:27 +00:00
Josh Soref 6e1244b6c1 spelling: significantly 2018-03-11 18:56:45 +00:00
Josh Soref 8978caea28 spelling: shutdown 2018-03-11 18:55:49 +00:00
Josh Soref 05305afcd9 spelling: services 2018-03-11 18:53:58 +00:00
Josh Soref f28efbbc79 spelling: sanitize 2018-03-11 18:52:59 +00:00
Josh Soref ad55e85e73 spelling: registrations 2018-03-11 18:40:53 +00:00
Josh Soref 7b74c9b36f spelling: queryable 2018-03-11 18:40:07 +00:00
Josh Soref 6fa892a463 spelling: propagated 2018-03-11 18:39:26 +00:00
Josh Soref d208d26b6e spelling: preemptively 2018-03-11 17:58:48 +00:00
Josh Soref 8abf038f4d spelling: output 2018-03-11 18:35:30 +00:00
Josh Soref 3c1ce6d16d spelling: otherwise 2018-03-11 18:34:27 +00:00
Josh Soref 3e2f500cf9 spelling: largely 2018-03-11 18:21:52 +00:00
Josh Soref 85fabc63c8 spelling: expected 2018-03-11 17:57:01 +00:00
Josh Soref 7f6e4012a0 spelling: existent 2018-03-11 18:30:37 +00:00
Josh Soref 35238fe779 spelling: enforcement 2018-03-11 17:59:12 +00:00
Josh Soref 680bbd6d4f spelling: encountered 2018-03-11 17:58:59 +00:00
Josh Soref 1f927cd343 spelling: each other 2018-03-11 17:56:50 +00:00
Josh Soref 89af36e6dc spelling: deferring 2018-03-11 17:53:08 +00:00
Josh Soref 7a6dfa4b1a spelling: current 2018-03-11 17:52:32 +00:00
Josh Soref 6f48e31c00 spelling: convenience 2018-03-11 17:50:48 +00:00
Josh Soref c15c0f519f spelling: concatenated 2018-03-11 17:47:54 +00:00
Josh Soref 4ba6307b84 spelling: command 2018-03-11 17:47:37 +00:00
Josh Soref 76cf178933 spelling: cleanup 2018-03-11 17:47:09 +00:00
Josh Soref 3ec4ebc7b1 spelling: canonical 2018-03-11 17:46:01 +00:00
Josh Soref 1d58ae8899 spelling: bootstrap 2018-03-11 17:43:19 +00:00
Josh Soref 455eb1aeb3 spelling: authoritative 2018-03-11 17:42:05 +00:00
Josh Soref 5f87691df1 spelling: asynchronously 2018-03-11 17:41:50 +00:00
Josh Soref 58b794875f spelling: artifact 2018-03-11 17:41:02 +00:00
Josh Soref c74245279b spelling: arbitrary 2018-03-11 17:40:03 +00:00
Josh Soref d468bdf5a2 spelling: annotated 2018-03-11 17:39:31 +00:00
Preetha Appan ab946858bb
Ignore left members when querying regions for leader in server-members command. 2018-03-05 16:56:36 -06:00
Alex Dadgar 483e011720
Merge pull request #3892 from hashicorp/f-tunnel
Client RPC Endpoints, Server Routing and Streaming RPCs
2018-02-20 16:35:42 -08:00
Alex Dadgar 8b86e64fd8 Show HTTP request method 2018-02-16 15:55:26 -08:00
Alex Dadgar aa98f8ba7b Enhance API pkg to utilize Server's Client Tunnel
This PR enhances the API package by having client only RPCs route
through the server when they are low cost and for filesystem access to
first attempt a direct connection to the node and then falling back to
a server routed request.
2018-02-15 13:59:03 -08:00
Alex Dadgar 38b695b69c feedback and rebasing 2018-02-15 13:59:03 -08:00
Alex Dadgar c5e9ebb656 Use helper for forwarding 2018-02-15 13:59:03 -08:00
Alex Dadgar 9117ef4650 HTTP agent 2018-02-15 13:59:03 -08:00
Alex Dadgar d7029965ca Server side impl + touch ups 2018-02-15 13:59:02 -08:00
Alex Dadgar 14845bb918 Refactor determining the handler for a node id call 2018-02-15 13:59:02 -08:00
Alex Dadgar e685211892 Code review feedback 2018-02-15 13:59:02 -08:00
Alex Dadgar f5f43218f5 HTTP and tests 2018-02-15 13:59:02 -08:00
Alex Dadgar 9a5569678c Client Stat/List impl 2018-02-15 13:59:02 -08:00
Alex Dadgar 8854b35b34 Agent logs 2018-02-15 13:59:02 -08:00
Alex Dadgar 69def2ff22 Server tests of logs 2018-02-15 13:59:02 -08:00
Alex Dadgar ddd67f5f11 Server streaming 2018-02-15 13:59:01 -08:00
Alex Dadgar ca9379be09 Logs over RPC w/ lots to touch up 2018-02-15 13:59:01 -08:00
Alex Dadgar 993727c28f Use in-mem rpc 2018-02-15 13:59:01 -08:00
Alex Dadgar d5a834b801 fix lint 2018-02-15 13:59:01 -08:00
Alex Dadgar 9bc75f0ad4 Fix manager tests and make testagent recover from port conflicts 2018-02-15 13:59:01 -08:00
Alex Dadgar 8dcda29c21 Use nomad UUID 2018-02-15 13:59:00 -08:00
Alex Dadgar 71029b6329 Test http 2018-02-15 13:59:00 -08:00
Alex Dadgar 6dd1c9f49d Refactor 2018-02-15 13:59:00 -08:00
Alex Dadgar ad7bc0c6bd Server can forward ClientStats.Stats 2018-02-15 13:59:00 -08:00
Alex Dadgar 1472b943d6 Stats Endpoint 2018-02-15 13:59:00 -08:00
Kyle Havlovitz 54b691f538
Merge pull request #3852 from hashicorp/autopilot-cleanup
Clean up some leftover autopilot differences from Consul
2018-02-14 10:42:32 -08:00
Preetha df6400222b
Merge pull request #3868 from hashicorp/f-server-side-restarts
server side rescheduling
2018-02-13 20:09:51 -06:00
Kyle Havlovitz 709b693d39 Clean up some leftover autopilot differences from Consul 2018-02-08 10:27:26 -08:00