Commit graph

2016 commits

Author SHA1 Message Date
Michael Schurter 3495df7da9
Merge pull request #4085 from hashicorp/docs-node-drain
Initial Node drain docs
2018-03-30 16:34:49 -07:00
Alex Dadgar 6871a068cb Fix prefix output of agent metadata 2018-03-30 16:27:18 -07:00
Michael Schurter ac8969ea38 docs: add migrate to example.nomad 2018-03-30 16:18:46 -07:00
Alex Dadgar af81349dbe Generated files 2018-03-30 16:14:40 -07:00
Michael Schurter 8e7870b590 cli: remove info color on monitored output 2018-03-30 15:51:48 -07:00
Michael Schurter 6b94029740
Merge pull request #4084 from hashicorp/f-drain-timestamps
drain: add timestamps to cli output
2018-03-30 15:07:43 -07:00
Michael Schurter b5aa71a517 cli: fix drain monitoring formating 2018-03-30 14:20:30 -07:00
Michael Schurter 7199a2b960 cli: differentiate normal output vs info 2018-03-30 11:42:11 -07:00
Michael Schurter 0260bda046 cli: add color to drain output 2018-03-30 11:15:12 -07:00
Alex Dadgar 886f7feaef Generic handling of server region errors 2018-03-30 10:57:32 -07:00
Michael Schurter 88d0cdbb72 drain: add timestamps to cli output
Also toggle eligibility output based on value
2018-03-30 10:26:43 -07:00
Chelsea Komlo 2e3cb28cc5
Merge pull request #4013 from hashicorp/node-events-documentation
Add node events to documentation
2018-03-29 17:16:26 -04:00
Michael Schurter ee3eddbac3 drain: block cli until all allocs stop
Before the drain CLI would block until the node was marked as completing
drain operations. While technically correct, it could lead operators (or
more likely: scripts) to shutdown drained nodes before all of its
allocations had *actually* terminated.

This change makes the CLI block until all allocations have terminated
(unless ignoring system jobs).
2018-03-29 10:56:09 -07:00
Chelsea Holland Komlo 48765d334c add further driver info; driver name with subsystem 2018-03-29 13:34:30 -04:00
Chelsea Holland Komlo baeadae990 Revert "omit driver key/val from details"
This reverts commit 74fc5543babc85e3fa16c048bce92e71826e53cc.
2018-03-29 12:49:46 -04:00
Chelsea Holland Komlo a17f0d8875 omit driver key/val from details 2018-03-29 12:30:12 -04:00
Chelsea Komlo 607e631714
Merge pull request #4046 from hashicorp/tls-same-file-reload
Check file contents when determining if agent should reload TLS confi…
2018-03-29 10:51:32 -04:00
Chelsea Holland Komlo be15c07110 make check fix 2018-03-28 19:11:51 -04:00
Chelsea Holland Komlo 58ada9bc42 return error when setting checksum; don't reload 2018-03-28 18:15:50 -04:00
Chelsea Holland Komlo 38c7a38549 output warning for error in creating TLS checksum 2018-03-28 17:44:23 -04:00
Chelsea Komlo 7812ac5abf
Merge pull request #4057 from hashicorp/specify-docker-msg
Specify docker name in driver health messages
2018-03-28 13:32:36 -04:00
Chelsea Holland Komlo 770ac13bc4 check for nil, remove unnecessary set checksum call 2018-03-28 13:29:53 -04:00
Preetha 177d2d6010
Merge pull request #4052 from hashicorp/f-specify-total-memory
Allow to specify total memory on agent configuration
2018-03-28 12:28:41 -05:00
Chelsea Holland Komlo d3d686b58a fix up test for file content changes 2018-03-28 13:18:13 -04:00
Preetha Appan 5c2fc6d46b
Fix compilation fail in test after rebase 2018-03-28 11:54:34 -05:00
Chelsea Holland Komlo 263544bf52 reformat message 2018-03-28 12:28:16 -04:00
Chelsea Holland Komlo efc03e252c specify driver health messages 2018-03-28 11:35:21 -04:00
Chelsea Holland Komlo 2d5af7ff4d set TLS checksum when parsing config
Refactor checksum comparison, always set checksum if it is empty
2018-03-28 09:56:11 -04:00
Michael Schurter 79a2781585
Merge pull request #4053 from hashicorp/b-drain-sys-jobs-2
drain: fix draining of system jobs
2018-03-27 16:26:45 -07:00
Alex Dadgar de4b3772f1 Create evals for system jobs when drain is unset
This PR creates evals for system jobs when:

* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Michael Schurter 559dae984f drain: give cli more time before exiting
May want to bump this a bit more as the drain often completes amidst a
flurry of alloc updates.
2018-03-27 14:24:02 -07:00
Mildred Ki'Lya 1017cbe8ab
Allow to specify total memory on agent configuration
Allow to set the total memory of an agent in its configuration file. This
can be used in case the automatic detection doesn't work or in specific
environments when memory overcommit (using swap for example) can be
desirable.
2018-03-27 15:46:18 -05:00
Chelsea Holland Komlo 003bc209b9 use time.Time for node events for compatibility 2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo ee96bbbe1d properly formatting unix timestamps 2018-03-27 15:43:57 -04:00
Chelsea Holland Komlo 6e6d6b7e33 check file contents when determining if agent should reload TLS configuration 2018-03-27 15:42:20 -04:00
Alex Dadgar 432784dae3 Fix alloc watcher snapshot streaming 2018-03-27 11:14:53 -07:00
Chelsea Komlo 57e2cd04bd
Merge pull request #4025 from hashicorp/reload-http-tls
Allow TLS configurations for HTTP and RPC connections to be reloaded …
2018-03-26 18:00:30 -04:00
Preetha Appan 33e170c15d
s/linear/constant/g 2018-03-26 14:45:09 -05:00
Chelsea Holland Komlo 96df419fff code review feedback 2018-03-26 10:55:22 -04:00
Alex Dadgar 98d3e8a1a7 sort driver output 2018-03-23 10:59:18 -07:00
Alex Dadgar da27fc3880 Driver Info output 2018-03-22 17:18:32 -07:00
Alex Dadgar 65b3548219 Fix spelling error 2018-03-22 15:18:09 -07:00
Alex Dadgar a0195d8d41 remove comment 2018-03-22 14:49:08 -07:00
Alex Dadgar 7192498ea9
Merge pull request #4027 from hashicorp/f-common
New Command Line Groupings
2018-03-22 13:57:37 -07:00
Alex Dadgar 34211f00a7 Allow separate enterprise config overlay 2018-03-22 13:53:08 -07:00
Alex Dadgar 17fc9a6283 Fix old references 2018-03-22 13:39:18 -07:00
Alex Dadgar ee463f1a28 Grouped commands help description 2018-03-21 20:27:32 -07:00
Alex Dadgar 742382146e color 2018-03-21 20:27:32 -07:00
Alex Dadgar aa85b61fc8 Deprecated commands 2018-03-21 20:27:32 -07:00
Alex Dadgar de08c1891a aliases 2018-03-21 20:27:32 -07:00
Alex Dadgar d2af5d22e4 rename files 2018-03-21 20:27:32 -07:00
Alex Dadgar 6faa136e03 common commands 2018-03-21 20:27:32 -07:00
Alex Dadgar b5bee9e06b use subcommands 2018-03-21 20:27:32 -07:00
Michael Schurter 0e0b04afec test: fix by using mock.BatchJob 2018-03-21 16:51:45 -07:00
Michael Schurter 39cef16c73 test: don't call t.Fatal from within a goroutine 2018-03-21 16:51:45 -07:00
Michael Schurter cb61a4bdc7 Fix linting errors 2018-03-21 16:51:45 -07:00
Michael Schurter be7c759867 Improve drain log messages
Also delay "node complete" after the node has been marked complete to
capture a few more alloc events. There are other ways to implement this
that could trade off correctness for responsiveness as technically a
node is considered drained when all of its allocs have been marked to
stop and not when they've actually stopped (which may not happen for a
long time).
2018-03-21 16:51:45 -07:00
Michael Schurter 5eebd53223 Monitor node drains until completion in CLI
allow -detach like other commands
2018-03-21 16:51:45 -07:00
Michael Schurter 3907766b6d Fix node eligibility test 2018-03-21 16:51:45 -07:00
Alex Dadgar 7b2bad8c5e Toggle Drain allows resetting eligibility
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar 02019f216a Correct defaulting 2018-03-21 16:51:44 -07:00
Alex Dadgar 78c7c36e65 code review 2018-03-21 16:51:44 -07:00
Alex Dadgar ac4975cef4 Small refactor and cleanups 2018-03-21 16:51:44 -07:00
Alex Dadgar d47c68f764 Add eligibility to node view 2018-03-21 16:51:44 -07:00
Alex Dadgar c0f808205e node eligibility command 2018-03-21 16:51:44 -07:00
Alex Dadgar 8289cc3c6f HTTP and API 2018-03-21 16:51:44 -07:00
Alex Dadgar f44fbac66a flag comment 2018-03-21 16:51:44 -07:00
Alex Dadgar b3d2346419 Upgrade path 2018-03-21 16:51:43 -07:00
Alex Dadgar 010228577e Drain cli, api, http 2018-03-21 16:51:43 -07:00
Chelsea Holland Komlo 66e44cdb73 Allow TLS configurations for HTTP and RPC connections to be reloaded separately 2018-03-21 17:51:08 -04:00
Preetha 01898b2c25
Merge pull request #4007 from hashicorp/f-show-rescheduling-cli-job-status
Show a section on upcoming delayed evaluations when applicable
2018-03-21 14:37:38 -05:00
Preetha Appan ba217e5224
More code review feedback 2018-03-21 14:28:14 -05:00
Preetha Appan 496ff26d32
Fix test 2018-03-21 09:24:28 -05:00
Preetha Appan 26b6e70d13
Addressed code review feedback 2018-03-21 09:15:29 -05:00
Michael Schurter 70c370c6fe
Merge pull request #4003 from jrasell/f_gh_3988
Allow Nomads Consul health check names to be configurable.
2018-03-20 16:44:08 -07:00
Preetha Appan 30fff15de7
Show reschedule policy in addition to when the eval will get rescheduled 2018-03-20 13:34:29 -05:00
James Rasell 121c3bc997 Update Consul check params from using health-check to check. 2018-03-20 16:03:58 +01:00
Preetha Appan ec13089be9
Unit test and linting fix 2018-03-19 22:11:07 -05:00
Preetha Appan 31a3c81c3b
Show a section on upcoming delayed evaluations when applicable 2018-03-19 21:42:37 -05:00
Preetha 5cbcd6f8a2
Merge pull request #4006 from hashicorp/b-alloc-status-cli-hide-unlimited-attempts
Hide the number of reschedule attempts for unlimited rescheduling
2018-03-19 18:25:20 -05:00
Preetha Appan 96b49cb491
Hide the number of reschedule attempts for unlimited rescheduling 2018-03-19 18:06:15 -05:00
Michael Schurter 86ccdb9115 Fix generating static assets
Broke due to a change in go-bindata-assetfs
2018-03-19 15:52:38 -07:00
James Rasell 15afef9b77 Allow Nomads Consul health checks to be configurable.
This change allows the client HTTP and the server HTTP, Serf and
RPC health check names within Consul to be configurable with the
defaults as previous. The configuration can be done via either a
config file or using CLI flags.

Closes #3988
2018-03-19 19:37:56 +01:00
Preetha 6df57c177c
Merge pull request #4002 from hashicorp/b-reschedule-systemjob-panic
Fix incorrect initialization of reschedule policy for system jobs.
2018-03-19 13:06:55 -05:00
Preetha Appan 161bc66355
Fix incorrect initialization of reschedule policy for system jobs. 2018-03-19 12:16:13 -05:00
Alex Dadgar 9e05c9a50e
Merge pull request #3997 from hashicorp/b-serf-addr
RPC Advertise used exclusively for Clients
2018-03-19 09:30:20 -07:00
Alex Dadgar 9ef23ff277 enable server in test 2018-03-16 16:52:37 -07:00
Alex Dadgar b8607ad6d6 Heartbeat uses client rpc advertise and server defaults server rpc advertise addr 2018-03-16 16:47:08 -07:00
Alex Dadgar 52b7fb5361 Separate client and server rpc advertise addresses 2018-03-16 16:47:08 -07:00
Michael Schurter 86f562be3a Remove unnecessary conversions 2018-03-16 16:32:59 -07:00
Michael Schurter c3e8f6319c gofmt -s (simplify) files 2018-03-16 16:31:16 -07:00
Michael Schurter 1044bc0feb
Merge pull request #3984 from hashicorp/f-loosen-consul-skipverify
Replace Consul TLSSkipVerify handling
2018-03-16 11:21:28 -07:00
Michael Schurter 0971114f0c Replace Consul TLSSkipVerify handling
Instead of checking Consul's version on startup to see if it supports
TLSSkipVerify, assume that it does and only log in the job service
handler if we discover Consul does not support TLSSkipVerify.

The old code would break TLSSkipVerify support if Nomad started before
Consul (such as on system boot) as TLSSkipVerify would default to false
if Consul wasn't running. Since TLSSkipVerify has been supported since
Consul 0.7.2, it's safe to relax our handling.
2018-03-14 17:43:06 -07:00
Alex Dadgar 3537c73289
Merge pull request #3978 from hashicorp/b-core-sched
Always add core scheduler
2018-03-14 16:13:15 -07:00
Preetha Appan fa8863b51a
fix method comment 2018-03-14 16:10:32 -05:00
Preetha Appan ffa0b7b1ed
Show reschedule eligibility time in alloc-status when followup evalid is available 2018-03-14 16:10:32 -05:00
Preetha Appan e75630f8e8
Fix formatting 2018-03-14 16:10:32 -05:00
Preetha Appan 9a5e6edf1f
Rename DelayCeiling to MaxDelay 2018-03-14 16:10:32 -05:00
Preetha Appan 5f50c3d618
Add new reschedule options to API layer and unit tests 2018-03-14 16:10:32 -05:00
Alex Dadgar 92cb552ff6 Always add core scheduler and detect invalid schedulers 2018-03-14 10:53:27 -07:00
Alex Dadgar 63e14b7d63 nodeevents -> events 2018-03-13 18:08:22 -07:00
Alex Dadgar d3c3deffad fixes 2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo b41501e442 code review feedback 2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo 1488b076d1 code review feedback 2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo 00d9923454 Ensure node updates don't strip node events
Add node events to CLI
2018-03-13 18:05:40 -07:00
Michael Schurter 7dd7fbcda2 non-Existent -> nonexistent
Reverting from #3963

https://www.merriam-webster.com/dictionary/existent
2018-03-12 11:59:33 -07:00
Michael Schurter ec381ee705 Revert spelling corrections in generated code 2018-03-12 11:19:29 -07:00
Josh Soref 1359fd2c3d spelling: unexpected 2018-03-11 19:08:07 +00:00
Josh Soref eda5cbb2a6 spelling: triggered 2018-03-11 19:06:15 +00:00
Josh Soref 42d7f19861 spelling: supports 2018-03-11 19:00:11 +00:00
Josh Soref c808dc3095 spelling: submitted 2018-03-11 18:59:27 +00:00
Josh Soref 6e1244b6c1 spelling: significantly 2018-03-11 18:56:45 +00:00
Josh Soref 8978caea28 spelling: shutdown 2018-03-11 18:55:49 +00:00
Josh Soref 05305afcd9 spelling: services 2018-03-11 18:53:58 +00:00
Josh Soref f28efbbc79 spelling: sanitize 2018-03-11 18:52:59 +00:00
Josh Soref ad55e85e73 spelling: registrations 2018-03-11 18:40:53 +00:00
Josh Soref 7b74c9b36f spelling: queryable 2018-03-11 18:40:07 +00:00
Josh Soref 6fa892a463 spelling: propagated 2018-03-11 18:39:26 +00:00
Josh Soref d208d26b6e spelling: preemptively 2018-03-11 17:58:48 +00:00
Josh Soref 8abf038f4d spelling: output 2018-03-11 18:35:30 +00:00
Josh Soref 3c1ce6d16d spelling: otherwise 2018-03-11 18:34:27 +00:00
Josh Soref 3e2f500cf9 spelling: largely 2018-03-11 18:21:52 +00:00
Josh Soref 85fabc63c8 spelling: expected 2018-03-11 17:57:01 +00:00
Josh Soref 7f6e4012a0 spelling: existent 2018-03-11 18:30:37 +00:00
Josh Soref 35238fe779 spelling: enforcement 2018-03-11 17:59:12 +00:00
Josh Soref 680bbd6d4f spelling: encountered 2018-03-11 17:58:59 +00:00
Josh Soref 1f927cd343 spelling: each other 2018-03-11 17:56:50 +00:00
Josh Soref 89af36e6dc spelling: deferring 2018-03-11 17:53:08 +00:00
Josh Soref 7a6dfa4b1a spelling: current 2018-03-11 17:52:32 +00:00
Josh Soref 6f48e31c00 spelling: convenience 2018-03-11 17:50:48 +00:00
Josh Soref c15c0f519f spelling: concatenated 2018-03-11 17:47:54 +00:00
Josh Soref 4ba6307b84 spelling: command 2018-03-11 17:47:37 +00:00
Josh Soref 76cf178933 spelling: cleanup 2018-03-11 17:47:09 +00:00
Josh Soref 3ec4ebc7b1 spelling: canonical 2018-03-11 17:46:01 +00:00
Josh Soref 1d58ae8899 spelling: bootstrap 2018-03-11 17:43:19 +00:00
Josh Soref 455eb1aeb3 spelling: authoritative 2018-03-11 17:42:05 +00:00
Josh Soref 5f87691df1 spelling: asynchronously 2018-03-11 17:41:50 +00:00
Josh Soref 58b794875f spelling: artifact 2018-03-11 17:41:02 +00:00
Josh Soref c74245279b spelling: arbitrary 2018-03-11 17:40:03 +00:00
Josh Soref d468bdf5a2 spelling: annotated 2018-03-11 17:39:31 +00:00
Preetha Appan ab946858bb
Ignore left members when querying regions for leader in server-members command. 2018-03-05 16:56:36 -06:00
Alex Dadgar 483e011720
Merge pull request #3892 from hashicorp/f-tunnel
Client RPC Endpoints, Server Routing and Streaming RPCs
2018-02-20 16:35:42 -08:00
Alex Dadgar 8b86e64fd8 Show HTTP request method 2018-02-16 15:55:26 -08:00
Alex Dadgar aa98f8ba7b Enhance API pkg to utilize Server's Client Tunnel
This PR enhances the API package by having client only RPCs route
through the server when they are low cost and for filesystem access to
first attempt a direct connection to the node and then falling back to
a server routed request.
2018-02-15 13:59:03 -08:00
Alex Dadgar 38b695b69c feedback and rebasing 2018-02-15 13:59:03 -08:00
Alex Dadgar c5e9ebb656 Use helper for forwarding 2018-02-15 13:59:03 -08:00
Alex Dadgar 9117ef4650 HTTP agent 2018-02-15 13:59:03 -08:00
Alex Dadgar d7029965ca Server side impl + touch ups 2018-02-15 13:59:02 -08:00
Alex Dadgar 14845bb918 Refactor determining the handler for a node id call 2018-02-15 13:59:02 -08:00
Alex Dadgar e685211892 Code review feedback 2018-02-15 13:59:02 -08:00
Alex Dadgar f5f43218f5 HTTP and tests 2018-02-15 13:59:02 -08:00
Alex Dadgar 9a5569678c Client Stat/List impl 2018-02-15 13:59:02 -08:00
Alex Dadgar 8854b35b34 Agent logs 2018-02-15 13:59:02 -08:00
Alex Dadgar 69def2ff22 Server tests of logs 2018-02-15 13:59:02 -08:00
Alex Dadgar ddd67f5f11 Server streaming 2018-02-15 13:59:01 -08:00
Alex Dadgar ca9379be09 Logs over RPC w/ lots to touch up 2018-02-15 13:59:01 -08:00
Alex Dadgar 993727c28f Use in-mem rpc 2018-02-15 13:59:01 -08:00
Alex Dadgar d5a834b801 fix lint 2018-02-15 13:59:01 -08:00
Alex Dadgar 9bc75f0ad4 Fix manager tests and make testagent recover from port conflicts 2018-02-15 13:59:01 -08:00
Alex Dadgar 8dcda29c21 Use nomad UUID 2018-02-15 13:59:00 -08:00
Alex Dadgar 71029b6329 Test http 2018-02-15 13:59:00 -08:00
Alex Dadgar 6dd1c9f49d Refactor 2018-02-15 13:59:00 -08:00
Alex Dadgar ad7bc0c6bd Server can forward ClientStats.Stats 2018-02-15 13:59:00 -08:00
Alex Dadgar 1472b943d6 Stats Endpoint 2018-02-15 13:59:00 -08:00
Kyle Havlovitz 54b691f538
Merge pull request #3852 from hashicorp/autopilot-cleanup
Clean up some leftover autopilot differences from Consul
2018-02-14 10:42:32 -08:00
Preetha df6400222b
Merge pull request #3868 from hashicorp/f-server-side-restarts
server side rescheduling
2018-02-13 20:09:51 -06:00
Kyle Havlovitz 709b693d39 Clean up some leftover autopilot differences from Consul 2018-02-08 10:27:26 -08:00
Mahmood Ali bebafb5234 Add tags option to datadog telemetry
Expose an global tags option in telemetry config for dogstatsd, for
purposes of distinguishing between multiple nomad cluster metrics.
2018-02-06 12:08:37 -05:00
Preetha Appan 63c1d35928
Fixes nomad validate that broke when changing default values in example.nomad 2018-02-05 17:22:13 -06:00
Preetha Appan f91126a5cb
Fix nomad init to use default values for restart stanza 2018-01-31 16:43:29 -06:00
Preetha Appan 9d15e0c05b
Code review feedback 2018-01-31 09:58:05 -06:00
Preetha Appan 646cef4cef
Flaky contains check replaced with regex 2018-01-31 09:58:05 -06:00
Preetha Appan 95dba191ac
Always show reschedule tracking and next alloc id in alloc status 2018-01-31 09:58:05 -06:00
Preetha Appan 36b8e91625
Show info about remaining reschedule attempts relative to alloc modify time 2018-01-31 09:58:05 -06:00
Preetha Appan 537f410e75
Show some information about rescheduling in alloc-status cli 2018-01-31 09:58:05 -06:00
Preetha Appan 4a78c5bc84
Fix unit test 2018-01-31 09:56:53 -06:00
Preetha Appan 1f834d1a31
Add reschedule policy to API, and HCL parsing support. 2018-01-31 09:56:53 -06:00
Kyle Havlovitz cb2321353c
Merge pull request #3812 from hashicorp/autopilot-config-change
Refactor redundancy_zone/upgrade_version out of client meta
2018-01-30 16:14:26 -08:00
Kyle Havlovitz 0eb0acacdc Fix remaining issues with autopilot change 2018-01-30 15:21:28 -08:00
James Rasell 06bf8e932a
Update CLI to use ISO_8601 time format.
This change updates the formatTime CLI helper function to return
an ISO_8601 time format which will make CLI time usage more
consistent and easier. Previosuly the time format was in US style
format which was somewhat confusing to non US users.

Closes #3806
2018-01-30 08:57:07 +00:00
Kyle Havlovitz 2ccf565bf6 Refactor redundancy_zone/upgrade_version out of client meta 2018-01-29 20:03:38 -08:00
Alex Dadgar 31e212f467
Merge pull request #3684 from jrasell/f_gh_3678
Update `node-status` verbose command to include node address.
2018-01-29 15:32:14 -08:00
Chelsea Komlo d09cc2a69f
Merge pull request #3492 from hashicorp/f-client-tls-reload
Client/Server TLS dynamic reload
2018-01-23 05:51:32 -05:00
Michael Schurter 694b547a6b
Merge pull request #3682 from hashicorp/b-3681-always-set-driver-ip
Always advertise driver IP when in driver mode
2018-01-22 16:41:34 -08:00
Chelsea Holland Komlo 7d3c240871 swap raft layer tls wrapper 2018-01-19 17:00:15 -05:00
Kyle Havlovitz 901c8791f4
Merge pull request #3670 from hashicorp/autopilot
Add Autopilot feature from Consul
2018-01-19 12:52:56 -08:00
Michael Schurter 8a0cf66822 Improve invalid port error message for services
Related to #3681

If a user specifies an invalid port *label* when using
address_mode=driver they'll get an error message about the label being
an invalid number which is very confusing.

I also added a bunch of testing around Service.AddressMode validation
since I was concerned by the linked issue that there were cases I was
missing. Unfortunately when address_mode=driver is used there's only so
much validation that can be done as structs/structs.go validation never
peeks into the driver config which would be needed to verify the port
labels/map.
2018-01-18 15:35:24 -08:00
Michael Schurter 447dc5bbd3 Fix test 2018-01-18 15:35:24 -08:00
Michael Schurter 583e17fad5 Always advertise driver IP when in driver mode
Fixes #3681

When in drive address mode Nomad should always advertise the driver's IP
in Consul even when no network exists. This matches the 0.6 behavior.

When in host address mode Nomad advertises the alloc's network's IP if
one exists. Otherwise it lets Consul determine the IP.

I also added some much needed logging around Docker's network discovery.
2018-01-18 15:35:24 -08:00
Kyle Havlovitz 8d41f4ad40 Formatting/test adjustments 2018-01-18 15:03:35 -08:00
Kyle Havlovitz 12ff22ea70 Merge branch 'master' into autopilot 2018-01-18 13:29:25 -08:00
Michael Schurter 9fcda9c5aa Make example service name make sense
Switch from global-redis-check for the example job's service name to
redis-cache. The former name is really confusing and someone finally
called us out on it:

https://groups.google.com/d/msg/nomad-tool/3RTh6CyYkWk/vEe_Sj7lAAAJ

Also specifically mention that the `service.name` parameter is what is
advertised in Consul.
2018-01-17 14:08:30 -08:00
Chelsea Holland Komlo 35466a331a fixing up raft reload tests
close second goroutine in raft-net
2018-01-17 10:29:15 -05:00
Kyle Havlovitz bc385bcc93 Fix comments/text referring to consul 2018-01-17 00:20:13 -08:00
Michael Schurter 57eb128dcf
Merge pull request #3718 from hashicorp/b-3713-fix-check-restart
Fix service.check_restart stanza propagation
2018-01-16 16:39:42 -08:00
Kyle Havlovitz 7b980c42d8 Add raft remove by id endpoint/command 2018-01-16 13:35:32 -08:00
Chelsea Holland Komlo 6c9f9c8ac3 adding additional test assertions; differentiate reloading agent and http server 2018-01-16 07:34:39 -05:00
Alex Dadgar 54124a8478 Test listener uses freeport instead of static ports 2018-01-12 15:10:26 -08:00
Michael Schurter 9f179e9fab Fix HTTP code for permission denied errors
Fixes #3697

The existing code and test case only covered the leader behavior. When
querying against non-leaders the error has an "rpc error: " prefix.

To provide consistency in HTTP error response I also strip the "rpc
error: " prefix for 403 responses as they offer no beneficial additional
information (and in theory disclose a tiny bit of data to unauthorized
users, but it would be a pretty weird bit of data to use in a malicious
way).
2018-01-09 15:25:53 -08:00
Michael Schurter 7c282f174b Fix service.check_restart stanza propagation
There was a bug in jobspec parsing, a bug in CheckRestart merging, and a
bug in CheckRestart canonicalization. All are now tested.
2018-01-09 15:15:36 -08:00
Chelsea Holland Komlo 214d128eb9 reload raft transport layer
fix up linting
2018-01-08 14:52:28 -05:00
Chelsea Holland Komlo 0708d34135 call reload on agent, client, and server separately 2018-01-08 09:56:31 -05:00
Chelsea Holland Komlo d9ec538d6a don't ignore error in http reloading
code review feedback
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo 6a2432659a code review fixups 2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo 4e0dbd23cf prevent races when reloading, fully shut down raft 2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo ae7fc4695e fixups from code review
Revert "close raft long-lived connections"

This reverts commit 3ffda28206fcb3d63ad117fd1d27ae6f832b6625.

reload raft connections on changing tls
2018-01-08 09:21:06 -05:00
Chelsea Holland Komlo c0ad9a4627 add ability to upgrade/downgrade nomad agents tls configurations via sighup 2018-01-08 09:21:06 -05:00
Preetha 1712b03705
Merge branch 'master' into 0.8 2018-01-03 16:06:38 -06:00
James Rasell 45e8f977f7
Update node-status verbose command to include node address.
This change updates the `nomad node-status -verbose` command to
also include the addreess of the node. This is helpful for cluster
administrators to quickly discover information and access nodes
when required.
2017-12-21 08:58:35 +00:00
Alex Dadgar bfc62ae41c bump version and remove generated structs 2017-12-19 17:10:52 -08:00
Alex Dadgar f0127afd93 generated files 2017-12-19 16:57:34 -08:00
Michael Schurter 714eb0b266 Services should not require a port
Fixes #3673
2017-12-19 15:50:23 -08:00
Kyle Havlovitz 1c07066064 Add autopilot functionality based on Consul's autopilot 2017-12-18 14:29:41 -08:00
Kyle Havlovitz b775fc7b33
Added support for v2 raft APIs and -raft-protocol option 2017-12-12 10:17:16 -06:00
Alex Dadgar d61ade8f02 remove generated structs 2017-12-11 17:51:41 -08:00
Alex Dadgar 8e63d545c4 generated assets 2017-12-11 17:30:37 -08:00
Michael Schurter b6152cf436
Merge pull request #3650 from hashicorp/b-lowercase-agent-service-id
Use the Service.Hash() method in agent service ids
2017-12-11 17:12:52 -08:00
Michael Schurter cdcefd0908 Use the Service.Hash() method in agent service ids
The allocID and taskName parameters are useless for agents, but it's
still nice to reuse the same hash method for agent and task services.
This brings in the lowercase mode for the agent hash as well.
2017-12-11 16:50:15 -08:00
Alex Dadgar 45f5f4c521
Update job_status.go 2017-12-11 16:14:03 -08:00
Alex Dadgar 9a8a89c3ab Node Status output includes new modify time 2017-12-11 16:08:25 -08:00
Alex Dadgar 189ff0dc22 Fix upgrade path of modify time 2017-12-11 15:58:24 -08:00
Michael Schurter 4f1002c1a8 Be more defensive in port checks 2017-12-08 12:27:57 -08:00
Michael Schurter d613e0aaf5 Move service hash logic to Service.Hash method 2017-12-08 12:03:43 -08:00
Michael Schurter b71edf846f Hash fields used in task service IDs
Fixes #3620

Previously we concatenated tags into task service IDs. This could break
deregistration of tag names that contained double //s like some Fabio
tags.

This change breaks service ID backward compatibility so on upgrade all
users services and checks will be removed and re-added with new IDs.

This change has the side effect of including all service fields in the
ID's hash, so we no longer have to track PortLabel and AddressMode
changes independently.
2017-12-08 12:03:43 -08:00
Michael Schurter 91282315d1 Prevent using port 0 with address_mode=driver 2017-12-08 12:03:43 -08:00
Michael Schurter 4b20441eef Validate port label for host address mode
Also skip getting an address for script checks which don't use them.

Fixed a weird invalid reserved port in a TaskRunner test helper as well
as a problem with our mock Alloc/Job. Hopefully the latter doesn't cause
other tests to fail, but we were referencing an invalid PortLabel and
just not catching it before.
2017-12-08 12:03:43 -08:00
Michael Schurter 4347026f83 Test Consul from TaskRunner thoroughly
Rely less on the mockConsulServiceClient because the real
consul.ServiceClient needs all the testing it can get!
2017-12-08 12:03:00 -08:00
Michael Schurter 4ae115dc59 Allow custom ports for services and checks
Fixes #3380

Adds address_mode to checks (but no auto) and allows services and checks
to set literal port numbers when using address_mode=driver.

This allows SDNs, overlays, etc to advertise internal and host addresses
as well as do checks against either.
2017-12-08 12:03:00 -08:00
Michael Schurter 1dd5b3822c
Merge pull request #3608 from hashicorp/b-3342-windows-log-leak
Fix bug in log framer only affecting Windows
2017-12-08 10:59:26 -08:00
Chelsea Holland Komlo 2ea8e43214 code review fixups 2017-12-06 16:37:47 -05:00
Chelsea Holland Komlo a010db084b fix up basic test
add conversion for KillSignal for api/struct representation of task
2017-12-06 14:36:45 -05:00
Preetha 4e6204777e
Merge pull request #3618 from hashicorp/b_fix_ns_inspect_test
Fix Test: Inspect returns json, not cli formatted output
2017-12-05 12:41:04 -06:00
Michael Schurter b66aa5b7f6
Merge pull request #3563 from hashicorp/b-snapshot-atomic
Atomic Snapshotting / Sticky Volume Migration
2017-12-05 09:16:33 -08:00
Charlie Voiselle f7c4156f95 Inspect returns json, not cli formatted output 2017-12-05 12:04:11 -05:00
Alex Dadgar 0bec137561
Merge pull request #3555 from PagerDuty/fix-loop-on-sigpipe
Do not emit logs on SIGPIPE since logging service could be unavailable
2017-12-04 14:11:05 -08:00
Alex Dadgar ab67a98c13 Emit hostname as a label 2017-12-04 10:42:31 -08:00
Jens Herrmann 5680fcccc2 Fix typos in metric names. #3610 2017-12-01 15:24:14 +01:00
Michael Schurter 2cbde16b9b Add check for Windows ECONNRESET 2017-11-30 21:30:20 -08:00
Michael Schurter 3e8e3aac70 Add defensive check to safeguard from future #3342s
I hate adding "this should never happen" checks, but causing a tight
loop that OOMs Nomad is just too easy in this code otherwise.
2017-11-30 20:37:13 -08:00
Michael Schurter 29d86eb348 Fix race in framer and improperly returned err
Fixes #3342

Two bugs were fixed:

* Closing the StreamFramer's exitCh before setting the error means other
  goroutines blocked on exitCh closing could see the error as nil. This
  was *not* observered.
* parseFramerError on Windows would fall through and return an
  improperly captured nil err variable. There's no need for
  parseFramerError to be a closure which fixes the confusion.
2017-11-30 17:42:53 -08:00
Michael Schurter 5e975bbd0f Add comment and normalize err check ordering
as per PR comments
2017-11-29 17:26:11 -08:00
Michael Schurter d996c3a231 Check for error file when receiving snapshots 2017-11-29 17:26:11 -08:00
Michael Schurter ca946679f6 Destroy partially migrated alloc dirs
Test that snapshot errors don't return a valid tar currently fails.
2017-11-29 17:26:11 -08:00
Charlie Voiselle c563aa7bd9 gofmt changes 2017-11-29 19:43:19 -05:00
Charlie Voiselle 8ef6964817 Handle exact namespace match properly when also matched as prefix
This fixes an issue with the nomad namespace inspect and nomad namespace status commands failing to run for namespaces whose names are the start of another namespace's name: for example, foo and foobar. If the argument matches the first possibility returned by prefix exactly, that namespace will be targetted. If no exact match occurs, the behavior is unchanged.
2017-11-29 19:40:08 -05:00
Michael Lange 96403746b1 Add CORS headers to client fs endpoints 2017-11-21 11:22:42 -08:00
Preetha Appan 3592635ede Populate DisplayMessage in various http endpoints that return allocations, plus unit tests. 2017-11-17 14:53:26 -06:00
Alex Dadgar 69d3bf7392
Merge pull request #3559 from hashicorp/b-metrics
Don't emit metrics for non-running tasks
2017-11-17 10:33:23 -08:00
Preetha Appan 88b7c36357 Fix flaky test TestPrettyTimeDiff 2017-11-16 09:48:14 -06:00
Alex Dadgar 05b1588cea Only publish metric when the task is running and dev mode publishes metrics 2017-11-15 13:21:06 -08:00