Alex Dadgar
cdcd3c929c
loader and singleton
2019-01-22 15:11:57 -08:00
Alex Dadgar
6c2782f037
move catalog + grpcutils
2019-01-22 15:11:57 -08:00
Mahmood Ali
6bdb9864de
api: remove MockJob from exported functions
...
`api.MockJob` is a test utility, that's only used by `command/agent`
package. This moves it to the package and removes it from the public
API.
2019-01-18 14:51:31 -05:00
Michael Schurter
48afda786b
Merge pull request #5187 from hashicorp/test-consul
...
Port a bunch of pre-0.9 Consul tests to 0.9
2019-01-15 07:41:50 -08:00
Alex Dadgar
471fdb3ccf
Merge pull request #5173 from hashicorp/b-log-levels
...
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Michael Schurter
fc1bb95ef8
Remove old comment; it's been fixed!
2019-01-14 09:56:53 -08:00
Preetha Appan
7bd1440710
REfactor statedb factory config to set it directly in client config
2019-01-12 10:38:20 -06:00
Preetha Appan
f059ef8a47
Modified destroy failure handling to rely on allocrunner's destroy method
...
Added a unit test with custom statedb implementation that errors, to
use to verify destroy errors
2019-01-12 10:37:12 -06:00
Alex Dadgar
5621086f50
Enable json logs
2019-01-11 11:36:37 -08:00
Preetha Appan
b46728a88b
Make spread weight a pointer with default value if unset
2019-01-11 10:31:21 -06:00
Nick Wales
7a7b5da0df
Adds optional Consul service tags to nomad server and agent services, gh#4297
2019-01-09 22:02:46 +00:00
Chris Baker
e9db2ae822
Merge branch 'master' of github.com:hashicorp/nomad into f-1157-validate-node-meta-variables
2019-01-09 18:56:49 +00:00
Chris Baker
d5b1a56f3b
increased config validation coverage for dev mode
2019-01-09 18:56:40 +00:00
Chris Baker
d8a3a74c43
move `if dev` check into config validation, to support dev-mod
...
validation in the future
2019-01-08 22:21:48 +00:00
Chris Baker
220e9e838f
refactored config validation into a new method, modified Meta.Client
...
tests appropriately
2019-01-08 15:07:36 +00:00
Mahmood Ali
916a40bb9e
move cstructs.DeviceNetwork to drivers pkg
2019-01-08 09:11:47 -05:00
Chris Baker
bf00f93d87
moved interp key regex out to a helper function
2019-01-08 00:11:47 +00:00
Chris Baker
f99e18aaf4
gofmt to make check happy
2019-01-07 18:01:59 +00:00
Chris Baker
a61afad5bb
added validation on client metadata keys
2019-01-07 17:16:38 +00:00
Nick Ethier
82175d1328
client/drivermananger: add driver manager
...
The driver manager is modeled after the device manager and is started by the client.
It's responsible for handling driver lifecycle and reattachment state, as well as
processing the incomming fingerprint and task events from each driver. The mananger
exposes a method for registering event handlers for task events that is used by the
task runner to update the server when a task has been updated with an event.
Since driver fingerprinting has been implemented by the driver manager, it is no
longer needed in the fingerprint mananger and has been removed.
2018-12-18 22:55:18 -05:00
Alex Dadgar
4c57d2ec4d
Add plugin API versioning to plugin loader and plugins
2018-12-18 16:48:00 -08:00
Alex Dadgar
86d9ad4397
fix iops bug and increase test matrix coverage
2018-12-11 15:28:21 -08:00
Mahmood Ali
69b2355274
Merge pull request #4975 from hashicorp/fix-master-20181209
...
Some test fixes and remedies
2018-12-11 18:00:21 -05:00
Alex Dadgar
1531b6d534
Merge pull request #4970 from hashicorp/f-no-iops
...
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Mahmood Ali
5a487ac884
tests: prevent indefinite blocking in some tests
...
Noticed few places where tests seem to block indefinitely and panic
after the test run reaches the test package timeout.
I intend to follow up with the proper fix later, but timing out is much
better than indefinitely blocking.
2018-12-11 09:35:26 -05:00
Alex Dadgar
c918a96490
Warn if IOPS is being used
2018-12-06 16:17:09 -08:00
Alex Dadgar
1e3c3cb287
Deprecate IOPS
...
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Michael Schurter
8fa5e90095
consul: add ScriptExecutor context wrapper
...
Since d335a82859ca2177bc6deda0c2c85b559daf2db3 ScriptExecutors now take
a timeout duration instead of a context. This broke the script check
removal code which used context cancelation propagation to remove
script checks while they were executing.
This commit adds a wrapper around ScriptExecutors that obeys context
cancelation again. The only downside is that it leaks a goroutine until
the underlying Exec call completes or timeouts.
Since check removal is relatively rare, check timeouts usually low, and
scripts usually fast, the risk of leaking a goroutine seems very small.
2018-12-03 20:26:31 -08:00
Michael Schurter
6459c19ffc
consul: fix script checks exiting after 1 run
...
Fixes a regression caused in d335a82859ca2177bc6deda0c2c85b559daf2db3
The removal of the inner context made the remaining cancels cancel the
outer context and cause script checks to exit prematurely.
2018-12-03 18:50:02 -08:00
Nick Ethier
80ae7e34f4
Merge pull request #4906 from hashicorp/f-metric-prefix-master
...
Port metric prefix filtering to master
2018-11-29 22:27:47 -05:00
Nick Ethier
b1484aec33
nomad: fix hclog usage
2018-11-29 22:27:39 -05:00
Alex Dadgar
4ee603c382
Device hook and devices affect computed node class
...
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.
TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
2018-11-27 17:25:33 -08:00
Nick Ethier
ed65610ec6
command/agent: additional tests for telemetry config parsing
2018-11-19 23:22:33 -05:00
Nick Ethier
b81e4e18f0
agent: suppose filter_default telemetry option
2018-11-19 23:21:48 -05:00
Nick Ethier
85b221a1d6
nomad: add flag to disable publishing of job_summary metrics for dispatched jobs
2018-11-19 23:21:19 -05:00
Preetha Appan
18708d3f0b
Pass service metadata "external-source" for consul UI integration
2018-11-16 11:28:56 -06:00
Mahmood Ali
c62ec124c0
Set clean config for mock driver
...
The default job here contains some exec task config (for setting
command and args) that aren't used for mock driver. Now, the alloc
runner seems stricter about validating fields and errors on unexpected
fields.
Updating configs in tests so we can have an explicit task config
whenever driver is set explicitly.
2018-11-13 10:21:40 -05:00
Mahmood Ali
c7610d8c22
mark and skip failing consul failing tests
2018-11-13 10:21:40 -05:00
Preetha Appan
fd0ba320da
change path to v1/scheduler/configuration
2018-11-12 15:57:45 -06:00
Preetha Appan
3a10a589d7
Fix failing test
2018-11-10 19:53:47 -06:00
Preetha Appan
7ef126a027
Smaller methods, and added tests for RPC layer
2018-11-10 17:37:33 -06:00
Preetha Appan
75662b50d1
Use response object/querymeta/writemeta in scheduler config API
2018-11-10 10:31:10 -06:00
Alex Dadgar
98398a8a44
Merge pull request #4842 from hashicorp/b-deployment-progress-deadline
...
Fix multiple bugs with progress deadline handling
2018-11-08 13:31:54 -08:00
Alex Dadgar
204ca8230c
Device manager
...
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
2018-11-07 10:43:15 -08:00
Michael Schurter
392d548b85
Merge pull request #4828 from hashicorp/b-restore
...
Implement client agent restarting
2018-11-05 18:50:15 -06:00
Alex Dadgar
1c31970464
Fix multiple tgs with progress deadline handling
...
Fix an issue in which the deployment watcher would fail the deployment
based on the earliest progress deadline of the deployment regardless of
if the task group has finished.
Further fix an issue where the blocked eval optimization would make it
so no evals were created to progress the deployment. To reproduce this
issue, prior to this commit, you can create a job with two task groups.
The first group has count 1 and resources such that it can not be
placed. The second group has count 3, max_parallel=1, and can be placed.
Run this first and then update the second group to do a deployment. It
will place the first of three, but never progress since there exists a
blocked eval. However, that doesn't capture the fact that there are two
groups being deployed.
2018-11-05 16:06:17 -08:00
Michael Schurter
6bdbfb8129
tests: get consul integration tests building
2018-11-05 12:32:05 -08:00
Preetha Appan
9d316cbbef
Fix return type in tests after refactor
2018-10-30 11:10:46 -05:00
Preetha Appan
8f7eb61823
Introduce a response object for scheduler configuration
2018-10-30 11:06:32 -05:00
Preetha Appan
c1c1c230e4
Make preemption config a struct to allow for enabling based on scheduler type
2018-10-30 11:06:32 -05:00
Preetha Appan
bd34cbb1f7
Support for new scheduler config API, first use case is to disable preemption
2018-10-30 11:06:32 -05:00
Michael Schurter
d71a1b4547
tests: more fixes due to api changes
2018-10-29 15:25:22 -07:00
Michael Schurter
2b1b3d7e1e
tests: get tests building if not yet passing
2018-10-16 16:56:57 -07:00
Michael Schurter
1a29337e48
register drivers by default
...
Do not register mock_driver on release builds.
2018-10-16 16:56:56 -07:00
Nick Ethier
3183b33d24
client: review comments and fixup/skip tests
2018-10-16 16:56:56 -07:00
Nick Ethier
f192c3752a
client: refactor post allocrunnerv2 finalization
2018-10-16 16:56:56 -07:00
Nick Ethier
4a4c7dbbfc
client: begin driver plugin integration
...
client: fingerprint driver plugins
2018-10-16 16:56:56 -07:00
Alex Dadgar
7946a14aa8
Fix lints
2018-10-16 16:56:56 -07:00
Michael Schurter
f279b1d1b1
tests: test logs endpoint against pending task
...
Although the really exciting change is making WaitForRunning return the
allocations that it started. This should cut down test boilerplate
significantly.
2018-10-16 16:56:55 -07:00
Michael Schurter
6bcf772f3c
tests: test via ServeMux so http codes are set
2018-10-16 16:56:55 -07:00
Michael Schurter
960f3be76c
client: expose task state to client
...
The interesting decision in this commit was to expose AR's state and not
a fully materialized Allocation struct. AR.clientAlloc builds an Alloc
that contains the task state, so I considered simply memoizing and
exposing that method.
However, that would lead to AR having two awkwardly similar methods:
- Alloc() - which returns the server-sent alloc
- ClientAlloc() - which returns the fully materialized client alloc
Since ClientAlloc() could be memoized it would be just as cheap to call
as Alloc(), so why not replace Alloc() entirely?
Replacing Alloc() entirely would require Update() to immediately
materialize the task states on server-sent Allocs as there may have been
local task state changes since the server received an Alloc update.
This quickly becomes difficult to reason about: should Update hooks use
the TaskStates? Are state changes caused by TR Update hooks immediately
reflected in the Alloc? Should AR persist its copy of the Alloc? If so,
are its TaskStates canonical or the TaskStates on TR?
So! Forget that. Let's separate the static Allocation from the dynamic
AR & TR state!
- AR.Alloc() is for static Allocation access (often for the Job)
- AR.AllocState() is for the dynamic AR & TR runtime state (deployment
status, task states, etc).
If code needs to know the status of a task: AllocState()
If code needs to know the names of tasks: Alloc()
It should be very easy for a developer to reason about which method they
should call and what they can do with the return values.
2018-10-16 16:56:55 -07:00
Michael Schurter
1c9ccdeab5
tests: fix races caused by sharing a buffer
...
httptest.ResponseRecorder exposes a bytes.Buffer which we were reading
and writing concurrently to test streaming log APIs. This is a race, so
I wrapped the struct in a lock with some helpers.
2018-10-16 16:56:55 -07:00
Alex Dadgar
84ce8c3487
extra logging
2018-10-16 16:56:55 -07:00
Alex Dadgar
6f0ed6184b
Fix client reloading and pass the plugin loaders to server and client
2018-10-16 16:56:55 -07:00
Alex Dadgar
183561cf82
Plugin loader initialization
2018-10-16 16:54:12 -07:00
Michael Schurter
a4b4d7b266
consul service hook
...
Deregistration works but difficult to test due to terminal updates not
being fully implemented in the new client/ar/tr.
2018-10-16 16:53:29 -07:00
Alex Dadgar
a78cefec18
use int64
2018-10-16 15:34:32 -07:00
Preetha Appan
7c0d8c646c
Change CPU/Disk/MemoryMB to int everywhere in new resource structs
2018-10-16 16:21:42 -05:00
Alex Dadgar
5a07f9f96e
parse affinities and constraints on devices
2018-10-11 14:05:19 -07:00
Alex Dadgar
87cacb427f
parse devices
2018-10-08 16:09:41 -07:00
Alex Dadgar
6b08b9d6b6
Define device request structs
2018-10-08 15:38:03 -07:00
Alex Dadgar
01f8e5b95f
renames
2018-10-04 14:57:25 -07:00
Alex Dadgar
52f9cd7637
fixing tests
2018-10-04 14:26:19 -07:00
Alex Dadgar
bac5cb1e8b
Scheduler uses allocated resources
2018-10-02 17:08:25 -07:00
Alex Dadgar
5c8697667e
Node reserved resources
2018-09-29 18:44:55 -07:00
Alex Dadgar
ca28afa3b2
small fixes
2018-09-15 16:42:38 -07:00
Alex Dadgar
3c19d01d7a
server
2018-09-15 16:23:13 -07:00
Alex Dadgar
7739ef51ce
agent + consul
2018-09-13 10:43:40 -07:00
Alex Dadgar
4f89cabd34
Merge pull request #4631 from hashicorp/f-plugin-config
...
Parse plugin configs
2018-09-04 17:04:13 -07:00
Alex Dadgar
cc92cd92cd
Merge pull request #4642 from hashicorp/b-vet
...
Fix vet errors and use newer go version in travis
2018-09-04 17:04:02 -07:00
Alex Dadgar
c6576ddac1
Fix make check errors
2018-09-04 16:03:52 -07:00
Preetha Appan
659cfa3f64
Parsing and API layer for spread stanza
2018-09-04 16:10:11 -05:00
Preetha Appan
f3c4eead91
Refactor method to return affinity struct, and add extra test at task level
2018-09-04 16:10:11 -05:00
Preetha Appan
9f0caa9c3d
Affinity parsing, api and structs
2018-09-04 16:10:11 -05:00
Alex Dadgar
c0de218747
plugin dir parsing
2018-08-30 13:43:09 -07:00
Alex Dadgar
bff1669ee4
Plugin config parsing
2018-08-29 17:06:01 -07:00
Wyatt Anderson
9dccb62489
Add documentation for eligibility toggle endpoint
2018-08-24 10:50:12 -04:00
Chelsea Komlo
0a69cdb304
Merge pull request #4565 from hashicorp/b-compare-cert-alg
...
Error if TLS Certificate signature algorithm isn't supported in cipher suites
2018-08-15 16:09:46 -04:00
Chelsea Holland Komlo
71a4ced04c
fix up test failure due to keyloader instantiated on tls config during parsing
2018-08-15 00:59:29 -04:00
Chelsea Holland Komlo
e8379c9059
skip update checking if DisableUpdateCheck is set to true
2018-08-10 13:08:13 -04:00
Chelsea Holland Komlo
b92098fd08
change function signature to take entire tls config object
2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo
75d631a1c8
fix reload issue for tls certificates in dev mode
2018-07-05 17:08:31 -04:00
Alex Dadgar
b61051b3cd
Merge pull request #4409 from hashicorp/r-client-packages
...
Refactor client packages
2018-06-13 17:32:25 -07:00
Alex Dadgar
300b1a7a15
Tests only use testlog package logger
2018-06-13 15:40:56 -07:00
Chelsea Komlo
03075b603a
Merge pull request #4399 from hashicorp/r-reload-refactor
...
Refactor logic for dynamic reloading
2018-06-13 13:35:12 -04:00
Alex Dadgar
90c2108bfb
Fix gc tests + parallel destroy + small test fixes
2018-06-12 10:23:45 -07:00
Alex Dadgar
af5753d2cd
bump version + generated files
2018-06-11 13:39:42 -07:00
Chelsea Holland Komlo
3b5d5c7be8
remove logic to reload RPC connections from agent
2018-06-08 13:14:40 -04:00
Alex Dadgar
8efe9696ad
move log line
2018-06-07 15:12:51 -07:00
Chelsea Komlo
d738976234
Merge pull request #4395 from hashicorp/b-vault-second
...
Fix for dynamically reloading vault
2018-06-07 18:03:00 -04:00
Chelsea Holland Komlo
dcc9cdfeb7
fixup! comment and move to always log server reload operation
2018-06-07 17:12:36 -04:00
Chelsea Holland Komlo
9f6bd7bf3a
move logic for testing equality for vault config
2018-06-07 16:23:50 -04:00
Chelsea Holland Komlo
282f37b1ee
fix for dynamically reloading vault
2018-06-07 15:34:18 -04:00
Michael Schurter
f8e12e6ee7
agent: global logger should use the same flags
...
Prior to this change logs from the global logger only used seconds:
```
2018/06/06 18:25:58 http: TLS handshake error from ...
```
After this change they properly use the microseconds flag:
```
2018/06/06 18:39:50.702447 http: TLS handshake error ...
```
They still lack a log level unfortunately.
2018-06-06 11:40:08 -07:00
Preetha Appan
82837839eb
Fix bug with determining when agent is a client
...
This fixes a bug introduced in commit e27caadca6 that sets a boolean flag
when the agent is a client. It incorrectly checked state before initializing
the client. This leads to Nomad clients not deregistering any services registered
in Consul after allocs are destroyed
2018-06-05 19:19:52 -05:00
Alex Dadgar
c0386819b3
bump version/lint/generated files
2018-06-01 15:23:10 -07:00
Alex Dadgar
247f1edb11
spelling
2018-06-01 14:53:08 -07:00
Preetha Appan
ce6d4a8d7a
Fix tests and move isClient to constructor
2018-06-01 15:59:53 -05:00
Preetha Appan
a5bfaa098c
Fix unnecessary deregistration in consul sync
...
This commit fixes an issue where if a nomad client and server shared the same consul instance, the server would deregister any services and checks registered by clients for running tasks.
2018-06-01 14:48:25 -05:00
Alex Dadgar
40fec81315
Merge pull request #4277 from hashicorp/f-retry-join-clients
...
Add go-discover support to Nomad clients
2018-06-01 16:57:40 +00:00
Alex Dadgar
aca8d5cece
Actually disable the schedulers
2018-05-31 13:11:11 -07:00
Alex Dadgar
d098885b79
Disable schedulers for TestHTTP_AllocSnapshot_Atomic
2018-05-31 12:05:44 -07:00
Alex Dadgar
4765b62284
Improve validation/defaulting, handle start-join
...
This commit:
* Improves how we combine the old retry-* fields and the new stanza and
how it is validated
* Handles the new stanza setting start_join
* Fixes integration test to not bind to the standard port and instead be
randomized.
* Simplifies parsing of the old retry_interval
* Fixes the errors from retry join being masked
* Flags get parsed into new server_join stanza
2018-05-31 10:53:26 -07:00
Alex Dadgar
e1bf8780b5
validation errors
2018-05-31 10:53:26 -07:00
Alex Dadgar
a02fbe3e0f
indentation
2018-05-31 10:53:26 -07:00
Chelsea Holland Komlo
2bf2af4378
ensure default value of 30s is set for server_join stanza
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
307458d4a3
ignore default values for retry interval
...
add additional validation case
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
ebc758aa0e
add stronger protections for nil pointers in server join merge
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
10aff14509
update config parse test
...
documentation fixes
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
ac1411ce95
RetryInterval should be a time.Duration
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
e79bc29e1a
set retryInterval and other code feedback
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
de03c884bc
add further configuration validation for server_join
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
df7539b9d0
update documentation for server_join
2018-05-31 10:50:04 -07:00
Chelsea Holland Komlo
a4e514e07f
update server_join naming and improve logging
2018-05-31 10:50:03 -07:00
Chelsea Holland Komlo
064b5481e0
add server join info to server and client
2018-05-31 10:50:03 -07:00
Preetha Appan
7414395daa
Use constant in test
2018-05-30 17:27:04 -05:00
Preetha Appan
6cbd25945c
Add unit test to verify compatibility code for node drains
2018-05-30 17:14:53 -05:00
Preetha Appan
4f835790d7
Set node eligibility to true when old client calls disable
2018-05-30 16:54:07 -05:00
Preetha Appan
34db410b74
Fix failing test TestClientStatusRequest
2018-05-30 15:11:54 -05:00
Preetha Appan
2752204f26
Fix failing test TestHTTP_AllocAllGC
2018-05-30 15:11:54 -05:00
Chelsea Holland Komlo
19e4a5489b
add support for tls PreferServerCipherSuites
...
add further tests for tls configuration
2018-05-25 13:20:00 -04:00
Chelsea Komlo
af15dda45a
Merge pull request #4328 from hashicorp/r-single-tls-config-constructor
...
Refactor to prefer using NewTLSConfiguration constructor
2018-05-24 13:46:29 -04:00
Alex Dadgar
b1de61e012
Merge pull request #4321 from hashicorp/f-network-info
...
Display bind/advertise addresses on agent startup
2018-05-24 17:30:56 +00:00
Chelsea Holland Komlo
38f611a7f2
refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
...
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Alex Dadgar
51e67daf69
Use Tags when CanaryTags isn't specified
...
This PR fixes a bug where we weren't defaulting to `tags` when
`canary_tags` was empty and adds documentation.
2018-05-23 13:07:47 -07:00
Alex Dadgar
dd52ec402c
Display bind/advertise addresses on agent startup
...
Sample outputs from demo/vagrant/(server/client1).hcl and `nomad agent -dev` mode
Server:
```
==> Nomad agent configuration:
Advertise Addrs: HTTP: 192.168.1.75:4646; RPC: 192.168.1.75:4647; Serf: 192.168.1.75:4648
Bind Addrs: HTTP: 0.0.0.0:4646; RPC: 0.0.0.0:4647; Serf: 0.0.0.0:4648
Client: false
Log Level: DEBUG
Region: global (DC: dc1)
Server: true
Version: 0.8.4-dev
```
Client:
```
==> Nomad agent configuration:
Advertise Addrs: HTTP: 192.168.1.75:5656
Bind Addrs: HTTP: 0.0.0.0:5656
Client: true
Log Level: DEBUG
Region: global (DC: dc1)
Server: false
Version: 0.8.4-dev
```
Dev:
```
==> Nomad agent configuration:
Advertise Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
Bind Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
Client: true
Log Level: DEBUG
Region: global (DC: dc1)
Server: true
Version: 0.8.4-dev
```
2018-05-22 15:14:33 -07:00
Alex Dadgar
44697efd9a
safety guard
2018-05-22 14:45:34 -07:00
Alex Dadgar
586895965c
Unit test for dev agent
2018-05-22 14:45:34 -07:00
Alex Dadgar
58d2a4c7c2
Do not bypass normal RPC codepath when running both client and server at once
2018-05-22 14:45:34 -07:00
Alex Dadgar
21c5ed850d
Register events
2018-05-22 14:06:33 -07:00
Preetha
159888a856
Merge pull request #4274 from hashicorp/f-force-rescheduling
...
Add CLI and API support for forcing rescheduling of failed allocs
2018-05-21 16:24:22 -07:00
Preetha Appan
bfa0937bbb
Code review feedback
2018-05-10 14:42:24 -05:00
Chelsea Holland Komlo
44f536f18e
add support for configurable TLS minimum version
2018-05-09 18:07:12 -04:00
Chelsea Holland Komlo
796bae6f1b
allow configurable cipher suites
...
disallow 3DES and RC4 ciphers
add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Preetha Appan
b12df3c64b
Added CLI for evaluating job given ID, and modified client API for evaluate to take a request payload
2018-05-09 15:04:27 -05:00
Preetha Appan
c1b92c284e
Work in progress - force rescheduling of failed allocs
2018-05-08 17:26:57 -05:00
Preetha
e7ae6e98d9
Merge pull request #4259 from hashicorp/f-deployment-improvements
2018-05-08 16:37:10 -05:00
Chelsea Holland Komlo
136635f04d
only write error log line on error
2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo
30584639b5
remove log line for empty addresses which could confuse on initalization
2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo
24ff40df01
retry until all options are exhausted
2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo
ec4be4f871
ensure provider= is always the string prefix
2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo
5422b1b088
update test for more realistic IP address from go-discover
2018-05-07 16:57:07 -04:00
Chelsea Holland Komlo
7e4d4f8088
comments and other fixups
2018-05-07 16:57:06 -04:00
Chelsea Holland Komlo
8f584f6474
add go-discover
2018-05-07 16:57:06 -04:00
Chelsea Holland Komlo
25ad6eaf96
refactor to retryJoiner interface
2018-05-07 16:57:06 -04:00
Michael Schurter
f1d13683e6
consul: remove services with/without canary tags
...
Guard against Canary being set to false at the same time as an
allocation is being stopped: this could cause RemoveTask to be called
with the wrong Canary value and leaking a service.
Deleting both Canary values is the safest route.
2018-05-07 14:55:01 -05:00
Michael Schurter
50e04c976e
consul: support canary tags for services
...
Also refactor Consul ServiceClient to take a struct instead of a massive
set of arguments. Meant updating a lot of code but it should be far
easier to extend in the future as you will only need to update a single
struct instead of every single call site.
Adds an e2e test for canary tags.
2018-05-07 14:55:01 -05:00
Alex Dadgar
f4af30fbb5
Canary tags structs
2018-05-07 14:50:01 -05:00
Alex Dadgar
ee50789c22
Initial implementation
2018-05-07 14:50:01 -05:00
Michael Schurter
0d534d30d6
Merge pull request #4251 from hashicorp/f-grpc-checks
...
Support Consul gRPC Health Checks
2018-05-04 14:55:16 -07:00
Michael Schurter
f6a4713141
consul: make grpc checks more like http checks
2018-05-04 11:08:11 -07:00
Michael Schurter
382caec1e1
consul: initial grpc implementation
...
Needs to be more like http.
2018-05-04 11:08:11 -07:00
Michael Schurter
526af6a246
framer: fix early exit/truncation in framer
2018-05-02 10:46:16 -07:00
Michael Schurter
5d6bf2938f
client: use a bytes.Reader for reading a []byte
2018-05-02 10:46:16 -07:00
Michael Schurter
e00e265da5
client: ensure cancel is always called when func exits
2018-05-02 10:46:16 -07:00
Michael Schurter
d21b749c68
client: give pipe conns meaningful names
2018-05-02 10:46:16 -07:00
Michael Schurter
5ef0a82e6e
client: reset encoders between uses
...
According to go/codec's docs, Reset(...) should be called on
Decoders/Encoders before reuse:
https://godoc.org/github.com/ugorji/go/codec
I could find no evidence that *not* calling Reset() caused bugs, but
might as well do what the docs say?
2018-05-02 10:46:16 -07:00
Alex Dadgar
43192cefae
generated files
2018-04-26 16:28:58 -07:00
Michael Schurter
cfcbb9fa21
consul: periodically reconcile services/checks
...
Periodically sync services and checks from Nomad to Consul. This is
mostly useful when testing with the Consul dev agent which does not
persist state across restarts. However, this is a reasonable safety
measure to prevent skew between Consul's state and Nomad's
services+checks.
Also modernized the test suite a bit.
2018-04-19 15:45:42 -07:00
Nick Ethier
2e6c95f511
Merge pull request #4138 from hashicorp/i-hcl-json-endpoint
...
HCL to JSON api endpoint
2018-04-19 14:18:34 -04:00
Michael Schurter
d3650fb2cd
test: build with mock_driver by default
...
`make release` and `make prerelease` set a `release` tag to disable
enabling the `mock_driver`
2018-04-18 14:45:33 -07:00
Nick Ethier
555b036369
command/agent: fix url in jobs parse ep test
2018-04-16 19:21:09 -04:00
Nick Ethier
0baf9ba385
command/agent: fix ptr ref in job endpoint test
2018-04-16 19:21:09 -04:00
Nick Ethier
de4176606d
command/agent: add Canonicalize option to parse args
2018-04-16 19:21:09 -04:00
Nick Ethier
31da01856a
command/agent: add HCL mock for parse endpoint
2018-04-16 19:21:09 -04:00
Nick Ethier
f2db03e56c
command/agent: add /v1/jobs/parse endpoint
...
The parse endpoint accepts a hcl jobspec body within a json object
and returns the parsed json object for the job. This allows users to
register jobs with the nomad json api without specifically needing
a nomad binary to parse their hcl encoded jobspec file.
2018-04-16 19:21:06 -04:00
Alex Dadgar
663c4d0433
Version bump and generated files
2018-04-12 16:21:50 -07:00
Preetha Appan
f04c9d2b19
minor code review fix
2018-04-10 15:33:01 -05:00
Preetha Appan
6b6e97e50c
Dev mode should never persist nodeid
2018-04-10 12:34:14 -05:00
Preetha Appan
e1c3c7ab6c
Lint fixes
2018-04-10 11:22:16 -05:00
Preetha Appan
6d0e1c9fea
Use preconfigured nodeID if there isn't a persisted node ID, and persist it if its not persisted.
2018-04-10 08:47:33 -05:00
Alex Dadgar
d179a09b83
WIP: Not setting node id properlperly
2018-04-09 18:01:28 -07:00
Alex Dadgar
6871a068cb
Fix prefix output of agent metadata
2018-03-30 16:27:18 -07:00
Alex Dadgar
af81349dbe
Generated files
2018-03-30 16:14:40 -07:00
Michael Schurter
7199a2b960
cli: differentiate normal output vs info
2018-03-30 11:42:11 -07:00
Chelsea Komlo
607e631714
Merge pull request #4046 from hashicorp/tls-same-file-reload
...
Check file contents when determining if agent should reload TLS confi…
2018-03-29 10:51:32 -04:00
Chelsea Holland Komlo
be15c07110
make check fix
2018-03-28 19:11:51 -04:00
Chelsea Holland Komlo
58ada9bc42
return error when setting checksum; don't reload
2018-03-28 18:15:50 -04:00
Chelsea Holland Komlo
38c7a38549
output warning for error in creating TLS checksum
2018-03-28 17:44:23 -04:00
Chelsea Holland Komlo
770ac13bc4
check for nil, remove unnecessary set checksum call
2018-03-28 13:29:53 -04:00
Preetha
177d2d6010
Merge pull request #4052 from hashicorp/f-specify-total-memory
...
Allow to specify total memory on agent configuration
2018-03-28 12:28:41 -05:00
Chelsea Holland Komlo
d3d686b58a
fix up test for file content changes
2018-03-28 13:18:13 -04:00
Preetha Appan
5c2fc6d46b
Fix compilation fail in test after rebase
2018-03-28 11:54:34 -05:00
Chelsea Holland Komlo
2d5af7ff4d
set TLS checksum when parsing config
...
Refactor checksum comparison, always set checksum if it is empty
2018-03-28 09:56:11 -04:00
Alex Dadgar
de4b3772f1
Create evals for system jobs when drain is unset
...
This PR creates evals for system jobs when:
* Drain is unset and mark eligible is true
* Eligibility is restored to the node
2018-03-27 15:53:24 -07:00
Mildred Ki'Lya
1017cbe8ab
Allow to specify total memory on agent configuration
...
Allow to set the total memory of an agent in its configuration file. This
can be used in case the automatic detection doesn't work or in specific
environments when memory overcommit (using swap for example) can be
desirable.
2018-03-27 15:46:18 -05:00
Chelsea Holland Komlo
6e6d6b7e33
check file contents when determining if agent should reload TLS configuration
2018-03-27 15:42:20 -04:00
Alex Dadgar
432784dae3
Fix alloc watcher snapshot streaming
2018-03-27 11:14:53 -07:00
Chelsea Komlo
57e2cd04bd
Merge pull request #4025 from hashicorp/reload-http-tls
...
Allow TLS configurations for HTTP and RPC connections to be reloaded …
2018-03-26 18:00:30 -04:00
Preetha Appan
33e170c15d
s/linear/constant/g
2018-03-26 14:45:09 -05:00
Chelsea Holland Komlo
96df419fff
code review feedback
2018-03-26 10:55:22 -04:00
Alex Dadgar
34211f00a7
Allow separate enterprise config overlay
2018-03-22 13:53:08 -07:00
Michael Schurter
0e0b04afec
test: fix by using mock.BatchJob
2018-03-21 16:51:45 -07:00
Michael Schurter
39cef16c73
test: don't call t.Fatal from within a goroutine
2018-03-21 16:51:45 -07:00
Michael Schurter
cb61a4bdc7
Fix linting errors
2018-03-21 16:51:45 -07:00
Alex Dadgar
7b2bad8c5e
Toggle Drain allows resetting eligibility
...
This PR allows marking a node as eligible for scheduling while toggling
drain. By default the `nomad node drain -disable` commmand will mark it
as eligible but the drainer will maintain in-eligibility.
2018-03-21 16:51:44 -07:00
Alex Dadgar
02019f216a
Correct defaulting
2018-03-21 16:51:44 -07:00
Alex Dadgar
78c7c36e65
code review
2018-03-21 16:51:44 -07:00
Alex Dadgar
8289cc3c6f
HTTP and API
2018-03-21 16:51:44 -07:00
Alex Dadgar
b3d2346419
Upgrade path
2018-03-21 16:51:43 -07:00
Alex Dadgar
010228577e
Drain cli, api, http
2018-03-21 16:51:43 -07:00
Chelsea Holland Komlo
66e44cdb73
Allow TLS configurations for HTTP and RPC connections to be reloaded separately
2018-03-21 17:51:08 -04:00
Michael Schurter
70c370c6fe
Merge pull request #4003 from jrasell/f_gh_3988
...
Allow Nomads Consul health check names to be configurable.
2018-03-20 16:44:08 -07:00
James Rasell
121c3bc997
Update Consul check params from using health-check to check.
2018-03-20 16:03:58 +01:00
Michael Schurter
86ccdb9115
Fix generating static assets
...
Broke due to a change in go-bindata-assetfs
2018-03-19 15:52:38 -07:00
James Rasell
15afef9b77
Allow Nomads Consul health checks to be configurable.
...
This change allows the client HTTP and the server HTTP, Serf and
RPC health check names within Consul to be configurable with the
defaults as previous. The configuration can be done via either a
config file or using CLI flags.
Closes #3988
2018-03-19 19:37:56 +01:00
Preetha
6df57c177c
Merge pull request #4002 from hashicorp/b-reschedule-systemjob-panic
...
Fix incorrect initialization of reschedule policy for system jobs.
2018-03-19 13:06:55 -05:00
Preetha Appan
161bc66355
Fix incorrect initialization of reschedule policy for system jobs.
2018-03-19 12:16:13 -05:00
Alex Dadgar
9e05c9a50e
Merge pull request #3997 from hashicorp/b-serf-addr
...
RPC Advertise used exclusively for Clients
2018-03-19 09:30:20 -07:00
Alex Dadgar
9ef23ff277
enable server in test
2018-03-16 16:52:37 -07:00
Alex Dadgar
b8607ad6d6
Heartbeat uses client rpc advertise and server defaults server rpc advertise addr
2018-03-16 16:47:08 -07:00
Alex Dadgar
52b7fb5361
Separate client and server rpc advertise addresses
2018-03-16 16:47:08 -07:00
Michael Schurter
86f562be3a
Remove unnecessary conversions
2018-03-16 16:32:59 -07:00
Michael Schurter
c3e8f6319c
gofmt -s (simplify) files
2018-03-16 16:31:16 -07:00
Michael Schurter
1044bc0feb
Merge pull request #3984 from hashicorp/f-loosen-consul-skipverify
...
Replace Consul TLSSkipVerify handling
2018-03-16 11:21:28 -07:00
Michael Schurter
0971114f0c
Replace Consul TLSSkipVerify handling
...
Instead of checking Consul's version on startup to see if it supports
TLSSkipVerify, assume that it does and only log in the job service
handler if we discover Consul does not support TLSSkipVerify.
The old code would break TLSSkipVerify support if Nomad started before
Consul (such as on system boot) as TLSSkipVerify would default to false
if Consul wasn't running. Since TLSSkipVerify has been supported since
Consul 0.7.2, it's safe to relax our handling.
2018-03-14 17:43:06 -07:00
Alex Dadgar
3537c73289
Merge pull request #3978 from hashicorp/b-core-sched
...
Always add core scheduler
2018-03-14 16:13:15 -07:00
Preetha Appan
e75630f8e8
Fix formatting
2018-03-14 16:10:32 -05:00
Preetha Appan
9a5e6edf1f
Rename DelayCeiling to MaxDelay
2018-03-14 16:10:32 -05:00
Preetha Appan
5f50c3d618
Add new reschedule options to API layer and unit tests
2018-03-14 16:10:32 -05:00
Alex Dadgar
92cb552ff6
Always add core scheduler and detect invalid schedulers
2018-03-14 10:53:27 -07:00
Alex Dadgar
63e14b7d63
nodeevents -> events
2018-03-13 18:08:22 -07:00
Chelsea Holland Komlo
1488b076d1
code review feedback
2018-03-13 18:08:21 -07:00
Chelsea Holland Komlo
00d9923454
Ensure node updates don't strip node events
...
Add node events to CLI
2018-03-13 18:05:40 -07:00
Michael Schurter
ec381ee705
Revert spelling corrections in generated code
2018-03-12 11:19:29 -07:00
Josh Soref
1359fd2c3d
spelling: unexpected
2018-03-11 19:08:07 +00:00
Josh Soref
42d7f19861
spelling: supports
2018-03-11 19:00:11 +00:00
Josh Soref
c808dc3095
spelling: submitted
2018-03-11 18:59:27 +00:00
Josh Soref
6e1244b6c1
spelling: significantly
2018-03-11 18:56:45 +00:00
Josh Soref
8978caea28
spelling: shutdown
2018-03-11 18:55:49 +00:00
Josh Soref
05305afcd9
spelling: services
2018-03-11 18:53:58 +00:00
Josh Soref
ad55e85e73
spelling: registrations
2018-03-11 18:40:53 +00:00
Josh Soref
6fa892a463
spelling: propagated
2018-03-11 18:39:26 +00:00
Josh Soref
d208d26b6e
spelling: preemptively
2018-03-11 17:58:48 +00:00
Josh Soref
8abf038f4d
spelling: output
2018-03-11 18:35:30 +00:00
Josh Soref
3c1ce6d16d
spelling: otherwise
2018-03-11 18:34:27 +00:00
Josh Soref
3e2f500cf9
spelling: largely
2018-03-11 18:21:52 +00:00
Josh Soref
85fabc63c8
spelling: expected
2018-03-11 17:57:01 +00:00
Josh Soref
680bbd6d4f
spelling: encountered
2018-03-11 17:58:59 +00:00
Josh Soref
1f927cd343
spelling: each other
2018-03-11 17:56:50 +00:00
Josh Soref
7a6dfa4b1a
spelling: current
2018-03-11 17:52:32 +00:00
Josh Soref
6f48e31c00
spelling: convenience
2018-03-11 17:50:48 +00:00
Josh Soref
76cf178933
spelling: cleanup
2018-03-11 17:47:09 +00:00
Josh Soref
3ec4ebc7b1
spelling: canonical
2018-03-11 17:46:01 +00:00
Josh Soref
1d58ae8899
spelling: bootstrap
2018-03-11 17:43:19 +00:00
Josh Soref
455eb1aeb3
spelling: authoritative
2018-03-11 17:42:05 +00:00
Josh Soref
5f87691df1
spelling: asynchronously
2018-03-11 17:41:50 +00:00
Josh Soref
58b794875f
spelling: artifact
2018-03-11 17:41:02 +00:00
Josh Soref
c74245279b
spelling: arbitrary
2018-03-11 17:40:03 +00:00
Alex Dadgar
483e011720
Merge pull request #3892 from hashicorp/f-tunnel
...
Client RPC Endpoints, Server Routing and Streaming RPCs
2018-02-20 16:35:42 -08:00
Alex Dadgar
8b86e64fd8
Show HTTP request method
2018-02-16 15:55:26 -08:00
Alex Dadgar
aa98f8ba7b
Enhance API pkg to utilize Server's Client Tunnel
...
This PR enhances the API package by having client only RPCs route
through the server when they are low cost and for filesystem access to
first attempt a direct connection to the node and then falling back to
a server routed request.
2018-02-15 13:59:03 -08:00
Alex Dadgar
38b695b69c
feedback and rebasing
2018-02-15 13:59:03 -08:00
Alex Dadgar
c5e9ebb656
Use helper for forwarding
2018-02-15 13:59:03 -08:00
Alex Dadgar
9117ef4650
HTTP agent
2018-02-15 13:59:03 -08:00
Alex Dadgar
d7029965ca
Server side impl + touch ups
2018-02-15 13:59:02 -08:00
Alex Dadgar
14845bb918
Refactor determining the handler for a node id call
2018-02-15 13:59:02 -08:00
Alex Dadgar
e685211892
Code review feedback
2018-02-15 13:59:02 -08:00
Alex Dadgar
f5f43218f5
HTTP and tests
2018-02-15 13:59:02 -08:00
Alex Dadgar
9a5569678c
Client Stat/List impl
2018-02-15 13:59:02 -08:00
Alex Dadgar
8854b35b34
Agent logs
2018-02-15 13:59:02 -08:00
Alex Dadgar
69def2ff22
Server tests of logs
2018-02-15 13:59:02 -08:00
Alex Dadgar
ddd67f5f11
Server streaming
2018-02-15 13:59:01 -08:00
Alex Dadgar
ca9379be09
Logs over RPC w/ lots to touch up
2018-02-15 13:59:01 -08:00
Alex Dadgar
993727c28f
Use in-mem rpc
2018-02-15 13:59:01 -08:00
Alex Dadgar
d5a834b801
fix lint
2018-02-15 13:59:01 -08:00
Alex Dadgar
9bc75f0ad4
Fix manager tests and make testagent recover from port conflicts
2018-02-15 13:59:01 -08:00
Alex Dadgar
8dcda29c21
Use nomad UUID
2018-02-15 13:59:00 -08:00
Alex Dadgar
71029b6329
Test http
2018-02-15 13:59:00 -08:00
Alex Dadgar
6dd1c9f49d
Refactor
2018-02-15 13:59:00 -08:00
Alex Dadgar
ad7bc0c6bd
Server can forward ClientStats.Stats
2018-02-15 13:59:00 -08:00
Alex Dadgar
1472b943d6
Stats Endpoint
2018-02-15 13:59:00 -08:00
Kyle Havlovitz
54b691f538
Merge pull request #3852 from hashicorp/autopilot-cleanup
...
Clean up some leftover autopilot differences from Consul
2018-02-14 10:42:32 -08:00
Preetha
df6400222b
Merge pull request #3868 from hashicorp/f-server-side-restarts
...
server side rescheduling
2018-02-13 20:09:51 -06:00
Kyle Havlovitz
709b693d39
Clean up some leftover autopilot differences from Consul
2018-02-08 10:27:26 -08:00
Mahmood Ali
bebafb5234
Add tags option to datadog telemetry
...
Expose an global tags option in telemetry config for dogstatsd, for
purposes of distinguishing between multiple nomad cluster metrics.
2018-02-06 12:08:37 -05:00
Preetha Appan
4a78c5bc84
Fix unit test
2018-01-31 09:56:53 -06:00
Preetha Appan
1f834d1a31
Add reschedule policy to API, and HCL parsing support.
2018-01-31 09:56:53 -06:00
Kyle Havlovitz
0eb0acacdc
Fix remaining issues with autopilot change
2018-01-30 15:21:28 -08:00
Kyle Havlovitz
2ccf565bf6
Refactor redundancy_zone/upgrade_version out of client meta
2018-01-29 20:03:38 -08:00
Chelsea Komlo
d09cc2a69f
Merge pull request #3492 from hashicorp/f-client-tls-reload
...
Client/Server TLS dynamic reload
2018-01-23 05:51:32 -05:00
Michael Schurter
694b547a6b
Merge pull request #3682 from hashicorp/b-3681-always-set-driver-ip
...
Always advertise driver IP when in driver mode
2018-01-22 16:41:34 -08:00
Chelsea Holland Komlo
7d3c240871
swap raft layer tls wrapper
2018-01-19 17:00:15 -05:00
Michael Schurter
8a0cf66822
Improve invalid port error message for services
...
Related to #3681
If a user specifies an invalid port *label* when using
address_mode=driver they'll get an error message about the label being
an invalid number which is very confusing.
I also added a bunch of testing around Service.AddressMode validation
since I was concerned by the linked issue that there were cases I was
missing. Unfortunately when address_mode=driver is used there's only so
much validation that can be done as structs/structs.go validation never
peeks into the driver config which would be needed to verify the port
labels/map.
2018-01-18 15:35:24 -08:00
Michael Schurter
447dc5bbd3
Fix test
2018-01-18 15:35:24 -08:00
Michael Schurter
583e17fad5
Always advertise driver IP when in driver mode
...
Fixes #3681
When in drive address mode Nomad should always advertise the driver's IP
in Consul even when no network exists. This matches the 0.6 behavior.
When in host address mode Nomad advertises the alloc's network's IP if
one exists. Otherwise it lets Consul determine the IP.
I also added some much needed logging around Docker's network discovery.
2018-01-18 15:35:24 -08:00
Kyle Havlovitz
8d41f4ad40
Formatting/test adjustments
2018-01-18 15:03:35 -08:00
Kyle Havlovitz
12ff22ea70
Merge branch 'master' into autopilot
2018-01-18 13:29:25 -08:00
Chelsea Holland Komlo
35466a331a
fixing up raft reload tests
...
close second goroutine in raft-net
2018-01-17 10:29:15 -05:00
Michael Schurter
57eb128dcf
Merge pull request #3718 from hashicorp/b-3713-fix-check-restart
...
Fix service.check_restart stanza propagation
2018-01-16 16:39:42 -08:00