Commit graph

350 commits

Author SHA1 Message Date
James Rasell dab8282be5
Merge pull request #8589 from hashicorp/f-gh-5718
driver/docker: allow configurable pull context timeout setting.
2020-08-14 16:07:59 +02:00
James Rasell bc42cd2e5e
driver/docker: allow configurable pull context timeout setting.
Pulling large docker containers can take longer than the default
context timeout. Without a way to change this it is very hard for
users to utilise Nomad properly without hacky work arounds.

This change adds an optional pull_timeout config parameter which
gives operators the possibility to account for increase pull times
where needed. The infra docker image also has the option to set a
custom timeout to keep consistency.
2020-08-12 08:58:07 +01:00
Tim Gross fb27082e5c
RPC errors must be wrapped in order to wrap internal errors (#8632)
The CSI client RPC uses error wrapping to detect the type of error bubbling up
from plugins, but if the errors we get aren't wrapped at each layer, we can't
unwrap the inner error.

Also eliminates some unused args.
2020-08-11 09:13:52 -04:00
Seth Hoenig 6ab3d21d2c consul: validate script type when ussing check thresholds 2020-08-10 14:08:09 -05:00
Drew Bailey b296558b8e
oss compoments for multi-vault namespaces
adds in oss components to support enterprise multi-vault namespace feature

upgrade specific doc on vault multi-namespaces

vault docs

update test to reflect new error
2020-07-24 10:14:59 -04:00
Mahmood Ali ab25616add
Merge pull request #7234 from derekmarcotte/dm-freebsd
Fix undefined: getEphemeralPortRange error on FreeBSD.
2020-07-24 10:01:41 -04:00
Mahmood Ali e226ca9c4f revert changes from earlier change 2020-06-12 14:02:43 -04:00
Mahmood Ali 69bb42acf8 tests: prefix agent logs to identify agent sources 2020-06-07 16:38:11 -04:00
Mahmood Ali 9eb13ae144 basic snapshot restore 2020-06-07 15:46:23 -04:00
Mahmood Ali de44d9641b
Merge pull request #8047 from hashicorp/f-snapshot-save
API for atomic snapshot backups
2020-06-01 07:55:16 -04:00
Mahmood Ali 19cc84ec05
Apply suggestions from code review
Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-05-31 21:29:17 -04:00
Mahmood Ali fc6c3620c4 tests: log to stderr directly
Go 1.14 now streams t.Log output as it happens [1], so we no longer need
to maintain our log STDOUT helper.

I preserved the interface, so `testlog` still takes in a `*testing.T`
though unused. Changing it requires so too many changes that I didn't
want to make quite yet.

[1] https://golang.org/doc/go1.14#go-test
2020-05-27 08:42:29 -04:00
Mahmood Ali 2588b3bc98 cleanup driver eventor goroutines
This fixes few cases where driver eventor goroutines are leaked during
normal operations, but especially so in tests.

This change makes few modifications:

First, it switches drivers to use `Context`s to manage shutdown events.
Previously, it relied on callers invoking `.Shutdown()` function that is
specific to internal drivers only and require casting.  Using `Contexts`
provide a consistent idiomatic way to manage lifecycle for both internal
and external drivers.

Also, I discovered few places where we don't clean up a temporary driver
instance in the plugin catalog code, where we dispense a driver to
inspect and validate the schema config without properly cleaning it up.
2020-05-26 11:04:04 -04:00
Mahmood Ali 5585cb347e Add snapshot helper
Effectively Copied from https://github.com/hashicorp/consul/tree/v1.8.0-beta1/snapshot

With addition of overall snapshot checksum file
2020-05-21 20:04:38 -04:00
Mahmood Ali 751f337f1c Update hcl2 vendoring
The hcl2 library has moved from http://github.com/hashicorp/hcl2 to https://github.com/hashicorp/hcl/tree/hcl2.

This updates Nomad's vendoring to start using hcl2 library.  Also
updates some related libraries (e.g. `github.com/zclconf/go-cty/cty` and
`github.com/apparentlymart/go-textseg`).
2020-05-19 15:00:03 -04:00
Mahmood Ali b4fa8e9588 codec: we use hashicorp/go-msgpack exclusively
No need to maintain two msgpack handles!
2020-05-11 14:05:29 -04:00
James Rasell f18d8cab23
Merge pull request #7558 from hashicorp/b-ensure-correct-plugin-version-mapping
plugin: ensure plugin loader maps correct API version to type.
2020-04-01 12:34:24 +02:00
Yoan Blanc 225c9c1215 fixup! vendor: explicit use of hashicorp/go-msgpack
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-31 09:48:07 -04:00
Yoan Blanc 761d014071 vendor: explicit use of hashicorp/go-msgpack
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-03-31 09:45:21 -04:00
James Rasell 5dd4130a47
plugin: ensure plugin loader maps correct API version to type.
The plugin loader supplies a version map to ensure the Nomad agent
can support the plugins at the version they specify. The map was
incorrectly mapping the driver type to the device API supported
version identifier. This currently does not cause a bug as both
device and driver versions are the same string. This could cause
problems in the future, however, if either plugin interface were
to change and require version updates.
2020-03-31 12:31:56 +02:00
Michael Schurter 5ff458e840 agent: prevent XSS by controlling Content-Type 2020-03-25 09:45:43 -04:00
Lang Martin 887e1f28c9 csi: CLI for volume status, registration/deregistration and plugin status (#7193)
* command/csi: csi, csi_plugin, csi_volume

* helper/funcs: move ExtraKeys from parse_config to UnusedKeys

* command/agent/config_parse: use helper.UnusedKeys

* api/csi: annotate CSIVolumes with hcl fields

* command/csi_plugin: add Synopsis

* command/csi_volume_register: use hcl.Decode style parsing

* command/csi_volume_list

* command/csi_volume_status: list format, cleanup

* command/csi_plugin_list

* command/csi_plugin_status

* command/csi_volume_deregister

* command/csi_volume: add Synopsis

* api/contexts/contexts: add csi search contexts to the constants

* command/commands: register csi commands

* api/csi: fix struct tag for linter

* command/csi_plugin_list: unused struct vars

* command/csi_plugin_status: unused struct vars

* command/csi_volume_list: unused struct vars

* api/csi: add allocs to CSIPlugin

* command/csi_plugin_status: format the allocs

* api/allocations: copy Allocation.Stub in from structs

* nomad/client_rpc: add some error context with Errorf

* api/csi: collapse read & write alloc maps to a stub list

* command/csi_volume_status: cleanup allocation display

* command/csi_volume_list: use Schedulable instead of Healthy

* command/csi_volume_status: use Schedulable instead of Healthy

* command/csi_volume_list: sprintf string

* command/csi: delete csi.go, csi_plugin.go

* command/plugin: refactor csi components to sub-command plugin status

* command/plugin: remove csi

* command/plugin_status: remove csi

* command/volume: remove csi

* command/volume_status: split out csi specific

* helper/funcs: add RemoveEqualFold

* command/agent/config_parse: use helper.RemoveEqualFold

* api/csi: do ,unusedKeys right

* command/volume: refactor csi components to `nomad volume`

* command/volume_register: split out csi specific

* command/commands: use the new top level commands

* command/volume_deregister: hardwired type csi for now

* command/volume_status: csiFormatVolumes rescued from volume_list

* command/plugin_status: avoid a panic on no args

* command/volume_status: avoid a panic on no args

* command/plugin_status: predictVolumeType

* command/volume_status: predictVolumeType

* nomad/csi_endpoint_test: move CreateTestPlugin to testing

* command/plugin_status_test: use CreateTestCSIPlugin

* nomad/structs/structs: add CSIPlugins and CSIVolumes search consts

* nomad/state/state_store: add CSIPlugins and CSIVolumesByIDPrefix

* nomad/search_endpoint: add CSIPlugins and CSIVolumes

* command/plugin_status: move the header to the csi specific

* command/volume_status: move the header to the csi specific

* nomad/state/state_store: CSIPluginByID prefix

* command/status: rename the search context to just Plugins/Volumes

* command/plugin,volume_status: test return ids now

* command/status: rename the search context to just Plugins/Volumes

* command/plugin_status: support -json and -t

* command/volume_status: support -json and -t

* command/plugin_status_csi: comments

* command/*_status: clean up text

* api/csi: fix stale comments

* command/volume: make deregister sound less fearsome

* command/plugin_status: set the id length

* command/plugin_status_csi: more compact plugin health

* command/volume: better error message, comment
2020-03-23 13:58:30 -04:00
Danielle Lancashire 6ee038d515 helper/mount: Add mount helper package
This package introduces some basic abstractions around mount utilties
for various platforms. Initially it only supports linux, but the plan is
to expand this as CSI expands across to other platforms.
2020-03-23 13:58:29 -04:00
Danielle Lancashire 973a61935e helper: Add initial grpc logging middleware 2020-03-23 13:58:29 -04:00
Derek Marcotte aad870c9d8 Fix undefined: getEphemeralPortRange error on FreeBSD. 2020-02-27 14:54:55 -05:00
Mahmood Ali a8d6950007 Remove rkt as a built-in driver
Rkt has been archived and is no longer an active project:
* https://github.com/rkt/rkt
* https://github.com/rkt/rkt/issues/4024

The rkt driver will continue to live as an external plugin.
2020-02-26 22:16:41 -05:00
Mahmood Ali 98ad59b1de update rest of consul packages 2020-02-16 16:25:04 -06:00
Mahmood Ali f9ac62bb78
Update helper/pool/pool.go
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2020-02-06 19:24:58 -05:00
Mahmood Ali e106d373b2 rpc: Use MultiplexV2 for connections
MultiplexV2 is a new connection multiplex header that supports multiplex both
RPC and streaming requests over the same Yamux connection.

MultiplexV2 was added in 0.8.0 as part of
https://github.com/hashicorp/nomad/pull/3892 .  So Nomad 0.11 can expect it to
be supported.  Though, some more rigorous testing is required before merging
this.

I want to call out some implementation details:

First, the current connection pool reuses the Yamux stream for multiple RPC calls,
and doesn't close them until an error is encountered.  This commit doesn't
change it, and sets the `RpcNomad` byte only at stream creation.

Second, the StreamingRPC session gets closed by callers and cannot be reused.
Every StreamingRPC opens a new Yamux session.
2020-02-03 19:31:39 -05:00
Mahmood Ali a4d58c3178 pool: Clear connection before releasing
This to be consistent with other connection clean up handler as well as consul's https://github.com/hashicorp/consul/blob/v1.6.3/agent/pool/pool.go#L468-L479 .
2020-02-03 12:41:11 -05:00
Mahmood Ali 5983226cb9 Some fixes to connection pooling
Pick up some fixes from Consul:

* If a stream returns an EOF error, clear session from cache/pool and start a
new one.
* Close the codec when closing StreamClient
2020-01-31 15:31:16 -05:00
Drew Bailey 1b8af920f3
address pr feedback 2020-01-09 15:15:09 -05:00
Drew Bailey 7bbba613a5
prevent doubly wrapping with rpc error 2020-01-09 15:15:07 -05:00
Seth Hoenig f0c3dca49c tests: swap lib/freeport for tweaked helper/freeport
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.

Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.

This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
2019-12-09 08:37:32 -06:00
Nick Ethier 729dd9018c
docker: set default cpu cfs period (#6737)
* docker: set default cpu cfs period

Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2019-11-19 19:05:15 -05:00
Michael Schurter be00d0cc41 test: assert semvers are *not* compared lexically 2019-11-19 10:59:40 -08:00
Michael Schurter 796758b8a5 core: add semver constraint
The existing version constraint uses logic optimized for package
managers, not schedulers, when checking prereleases:

- 1.3.0-beta1 will *not* satisfy ">= 0.6.1"
- 1.7.0-rc1 will *not* satisfy ">= 1.6.0-beta1"

This is due to package managers wishing to favor final releases over
prereleases.

In a scheduler versions more often represent the earliest release all
required features/APIs are available in a system. Whether the constraint
or the version being evaluated are prereleases has no impact on
ordering.

This commit adds a new constraint - `semver` - which will use Semver
v2.0 ordering when evaluating constraints. Given the above examples:

- 1.3.0-beta1 satisfies ">= 0.6.1" using `semver`
- 1.7.0-rc1 satisfies ">= 1.6.0-beta1" using `semver`

Since existing jobspecs may rely on the old behavior, a new constraint
was added and the implicit Consul Connect and Vault constraints were
updated to use it.
2019-11-19 08:40:19 -08:00
Charlie Voiselle 835831a3d8 Added service wrapper code (#6220)
This is the basic code to add the Windows Service Manager hooks to Nomad.

Includes vendoring golang.org/x/sys/windows/svc and added Docs:
* guide for installing as a windows service.
* configuration for logging to file from PR #6429
2019-11-11 15:16:07 -05:00
Drew Bailey 786989dbe3
New monitor pkg for shared monitor functionality
Adds new package that can be used by client and server RPC endpoints to
facilitate monitoring based off of a logger

clean up old code

small comment about write

rm old comment about minsize

rename to Monitor

Removes connection logic from monitor command

Keep connection logic in endpoints, use a channel to send results from
monitoring

use new multisink logger and interfaces

small test for dropped messages

update go-hclogger and update sink/intercept logger interfaces
2019-11-05 09:51:49 -05:00
Drew Bailey 0de94466b2
Display error when remote side ended monitor
multisink logger

remove usage of logwriter
2019-11-05 09:51:48 -05:00
Tim Gross 3ac3ceb2cc test: add NOMAD_TEST_LOG_LEVEL env var to tune log levels 2019-08-30 13:25:36 -04:00
Mahmood Ali f98d4ee3f1 tests: enable raw_exec driver 2019-08-29 20:26:50 -04:00
Tim Gross 2a592a2e0c
agent: add optional param to -dev flag for connect (#6126)
Consul Connect must route traffic between network namespaces through a
public interface (i.e. not localhost). In order to support testing in
dev mode, users needed to manually set the interface which doesn't
make for a smooth experience.

This commit adds a facility for adding optional parameters to the
`nomad agent -dev` flag and uses it to add a `-dev=connect` flag that
binds to a public interface on the host.
2019-08-14 15:29:37 -04:00
Michael Schurter fb487358fb
connect: add group.service stanza support 2019-07-31 01:04:05 -04:00
Jasmine Dahilig 2157f6ddf1
add formatting for hcl parsing error messages (#5972) 2019-07-19 10:04:39 -07:00
Mahmood Ali cb92b5d162 Add a test for unknown variables 2019-06-17 12:25:43 -04:00
Mahmood Ali f38c59baa0 tests: handle unicode matches
naive implementation should focus on ascii characters only
2019-05-21 09:41:23 -04:00
Mahmood Ali 4013847ada escapingio: handle stalled readers
Handle stalled readers (e.g. network write got stalled), by having
escaping io have a buffer so it looks for escaped characters in the
stream.

This simplifies the implementation considerably, as we can look
for new lines followed by escaped characters directly.

Also, we add a test to ensure that any partial results are flushed to
readers.
2019-05-17 11:58:31 -04:00
Mahmood Ali 5bd946d790 escapingio: thread-safe struct for escaped chars
Use a helper struct for capturing escaped characters that's thread safe.
2019-05-17 10:22:24 -04:00
Mahmood Ali b6d68e19fa avoid printing counts in tests 2019-05-16 17:07:32 -04:00
Mahmood Ali 1293a8511c
Fix typos and comments
Co-Authored-By: Michael Schurter <michael.schurter@gmail.com>
2019-05-16 17:06:03 -04:00
Mahmood Ali b02852ef62 Add a escaping reader that mimics ssh behavior
Adds an escaping reading that mimics ssh handling of input escape
sequences.

The reader parses chunks to look for \n~
2019-05-16 16:22:52 -04:00
Mahmood Ali b4d84fd6a9 Allow compiling without nvidia integration
nvidia library use of dynamic library seems to conflict with alpine and
musl based OSes.  This adds a `nonvidia` tag to allow compiling nomad
for alpine images.

The nomad releases currently only support glibc based OS environments,
so we default to compiling with nvidia.
2019-04-10 09:19:12 -04:00
Michael Schurter cd87afd15f e2e: add NomadAgent and basic client state test
The e2e test code is absolutely hideous and leaks processes and files
on disk. NomadAgent seems useful, but the clientstate e2e tests are very
messy and slow. The last test "Corrupt" is probably the most useful as
it explicitly corrupts the state file whereas the other tests attempt to
reproduce steps thought to cause corruption in earlier releases of
Nomad.
2019-03-21 07:14:34 -07:00
Mahmood Ali bb32ba8784
Support driver config fields being set to nil (#5391)
To pick up https://github.com/hashicorp/hcl2/pull/90
2019-03-05 21:47:06 -05:00
Preetha 911c93f7bd
Merge pull request #5350 from hashicorp/b-json-logging-meta
Support json logging for CLI output for agent
2019-02-22 13:40:56 -06:00
Preetha Appan 0149bbc608
cli Ui implementation that logs to a hclogger
This makes it so any messages output to the UI *after* the agent has started
will be logged in json format correctly
2019-02-19 17:53:14 -06:00
Mahmood Ali 46cd3c3f55 drivers: restore port_map old json support
This ensures that `port_map` along with other block like attribute
declarations (e.g. ulimit, labels, etc) can handle various hcl and json
syntax that was supported in 0.8.

In 0.8.7, the following declarations are effectively equivalent:

```
// hcl block
port_map {
  http = 80
  https = 443
}

// hcl assignment
port_map = {
  http  = 80
  https = 443
}

// json single element array of map (default in API response)
{"port_map": [{"http": 80, "https": 443}]}

// json array of individual maps (supported accidentally iiuc)
{"port_map: [{"http": 80}, {"https": 443}]}
```

We achieve compatbility by using `NewAttr("...", "list(map(string))",
false)` to be serialized to a `map[string]string` wrapper, instead of using
`BlockAttrs` declaration.  The wrapper merges the list of maps
automatically, to ease driver development.

This approach is closer to how v0.8.7 implemented the fields [1][2], and
despite its verbosity, seems to perserve 0.8.7 behavior in hcl2.

This is only required for built-in types that have backward
compatibility constraints.  External drivers should use `BlockAttrs`
instead, as they see fit.

[1] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L216
[2] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L698-L700
2019-02-16 11:37:33 -05:00
Mahmood Ali f7102cd01d
tests: add hcl task driver config parsing tests (#5314)
* drivers: add config parsing tests

Add basic tests for parsing and encoding task config.

* drivers/docker: fix some config declarations

* refactor and document config parse helpers
2019-02-12 14:46:37 -05:00
Michael Schurter 3b84e08fa4
Merge pull request #5297 from hashicorp/b-docker-logging
Docker: Fix logging config parsing
2019-02-11 06:57:52 -08:00
mnachury aef18aa4ca Fix executable check on windows 2019-02-08 16:05:14 -05:00
Michael Schurter e1e4b10884 docker: fix logging config parsing
Fixes
https://groups.google.com/d/topic/nomad-tool/B3Uo6Kns2BI/discussion
2019-02-04 11:07:57 -08:00
Michael Schurter 9bf4b38ab3 plugins: update hclutils test
The test used old local copies of Docker structs and appeared to be
testing an outdated approach to task config decoding.

Updated to use real Docker structs so we can do end-to-end unit testing
of real Docker task configs.
2019-02-04 11:07:57 -08:00
Michael Schurter 6c0cc65b2e simplify hcl2 parsing helper
No need to pass in the entire eval context
2019-02-04 11:07:57 -08:00
Alex Dadgar b2af34141e
Update comment on int8ToPtr 2019-01-30 12:23:43 -08:00
Alex Dadgar 41265d4d61 Change types of weights on spread/affinity 2019-01-30 12:20:38 -08:00
Mahmood Ali 389e043129 drivers: pass logger through driver plugin client
This fixes a panic whenever driver plugin attempts to log a message.
2019-01-25 09:38:41 -05:00
Alex Dadgar 003aa2a69c nvidia device plugin docs 2019-01-23 10:58:46 -08:00
Michael Schurter 32daa7b47b goimports until make check is happy 2019-01-23 06:27:14 -08:00
Michael Schurter be0bab7c3f move pluginutils -> helper/pluginutils
I wanted a different color bikeshed, so I get to paint it
2019-01-22 15:50:08 -08:00
Alex Dadgar 4bdccab550 goimports 2019-01-22 15:44:31 -08:00
Mahmood Ali 7897dec9d9 api: move formatFloat function
`helpers.FormatFloat` function is only used in `api`.  Moving it and
marking it as private.  We can re-export it if we find value later.
2019-01-18 15:31:31 -05:00
Michael Schurter 40975633ae Update helper/testtask/testtask_windows.go
Co-Authored-By: dantoml <dani@tomlinson.io>
2019-01-17 18:43:14 +01:00
Danielle Tomlinson e6c0738b65 Expand unix build definition 2019-01-17 18:43:13 +01:00
Danielle Tomlinson 96b1d5d20c testtask: Build on windows 2019-01-17 18:43:13 +01:00
Chris Baker e9db2ae822 Merge branch 'master' of github.com:hashicorp/nomad into f-1157-validate-node-meta-variables 2019-01-09 18:56:49 +00:00
Mahmood Ali 0dfa93a3c1 appease linter 2019-01-08 10:58:49 -05:00
Chris Baker bf00f93d87 moved interp key regex out to a helper function 2019-01-08 00:11:47 +00:00
Michael Schurter 337d07fdd8 client/state: improve upgradeTaskBucket error handling
And add a test
2018-12-19 10:39:27 -08:00
Mahmood Ali fa9b9028a5 Use max 3 precision in displaying floats
When formating floats in `nomad node status`, use a maximum precision of
3.
2018-12-10 12:18:24 -05:00
Michael Schurter 4d92603340 boltdd: return error on use-after-Close
Return the same error as boltdb instead of panic'ing.
2018-11-15 14:15:37 -08:00
Mahmood Ali 9da19c6450 address review comments 2018-10-30 13:58:52 -04:00
Mahmood Ali 4937095389 Allow artifacts checksum interpolation
Fixes https://github.com/hashicorp/nomad/issues/4814
2018-10-30 13:24:30 -04:00
Michael Schurter e060174130 ar: fix leader handling, state restoring, and destroying unrun ARs
* Migrated all of the old leader task tests and got them passing
* Refactor and consolidate task killing code in AR to always kill leader
  tasks first
* Fixed lots of issues with state restoring
* Fixed deadlock in AR.Destroy if AR.Run had never been called
* Added a new in memory statedb for testing
2018-10-19 09:45:45 -07:00
Nick Ethier 8b876e1cce fix package references after drivers/base subpackage removed 2018-10-16 16:53:31 -07:00
Nick Ethier 0e3f85222a driver/raw_exec: port existing raw_exec tests and add some testing utilities 2018-10-16 16:53:31 -07:00
Michael Schurter 4236255686 lots of comment/log fixes 2018-10-16 16:53:30 -07:00
Michael Schurter 820af27171 wrap boltdb in a write deduplicator
Saves a tiny bit of cpu and some IO. Sadly doesn't prevent all IO on
duplicate writes as the transactions are still created and committed.

$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/hashicorp/nomad/helper/boltdd
BenchmarkWriteDeduplication_On-4             500           4059591 ns/op           23736 B/op         56 allocs/op
BenchmarkWriteDeduplication_Off-4            300           4115319 ns/op           25942 B/op         55 allocs/op
2018-10-16 16:53:30 -07:00
Michael Schurter ae89b7da95 reimplement success state for tr hooks and state persistence
splits apart local and remote persistence

removes some locking *for now*
2018-10-16 16:53:29 -07:00
Alex Dadgar cbb5f21112 New parser and comparison 2018-10-12 15:25:34 -07:00
oleksii.shyman b4a4b395e3 Introduce nvidia-plugin fingerprinting
- created go-nvml wrapper for fingerprinting
  - added fingerprinting feature to nvidia-plugin
2018-10-03 15:11:56 -07:00
Alex Dadgar 9971b3393f yamux 2018-09-17 14:22:40 -07:00
Alex Dadgar 7739ef51ce agent + consul 2018-09-13 10:43:40 -07:00
Michael Schurter 401ed92847 config: accept CA PEM files with extra whitespace
Previously we did a validation pass over CA PEM files before calling
Go's CertPool.AppendCertsFromPEM to provide more detailed error messages
than the stdlib provides.

Unfortunately our validation was overly strict and rejected valid CA
files. This is actually the reason the stdlib PEM parser doesn't return
meaningful errors: PEM files are extremely permissive and it's difficult
to tell the difference between invalid data and valid metadata.

This PR removes our custom validation as it would reject valid data and
the extra error messages were not useful in diagnosing the error
encountered.
2018-09-06 11:38:56 -07:00
Michael Schurter 6def5bc4f9 client: set host name when migrating over tls
Not setting the host name led the Go HTTP client to expect a certificate
with a DNS-resolvable name. Since Nomad uses `${role}.${region}.nomad`
names ephemeral dir migrations were broken when TLS was enabled.

Added an e2e test to ensure this doesn't break again as it's very
difficult to test and the TLS configuration is very easy to get wrong.
2018-09-05 17:24:17 -07:00
Alex Dadgar c6576ddac1 Fix make check errors 2018-09-04 16:03:52 -07:00
Chelsea Holland Komlo f5e631886f add signature algorithm to error message 2018-08-13 16:21:18 -04:00
Chelsea Holland Komlo ed21481ca1 rename signature algorithm type per code review feedback 2018-08-13 16:11:49 -04:00
Chelsea Holland Komlo 16ffb2e412 extract functionality for determining signature algorithm per code review feedback 2018-08-13 16:08:23 -04:00
Chelsea Holland Komlo 91edec5bf4 change string repr of signature algorithms to constants 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo 4b228b1919 remove redundant nil check 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo 3f1d54f628 add default case for empty TLS structs 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo 4755a65978 add comments 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo 86103d41d4 type safety for string keys 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo 31d6d00381 add simple getter for certificate 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo 568564f63f refactor to use golang built in api for certs 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo bb6c30ee3c add functionality to check if signature algorithm is supported in cipher suites 2018-08-10 12:37:21 -04:00
Chelsea Holland Komlo b92098fd08 change function signature to take entire tls config object 2018-08-10 12:37:21 -04:00
Nick Ethier a3be46b5ee
vendor: remove unused github.com/kardianos/osext 2018-07-05 11:04:12 -04:00
Charlie Voiselle 1560d0b893
Extend timeout based on user feedback
Closes https://github.com/hashicorp/nomad/issues/4439.
2018-06-21 15:27:56 -04:00
Chelsea Holland Komlo da712f4f47 fixup! more specific test assertion 2018-06-13 09:58:40 -04:00
Chelsea Holland Komlo dca7235ca5 add tests and improve should reload logic 2018-06-08 15:10:10 -04:00
Chelsea Holland Komlo de03ce8070 move logic to determine whether to reload tls configuration to tlsutil helper 2018-06-08 14:33:58 -04:00
Chelsea Holland Komlo 914d2257ef enable more tls 1.2 ciphers 2018-06-07 17:49:57 -04:00
Alex Dadgar de98774f2c Add test and docs 2018-05-31 18:05:03 -07:00
Alex Dadgar 446fc64850
Merge branch 'master' into f-tls-parse-certs 2018-05-30 17:25:50 +00:00
Chelsea Holland Komlo 3edf309096 fixup! clearify docs and group similar TLS fields 2018-05-29 21:30:49 -04:00
Chelsea Holland Komlo 498b57036d refactor to remove duplication 2018-05-29 18:47:25 -04:00
Chelsea Holland Komlo 1dc14d8e0d handle parsing multiple certificates in a pem file 2018-05-29 18:25:43 -04:00
Chelsea Holland Komlo 9156556555 remove unnecessary type conversation 2018-05-29 17:07:38 -04:00
Chelsea Holland Komlo 521f8d3fb4 parse CA certificate to catch more specific errors 2018-05-25 18:14:32 -04:00
Chelsea Holland Komlo 19e4a5489b add support for tls PreferServerCipherSuites
add further tests for tls configuration
2018-05-25 13:20:00 -04:00
Chelsea Holland Komlo 38f611a7f2 refactor NewTLSConfiguration to pass in verifyIncoming/verifyOutgoing
add missing fields to TLS merge method
2018-05-23 18:35:30 -04:00
Chelsea Komlo 687c26093c
Merge pull request #4269 from hashicorp/f-tls-remove-weak-standards
Configurable TLS cipher suites and versions; disallow weak ciphers
2018-05-11 08:11:46 -04:00
Charlie Voiselle fd952eefbc Added deferred cancel to prevent context leaks 2018-05-10 18:52:54 -04:00
Chelsea Holland Komlo 44f536f18e add support for configurable TLS minimum version 2018-05-09 18:07:12 -04:00
Chelsea Holland Komlo 796bae6f1b allow configurable cipher suites
disallow 3DES and RC4 ciphers

add documentation for tls_cipher_suites
2018-05-09 17:15:31 -04:00
Charlie Voiselle 6e58e1ff4b
Merge branch 'master' into b-extend-win-cpu-fingerprint-timeout 2018-05-09 16:23:14 -04:00
Charlie Voiselle 62f99cc629 Addressed review comments 2018-05-09 13:21:35 -04:00
Charlie Voiselle d64b02f07d Override 3 sec. WMI timeout in gopsutil
The default timeout is too short for some overburdened or resource
constrained machines to complete the WMI query before the context
deadline expires.  This causes them to be unable to fingerprint the CPU
properly.
2018-05-08 17:00:31 -04:00
Charlie Voiselle 893b01158c
Fix the CPU Information error message
The new version of gopsutil introduces a 3 second timeout that could come up as an error here; however, we are outputting the wrong variable and eating the error.
2018-05-08 14:11:29 -04:00
Seth Vargo df4fe7e76c
Set user-agent when talking to GCE metadata 2018-04-10 10:36:46 -04:00
Michael Schurter 187716944f testlog: override testlogger with envvar 2018-03-21 16:49:48 -07:00
Josh Soref 0790a58fb7 spelling: unknown 2018-03-11 19:07:31 +00:00
Alex Dadgar f9cf642436 Client tls 2018-02-15 15:22:57 -08:00
Alex Dadgar aa98f8ba7b Enhance API pkg to utilize Server's Client Tunnel
This PR enhances the API package by having client only RPCs route
through the server when they are low cost and for filesystem access to
first attempt a direct connection to the node and then falling back to
a server routed request.
2018-02-15 13:59:03 -08:00
Alex Dadgar 2c0ad26374 New RPC Modes and basic setup for streaming RPC handlers 2018-02-15 13:59:01 -08:00
Alex Dadgar 6dd1c9f49d Refactor 2018-02-15 13:59:00 -08:00
Alex Dadgar 940a2df8a1 Pull inmem codec to helper 2018-02-15 13:59:00 -08:00
Chelsea Komlo d09cc2a69f
Merge pull request #3492 from hashicorp/f-client-tls-reload
Client/Server TLS dynamic reload
2018-01-23 05:51:32 -05:00
Charlie Voiselle 0f782acfda Allow . in Environment Variable Names
From [https://github.com/appc/spec/blob/master/spec/aci.md](https://github.com/appc/spec/blob/master/spec/aci.md):

>environment (list of objects, optional) represents the app's environment variables (ACE can append). The listed objects must have two key-value pairs: name and value. The name must consist solely of letters, digits, and underscores '_' as outlined in IEEE Std 1003.1-2008, 2016 Edition, with practical considerations dictating that the name may also include periods '.' and hyphens '-'. The value is an arbitrary string. These values are not evaluated in any way, and no substitutions are made.

Dotted environment variables are frequently used as a part of the Spring Boot pattern. (re: ZD-6116)

This PR specifically doesn't address the conversion of hyphens (`-`) due to an issue with rkt [[Nomad GH # 2358]](https://github.com/hashicorp/nomad/issues/2358).
2018-01-22 13:59:37 -08:00
Chelsea Holland Komlo 649f86f094 refactor creating a new tls configuration 2018-01-16 08:02:39 -05:00
Michael Schurter 0baf168ed0 Improve naming and docs 2018-01-08 13:36:07 -08:00
Michael Schurter bc10061aa2 Logger backed by *testing.T
For capturing log output in tests and only displaying them on failure.

Pulled out of #3241
2018-01-08 12:53:58 -08:00
Preetha Appan d3110f21bd Changes service name validation logic to ignore any environment variables first. 2017-11-15 15:35:43 -06:00
Chelsea Komlo 2dfda33703 Nomad agent reload TLS configuration on SIGHUP (#3479)
* Allow server TLS configuration to be reloaded via SIGHUP

* dynamic tls reloading for nomad agents

* code cleanup and refactoring

* ensure keyloader is initialized, add comments

* allow downgrading from TLS

* initalize keyloader if necessary

* integration test for tls reload

* fix up test to assert success on reloaded TLS configuration

* failure in loading a new TLS config should remain at current

Reload only the config if agent is already using TLS

* reload agent configuration before specific server/client

lock keyloader before loading/caching a new certificate

* introduce a get-or-set method for keyloader

* fixups from code review

* fix up linting errors

* fixups from code review

* add lock for config updates; improve copy of tls config

* GetCertificate only reloads certificates dynamically for the server

* config updates/copies should be on agent

* improve http integration test

* simplify agent reloading storing a local copy of config

* reuse the same keyloader when reloading

* Test that server and client get reloaded but keep keyloader

* Keyloader exposes GetClientCertificate as well for outgoing connections

* Fix spelling

* correct changelog style
2017-11-14 17:53:23 -08:00
Alex Dadgar 7522c56014 skip running test executables 2017-10-19 16:49:57 -07:00
Alex Dadgar c1cc51dbee sync 2017-10-13 14:36:02 -07:00
Michael Schurter a66c53d45a Remove structs import from api
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar 4173834231 Enable more linters 2017-09-26 15:26:33 -07:00
Michael Schurter bb8d5689d8 Add Header and Method support for HTTP checks 2017-08-17 16:44:21 -07:00
Alex Dadgar d86b3977b9 Fix alloc health with checks using interpolation
Fixes an issue in which the allocation health watcher was checking for
allocations health based on un-interpolated services and checks. Change
the interface for retrieving check information from Consul to retrieving
all registered services and checks by allocation. In the future this
will allow us to output nicer messages.

Fixes https://github.com/hashicorp/nomad/issues/2969
2017-08-07 16:27:08 -07:00
Alex Dadgar 2650bb1d12 Distinct Property supports arbitrary limit
This PR enhances the distinct_property constraint such that a limit can
be specified in the RTarget/value parameter. This allows constraints
such as:

```
constraint {
  distinct_property = "${meta.rack}"
  value = "2"
}
```

This restricts any given rack from running more than 2 allocations from
the task group.

Fixes https://github.com/hashicorp/nomad/issues/1146
2017-07-31 16:52:13 -07:00
Alex Dadgar 2471b86dec Show submit time 2017-07-07 12:07:07 -07:00
Alex Dadgar 0d42b5d421 initial reconciler 2017-07-07 12:01:17 -07:00
Michael Schurter 89abaf5ef4 Don't fail on first error detecting cpu stats
Since cpu.Counts() never returns an error this doesn't functionally
change anything today.
2017-07-03 14:51:02 -07:00
Michael Schurter fd9bef768f Move task env into execcontext
Also inject PATH into rkt commands since we're no longer appending host
env vars for it.
2017-05-23 13:53:34 -07:00
Alex Dadgar 9b9d69f577 Add a comment 2017-04-10 12:07:57 -07:00
Alex Dadgar 2321e8a4a0 Hash host ID so its stable and well distributed
This PR takes the host ID and runs it through a hash so that it is well
distributed. This makes it so that machines that report similar host IDs
are easily distinguished.

Instances of similar IDs occur on EC2 where the ID is prefixed and on
motherboards created in the same batch.

Fixes https://github.com/hashicorp/nomad/issues/2534
2017-04-10 11:44:51 -07:00
Alex Dadgar abce18749d Fix tests that exec nomad 2017-03-14 16:04:33 -07:00
Alex Dadgar a1a7941dec Various fixes
This PR:
* Uses Go 1.8 executable lookup
* Stores any err message from stats init method
* Allows overriding of Cpu Compute for hosts where it can't be detected
2017-03-14 12:56:31 -07:00
Michael Schurter 16adc44358 Round two of env var cleaning
Should bring us into conformance with IEEE Std 1003.1, 2004 Edition:
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html

1 alloc/op and ~80ns/op on my machine.
2017-03-08 16:46:13 -08:00
Alex Dadgar 5be806a3df Fix vet script and fix vet problems
This PR fixes our vet script and fixes all the missed vet changes.

It also fixes pointers being printed in `nomad stop <job>` and `nomad
node-status <node>`.
2017-02-27 16:00:19 -08:00
Diptanu Choudhury 7567209857 Making the job spec return api.Job 2017-02-16 13:52:39 -08:00
Sean Chittenden adb5be23ef
Add better verification of a host's HostID. 2017-02-02 16:24:32 -08:00
Diptanu Choudhury e927de02d2 Moved functions to helper from structs 2017-01-18 15:55:14 -08:00
Michael Schurter d36d716bf9 Add docs for generating example certificates 2016-11-15 17:22:54 -08:00
Michael Schurter 345a2640dc Fix tlsutil tests 2016-11-10 12:18:13 -08:00
Alex Dadgar 5fba85c092 get tlsutil tests to compile - need to regenerate the certificates 2016-11-09 14:41:08 -08:00
Michael Schurter ae680c9c81 Remove incorrectly committed line and wrong comment 2016-11-01 15:57:21 -07:00
Michael Schurter 536c2921e9 Remove ServerName because we verify based on region 2016-11-01 14:17:31 -07:00
Diptanu Choudhury 1a8fa8c8d5 Making Nomad TLS configs region aware 2016-11-01 11:55:29 -07:00
Diptanu Choudhury 7c61e115bd Moved tlsutil into helpers 2016-10-25 16:05:37 -07:00
Alex Dadgar 751aa114bf Fix Vault parsing of booleans 2016-10-10 18:04:39 -07:00
Alex Dadgar 4ff8edd2da Floor CPU MHz and total compute and mark hostname as unique 2016-06-22 15:01:36 -07:00
Sean Chittenden 21b84fc3e6
Memoize the CPU stats. Error if CPU fingerprinting fails. 2016-06-17 12:13:53 -07:00
Diptanu Choudhury 7fb507e810 Moving the clkspeed code to helper 2016-06-11 17:31:49 +02:00
Diptanu Choudhury ea65fd7925 Checking in the stats helper package 2016-06-10 23:46:33 +02:00
Sean Chittenden dc28ab0cb5
Speling police 2016-05-15 09:41:34 -07:00
Alex Dadgar b4bb28c425 Job diff using generic structures 2016-05-10 22:23:34 -07:00
Ivo Verberk cd301327ed Add comments and fix a typo 2016-04-11 23:09:09 +02:00
Ivo Verberk 542603dec6 Add helper to validate raw configuration data 2016-04-10 00:42:43 +02:00
Diptanu Choudhury 5439d4c23c Interpolating service tags 2016-03-28 15:02:00 -07:00
Aleksejs Sinicins 0ce9ea3bab Allow dashes in var names 2016-02-27 18:42:33 +02:00
Alex Dadgar 5018f5dd1e Only interpret vars wrapped in braces 2016-02-04 17:26:46 -08:00
Alex Dadgar 1ceb6f012a Fix a bunch of tests
Up timeouts

trusty travis beta

Increase timeouts
2016-01-20 16:03:53 -08:00
Alex Dadgar 3ba1c9b76b merge 2016-01-11 09:58:26 -08:00
Diptanu Choudhury f6fb42835e Using cgo dependencies to look up users 2015-12-15 11:12:13 -08:00
Chris Bednarski 1e3e120646 Added build flag to user-lookup so it does not build on windows 2015-12-01 14:28:12 -08:00
Chris Bednarski 9292a97062 Merge branch 'user-lookup-nocgo' of https://github.com/carlosdp/nomad into b-user-lookup 2015-12-01 13:44:56 -08:00
Carlos Diaz-Padron 55e49506f0 Refactor out userLookup to helper package
Also replaces user.Lookup in exec driver
2015-12-01 11:59:55 -08:00
Diptanu Choudhury 29915ddd16 Moving the args to helper 2015-11-26 14:13:19 -08:00
Chris Hines 5b2168bb12 Use package testtask and httptest.Server to make client/driver tests OS independent. 2015-11-25 15:56:20 -05:00
Chris Hines d7ebe099c1 Factor portable test task out of client/driver/executor. 2015-11-24 20:59:42 -05:00
Alex Dadgar 88391eedfb Search path 2015-11-03 13:26:09 -08:00
Alex Dadgar 5562fc7672 Create Spawn pkg that handles IPC with the spawn-daemon and update exec_linux to use that 2015-11-02 20:28:37 -08:00
Ivo Verberk c6e1b13b51 Fix vet warnings 2015-10-07 12:26:58 +02:00
Alex Dadgar 8230008114 Correctly scan the CWD for the nomad executable 2015-09-30 14:30:09 -07:00
Alex Dadgar 0e3f21b34f Linux executor with cgroup isolation support 2015-09-21 09:08:57 -07:00
Armon Dadgar adb771ed1d agent: moving functions into helpers 2015-08-23 16:57:54 -07:00