This PR replaces use of time.After with a safe helper function
that creates a time.Timer to use instead. The new function returns
both a time.Timer and a Stop function that the caller must handle.
Unlike time.NewTimer, the helper function does not panic if the duration
set is <= 0.
Previously we copied this library by hand to avoid vendor-ing a bunch of
files related to minimock. Now that we no longer vendor, just import the
library normally.
Also we might use more of the library for handling `time.After` uses,
for which this library provides a Context-based solution.
* driver: fix integer conversion error
The shared executor incorrectly parsed the user's group into int32 and
then cast to uint32 without bounds checking. This is harmless because
an out-of-bounds gid will throw an error later, but it triggers
security and code quality scans. Parse directly to uint32 so that we
get correct error handling.
* helper: fix integer conversion error
The autopilot flags helper incorrectly parses a uint64 to a uint which
is machine specific size. Although we don't have 32-bit builds, this
sets off security and code quality scaans. Parse to the machine sized
uint.
* driver: restrict bounds of port map
The plugin server doesn't constrain the maximum integer for port
maps. This could result in a user-visible misconfiguration, but it
also triggers security and code quality scans. Restrict the bounds
before casting to int32 and return an error.
* cpuset: restrict upper bounds of cpuset values
Our cpuset configuration expects values in the range of uint16 to
match the expectations set by the kernel, but we don't constrain the
values before downcasting. An underflow could lead to allocations
failing on the client rather than being caught earlier. This also make
security and code quality scanners happy.
* http: fix integer downcast for per_page parameter
The parser for the `per_page` query parameter downcasts to int32
without bounds checking. This could result in underflow and
nonsensical paging, but there's no server-side consequences for
this. Fixing this will silence some security and code quality scanners
though.
Before this change trying to run `nomad operator raft {info,logs}` on an
inuse raft.db would cause the command to block until the agent using
raft.db is closed.
After this change the command will block for 1s before returning a
(hopefully) helpful error message.
This change also sets the ReadOnly mode on the underlying BoltDb to
ensure diagnostics make no changes to the underlying store. We have no
evidence this has ever occurred, but it seems like a useful safety
measure.
No changelog added since this is a minor tweak in a "new" feature (it
was hidden in previous relases).
The `nomad operator raft logs` command uses a raft helper that reads
in the logs from raft and serializes them to JSON. The previous
implementation returned the slice of all logs and then serializes the
entire object. Update the helper to stream the log entries and then
serialize them as newline-delimited JSON.
* Include region and namespace in CLI output
* Add region and prefix matching for server members
* Add namespace and region API outputs to cluster metadata folder
* Add region awareness to WaitForClient helper function
* Add helper functions for SliceStringHasPrefix and StringHasPrefixInSlice
* Refactor test client agent generation
* Add tests for region
* Add changelog
When mTLS is enabled, only nomad servers of the region should access the
Raft RPC layer. Clients and servers in other regions should only use the
Nomad RPC endpoints.
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
Co-authored-by: Seth Hoenig <shoenig@hashicorp.com>
Track usage of incoming streams on a connection. Connections without
reference counts get marked as unused and reaped in a periodic job.
This fixes a bug where `alloc exec` and `alloc fs` sessions get terminated
unexpectedly. Previously, when a client heartbeats switches between
servers, the pool connection reaper eventually identifies the connection
as unused and closes it even if it has an active exec/fs sessions.
Fixes#10579
This PR implements first-class support for Nomad running Consul
Connect Mesh Gateways. Mesh gateways enable services in the Connect
mesh to make cross-DC connections via gateways, where each datacenter
may not have full node interconnectivity.
Consul docs with more information:
https://www.consul.io/docs/connect/gateways/mesh-gateway
The following group level service block can be used to establish
a Connect mesh gateway.
service {
connect {
gateway {
mesh {
// no configuration
}
}
}
}
Services can make use of a mesh gateway by configuring so in their
upstream blocks, e.g.
service {
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "<service>"
local_bind_port = <port>
datacenter = "<datacenter>"
mesh_gateway {
mode = "<mode>"
}
}
}
}
}
}
Typical use of a mesh gateway is to create a bridge between datacenters.
A mesh gateway should then be configured with a service port that is
mapped from a host_network configured on a WAN interface in Nomad agent
config, e.g.
client {
host_network "public" {
interface = "eth1"
}
}
Create a port mapping in the group.network block for use by the mesh
gateway service from the public host_network, e.g.
network {
mode = "bridge"
port "mesh_wan" {
host_network = "public"
}
}
Use this port label for the service.port of the mesh gateway, e.g.
service {
name = "mesh-gateway"
port = "mesh_wan"
connect {
gateway {
mesh {}
}
}
}
Currently Envoy is the only supported gateway implementation in Consul.
By default Nomad client will run the latest official Envoy docker image
supported by the local Consul agent. The Envoy task can be customized
by setting `meta.connect.gateway_image` in agent config or by setting
the `connect.sidecar_task` block.
Gateways require Consul 1.8.0+, enforced by the Nomad scheduler.
Closes#9446
This PR wraps the use of the consul envoy bootstrap command in
an expoenential backoff closure, configured to timeout after 60
seconds. This is an increase over the current behavior of making
3 attempts over 6 seconds.
Should help with #10451
* Fixup uses of `sanity`
* Remove unnecessary comments.
These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.
* Update nomad/fsm_test.go
RPC endpoints for the user-driven APIs (`UpsertOneTimeToken` and
`ExchangeOneTimeToken`) and token expiration (`ExpireOneTimeTokens`).
Includes adding expiration to the periodic core GC job.
* Throw away result of multierror.Append
When given a *multierror.Error, it is mutated, therefore the return
value is not needed.
* Simplify MergeMultierrorWarnings, use StringBuilder
* Hash.Write() never returns an error
* Remove error that was always nil
* Remove error from Resources.Add signature
When this was originally written it could return an error, but that was
refactored away, and callers of it as of today never handle the error.
* Throw away results of io.Copy during Bridge
* Handle errors when computing node class in test
If one thread calls `Flush()` on a gatedwriter while another thread attempts to
`Write()` new data to it, strange things will happen.
The test I wrote shows that at the very least you can write _while_ flushing,
and the call to `Write()` will happen during the internal writes of the
buffered data, which is maybe not what is expected. (i.e. the `Write()`'d data
will be inserted somewhere in the middle of the data being `Flush()'d`)
It's also the case that, because `Write()` only has a read lock, if you had
multiple threads trying to write ("read") at the same time you might have data
loss because the `w.buf` that was read would not necessarily be up-to-date by
the time `p2` was appended to it and it was re-assigned to `w.buf`. You can see
this if you run the new gatedwriter tests with `-race` against the old implementation:
```
WARNING: DATA RACE
Read at 0x00c0000c0420 by goroutine 11:
runtime.growslice()
/usr/lib/go/src/runtime/slice.go:125 +0x0
github.com/hashicorp/nomad/helper/gated-writer.(*Writer).Write()
/home/hicks/workspace/nomad/helper/gated-writer/writer.go:41 +0x2b6
github.com/hashicorp/nomad/helper/gated-writer.TestWriter_WithMultipleWriters.func1()
/home/hicks/workspace/nomad/helper/gated-writer/writer_test.go:90 +0xea
```
This race condition is fixed in this change.
Introduce a new more-block friendly syntax for specifying mounts with a new `mount` block type with the target as label:
```hcl
config {
image = "..."
mount {
type = "..."
target = "target-path"
volume_options { ... }
}
}
```
The main benefit here is that by `mount` being a block, it can nest blocks and avoids the compatibility problems noted in https://github.com/hashicorp/nomad/pull/9634/files#diff-2161d829655a3a36ba2d916023e4eec125b9bd22873493c1c2e5e3f7ba92c691R128-R155 .
The intention is for us to promote this `mount` blocks and quietly deprecate the `mounts` type, while still honoring to preserve compatibility as much as we could.
This addresses the issue in https://github.com/hashicorp/nomad/issues/9604 .
Nomad v1.0.0 introduced a regression where the client configurations
for `connect.sidecar_image` and `connect.gateway_image` would be
ignored despite being set. This PR restores that functionality.
There was a missing layer of interpolation that needs to occur for
these parameters. Since Nomad 1.0 now supports dynamic envoy versioning
through the ${NOMAD_envoy_version} psuedo variable, we basically need
to first interpolate
${connect.sidecar_image} => envoyproxy/envoy:v${NOMAD_envoy_version}
then use Consul at runtime to resolve to a real image, e.g.
envoyproxy/envoy:v${NOMAD_envoy_version} => envoyproxy/envoy:v1.16.0
Of course, if the version of Consul is too old to provide an envoy
version preference, we then need to know to fallback to the old
version of envoy that we used before.
envoyproxy/envoy:v${NOMAD_envoy_version} => envoyproxy/envoy:v1.11.2@sha256:a7769160c9c1a55bb8d07a3b71ce5d64f72b1f665f10d81aa1581bc3cf850d09
Beyond that, we also need to continue to support jobs that set the
sidecar task themselves, e.g.
sidecar_task { config { image: "custom/envoy" } }
which itself could include teh pseudo envoy version variable.
In a few places Nomad was using flag implementations directly
from Consul, lending to Nomad's need to import consul. Replace
those uses with helpers already in Nomad, and copy over the bare
minimum needed to make the autopilot flags behave as they have.
* debug: add pprof duration CLI argument
* debug: add CSI plugin details
* update help text with ACL requirements
* debug: provide ACL hints upon permission failures
* debug: only write file when pprof retrieve is successful
* debug: add helper function to clean bad characters from dynamic filenames
* debug: ensure files are unable to escape the capture directory
* Process to send events to configured sinks
This PR adds a SinkManager to a server which is responsible for managing
managed sinks. Managed sinks subscribe to the event broker and send
events to a sink writer (webhook). When changes to the eventstore are
made the sinkmanager and managed sink are responsible for reloading or
starting a new managed sink.
* periodically check in sink progress to raft
Save progress on the last successfully sent index to raft. This allows a
managed sink to resume close to where it left off in the event of a lost
server or leadership change
dereference eventsink so we can accurately use the watchch
When using a pointer to eventsink struct it was updated immediately and our reload logic would not trigger
* network sink rpc/api plumbing
state store methods and restore
upsert sink test
get sink
delete sink
event sink list and tests
go generate new msg types
validate sink on upsert
* go generate
Prior to Nomad 0.12.5, you could use `${NOMAD_SECRETS_DIR}/mysecret.txt` as
the `artifact.destination` and `template.destination` because we would always
append the destination to the task working directory. In the recent security
patch we treated the `destination` absolute path as valid if it didn't escape
the working directory, but this breaks backwards compatibility and
interpolation of `destination` fields.
This changeset partially reverts the behavior so that we always append the
destination, but we also perform the escape check on that new destination
after interpolation so the security hole is closed.
Also, ConsulTemplate test should exercise interpolation
Ensure that the client honors the client configuration for the
`template.disable_file_sandbox` field when validating the jobspec's
`template.source` parameter, and not just with consul-template's own `file`
function.
Prevent interpolated `template.source`, `template.destination`, and
`artifact.destination` fields from escaping file sandbox.
This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza.
```hcl
service {
connect {
gateway {
proxy {
// envoy proxy configuration
}
ingress {
// ingress-gateway configuration entry
}
}
}
}
```
A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value.
Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver.
Aims to address #8294 and tangentially #8647
Pulling large docker containers can take longer than the default
context timeout. Without a way to change this it is very hard for
users to utilise Nomad properly without hacky work arounds.
This change adds an optional pull_timeout config parameter which
gives operators the possibility to account for increase pull times
where needed. The infra docker image also has the option to set a
custom timeout to keep consistency.
The CSI client RPC uses error wrapping to detect the type of error bubbling up
from plugins, but if the errors we get aren't wrapped at each layer, we can't
unwrap the inner error.
Also eliminates some unused args.
adds in oss components to support enterprise multi-vault namespace feature
upgrade specific doc on vault multi-namespaces
vault docs
update test to reflect new error
Go 1.14 now streams t.Log output as it happens [1], so we no longer need
to maintain our log STDOUT helper.
I preserved the interface, so `testlog` still takes in a `*testing.T`
though unused. Changing it requires so too many changes that I didn't
want to make quite yet.
[1] https://golang.org/doc/go1.14#go-test
This fixes few cases where driver eventor goroutines are leaked during
normal operations, but especially so in tests.
This change makes few modifications:
First, it switches drivers to use `Context`s to manage shutdown events.
Previously, it relied on callers invoking `.Shutdown()` function that is
specific to internal drivers only and require casting. Using `Contexts`
provide a consistent idiomatic way to manage lifecycle for both internal
and external drivers.
Also, I discovered few places where we don't clean up a temporary driver
instance in the plugin catalog code, where we dispense a driver to
inspect and validate the schema config without properly cleaning it up.
The plugin loader supplies a version map to ensure the Nomad agent
can support the plugins at the version they specify. The map was
incorrectly mapping the driver type to the device API supported
version identifier. This currently does not cause a bug as both
device and driver versions are the same string. This could cause
problems in the future, however, if either plugin interface were
to change and require version updates.
* command/csi: csi, csi_plugin, csi_volume
* helper/funcs: move ExtraKeys from parse_config to UnusedKeys
* command/agent/config_parse: use helper.UnusedKeys
* api/csi: annotate CSIVolumes with hcl fields
* command/csi_plugin: add Synopsis
* command/csi_volume_register: use hcl.Decode style parsing
* command/csi_volume_list
* command/csi_volume_status: list format, cleanup
* command/csi_plugin_list
* command/csi_plugin_status
* command/csi_volume_deregister
* command/csi_volume: add Synopsis
* api/contexts/contexts: add csi search contexts to the constants
* command/commands: register csi commands
* api/csi: fix struct tag for linter
* command/csi_plugin_list: unused struct vars
* command/csi_plugin_status: unused struct vars
* command/csi_volume_list: unused struct vars
* api/csi: add allocs to CSIPlugin
* command/csi_plugin_status: format the allocs
* api/allocations: copy Allocation.Stub in from structs
* nomad/client_rpc: add some error context with Errorf
* api/csi: collapse read & write alloc maps to a stub list
* command/csi_volume_status: cleanup allocation display
* command/csi_volume_list: use Schedulable instead of Healthy
* command/csi_volume_status: use Schedulable instead of Healthy
* command/csi_volume_list: sprintf string
* command/csi: delete csi.go, csi_plugin.go
* command/plugin: refactor csi components to sub-command plugin status
* command/plugin: remove csi
* command/plugin_status: remove csi
* command/volume: remove csi
* command/volume_status: split out csi specific
* helper/funcs: add RemoveEqualFold
* command/agent/config_parse: use helper.RemoveEqualFold
* api/csi: do ,unusedKeys right
* command/volume: refactor csi components to `nomad volume`
* command/volume_register: split out csi specific
* command/commands: use the new top level commands
* command/volume_deregister: hardwired type csi for now
* command/volume_status: csiFormatVolumes rescued from volume_list
* command/plugin_status: avoid a panic on no args
* command/volume_status: avoid a panic on no args
* command/plugin_status: predictVolumeType
* command/volume_status: predictVolumeType
* nomad/csi_endpoint_test: move CreateTestPlugin to testing
* command/plugin_status_test: use CreateTestCSIPlugin
* nomad/structs/structs: add CSIPlugins and CSIVolumes search consts
* nomad/state/state_store: add CSIPlugins and CSIVolumesByIDPrefix
* nomad/search_endpoint: add CSIPlugins and CSIVolumes
* command/plugin_status: move the header to the csi specific
* command/volume_status: move the header to the csi specific
* nomad/state/state_store: CSIPluginByID prefix
* command/status: rename the search context to just Plugins/Volumes
* command/plugin,volume_status: test return ids now
* command/status: rename the search context to just Plugins/Volumes
* command/plugin_status: support -json and -t
* command/volume_status: support -json and -t
* command/plugin_status_csi: comments
* command/*_status: clean up text
* api/csi: fix stale comments
* command/volume: make deregister sound less fearsome
* command/plugin_status: set the id length
* command/plugin_status_csi: more compact plugin health
* command/volume: better error message, comment
This package introduces some basic abstractions around mount utilties
for various platforms. Initially it only supports linux, but the plan is
to expand this as CSI expands across to other platforms.
MultiplexV2 is a new connection multiplex header that supports multiplex both
RPC and streaming requests over the same Yamux connection.
MultiplexV2 was added in 0.8.0 as part of
https://github.com/hashicorp/nomad/pull/3892 . So Nomad 0.11 can expect it to
be supported. Though, some more rigorous testing is required before merging
this.
I want to call out some implementation details:
First, the current connection pool reuses the Yamux stream for multiple RPC calls,
and doesn't close them until an error is encountered. This commit doesn't
change it, and sets the `RpcNomad` byte only at stream creation.
Second, the StreamingRPC session gets closed by callers and cannot be reused.
Every StreamingRPC opens a new Yamux session.
Pick up some fixes from Consul:
* If a stream returns an EOF error, clear session from cache/pool and start a
new one.
* Close the codec when closing StreamClient
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.
Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.
This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
The existing version constraint uses logic optimized for package
managers, not schedulers, when checking prereleases:
- 1.3.0-beta1 will *not* satisfy ">= 0.6.1"
- 1.7.0-rc1 will *not* satisfy ">= 1.6.0-beta1"
This is due to package managers wishing to favor final releases over
prereleases.
In a scheduler versions more often represent the earliest release all
required features/APIs are available in a system. Whether the constraint
or the version being evaluated are prereleases has no impact on
ordering.
This commit adds a new constraint - `semver` - which will use Semver
v2.0 ordering when evaluating constraints. Given the above examples:
- 1.3.0-beta1 satisfies ">= 0.6.1" using `semver`
- 1.7.0-rc1 satisfies ">= 1.6.0-beta1" using `semver`
Since existing jobspecs may rely on the old behavior, a new constraint
was added and the implicit Consul Connect and Vault constraints were
updated to use it.
This is the basic code to add the Windows Service Manager hooks to Nomad.
Includes vendoring golang.org/x/sys/windows/svc and added Docs:
* guide for installing as a windows service.
* configuration for logging to file from PR #6429
Adds new package that can be used by client and server RPC endpoints to
facilitate monitoring based off of a logger
clean up old code
small comment about write
rm old comment about minsize
rename to Monitor
Removes connection logic from monitor command
Keep connection logic in endpoints, use a channel to send results from
monitoring
use new multisink logger and interfaces
small test for dropped messages
update go-hclogger and update sink/intercept logger interfaces
Consul Connect must route traffic between network namespaces through a
public interface (i.e. not localhost). In order to support testing in
dev mode, users needed to manually set the interface which doesn't
make for a smooth experience.
This commit adds a facility for adding optional parameters to the
`nomad agent -dev` flag and uses it to add a `-dev=connect` flag that
binds to a public interface on the host.
Handle stalled readers (e.g. network write got stalled), by having
escaping io have a buffer so it looks for escaped characters in the
stream.
This simplifies the implementation considerably, as we can look
for new lines followed by escaped characters directly.
Also, we add a test to ensure that any partial results are flushed to
readers.
nvidia library use of dynamic library seems to conflict with alpine and
musl based OSes. This adds a `nonvidia` tag to allow compiling nomad
for alpine images.
The nomad releases currently only support glibc based OS environments,
so we default to compiling with nvidia.
The e2e test code is absolutely hideous and leaks processes and files
on disk. NomadAgent seems useful, but the clientstate e2e tests are very
messy and slow. The last test "Corrupt" is probably the most useful as
it explicitly corrupts the state file whereas the other tests attempt to
reproduce steps thought to cause corruption in earlier releases of
Nomad.
This ensures that `port_map` along with other block like attribute
declarations (e.g. ulimit, labels, etc) can handle various hcl and json
syntax that was supported in 0.8.
In 0.8.7, the following declarations are effectively equivalent:
```
// hcl block
port_map {
http = 80
https = 443
}
// hcl assignment
port_map = {
http = 80
https = 443
}
// json single element array of map (default in API response)
{"port_map": [{"http": 80, "https": 443}]}
// json array of individual maps (supported accidentally iiuc)
{"port_map: [{"http": 80}, {"https": 443}]}
```
We achieve compatbility by using `NewAttr("...", "list(map(string))",
false)` to be serialized to a `map[string]string` wrapper, instead of using
`BlockAttrs` declaration. The wrapper merges the list of maps
automatically, to ease driver development.
This approach is closer to how v0.8.7 implemented the fields [1][2], and
despite its verbosity, seems to perserve 0.8.7 behavior in hcl2.
This is only required for built-in types that have backward
compatibility constraints. External drivers should use `BlockAttrs`
instead, as they see fit.
[1] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L216
[2] https://github.com/hashicorp/nomad/blob/v0.8.7/client/driver/docker.go#L698-L700
The test used old local copies of Docker structs and appeared to be
testing an outdated approach to task config decoding.
Updated to use real Docker structs so we can do end-to-end unit testing
of real Docker task configs.
* Migrated all of the old leader task tests and got them passing
* Refactor and consolidate task killing code in AR to always kill leader
tasks first
* Fixed lots of issues with state restoring
* Fixed deadlock in AR.Destroy if AR.Run had never been called
* Added a new in memory statedb for testing
Saves a tiny bit of cpu and some IO. Sadly doesn't prevent all IO on
duplicate writes as the transactions are still created and committed.
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/hashicorp/nomad/helper/boltdd
BenchmarkWriteDeduplication_On-4 500 4059591 ns/op 23736 B/op 56 allocs/op
BenchmarkWriteDeduplication_Off-4 300 4115319 ns/op 25942 B/op 55 allocs/op