This PR creates a pointer.Compare helper for comparing equality of
two pointers. Strictly only works with primitive types we know are
safe to derefence and compare using '=='.
Before this change, Client had 2 copies of the config object: config and configCopy. There was no guidance around which to use where (other than configCopy's comment to pass it to alloc runners), both are shared among goroutines and mutated in data racy ways. At least at one point I think the idea was to have `config` be mutable and then grab a lock to overwrite `configCopy`'s pointer atomically. This would have allowed alloc runners to read their config copies in data race safe ways, but this isn't how the current implementation worked.
This change takes the following approach to safely handling configs in the client:
1. `Client.config` is the only copy of the config and all access must go through the `Client.configLock` mutex
2. Since the mutex *only protects the config pointer itself and not fields inside the Config struct:* all config mutation must be done on a *copy* of the config, and then Client's config pointer is overwritten while the mutex is acquired. Alloc runners and other goroutines with the old config pointer will not see config updates.
3. Deep copying is implemented on the Config struct to satisfy the previous approach. The TLS Keyloader is an exception because it has its own internal locking to support mutating in place. An unfortunate complication but one I couldn't find a way to untangle in a timely fashion.
4. To facilitate deep copying I made an *internally backward incompatible API change:* our `helper/funcs` used to turn containers (slices and maps) with 0 elements into nils. This probably saves a few memory allocations but makes it very easy to cause panics. Since my new config handling approach uses more copying, it became very difficult to ensure all code that used containers on configs could handle nils properly. Since this code has caused panics in the past, I fixed it: nil containers are copied as nil, but 0-element containers properly return a new 0-element container. No more "downgrading to nil!"
Move conflict resolution implementation into the state store with a new Apply RPC.
This also makes the RPC for secure variables much more similar to Consul's KV,
which will help us support soft deletes in a post-1.4.0 version of Nomad.
Reimplement quotas in the state store functions.
Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
This commit includes the new state schema for ACL roles along with
state interaction functions for CRUD actions.
The change also includes snapshot persist and restore
functionality and the addition of FSM messages for Raft updates
which will come via RPC endpoints.
The ACL token state schema has been updated to utilise two new
indexes which track expiration of tokens that are configured with
an expiration TTL or time. A new state function allows listing
ACL expired tokens which will be used by internal garbage
collection.
The ACL endpoint has been modified so that all validation happens
within a single function call. This is easier to understand and
see at a glance. The ACL token validation now also includes logic
for expiry TTL and times. The ACL endpoint upsert tests have been
condensed into a single, table driven test.
There is a new token canonicalize which provides a single place
for token canonicalization, rather than logic spread in the RPC
handler.
This PR adds support for specifying checks in services registered to
the built-in nomad service provider.
Currently only HTTP and TCP checks are supported, though more types
could be added later.
Stream snapshot to FSM when restoring from archive
The `RestoreFromArchive` helper decompresses the snapshot archive to a
temporary file before reading it into the FSM. For large snapshots
this performs a lot of disk IO. Stream decompress the snapshot as we
read it, without first writing to a temporary file.
Add bexpr filters to the `RestoreFromArchive` helper.
The operator can pass these as `-filter` arguments to `nomad operator
snapshot state` (and other commands in the future) to include only
desired data when reading the snapshot.
* core: allow pause/un-pause of eval broker on region leader.
* agent: add ability to pause eval broker via scheduler config.
* cli: add operator scheduler commands to interact with config.
* api: add ability to pause eval broker via scheduler config
* e2e: add operator scheduler test for eval broker pause.
* docs: include new opertor scheduler CLI and pause eval API info.
This PR adds 2 helper functions to the helpers/bbolt package
- Iterate: iterate every key in a bucket. Automatically decodes the
msg pack value into the provided value argument.
- DeletePrefix: deletes every key in a bucket starting with a
given prefix. Manages the wrapper's hash values accordingly. Uses
a cursor & sync to operate efficiently.
This PR deprecates some functions in favor of generic alternatives.
The new functions are compatible only with Nomad v1.4+.
The old functions (nor their use) should not be removed until Nomad v1.6+.
Closes#12927Closes#12958
This PR updates the version of redis used in our examples from 3.2 to 7.
The old version is very not supported anymore, and we should be setting
a good example by using a supported version.
The long-form example job is now fixed so that the service stanza uses
nomad as the service discovery provider, and so now the job runs without
a requirement of having Consul running and configured.
* test: use `T.TempDir` to create temporary test directory
This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using `ioutil.TempDir`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but `t.TempDir` handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* test: fix TestLogmon_Start_restart on Windows
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* test: fix failing TestConsul_Integration
t.TempDir fails to perform the cleanup properly because the folder is
still in use
testing.go:967: TempDir RemoveAll cleanup: unlinkat /tmp/TestConsul_Integration2837567823/002/191a6f1a-5371-cf7c-da38-220fe85d10e5/web/secrets: device or resource busy
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Nomad Service Discovery uses an in-memory buffered pipe implementation
to connect consul-template to the Nomad API.
This adds a basic test for that helper functionality.
This change modifies the template task runner to utilise the
new consul-template which includes Nomad service lookup template
funcs.
In order to provide security and auth to consul-template, we use
a custom HTTP dialer which is passed to consul-template when
setting up the runner. This method follows Vault implementation.
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
Move some common Vault API data struct decoding out of the Vault client
so it can be reused in other situations.
Make Vault job validation its own function so it's easier to expand it.
Rename the `Job.VaultPolicies` method to just `Job.Vault` since it
returns the full Vault block, not just their policies.
Set `ChangeMode` on `Vault.Canonicalize`.
Add some missing tests.
Allows specifying an entity alias that will be used by Nomad when
deriving the task Vault token.
An entity alias assigns an indentity to a token, allowing better control
and management of Vault clients since all tokens with the same indentity
alias will now be considered the same client. This helps track Nomad
activity in Vault's audit logs and better control over Vault billing.
Add support for a new Nomad server configuration to define a default
entity alias to be used when deriving Vault tokens. This default value
will be used if the task doesn't have an entity alias defined.
Part 2 of breaking up https://github.com/hashicorp/nomad/pull/12255
This PR makes it so gotestsum is invoked only in CircleCI. Also the
HCLogger(t) is plumbed more correctly in TestServer and TestAgent so
that they respect NOMAD_TEST_LOG_LEVEL.
The reason for these is we'll want to disable logging in GHA,
where spamming the disk with logs really drags performance.
Nomad inherited protocol version numbering configuration from Consul and
Serf, but unlike those projects Nomad has never used it. Nomad's
`protocol_version` has always been `1`.
While the code is effectively unused and therefore poses no runtime
risks to leave, I felt like removing it was best because:
1. Nomad's RPC subsystem has been able to evolve extensively without
needing to increment the version number.
2. Nomad's HTTP API has evolved extensively without increment
`API{Major,Minor}Version`. If we want to version the HTTP API in the
future, I doubt this is the mechanism we would choose.
3. The presence of the `server.protocol_version` configuration
parameter is confusing since `server.raft_protocol` *is* an important
parameter for operators to consider. Even more confusing is that
there is a distinct Serf protocol version which is included in `nomad
server members` output under the heading `Protocol`. `raft_protocol`
is the *only* protocol version relevant to Nomad developers and
operators. The other protocol versions are either deadcode or have
never changed (Serf).
4. If we were to need to version the RPC, HTTP API, or Serf protocols, I
don't think these configuration parameters and variables are the best
choice. If we come to that point we should choose a versioning scheme
based on the use case and modern best practices -- not this 6+ year
old dead code.
These API endpoints now return results in chronological order. They
can return results in reverse chronological order by setting the
query parameter ascending=true.
- Eval.List
- Deployment.List
This PR replaces use of time.After with a safe helper function
that creates a time.Timer to use instead. The new function returns
both a time.Timer and a Stop function that the caller must handle.
Unlike time.NewTimer, the helper function does not panic if the duration
set is <= 0.
Previously we copied this library by hand to avoid vendor-ing a bunch of
files related to minimock. Now that we no longer vendor, just import the
library normally.
Also we might use more of the library for handling `time.After` uses,
for which this library provides a Context-based solution.
* driver: fix integer conversion error
The shared executor incorrectly parsed the user's group into int32 and
then cast to uint32 without bounds checking. This is harmless because
an out-of-bounds gid will throw an error later, but it triggers
security and code quality scans. Parse directly to uint32 so that we
get correct error handling.
* helper: fix integer conversion error
The autopilot flags helper incorrectly parses a uint64 to a uint which
is machine specific size. Although we don't have 32-bit builds, this
sets off security and code quality scaans. Parse to the machine sized
uint.
* driver: restrict bounds of port map
The plugin server doesn't constrain the maximum integer for port
maps. This could result in a user-visible misconfiguration, but it
also triggers security and code quality scans. Restrict the bounds
before casting to int32 and return an error.
* cpuset: restrict upper bounds of cpuset values
Our cpuset configuration expects values in the range of uint16 to
match the expectations set by the kernel, but we don't constrain the
values before downcasting. An underflow could lead to allocations
failing on the client rather than being caught earlier. This also make
security and code quality scanners happy.
* http: fix integer downcast for per_page parameter
The parser for the `per_page` query parameter downcasts to int32
without bounds checking. This could result in underflow and
nonsensical paging, but there's no server-side consequences for
this. Fixing this will silence some security and code quality scanners
though.
Before this change trying to run `nomad operator raft {info,logs}` on an
inuse raft.db would cause the command to block until the agent using
raft.db is closed.
After this change the command will block for 1s before returning a
(hopefully) helpful error message.
This change also sets the ReadOnly mode on the underlying BoltDb to
ensure diagnostics make no changes to the underlying store. We have no
evidence this has ever occurred, but it seems like a useful safety
measure.
No changelog added since this is a minor tweak in a "new" feature (it
was hidden in previous relases).
The `nomad operator raft logs` command uses a raft helper that reads
in the logs from raft and serializes them to JSON. The previous
implementation returned the slice of all logs and then serializes the
entire object. Update the helper to stream the log entries and then
serialize them as newline-delimited JSON.
* Include region and namespace in CLI output
* Add region and prefix matching for server members
* Add namespace and region API outputs to cluster metadata folder
* Add region awareness to WaitForClient helper function
* Add helper functions for SliceStringHasPrefix and StringHasPrefixInSlice
* Refactor test client agent generation
* Add tests for region
* Add changelog
When mTLS is enabled, only nomad servers of the region should access the
Raft RPC layer. Clients and servers in other regions should only use the
Nomad RPC endpoints.
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
Co-authored-by: Seth Hoenig <shoenig@hashicorp.com>
Track usage of incoming streams on a connection. Connections without
reference counts get marked as unused and reaped in a periodic job.
This fixes a bug where `alloc exec` and `alloc fs` sessions get terminated
unexpectedly. Previously, when a client heartbeats switches between
servers, the pool connection reaper eventually identifies the connection
as unused and closes it even if it has an active exec/fs sessions.
Fixes#10579
This PR implements first-class support for Nomad running Consul
Connect Mesh Gateways. Mesh gateways enable services in the Connect
mesh to make cross-DC connections via gateways, where each datacenter
may not have full node interconnectivity.
Consul docs with more information:
https://www.consul.io/docs/connect/gateways/mesh-gateway
The following group level service block can be used to establish
a Connect mesh gateway.
service {
connect {
gateway {
mesh {
// no configuration
}
}
}
}
Services can make use of a mesh gateway by configuring so in their
upstream blocks, e.g.
service {
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "<service>"
local_bind_port = <port>
datacenter = "<datacenter>"
mesh_gateway {
mode = "<mode>"
}
}
}
}
}
}
Typical use of a mesh gateway is to create a bridge between datacenters.
A mesh gateway should then be configured with a service port that is
mapped from a host_network configured on a WAN interface in Nomad agent
config, e.g.
client {
host_network "public" {
interface = "eth1"
}
}
Create a port mapping in the group.network block for use by the mesh
gateway service from the public host_network, e.g.
network {
mode = "bridge"
port "mesh_wan" {
host_network = "public"
}
}
Use this port label for the service.port of the mesh gateway, e.g.
service {
name = "mesh-gateway"
port = "mesh_wan"
connect {
gateway {
mesh {}
}
}
}
Currently Envoy is the only supported gateway implementation in Consul.
By default Nomad client will run the latest official Envoy docker image
supported by the local Consul agent. The Envoy task can be customized
by setting `meta.connect.gateway_image` in agent config or by setting
the `connect.sidecar_task` block.
Gateways require Consul 1.8.0+, enforced by the Nomad scheduler.
Closes#9446
This PR wraps the use of the consul envoy bootstrap command in
an expoenential backoff closure, configured to timeout after 60
seconds. This is an increase over the current behavior of making
3 attempts over 6 seconds.
Should help with #10451
* Fixup uses of `sanity`
* Remove unnecessary comments.
These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.
* Update nomad/fsm_test.go
RPC endpoints for the user-driven APIs (`UpsertOneTimeToken` and
`ExchangeOneTimeToken`) and token expiration (`ExpireOneTimeTokens`).
Includes adding expiration to the periodic core GC job.
* Throw away result of multierror.Append
When given a *multierror.Error, it is mutated, therefore the return
value is not needed.
* Simplify MergeMultierrorWarnings, use StringBuilder
* Hash.Write() never returns an error
* Remove error that was always nil
* Remove error from Resources.Add signature
When this was originally written it could return an error, but that was
refactored away, and callers of it as of today never handle the error.
* Throw away results of io.Copy during Bridge
* Handle errors when computing node class in test
If one thread calls `Flush()` on a gatedwriter while another thread attempts to
`Write()` new data to it, strange things will happen.
The test I wrote shows that at the very least you can write _while_ flushing,
and the call to `Write()` will happen during the internal writes of the
buffered data, which is maybe not what is expected. (i.e. the `Write()`'d data
will be inserted somewhere in the middle of the data being `Flush()'d`)
It's also the case that, because `Write()` only has a read lock, if you had
multiple threads trying to write ("read") at the same time you might have data
loss because the `w.buf` that was read would not necessarily be up-to-date by
the time `p2` was appended to it and it was re-assigned to `w.buf`. You can see
this if you run the new gatedwriter tests with `-race` against the old implementation:
```
WARNING: DATA RACE
Read at 0x00c0000c0420 by goroutine 11:
runtime.growslice()
/usr/lib/go/src/runtime/slice.go:125 +0x0
github.com/hashicorp/nomad/helper/gated-writer.(*Writer).Write()
/home/hicks/workspace/nomad/helper/gated-writer/writer.go:41 +0x2b6
github.com/hashicorp/nomad/helper/gated-writer.TestWriter_WithMultipleWriters.func1()
/home/hicks/workspace/nomad/helper/gated-writer/writer_test.go:90 +0xea
```
This race condition is fixed in this change.
Introduce a new more-block friendly syntax for specifying mounts with a new `mount` block type with the target as label:
```hcl
config {
image = "..."
mount {
type = "..."
target = "target-path"
volume_options { ... }
}
}
```
The main benefit here is that by `mount` being a block, it can nest blocks and avoids the compatibility problems noted in https://github.com/hashicorp/nomad/pull/9634/files#diff-2161d829655a3a36ba2d916023e4eec125b9bd22873493c1c2e5e3f7ba92c691R128-R155 .
The intention is for us to promote this `mount` blocks and quietly deprecate the `mounts` type, while still honoring to preserve compatibility as much as we could.
This addresses the issue in https://github.com/hashicorp/nomad/issues/9604 .
Nomad v1.0.0 introduced a regression where the client configurations
for `connect.sidecar_image` and `connect.gateway_image` would be
ignored despite being set. This PR restores that functionality.
There was a missing layer of interpolation that needs to occur for
these parameters. Since Nomad 1.0 now supports dynamic envoy versioning
through the ${NOMAD_envoy_version} psuedo variable, we basically need
to first interpolate
${connect.sidecar_image} => envoyproxy/envoy:v${NOMAD_envoy_version}
then use Consul at runtime to resolve to a real image, e.g.
envoyproxy/envoy:v${NOMAD_envoy_version} => envoyproxy/envoy:v1.16.0
Of course, if the version of Consul is too old to provide an envoy
version preference, we then need to know to fallback to the old
version of envoy that we used before.
envoyproxy/envoy:v${NOMAD_envoy_version} => envoyproxy/envoy:v1.11.2@sha256:a7769160c9c1a55bb8d07a3b71ce5d64f72b1f665f10d81aa1581bc3cf850d09
Beyond that, we also need to continue to support jobs that set the
sidecar task themselves, e.g.
sidecar_task { config { image: "custom/envoy" } }
which itself could include teh pseudo envoy version variable.
In a few places Nomad was using flag implementations directly
from Consul, lending to Nomad's need to import consul. Replace
those uses with helpers already in Nomad, and copy over the bare
minimum needed to make the autopilot flags behave as they have.
* debug: add pprof duration CLI argument
* debug: add CSI plugin details
* update help text with ACL requirements
* debug: provide ACL hints upon permission failures
* debug: only write file when pprof retrieve is successful
* debug: add helper function to clean bad characters from dynamic filenames
* debug: ensure files are unable to escape the capture directory
* Process to send events to configured sinks
This PR adds a SinkManager to a server which is responsible for managing
managed sinks. Managed sinks subscribe to the event broker and send
events to a sink writer (webhook). When changes to the eventstore are
made the sinkmanager and managed sink are responsible for reloading or
starting a new managed sink.
* periodically check in sink progress to raft
Save progress on the last successfully sent index to raft. This allows a
managed sink to resume close to where it left off in the event of a lost
server or leadership change
dereference eventsink so we can accurately use the watchch
When using a pointer to eventsink struct it was updated immediately and our reload logic would not trigger
* network sink rpc/api plumbing
state store methods and restore
upsert sink test
get sink
delete sink
event sink list and tests
go generate new msg types
validate sink on upsert
* go generate
Prior to Nomad 0.12.5, you could use `${NOMAD_SECRETS_DIR}/mysecret.txt` as
the `artifact.destination` and `template.destination` because we would always
append the destination to the task working directory. In the recent security
patch we treated the `destination` absolute path as valid if it didn't escape
the working directory, but this breaks backwards compatibility and
interpolation of `destination` fields.
This changeset partially reverts the behavior so that we always append the
destination, but we also perform the escape check on that new destination
after interpolation so the security hole is closed.
Also, ConsulTemplate test should exercise interpolation
Ensure that the client honors the client configuration for the
`template.disable_file_sandbox` field when validating the jobspec's
`template.source` parameter, and not just with consul-template's own `file`
function.
Prevent interpolated `template.source`, `template.destination`, and
`artifact.destination` fields from escaping file sandbox.