The default Linux Capabilities set enabled by the docker, exec, and
java task drivers includes CAP_NET_RAW (for making ping just work),
which has the side affect of opening an ARP DoS/MiTM attack between
tasks using bridge networking on the same host network.
https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities
This PR disables CAP_NET_RAW for the docker, exec, and java task
drivers. The previous behavior can be restored for docker using the
allow_caps docker plugin configuration option.
A future version of nomad will enable similar configurability for the
exec and java task drivers.
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.
This fixes 2 bugs:
- 'job plan' output is now idempotent for jobs making use of injected ports
- tasks will no longer be destructively updated when jobs making use of
injected ports are re-run without changing any user specified part of
job config.
Closes: https://github.com/hashicorp/nomad/issues/10099
This is the first step in #10268. If a maximum is not specified, the
task group sum uses the memory number instead. The maximum is only
shown when it’s higher than the memory sum.
This updates the UI to use the new fuzzy search API. It’s a drop-in
replacement so the / shortcut to jump to search is preserved, and
results can be cycled through and chosen via arrow keys and the
enter key.
It doesn’t use everything returned by the API:
* deployments and evaluations: these match by id, doesn’t seem like
people would know those or benefit from quick navigation to them
* namespaces: doesn’t seem useful as they currently function
* scaling policies
* tasks: the response doesn’t include an allocation id, which means they
can’t be navigated to in the UI without an additional query
* CSI volumes: aren’t actually returned by the API
Since there’s no API to check the server configuration and know whether
the feature has been disabled, this adds another query in
route:application#beforeModel that acts as feature detection: if the
attempt to query fails (500), the global search field is hidden.
Upon having added another query on load, I realised that beforeModel was
being triggered any time service:router#transitionTo was being called,
which happens upon navigating to a search result, for instance, because
of refreshModel being present on the region query parameter. This PR
adds a check for transition.queryParamsOnly and skips rerunning the
onload queries (token permissions check, license check, fuzzy search
feature detection).
Implementation notes:
* there are changes to unrelated tests to ignore the on-load feature
detection query
* some lifecycle-related guards against undefined were required to
address failures when navigating to an allocation
* the minimum search length of 2 characters is hard-coded as there’s
currently no way to determine min_term_length in the UI
This PR fixes a bug where Nomad was more restrictive on Ingress Gateway Configuration
Entry definitions than Consul. Before, Nomad would not allow for declaring IGCEs with
http listeners with service name "*", which is a special feature allowable by Consul.
Note: to make http protocol work, a service-default must be defined setting the
protocol to http for each service.
Fixes: #9729
This PR adds job-submission validation that checks for the use of uppercase characters
in group and service names for services that make use of Consul Connect. This prevents
attempting to launch services that Consul will not validate correctly, which in turn
causes tasks to fail to launch in Nomad.
Underlying Consul issue: https://github.com/hashicorp/consul/issues/6765Closes#7581#10450
This PR wraps the use of the consul envoy bootstrap command in
an expoenential backoff closure, configured to timeout after 60
seconds. This is an increase over the current behavior of making
3 attempts over 6 seconds.
Should help with #10451
This adds a Revert two-step button to the JobVersions component for
not-current versions, which redirects to the overview on success. It
checks the job version before and after reversion to mitigate the edge
case where reverting to an otherwise-identical version has no effect, as
discussed in #10337.
It uses existing facilities for handling other errors and disabling the
button when permissions are lacking.
This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy
searching objects in Nomad. The fuzzy search endpoint routes requests
to the Nomad Server leader, which implements the Search.FuzzySearch RPC
method.
Requests to the fuzzy search API are based on the api.FuzzySearchRequest
object, e.g.
{
"Text": "ed",
"Context": "all"
}
Responses from the fuzzy search API are based on the api.FuzzySearchResponse
object, e.g.
{
"Index": 27,
"KnownLeader": true,
"LastContact": 0,
"Matches": {
"tasks": [
{
"ID": "redis",
"Scope": [
"default",
"example",
"cache"
]
}
],
"evals": [],
"deployment": [],
"volumes": [],
"scaling_policy": [],
"images": [
{
"ID": "redis:3.2",
"Scope": [
"default",
"example",
"cache",
"redis"
]
}
]
},
"Truncations": {
"volumes": false,
"scaling_policy": false,
"evals": false,
"deployment": false
}
}
The API is tunable using the new server.search stanza, e.g.
server {
search {
fuzzy_enabled = true
limit_query = 200
limit_results = 1000
min_term_length = 5
}
}
These values can be increased or decreased, so as to provide more
search results or to reduce load on the Nomad Server. The fuzzy search
API can be disabled entirely by setting `fuzzy_enabled` to `false`.
This PR adds the common OSS changes for adding support for Consul Namespaces,
which is going to be a Nomad Enterprise feature. There is no new functionality
provided by this changeset and hopefully no new bugs.
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.
Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
Callers of `CSIVolumeByID` are generally assuming they should receive a single
volume. This potentially results in feasibility checking being performed
against the wrong volume if a volume's ID is a prefix substring of other
volume (for example: "test" and "testing").
Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix
matching in the command line client. Add the required elements for prefix
matching to the commands and API.
When multiple CSI volumes are requested, the feasibility check could return
early for read/write volumes with free claims, even if a later volume in the
request was not feasible for any other reason (including not existing at
all). This can result in random failure to fail feasibility checking,
depending on how the map of volumes was being ordered at runtime.
Remove the early return from the feasibility check. Add a test to verify that
missing volumes in the map will cause a failure; this test will not catch a
regression every test run because of the random map ordering, but any failure
will be caught over the course of several CI runs.
With the Ember update, when the will-destroy action is called
to check the element height, its height is already zero. That
seems strange but I didn’t look into it any further, as
using did-insert to store the element lets us check its height
before any other actions when a processing button is pressed.
This PR fixes a bug where tasks with Connect services could be
triggered to destructively update (i.e. placed in a new alloc)
when no update should be necessary.
Fixes#10077
In a deployment with two groups (ex. A and B), if group A's canary becomes
healthy before group B's, the deadline for the overall deployment will be set
to that of group A. When the deployment is promoted, if group A is done it
will not contribute to the next deadline cutoff. Group B's old deadline will
be used instead, which will be in the past and immediately trigger a
deployment progress failure. Reset the progress deadline when the job is
promotion to avoid this bug, and to better conform with implicit user
expectations around how the progress deadline should interact with promotions.
This PR fixes a bug where sidecar services would be re-registered into Consul every ~30
seconds, caused by the parent service having its tags field set and the sidecar_service
tags unset. Nomad would directly compare the tags between its copy of the sidecar service
definition and the tags of the sidecar service reported by Consul. This does not work,
because Consul will under-the-hood set the sidecar service tags to inherit the parent
service tags if the sidecar service tags are unset. The comparison then done by Nomad
would not match, if the parent sidecar tags are set.
Fixes#10025
This PR enables jobs configured with a custom sidecar_task to make
use of the `service.expose` feature for creating checks on services
in the service mesh. Before we would check that sidecar_task had not
been set (indicating that something other than envoy may be in use,
which would not support envoy's expose feature). However Consul has
not added support for anything other than envoy and probably never
will, so having the restriction in place seems like an unnecessary
hindrance. If Consul ever does support something other than Envoy,
they will likely find a way to provide the expose feature anyway.
Fixes#9854
This PR adds pid_mode and ipc_mode options to the exec and java task
driver config options. By default these will defer to the default_pid_mode
and default_ipc_mode agent plugin options created in #9969. Setting
these values to "host" mode disables isolation for the task. Doing so
is not recommended, but may be necessary to support legacy job configurations.
Closes#9970
This PR adds default_pid_mode and default_ipc_mode options to the exec and java
task drivers. By default these will default to "private" mode, enabling PID and
IPC isolation for tasks. Setting them to "host" mode disables isolation. Doing
so is not recommended, but may be necessary to support legacy job configurations.
Closes#9969
The details of host volume and CSI volume requests do not show up in `nomad
plan` outputs, although the updates are detected by the scheduler and result
in an update as expected.
This PR implements Nomad built-in support for running Consul Connect
terminating gateways. Such a gateway can be used by services running
inside the service mesh to access "legacy" services running outside
the service mesh while still making use of Consul's service identity
based networking and ACL policies.
https://www.consul.io/docs/connect/gateways/terminating-gateway
These gateways are declared as part of a task group level service
definition within the connect stanza.
service {
connect {
gateway {
proxy {
// envoy proxy configuration
}
terminating {
// terminating-gateway configuration entry
}
}
}
}
Currently Envoy is the only supported gateway implementation in
Consul. The gateay task can be customized by configuring the
connect.sidecar_task block.
When the gateway.terminating field is set, Nomad will write/update
the Configuration Entry into Consul on job submission. Because CEs
are global in scope and there may be more than one Nomad cluster
communicating with Consul, there is an assumption that any terminating
gateway defined in Nomad for a particular service will be the same
among Nomad clusters.
Gateways require Consul 1.8.0+, checked by a node constraint.
Closes#9445
Most allocation hooks don't need to know when a single task within the
allocation is restarted. The check watcher for group services triggers the
alloc runner to restart all tasks, but the alloc runner's `Restart` method
doesn't trigger any of the alloc hooks, including the group service hook. The
result is that after the first time a check triggers a restart, we'll never
restart the tasks of an allocation again.
This commit adds a `RunnerTaskRestartHook` interface so that alloc runner
hooks can act if a task within the alloc is restarted. The only implementation
is in the group service hook, which will force a re-registration of the
alloc's services and fix check restarts.
Connect ingress gateway services were being registered into Consul without
an explicit deterministic service ID. Consul would generate one automatically,
but then Nomad would have no way to register a second gateway on the same agent
as it would not supply 'proxy-id' during envoy bootstrap.
Set the ServiceID for gateways, and supply 'proxy-id' when doing envoy bootstrap.
Fixes#9834
If the connect.proxy stanza is left unset, the connection timeout
value is not set but is assumed to be, and may cause a non-fatal NPE
on job submission.
* Persist shared allocated ports for inplace update
Ports were not copied over when performing inplace updates in the
generic scheduler
* changelog
* drop spew
When a client restarts, the network_hook's prerun will call
`CreateNetwork`. Drivers that don't implement their own network manager will
fall back to the default network manager, which doesn't handle the case where
the network namespace is being recreated safely. This results in an error and
the task being restarted for `exec` tasks with `network` blocks (this also
impacts the community `containerd` and probably other community task drivers).
If we get an error when attempting to create the namespace and that error is
because the file already exists and is locked by its process, then we'll
return a `nil` error with the `created` flag set to false, just as we do with
the `docker` driver.
AllocatedSharedResources were not being copied over to the new
allocation struct the scheduler makes during inplace updates. This
caused downstream issues after the plan was applied, namely the shared
ports were dropped causing issues with service
registration/deregistration.
test that shared ports are preserved
change log, also carry over shared network
copy networks
When upgrading from Nomad v0.12.x to v1.0.x, Nomad client will panic on
startup if the node is running Connect enabled jobs. This is caused by
a missing piece of plumbing of the Consul Proxies API interface during the
client restore process.
Fixes#9738
When a task is restored after a client restart, the template runner will
create a new lease for any dynamic secret (ex. Consul or PKI secrets
engines). But because this lease is being created in the prestart hook, we
don't trigger the `change_mode`.
This changeset uses the the existence of the task handle to detect a
previously running task that's been restored, so that we can trigger the
template `change_mode` if the template is changed, as it will be only with
dynamic secrets.
When we iterate over the interfaces returned from CNI setup, we filter for one
with the `Sandbox` field set. Ensure that if none of the interfaces has that
field set that we still return an available interface.
Introduce a new more-block friendly syntax for specifying mounts with a new `mount` block type with the target as label:
```hcl
config {
image = "..."
mount {
type = "..."
target = "target-path"
volume_options { ... }
}
}
```
The main benefit here is that by `mount` being a block, it can nest blocks and avoids the compatibility problems noted in https://github.com/hashicorp/nomad/pull/9634/files#diff-2161d829655a3a36ba2d916023e4eec125b9bd22873493c1c2e5e3f7ba92c691R128-R155 .
The intention is for us to promote this `mount` blocks and quietly deprecate the `mounts` type, while still honoring to preserve compatibility as much as we could.
This addresses the issue in https://github.com/hashicorp/nomad/issues/9604 .
* fix acl event creation
* allow way to access secretID without exposing it to stream
test that values are omitted
test event creation
test acl events
payloads are pointers
fix failing tests, do all security steps inside constructor
* increase time
* ignore empty tokens
* uncomment line
* changelog
This PR enables job submitters to use interpolation in the connect
block of jobs making use of consul connect. Before, only the name of
the connect service would be interpolated, and only for a few select
identifiers related to the job itself (#6853). Now, all connect fields
can be interpolated using the full spectrum of runtime parameters.
Note that the service name is interpolated at job-submission time,
and cannot make use of values known only at runtime.
Fixes#7221
* use full name for events
use evaluation and allocation instead of short name
* update api event stream package and shortnames
* update docs
* make sync; fix typo
* backwards compat not from 1.0.0-beta event stream api changes
* use api types instead of string
* rm backwards compat note that only changed between prereleases
* remove backwards incompat that only existed in prereleases
When the Docker driver kills as task, we send a request via the Docker API for
dockerd to fire the signal. We send that signal and then block for the
`kill_timeout` waiting for the container to exit. But if the Docker API
blocks, we will block indefinitely because we haven't configured the API call
with the same timeout.
This changeset is a minimal intervention to add the timeout to the Docker API
call _only_ when we have the `kill_timeout` set. Future work should examine
whether we should be threading contexts through other `go-dockerclient` API
calls.
Previously, every Envoy Connect sidecar would spawn as many worker
threads as logical CPU cores. That is Envoy's default behavior when
`--concurrency` is not explicitly set. Nomad now sets the concurrency
flag to 1, which is sensible for the default cpu = 250 Mhz resources
allocated for sidecar proxies. The concurrency value can be configured
in Client configuration by setting `meta.connect.proxy_concurrency`.
Closes#9341