This PR changes Nomad's wrapper around the Consul NamespaceAPI so that
it will detect if the Consul Namespaces feature is enabled before making
a request to the Namespaces API. Namespaces are not enabled in Consul OSS,
and require a suitable license to be used with Consul ENT.
Previously Nomad would check for a 404 status code when makeing a request
to the Namespaces API to "detect" if Consul OSS was being used. This does
not work for Consul ENT with Namespaces disabled, which returns a 500.
Now we avoid requesting the namespace API altogether if Consul is detected
to be the OSS sku, or if the Namespaces feature is not licensed. Since
Consul can be upgraded from OSS to ENT, or a new license applied, we cache
the value for 1 minute, refreshing on demand if expired.
Fixes https://github.com/hashicorp/nomad-enterprise/issues/575
Note that the ticket originally describes using attributes from https://github.com/hashicorp/nomad/issues/10688.
This turns out not to be possible due to a chicken-egg situation between
bootstrapping the agent and setting up the consul client. Also fun: the
Consul fingerprinter creates its own Consul client, because there is no
[currently] no way to pass the agent's client through the fingerprint factory.
When `nomad volume create` was introduced in Nomad 1.1.0, we changed the
volume spec to take a list of capabilities rather than a single capability, to
meet the requirements of the CSI spec. When a volume is registered via `nomad
volume register`, we should be using the same fields to validate the volume
with the controller plugin.
This PR adds two additional constraints on Connect sidecar and gateway tasks,
making sure Nomad schedules them only onto nodes where Connect is actually
enabled on the Consul agent.
Consul requires `connect.enabled = true` and `ports.grpc = <number>` to be
explicitly set on agent configuration before Connect APIs will work. Until
now, Nomad would only validate a minimum version of Consul, which would cause
confusion for users who try to run Connect tasks on nodes where Consul is not
yet sufficiently configured. These contstraints prevent job scheduling on nodes
where Connect is not actually use-able.
Closes#10700
When a wildcard namespace is used for `nomad job` commands that support prefix
matching, avoid asking the user for input if a prefix is an unambiguous exact
match so that the behavior is similar to the commands using a specific or
unset namespace.
Adds clarification to `nomad volume create` commands around how the `volume`
block in the jobspec overrides this behavior. Adds missing section to `nomad
volume register` and to example volume spec for both commands.
The default Linux Capabilities set enabled by the docker, exec, and
java task drivers includes CAP_NET_RAW (for making ping just work),
which has the side affect of opening an ARP DoS/MiTM attack between
tasks using bridge networking on the same host network.
https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities
This PR disables CAP_NET_RAW for the docker, exec, and java task
drivers. The previous behavior can be restored for docker using the
allow_caps docker plugin configuration option.
A future version of nomad will enable similar configurability for the
exec and java task drivers.
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.
This fixes 2 bugs:
- 'job plan' output is now idempotent for jobs making use of injected ports
- tasks will no longer be destructively updated when jobs making use of
injected ports are re-run without changing any user specified part of
job config.
Closes: https://github.com/hashicorp/nomad/issues/10099
This is the first step in #10268. If a maximum is not specified, the
task group sum uses the memory number instead. The maximum is only
shown when it’s higher than the memory sum.
This updates the UI to use the new fuzzy search API. It’s a drop-in
replacement so the / shortcut to jump to search is preserved, and
results can be cycled through and chosen via arrow keys and the
enter key.
It doesn’t use everything returned by the API:
* deployments and evaluations: these match by id, doesn’t seem like
people would know those or benefit from quick navigation to them
* namespaces: doesn’t seem useful as they currently function
* scaling policies
* tasks: the response doesn’t include an allocation id, which means they
can’t be navigated to in the UI without an additional query
* CSI volumes: aren’t actually returned by the API
Since there’s no API to check the server configuration and know whether
the feature has been disabled, this adds another query in
route:application#beforeModel that acts as feature detection: if the
attempt to query fails (500), the global search field is hidden.
Upon having added another query on load, I realised that beforeModel was
being triggered any time service:router#transitionTo was being called,
which happens upon navigating to a search result, for instance, because
of refreshModel being present on the region query parameter. This PR
adds a check for transition.queryParamsOnly and skips rerunning the
onload queries (token permissions check, license check, fuzzy search
feature detection).
Implementation notes:
* there are changes to unrelated tests to ignore the on-load feature
detection query
* some lifecycle-related guards against undefined were required to
address failures when navigating to an allocation
* the minimum search length of 2 characters is hard-coded as there’s
currently no way to determine min_term_length in the UI
This PR fixes a bug where Nomad was more restrictive on Ingress Gateway Configuration
Entry definitions than Consul. Before, Nomad would not allow for declaring IGCEs with
http listeners with service name "*", which is a special feature allowable by Consul.
Note: to make http protocol work, a service-default must be defined setting the
protocol to http for each service.
Fixes: #9729
This PR adds job-submission validation that checks for the use of uppercase characters
in group and service names for services that make use of Consul Connect. This prevents
attempting to launch services that Consul will not validate correctly, which in turn
causes tasks to fail to launch in Nomad.
Underlying Consul issue: https://github.com/hashicorp/consul/issues/6765Closes#7581#10450
This PR wraps the use of the consul envoy bootstrap command in
an expoenential backoff closure, configured to timeout after 60
seconds. This is an increase over the current behavior of making
3 attempts over 6 seconds.
Should help with #10451