This PR adds e2e tests for Consul Namespaces for Nomad Enterprise
with Consul ACLs enabled.
Needed to add support for Consul ACL tokens with `namespace` and
`namespace_prefix` blocks, which Nomad parses and validates before
tossing the token. These bits will need to be picked back to OSS.
ParentID is an internal field that Nomad sets for dispatched or parameterized jobs. Job submitters should not be able to set it directly, as that messes up children tracking.
Fixes#10422 . It specifically stops the scheduler from honoring the ParentID. The reason failure and why the scheduler didn't schedule that job once it was created is very interesting and requires follow up with a more technical issue.
Add templating to `network-interface` option.
This PR also adds a fast-fail to in the case where an invalid interface is set or produced by the template
* add tests and check for valid interface
* Add documentation
* Incorporate suggestions from code review
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
(cherry-picked from ent without _ent things)
This is part 2/4 of e2e tests for Consul Namespaces. Took a
first pass at what the parameterized tests can look like, but
only on the ENT side for this PR. Will continue to refactor
in the next PRs.
Also fixes 2 bugs:
- Config Entries registered by Nomad Server on job registration
were not getting Namespace set
- Group level script checks were not getting Namespace set
Those changes will need to be copied back to Nomad OSS.
Nomad OSS + no ACLs (previously, needs refactor)
Nomad ENT + no ACLs (this)
Nomad OSS + ACLs (todo)
Nomad ENT + ALCs (todo)
This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy
searching objects in Nomad. The fuzzy search endpoint routes requests
to the Nomad Server leader, which implements the Search.FuzzySearch RPC
method.
Requests to the fuzzy search API are based on the api.FuzzySearchRequest
object, e.g.
{
"Text": "ed",
"Context": "all"
}
Responses from the fuzzy search API are based on the api.FuzzySearchResponse
object, e.g.
{
"Index": 27,
"KnownLeader": true,
"LastContact": 0,
"Matches": {
"tasks": [
{
"ID": "redis",
"Scope": [
"default",
"example",
"cache"
]
}
],
"evals": [],
"deployment": [],
"volumes": [],
"scaling_policy": [],
"images": [
{
"ID": "redis:3.2",
"Scope": [
"default",
"example",
"cache",
"redis"
]
}
]
},
"Truncations": {
"volumes": false,
"scaling_policy": false,
"evals": false,
"deployment": false
}
}
The API is tunable using the new server.search stanza, e.g.
server {
search {
fuzzy_enabled = true
limit_query = 200
limit_results = 1000
min_term_length = 5
}
}
These values can be increased or decreased, so as to provide more
search results or to reduce load on the Nomad Server. The fuzzy search
API can be disabled entirely by setting `fuzzy_enabled` to `false`.
on Linux systems this is derived from the configure cpuset cgroup parent (defaults to /nomad)
for non Linux systems and Linux systems where cgroups are not enabled, the client defaults to using all cores
Consul allows specifying the HTTP body to send in a health check. Nomad
uses Consul for health checking so this just plumbs the value through to
where the Consul API is called.
There is no validation that `body` is not used with an incompatible
check method like GET.
Our API client `delete` method doesn't include a request body, but accepts an
interface for the response. We were accidentally putting the request body into
the response, which doesn't get picked up in unit tests because we're not
reading the (always empty) response body anyways.
The HTTP router did not correctly route `/v1/volumes/external` without being
explicitly added to the top-level router. Break this out into its own request
handler.
Registration of Nomad volumes previously allowed for a single volume
capability (access mode + attachment mode pair). The recent `volume create`
command requires that we pass a list of requested capabilities, but the
existing workflow for claiming volumes and attaching them on the client
assumed that the volume's single capability was correct and unchanging.
Add `AccessMode` and `AttachmentMode` to `CSIVolumeClaim`, use these fields to
set the initial claim value, and add backwards compatibility logic to handle
the existing volumes that already have claims without these fields.
This PR adds the common OSS changes for adding support for Consul Namespaces,
which is going to be a Nomad Enterprise feature. There is no new functionality
provided by this changeset and hopefully no new bugs.
The HTTP test to create CSI volumes depends on having a controller plugin to
talk to, but the test was using a node-only plugin, which allows it to
silently ignore the missing controller.
This commit includes a new test client that allows overriding the RPC
protocols. Only the RPCs that are passed in are registered, which lets you
implement a mock RPC in the server tests. This commit includes an example of
this for the ClientCSI RPC server.
This commit updates the API to pass the MemoryMaxMB field, and the CLI to show
the max set for the task.
Also, start parsing the MemoryMaxMB in hcl2, as it's set by tags.
A sample CLI output; note the additional `Max: ` for "task":
```
$ nomad alloc status 96fbeb0b
ID = 96fbeb0b-a0b3-aa95-62bf-b8a39492fd5c
[...]
Task "cgroup-fetcher" is "running"
Task Resources
CPU Memory Disk Addresses
0/500 MHz 32 MiB/20 MiB 300 MiB
Task Events:
[...]
Task "task" is "running"
Task Resources
CPU Memory Disk Addresses
0/500 MHz 176 KiB/20 MiB 300 MiB
Max: 30 MiB
Task Events:
[...]
```
The OTT feature relies on having a query parameter for a one-time token which
gets handled by the UI. We need to make sure that query param is preserved
when redirecting from the root URL to the `/ui/` URI.
node drain: use msgtype on txn so that events are emitted
wip: encoding extension to add Node.Drain field back to API responses
new approach for hiding Node.SecretID in the API, using `json` tag
documented this approach in the contributing guide
refactored the JSON handlers with extensions
modified event stream encoding to use the go-msgpack encoders with the extensions
Add a `PerAlloc` field to volume requests that directs the scheduler to test
feasibility for volumes with a source ID that includes the allocation index
suffix (ex. `[0]`), rather than the exact source ID.
Read the `PerAlloc` field when making the volume claim at the client to
determine if the allocation index suffix (ex. `[0]`) should be added to the
volume source ID.
Callers of `CSIVolumeByID` are generally assuming they should receive a single
volume. This potentially results in feasibility checking being performed
against the wrong volume if a volume's ID is a prefix substring of other
volume (for example: "test" and "testing").
Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix
matching in the command line client. Add the required elements for prefix
matching to the commands and API.
* Fixup uses of `sanity`
* Remove unnecessary comments.
These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.
* Update nomad/fsm_test.go
If the user has disabled Prometheus metrics and a request is
sent to the metrics endpoint requesting Prometheus formatted
metrics, then the request should fail.