Follow up to memory oversubscription - expose an env-var to indicate when memory oversubscription is enabled and what the limit is.
This will be helpful for setting hints to app for memory management.
Co-authored-by: Seth Hoenig <shoenig@hashicorp.com>
This PR introduces the /v1/search/fuzzy API endpoint, used for fuzzy
searching objects in Nomad. The fuzzy search endpoint routes requests
to the Nomad Server leader, which implements the Search.FuzzySearch RPC
method.
Requests to the fuzzy search API are based on the api.FuzzySearchRequest
object, e.g.
{
"Text": "ed",
"Context": "all"
}
Responses from the fuzzy search API are based on the api.FuzzySearchResponse
object, e.g.
{
"Index": 27,
"KnownLeader": true,
"LastContact": 0,
"Matches": {
"tasks": [
{
"ID": "redis",
"Scope": [
"default",
"example",
"cache"
]
}
],
"evals": [],
"deployment": [],
"volumes": [],
"scaling_policy": [],
"images": [
{
"ID": "redis:3.2",
"Scope": [
"default",
"example",
"cache",
"redis"
]
}
]
},
"Truncations": {
"volumes": false,
"scaling_policy": false,
"evals": false,
"deployment": false
}
}
The API is tunable using the new server.search stanza, e.g.
server {
search {
fuzzy_enabled = true
limit_query = 200
limit_results = 1000
min_term_length = 5
}
}
These values can be increased or decreased, so as to provide more
search results or to reduce load on the Nomad Server. The fuzzy search
API can be disabled entirely by setting `fuzzy_enabled` to `false`.
This PR enables job submitters to use interpolation in the connect
block of jobs making use of consul connect. Before, only the name of
the connect service would be interpolated, and only for a few select
identifiers related to the job itself (#6853). Now, all connect fields
can be interpolated using the full spectrum of runtime parameters.
Note that the service name is interpolated at job-submission time,
and cannot make use of values known only at runtime.
Fixes#7221
Beforehand tasks and field replacements did not have access to the
unique ID of their job or its parent. This adds this information as
new environment variables.
Now that alloc.Canonicalize() is called in all alloc sources in the
client (i.e. on state restore and RPC fetching), we no longer need to
check alloc.TaskResources.
alloc.AllocatedResources is always non-nil through alloc runner.
Though, early on, we check for alloc validity, so NewTaskRunner and
TaskEnv must still check. `TestClient_AddAllocError` test validates
that behavior.
* client: improve group service stanza interpolation and check_restart support
Interpolation can now be done on group service stanzas. Note that some task runtime specific information
that was previously available when the service was registered poststart of a task is no longer available.
The check_restart stanza for checks defined on group services will now properly restart the allocation upon
check failures if configured.
* taskenv: add connect upstream env vars + test
* set taskenv upstreams instead of appending
* Update client/taskenv/env.go
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
Fixes https://github.com/hashicorp/nomad/issues/5587
When a nomad 0.9 client is handling an alloc generated by a nomad 0.8
server, we should check the alloc.TaskResources for networking details
rather than task.Resources.
We check alloc.TaskResources for networking for other tasks in the task
group [1], so it's a bit odd that we used the task.Resources struct
here. TaskRunner also uses `alloc.TaskResources`[2].
The task.Resources struct in 0.8 was sparsly populated, resulting to
storing of 0 in port mapping env vars:
```
vagrant@nomad-server-01:~$ nomad version
Nomad v0.8.7 (21a2d93eecf018ad2209a5eab6aae6c359267933+CHANGES)
vagrant@nomad-server-01:~$ nomad server members
Name Address Port Status Leader Protocol Build Datacenter Region
nomad-server-01.global 10.199.0.11 4648 alive true 2 0.8.7 dc1 global
vagrant@nomad-server-01:~$ nomad alloc status -json 5b34649b | jq '.Job.TaskGroups[0].Tasks[0].Resources.Networks'
[
{
"CIDR": "",
"Device": "",
"DynamicPorts": [
{
"Label": "db",
"Value": 0
}
],
"IP": "",
"MBits": 10,
"ReservedPorts": null
}
]
vagrant@nomad-server-01:~$ nomad alloc status -json 5b34649b | jq '.TaskResources'
{
"redis": {
"CPU": 500,
"DiskMB": 0,
"IOPS": 0,
"MemoryMB": 256,
"Networks": [
{
"CIDR": "",
"Device": "eth1",
"DynamicPorts": [
{
"Label": "db",
"Value": 21722
}
],
"IP": "10.199.0.21",
"MBits": 10,
"ReservedPorts": null
}
]
}
}
```
Also, updated the test values to mimic how Nomad 0.8 structs are
represented, and made its result match the non compact values in
`TestEnvironment_AsList`.
[1] 24e9040b18/client/taskenv/env.go (L624-L639)
[2] https://github.com/hashicorp/nomad/blob/master/client/allocrunner/taskrunner/task_runner.go#L287-L303
**The Bug:**
You may have seen log lines like this when running 0.9.0-dev:
```
... client.alloc_runner.task_runner: some environment variables not available for rendering: ... keys="attr.driver.docker.volumes.enabled, attr.driver.docker.version, attr.driver.docker.bridge_ip, attr.driver.qemu.version"
```
Not only should we not be erroring on builtin driver attributes, but the
results were nondeterministic due to map iteration order!
The root cause is that we have an old root attribute for all drivers
like:
```
attr.driver.docker = "1"
```
When attributes were opaque variable names it was fine to also have
"nested" attributes like:
```
attr.driver.docker.version = "1.2.3"
```
However in the HCLv2 world the variable names are no longer opaque: they
form an object tree. The `docker` object can no longer both hold a value
(`"1"`) *and* nested attributes (`version = "1.2.3"`).
**The Fix:**
Since the old `attr.driver.<name> = "1"` attribues are useless for task
config interpolation, create a new precedence rule for creating the task
config evaluation context:
*Maps take precedence over primitives.*
This means `attr.driver.docker.version` will always take precedence over
`attr.driver.docker`. The results are determinstic and give users access
to the more useful metadata.
I made this a general precedence rule instead of special-casing driver
attrs because it seemed like better default behavior than spamming
WARNings to logs that were likely unactionable by users.