- updated region in job metadata that gets persisted to nomad datastore
- fixed many unrelated unit tests that used an invalid region value
(they previously passed because hcl wasn't getting picked up and
the job would default to global region)
It is possible to provide multiple identically named services with
different port assignments in a Nomad configuration.
We introduced a regression when migrating to stable service identifiers where
multiple services with the same name would conflict, and the last definition
would take precedence.
This commit includes the port label in the stable service identifier to
allow the previous behaviour where this was supported, for example
providing:
```hcl
service {
name = "redis-cache"
tags = ["global", "cache"]
port = "db"
check {
name = "alive"
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
service {
name = "redis-cache"
tags = ["global", "foo"]
port = "foo"
check {
name = "alive"
type = "tcp"
port = "db"
interval = "10s"
timeout = "2s"
}
}
service {
name = "redis-cache"
tags = ["global", "bar"]
port = "bar"
check {
name = "alive"
type = "tcp"
port = "db"
interval = "10s"
timeout = "2s"
}
}
```
in a nomad task definition is now completely valid. Each service
definition with the same name must still have a unique port label however.
Currently when you submit a manual request to the alloc lifecycle API
with a version of Curl that will submit empty bodies, the alloc restart
api will fail with an EOF error.
This behaviour is undesired, as it is reasonable to not submit a body at
all when restarting an entire allocation rather than an individual task.
This fixes it by ignoring EOF (not unexpected EOF) errors and treating
them as entire task restarts.
Enterprise only.
Disable preemption for service and batch jobs by default.
Maintain backward compatibility in a x.y.Z release. Consider switching
the default for new clusters in the future.
This exposes a client flag to disable nomad remote exec support in
environments where access to tasks ought to be restricted.
I used `disable_remote_exec` client flag that defaults to allowing
remote exec. Opted for a client config that can be used to disable
remote exec globally, or to a subset of the cluster if necessary.
* master: (912 commits)
Update redirects.txt
Added redirect for Spark guide link
client: log when server list changes
docs: mention regression in task config validation
fix update to changelog
update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665
typo: "atleast" -> "at least"
implement nomad exec for rkt
docs: fixed typo
use pty/tty terminology similar to github.com/kr/pty
vendor github.com/kr/pty
drivers: implement streaming exec for executor based drivers
executors: implement streaming exec
executor: scaffolding for executor grpc handling
client: expose allocated memory per task
client improve a comment in updateNetworks
stalebot: Add 'thinking' as an exempt label (#5684)
Added Sparrow link
update links to use new canonical location
Add redirects for restructing done in GH-5667
...
This adds a websocket endpoint for handling `nomad exec`.
The endpoint is a websocket interface, as we require a bi-directional
streaming (to handle both input and output), which is not very appropriate for
plain HTTP 1.0. Using websocket makes implementing the web ui a bit simpler. I
considered using golang http hijack capability to treat http request as a plain
connection, but the web interface would be too complicated potentially.
Furthermore, the API endpoint operates against the raw core nomad exec streaming
datastructures, defined in protobuf, with json serializer. Our APIs use json
interfaces in general, and protobuf generates json friendly golang structs.
Reusing the structs here simplify interface and reduce conversion overhead.
This commit causes sync to skip deregistering checks that are not
managed by nomad, such as service maintenance mode checks. This is
handled in the same way as service registrations - by doing a Nomad
specific prefix match.
The current implementation of Service Registration uses a hash of the
nomad-internal state of a service to register it with Consul, this means that
any update to the service invalidates this name and we then deregister, and
recreate the service in Consul.
While this behaviour slightly simplifies reasoning about service registration,
this becomes problematic when we add consul health checks to a service. When
the service is re-registered, so are the checks, which default to failing for
at least one check period.
This commit migrates us to using a stable identifier based on the
allocation, task, and service identifiers, and uses the difference
between the remote and local state to decide when to push updates.
It uses the existing hashing mechanic to decide when UpdateTask should
regenerate service registrations for providing to Sync, but this should
be removable as part of a future refactor.
It additionally introduces the _nomad-check- prefix for check
definitions, to allow for future allowing of consul features like
maintenance mode.
* client/metrics: modified metrics to use (updated) client copy of allocation instead of (unupdated) server copy
* updated armon/go-metrics to address race condition in DisplayMetrics