Fixes#2891
Previously the agent services and checks were being asynchrously
deregistered on shutdown, so it was a race between the sync goroutine
deregistering them and Nomad shutting down.
This switches to synchronously deregister agent serivces and checks
which doesn't really have a downside since the sync goroutines retry
behavior doesn't help on shutdown anyway.
* alloc_runner
* Random tests
* parallel task_runner and no exec compatible check
* Parallel client
* Fail fast and use random ports
* Fix docker port mapping
* Make concurrent pull less timing dependant
* up parallel
* Fixes
* don't build chroots in parallel on travis
* Reduce parallelism on travis with lxc/rkt
* make java test app not run forever
* drop parallelism a little
* use docker ports that are out of the os's ephemeral port range
* Limit even more on travis
* rkt deadline
Fixes#2827
This is a tradeoff. The pro is that you can run separate client and
server agents on the same node and advertise both. The con is that if a
Nomad agent crashes and isn't restarted on that node in the same mode
its entry will not be cleaned up.
That con scenario seems far less likely to occur than the scenario on
the pro side, and even if we do leak an agent entry the checks will be
failing so nothing should attempt to use it.
This PR adds watching of allocation health at the client. The client can
watch for health based on the tasks running on time and also based on
the consul checks passing.
Ideally DriverNetwork would be fully populated in Driver.Prestart, but
Docker doesn't assign the container's IP until you start the container.
However, it's important to setup the port env vars before calling
Driver.Start, so Prestart should populate that.
Previously was interpolating the original task's services again.
Fixes#2180
Also fixes a slight memory leak in the new consul agent. Script check
handles weren't being deleted after cancellation.
Previous implementation assumed all struct fields were included in
service and check IDs. Service IDs never include port labels and check
IDs *optionally* include port labels, so lots of things had to change.
Added a really big test to exercise this.
Fixes#2478#2474#1995#2294
The new client only handles agent and task service advertisement. Server
discovery is mostly unchanged.
The Nomad client agent now handles all Consul operations instead of the
executor handling task related operations. When upgrading from an
earlier version of Nomad existing executors will be told to deregister
from Consul so that the Nomad agent can re-register the task's services
and checks.
Drivers - other than qemu - now support an Exec method for executing
abritrary commands in a task's environment. This is used to implement
script checks.
Interfaces are used extensively to avoid interacting with Consul in
tests that don't assert any Consul related behavior.
trackedServices, delegateChecks, trackedChecks, and checkRunners
should all be covered. This lock needs to be reasonably narrow and
can't use defer due to possible recursive locking concerns further
downstream from the call sites.
The source of truth is the local Nomad Agent. Any checks are not local that
have a matching prefix are removed. Changed checks are re-registered
and missing checks are re-added.
The source of truth is the local Nomad Agent. Any services not local that
have a matching prefix are removed. Changed services are re-registered
and missing services are re-added.
Now the right way to register services with the Syncer is to call
`SetServices(groupName, []*services)`. This was required to allow
the Syncer to sync either the Client, Server, or Both using a
single Syncer.
This call is obsolete by a future commit that changes the canonical
source of truth to be consul.AgentServiceRegistration structs, which
means it is not necessary to construct AgentServiceRegistration
objects every time a registration is made, we just reuse the existing
object.
This has been done to allow the Server and Client to reuse the same
Syncer because the Agent may be running Client, Server, or both
simultaneously and we only want one Syncer object alive in the agent.