This PR fixes a bug where the docker network pause container would not be
stopped and removed in the case where a node is restarted, the alloc is
moved to another node, the node comes back up. See the issue below for
full repro conditions.
Basically in the DestroyNetwork PostRun hook we would depend on the
NetworkIsolationSpec field not being nil - which is only the case
if the Client stays alive all the way from network creation to network
teardown. If the node is rebooted we lose that state and previously
would not be able to find the pause container to remove. Now, we manually
find the pause container by scanning them and looking for the associated
allocID.
Fixes#17299
* Exam to parallelize tests
* Logging to try to solve test flakiness
* Logging in another failure
* Hardening for one test and snapshot for another
* Explicitly set the first one as the servicedAlloc instead of randomly picking
* A wild CircleCI test failure appears
* de-log
During shutdown of a client with drain_on_shutdown there is a race between
the Client ending the cgroup and the task's cpuset manager cleaning up
the cgroup. During the path traversal, skip anything we cannot read, which
avoids the nil DirEntry we try to dereference now.
Go released a security update to fix build-time code injection and execution via
CGO. This doesn't impact already-released versions of Nomad, just the build
toolchain, so we won't be releasing a Nomad security update to go with it.
Implement scheduler support for node pool:
* When a scheduler is invoked, we get a set of the ready nodes in the DCs that
are allowed for that job. Extend the filter to include the node pool.
* Ensure that changes to a job's node pool are picked up as destructive
allocation updates.
* Add `NodesInPool` as a metric to all reporting done by the scheduler.
* Add the node-in-pool the filter to the `Node.Register` RPC so that we don't
generate spurious evals for nodes in the wrong pool.
Implements the HTTP API associated with the `NodePool.ListJobs` RPC, including
the `api` package for the public API and documentation.
Update the `NodePool.ListJobs` RPC to fix the missing handling of the special
"all" pool.
namely, these workflows:
test-e2e, test-ui, and test-windows
extra-curricularly, as part of the overall
migration effort company-wide, this also includes
some standardization such as:
* explicit permissions:read on various workflows
* pinned action version shas (per https://github.com/hashicorp/security-public-tsccr)
* actionlint, which among other things runs
shellcheck on GHA run steps
Co-authored-by: emilymianeil <eneil@hashicorp.com>
Co-authored-by: Daniel Kimsey <daniel.kimsey@hashicorp.com>
* "allow" (don't try to drop) linux capabilities
in the docker test driver harness (see #15181)
* refactor to allow different busybox images
since windows containers need to be the same
version as the underlying OS, and we're
moving from 2016 to 2019
* one docker test was flaky from apparently
being a bit slower on windows, so add Wait()
When checking if a token is allowed to query the search endpoints we
need to return an error if the search context includes `node_pool` and
the token doesn't have access to _any_ pool. This prevents returning an
empty list instead of a permission denied error.
Add a new job admission hook for node pools that enforces the pool exists on
registration. Also provide the skeleton function we need for Enterprise
enforcement functions we'll implement later.
Move all validations related to task fields to Task.Validate(). Prior to
this, some task validations were being done inside TaskGroup.Validate()
because they required access to some group values.
But similarly to how TaskGroup.Validate() tasks the job as parameter,
it's fair to expect the task to receive its group.
This changeset only adds the `node_pool` field to the jobspec, and ensures that
it gets picked up correctly as a change. Without the rest of the implementation
landed yet, the field will be ignored.
The allocrunner sends several updates to the server during the early lifecycle
of an allocation and its tasks. Clients batch-up allocation updates every 200ms,
but experiments like the C2M challenge has shown that even with this batching,
servers can be overwhelmed with client updates during high volume
deployments. Benchmarking done in #9451 has shown that client updates can easily
represent ~70% of all Nomad Raft traffic.
Each allocation sends many updates during its lifetime, but only those that
change the `ClientStatus` field are critical for progressing a deployment or
kicking off a reschedule to recover from failures.
Add a priority to the client allocation sync and update the `syncTicker`
receiver so that we only send an update if there's a high priority update
waiting, or on every 5th tick. This means when there are no high priority
updates, the client will send updates at most every 1s instead of
200ms. Benchmarks have shown this can reduce overall Raft traffic by 10%, as
well as reduce client-to-server RPC traffic.
This changeset also switches from a channel-based collection of updates to a
shared buffer, so as to split batching from sending and prevent backpressure
onto the allocrunner when the RPC is slow. This doesn't have a major performance
benefit in the benchmarks but makes the implementation of the prioritized update
simpler.
Fixes: #9451