* taskenv: add connect upstream env vars + test
* set taskenv upstreams instead of appending
* Update client/taskenv/env.go
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
* adds meta object to service in job spec, sends it to consul
* adds tests for service meta
* fix tests
* adds docs
* better hashing for service meta, use helper for copying meta when registering service
* tried to be DRY, but looks like it would be more work to use the
helper function
Fixes#6041
Unlike all other Consul operations, boostrapping requires Consul be
available. This PR tries Consul 3 times with a backoff to account for
the group services being asynchronously registered with Consul.
Adds a check for differences in `job.Diff` so that task group networks
and services, including new Consul connect stanzas, show up in the job
plan outputs.
* nomad: add admission controller framework
* nomad: add admission controller framework and Consul Connect hooks
* run admission controllers before checking permissions
* client: add default node meta for connect configurables
* nomad: remove validateJob func since it has been moved to admission controller
* nomad: use new TaskKind type
* client: use consts for connect sidecar image and log level
* Apply suggestions from code review
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
* nomad: add job register test with connect sidecar
* Update nomad/job_endpoint_hooks.go
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
* jobspec: breakup parse.go into smaller files
* add sidecar_task parsing to jobspec and api
* jobspec: combine service parsing logic for task and group service stanzas
* api: use slice of ConsulUpstream values instead of pointers
Also includes unit tests for binpacker and preemption.
The tests verify that network resources specified at the
task group level are properly accounted for
Here, we ensure that when leader only responds to RPC calls when state
store is up to date. At leadership transition or launch with restored
state, the server local store might not be caught up with latest raft
logs and may return a stale read.
The solution here is to have an RPC consistency read gate, enabled when
`establishLeadership` completes before we respond to RPC calls.
`establishLeadership` is gated by a `raft.Barrier` which ensures that
all prior raft logs have been applied.
Conversely, the gate is disabled when leadership is lost.
This is very much inspired by https://github.com/hashicorp/consul/pull/3154/files
Plan application should use a state snapshot at or after the Raft index
at which the plan was created otherwise it risks being rejected based on
stale data.
This commit adds a Plan.SnapshotIndex which is set by workers when
submitting plan. SnapshotIndex is set to the Raft index of the snapshot
the worker used to generate the plan.
Plan.SnapshotIndex plays a similar role to PlanResult.RefreshIndex.
While RefreshIndex informs workers their StateStore is behind the
leader's, SnapshotIndex is a way to prevent the leader from using a
StateStore behind the worker's.
Plan.SnapshotIndex should be considered the *lower bound* index for
consistently handling plan application.
Plans must also be committed serially, so Plan N+1 should use a state
snapshot containing Plan N. This is guaranteed for plans *after* the
first plan after a leader election.
The Raft barrier on leader election ensures the leader's statestore has
caught up to the log index at which it was elected. This guarantees its
StateStore is at an index > lastPlanIndex.
Fix a case where `node.StatusUpdatedAt` was manipulated directly in
memory.
This ensures that StatusUpdatedAt is set in raft layer, and ensures that
the field is updated when node drain/eligibility is updated too.
* master: (912 commits)
Update redirects.txt
Added redirect for Spark guide link
client: log when server list changes
docs: mention regression in task config validation
fix update to changelog
update CHANGELOG with datacenter config validation https://github.com/hashicorp/nomad/pull/5665
typo: "atleast" -> "at least"
implement nomad exec for rkt
docs: fixed typo
use pty/tty terminology similar to github.com/kr/pty
vendor github.com/kr/pty
drivers: implement streaming exec for executor based drivers
executors: implement streaming exec
executor: scaffolding for executor grpc handling
client: expose allocated memory per task
client improve a comment in updateNetworks
stalebot: Add 'thinking' as an exempt label (#5684)
Added Sparrow link
update links to use new canonical location
Add redirects for restructing done in GH-5667
...
Fixes#1795
Running restored allocations and pulling what allocations to run from
the server happen concurrently. This means that if a client is rebooted,
and has its allocations rescheduled, it may restart the dead allocations
before it contacts the server and determines they should be dead.
This commit makes tasks that fail to reattach on restore wait until the
server is contacted before restarting.
Our testing so far indicates that ugorji/go/codec maintains backward
compatiblity with the version we are using now, for purposes of Nomad
serialization.
Using latest ugorji/go allows us to get back to using upstream library,
get get the optimizations benefits in RPC paths (including code
generation optimizations).
ugorji/go introduced two significant changes:
* time binary format in debb8e2d2e. Setting `h.BasicHandle.TimeNotBuiltin = true` restores old behavior
* ugorji/go started honoring `json` tag as well:
v1.1.4 is the latest but has a bug in handling RawString that's fixed in
d09a80c1e0
.
`Index` is already included as part of `WriteMeta` embedding.
This is a backward compatible change: Clients never read the field; and
Server refernces to `EmitNodeEventsResponse.Index` would be using the
value in `WriteMeta`, which is consistent with other response structs.
This command will be used to send a signal to either a single task within an
allocation, or all of the tasks if <task-name> is omitted. If the sent signal
terminates the allocation, it will be treated as if the allocation has crashed,
rather than as if it was operator-terminated.
Signal validation is currently handled by the driver itself and nomad
does not attempt to restrict or validate them.