* Add UnexpectedResultError to nomad/api
This allows users to perform additional status-based behavior by rehydrating the error using `errors.As` inside of consumers.
* build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.5 in /api
* deps: update shoenig/test to v0.6.5
* deps: update again to v0.6.6
---------
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Seth Hoenig <shoenig@duck.com>
nomad#15861 describes intermitent panics caused by go-metrics Prometheus
client. We have not been able to further debug this problem due to the
lack of information when the panic happens.
go-metrics#146 prevents the panic from happening and also logs
additional information that can help us understand the root cause of the
problem.
This commits pins the go-metric dependency to this branch until we can
better debug the issue.
Their release notes are here: https://github.com/golang-jwt/jwt/releases
Seemed wise to upgrade before we do even more with JWTs. For example
this upgrade *would* have mattered if we already implemented common JWT
claims such as expiration. Since we didn't rely on any claim
verification this upgrade is a noop...
...except for 1 test that called `Claims.Valid()`! Removing that
assertion *seems* scary, but it didn't actually do anything because we
didn't implement any of the standard claims it validated:
https://github.com/golang-jwt/jwt/blob/v4.5.0/map_claims.go#L120-L151
So functionally this major upgrade is a noop.
* Upgrade from hashicorp/go-msgpack v1.1.5 to v2.1.0
Fixes#16808
* Update hashicorp/net-rpc-msgpackrpc to v2 to match go-msgpack
* deps: use go-msgpack v2.0.0
go-msgpack v2.1.0 includes some code changes that we will need to
investigate furthere to assess its impact on Nomad, so keeping this
dependency on v2.0.0 for now since it's no-op.
---------
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
* [no ci] deps: update docker to 23.0.3
This PR brings our docker/docker dependency (which is hosted at github.com/moby/moby)
up to 23.0.3 (forward about 2 years). Refactored our use of docker/libnetwork to
reference the package in its new home, which is docker/docker/libnetwork (it is
no longer an independent repository). Some minor nearby test case cleanup as well.
* add cl
This update changes the behaviour when following logs from an
allocation, so that both stdout and stderr files streamed when the
operator supplies the follow flag. The previous behaviour is held
when all other flags and situations are provided.
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
* client/fingerprint: correctly fingerprint E/P cores of Apple Silicon chips
This PR adds detection of asymetric core types (Power & Efficiency) (P/E)
when running on M1/M2 Apple Silicon CPUs. This functionality is provided
by shoenig/go-m1cpu which makes use of the Apple IOKit framework to read
undocumented registers containing CPU performance data. Currently working
on getting that functionality merged upstream into gopsutil, but gopsutil
would still not support detecting P vs E cores like this PR does.
Also refactors the CPUFingerprinter code to handle the mixed core
types, now setting power vs efficiency cpu attributes.
For now the scheduler is still unaware of mixed core types - on Apple
platforms tasks cannot reserve cores anyway so it doesn't matter, but
at least now the total CPU shares available will be correct.
Future work should include adding support for detecting P/E cores on
the latest and upcoming Intel chips, where computation of total cpu shares
is currently incorrect. For that, we should also include updating the
scheduler to be core-type aware, so that tasks of resources.cores on Linux
platforms can be assigned the correct number of CPU shares for the core
type(s) they have been assigned.
node attributes before
cpu.arch = arm64
cpu.modelname = Apple M2 Pro
cpu.numcores = 12
cpu.reservablecores = 0
cpu.totalcompute = 1000
node attributes after
cpu.arch = arm64
cpu.frequency.efficiency = 2424
cpu.frequency.power = 3504
cpu.modelname = Apple M2 Pro
cpu.numcores.efficiency = 4
cpu.numcores.power = 8
cpu.reservablecores = 0
cpu.totalcompute = 37728
* fingerprint/cpu: follow up cr items
Fixes#16517
Given a 3 Server cluster with at least 1 Client connected to Follower 1:
If a NodeMeta.{Apply,Read} for the Client request is received by
Follower 1 with `AllowStale = false` the Follower will forward the
request to the Leader.
The Leader, not being connected to the target Client, will forward the
RPC to Follower 1.
Follower 1, seeing AllowStale=false, will forward the request to the
Leader.
The Leader, not being connected to... well hoppefully you get the
picture: an infinite loop occurs.
* landlock: git needs more files for private repositories
This PR fixes artifact downloading so that git may work when cloning from
private repositories. It needs
- file read on /etc/passwd
- dir read on /root/.ssh
- file write on /root/.ssh/known_hosts
Add these rules to the landlock rules for the artifact sandbox.
* cr: use nonexistent instead of devnull
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
* cr: use go-homdir for looking up home directory
* pr: pull go-homedir into explicit require
* cr: fixup homedir tests in homeless root cases
* cl: fix root test for real
---------
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
Fixes#16288. An earlier version of `go-plugin` introduced a warning log if
`SecureConfig` is unset. For Nomad and other applications that have "internal"
`go-plugin` consumers where the application runs itself as a plugin, this causes
spurious warn-level logs. For Nomad in particular this means every task driver
and logmon invocation emits the log, which is our primary operation.
The change was reverted upstream, so this changeset picks up the reverted
version.
* artifact: protect against unbounded artifact decompression
Starting with 1.5.0, set defaut values for artifact decompression limits.
artifact.decompression_size_limit (default "100GB") - the maximum amount of
data that will be decompressed before triggering an error and cancelling
the operation
artifact.decompression_file_count_limit (default 4096) - the maximum number
of files that will be decompressed before triggering an error and
cancelling the operation.
* artifact: assert limits cannot be nil in validation
* main: remove deprecated uses of rand.Seed
go1.20 deprecates rand.Seed, and seeds the rand package
automatically. Remove cases where we seed the random package,
and cleanup the one case where we intentionally create a
known random source.
* cl: update cl
* mod: update go mod