* client: protect user lookups with global lock
This PR updates Nomad client to always do user lookups while holding
a global process lock. This is to prevent concurrency unsafe implementations
of NSS, but still enabling NSS lookups of users (i.e. cannot not use osusergo).
* cl: add cl
http.NewRequestWithContext will only set the right value for
Content-Length if the input is *bytes.Buffer, *bytes.Reader, or
*strings.Reader [0].
Since os.Stdin is an os.File, POST requests made with the `nomad
operator api` command would always have Content-Length set to -1, which
is interpreted as an unknown length by web servers.
[0]: https://pkg.go.dev/net/http#NewRequestWithContext
In Nomad 1.2.6 we shipped `eval list`, which accepts a `-json` flag, and
deprecated the usage of `eval status` without an evaluation ID with an upgrade
note that it would be removed in Nomad 1.4.0. This changeset completes that
work.
The `operator debug` command writes JSON files from API responses as a single
line containing an array of JSON objects. But some of these files can be
extremely large (GB's) for large production clusters, which makes it difficult
to parse them using typical line-oriented Unix command line tools that can
stream their inputs without consuming a lot of memory.
For collections that are typically large, instead emit newline-delimited JSON.
This changeset includes some first-pass refactoring of this command. It breaks
up monolithic methods that validate a path, create a file, serialize objects,
and write them to disk into smaller functions, some of which can now be
standalone to take advantage of generics.
Operators may have a setup whereby the TLS config comes from a
source other than setting Nomad specific env vars. In this case,
we should attempt to identify the scheme using the config setting
as a fallback.
* cleanup: refactor MapStringStringSliceValueSet to be cleaner
* cleanup: replace SliceStringToSet with actual set
* cleanup: replace SliceStringSubset with real set
* cleanup: replace SliceStringContains with slices.Contains
* cleanup: remove unused function SliceStringHasPrefix
* cleanup: fixup StringHasPrefixInSlice doc string
* cleanup: refactor SliceSetDisjoint to use real set
* cleanup: replace CompareSliceSetString with SliceSetEq
* cleanup: replace CompareMapStringString with maps.Equal
* cleanup: replace CopyMapStringString with CopyMap
* cleanup: replace CopyMapStringInterface with CopyMap
* cleanup: fixup more CopyMapStringString and CopyMapStringInt
* cleanup: replace CopySliceString with slices.Clone
* cleanup: remove unused CopySliceInt
* cleanup: refactor CopyMapStringSliceString to be generic as CopyMapOfSlice
* cleanup: replace CopyMap with maps.Clone
* cleanup: run go mod tidy
The `TestVarGetCommand` test uses the wrong namespace in the autocomplete
test. The namespace only gets validated against if we have quota enforcement (or
more typically by ACL checks), so the test only fails in the ENT repo test runs.
This PR implements support for check_restart for checks registered
in the Nomad service provider.
Unlike Consul, Nomad service checks never report a "warning" status,
and so the check_restart.ignore_warnings configuration is not valid
for Nomad service checks.
This PR refactors agent/consul/check_watcher into client/serviceregistration,
and abstracts away the Consul-specific check lookups.
In doing so we should be able to reuse the existing check watcher logic for
also watching NSD checks in a followup PR.
A chunk of consul/unit_test.go is removed - we'll cover that in e2e tests
in a follow PR if needed. In the long run I'd like to remove this whole file.
When querying the checks for an allocation, the request must be
forwarded to the agent that is running the allocation. If the
initial request is made to a server agent, the request can be made
directly to the client agent running the allocation. If the
request is made to a client agent not running the alloc, the
request needs to be forwarded to a server and then the correct
client.
Nomad's original autopilot was importing from a private package in Consul. It
has been moved out to a shared library. Switch Nomad to use this library so that
we can eliminate the import of Consul, which is necessary to build Nomad ENT
with the current version of the Consul SDK. This also will let us pick up
autopilot improvements shared with Consul more easily.
These options are mutually exclusive but, since `-hcl2-strict` defaults
to `true` users had to explicitily set it to `false` when using `-hcl1`.
Also return `255` when job plan fails validation as this is the expected
code in this situation.
* Update Consul Template dep to support Nomad vars
* Remove `Peering` config for Consul Testservers
Upgrading to the 1.14 Consul SDK introduces and additional default
configuration—`Peering`—that is not compatible with versions of Consul
before v1.13.0. because Nomad tests against Consul v1.11.1, this
configuration has to be nil'ed out before passing it to the Consul
binary.
Neither the `os.Setenv` nor `t.Setenv` helper are safe to use in parallel tests
because environment variables are process-global. The stdlib panics if you try
to do this. Remove the `ci.Parallel()` call from all tests where we're setting
environment variables.
Nomad reconciles services it expects to be registered in Consul with
what is actually registered in the local Consul agent. This is necessary
to prevent leaking service registrations if Nomad crashes at certain
points (or if there are bugs).
When Consul has namespaces enabled, we must iterate over each available
namespace to be sure no services were leaked into non-default
namespaces.
Since this reconciliation happens often, there's no need to require
results from the Consul leader server. In large clusters this creates
far more load than the "freshness" of the response is worth.
Therefore this patch switches the request to AllowStale=true