The signature of the `raftApply` function requires that the caller unwrap the
first returned value (the response from `FSM.Apply`) to see if it's an
error. This puts the burden on the caller to remember to check two different
places for errors, and we've done so inconsistently.
Update `raftApply` to do the unwrapping for us and return any `FSM.Apply` error
as the error value. Similar work was done in Consul in
https://github.com/hashicorp/consul/pull/9991. This eliminates some boilerplate
and surfaces a few minor bugs in the process:
* job deregistrations of already-GC'd jobs were still emitting evals
* reconcile job summaries does not return scheduler errors
* node updates did not report errors associated with inconsistent service
discovery or CSI plugin states
Note that although _most_ of the `FSM.Apply` functions return only errors (which
makes it tempting to remove the first return value entirely), there are few that
return `bool` for some reason and Variables relies on the response value for
proper CAS checking.
This changeset configures the RPC rate metrics that were added in #15515 to all
the RPCs that support authenticated HTTP API requests. These endpoints already
configured with pre-forwarding authentication in #15870, and a handful of others
were done already as part of the proof-of-concept work. So this changeset is
entirely copy-and-pasting one method call into a whole mess of handlers.
Upcoming PRs will wire up pre-forwarding auth and rate metrics for the remaining
set of RPCs that have no API consumers or aren't authenticated, in smaller
chunks that can be more thoughtfully reviewed.
This changeset allows Workload Identities to authenticate to all the RPCs that
support HTTP API endpoints, for use with PR #15864.
* Extends the work done for pre-forwarding authentication to all RPCs that
support a HTTP API endpoint.
* Consolidates the auth helpers used by the CSI, Service Registration, and Node
endpoints that are currently used to support both tokens and client secrets.
Intentionally excluded from this changeset:
* The Variables endpoint still has custom handling because of the implicit
policies. Ideally we'll figure out an efficient way to resolve those into real
policies and then we can get rid of that custom handling.
* The RPCs that don't currently support auth tokens (i.e. those that don't
support HTTP endpoints) have not been updated with the new pre-forwarding auth
We'll be doing this under a separate PR to support RPC rate metrics.
In #15417 we added a new `Authenticate` method to the server that returns an
`AuthenticatedIdentity` struct. This changeset implements this method for a
small number of RPC endpoints that together represent all the various ways in
which RPCs are sent, so that we can validate that we're happy with this
approach.
This changeset covers a sidebar discussion that @schmichael and I had around the
design for pre-forwarding auth. This includes some changes extracted out of
#15513 to make it easier to review both and leave a clean history.
* Remove fast path for NodeID. Previously-connected clients will have a NodeID
set on the context, and because this is a large portion of the RPCs sent we
fast-pathed it at the top of the `Authenticate` method. But the context is
shared for all yamux streams over the same yamux session (and TCP
connection). This lets an authenticated HTTP request to a client use the
NodeID for authentication, which is a privilege escalation. Remove the fast
path and annotate it so that we don't break it again.
* Add context to decisions around AuthenticatedIdentity. The `Authenticate`
method taken on its own looks like it wants to return an `acl.ACL` that folds
over all the various identity types (creating an ephemeral ACL on the fly if
neccessary). But keeping these fields idependent allows RPC handlers to
differentiate between internal and external origins so we most likely want to
avoid this. Leave some docstrings as a warning as to why this is built the way
it is.
* Mutate the request rather than returning. When reviewing #15513 we decided
that forcing the request handler to call `SetIdentity` was repetitive and
error prone. Instead, the `Authenticate` method mutates the request by setting
its `AuthenticatedIdentity`.
Upcoming work to instrument the rate of RPC requests by consumer (and eventually
rate limit) require that we authenticate a RPC request before forwarding. Add a
new top-level `Authenticate` method to the server and have it return an
`AuthenticatedIdentity` struct. RPC handlers will use the relevant fields of
this identity for performing authorization.
This changeset includes:
* The main implementation of `Authenticate`
* Provide a new RPC `ACL.WhoAmI` for debugging authentication. This endpoint
returns the same `AuthenticatedIdentity` that will be used by RPC handlers. At
some point we might want to give this an equivalent HTTP endpoint but I didn't
want to add that to our public API until some of the other Workload Identity
work is solidified, especially if we don't need it yet.
* A full coverage test of the `Authenticate` method. This sets up two server
nodes with mTLS and ACLs, some tokens, and some allocations with workload
identities.
* Wire up an example of using `Authenticate` in the `Namespace.Upsert` RPC and
see how authorization happens after forwarding.
* A new semgrep rule for `Authenticate`, which we'll need to update once we're
ready to wire up more RPC endpoints with authorization steps.
Upcoming work to instrument the rate of RPC requests by consumer (and eventually
rate limit) requires that we thread the `RPCContext` through all RPC
handlers so that we can access the underlying connection. This changeset adds
the context to everywhere we intend to initially support it and intentionally
excludes streaming RPCs and client RPCs.
To improve the ergonomics of adding the context everywhere its needed and to
clarify the requirements of dynamic vs static handlers, I've also done a good
bit of refactoring here:
* canonicalized the RPC handler fields so they're as close to identical as
possible without introducing unused fields (i.e. I didn't add loggers if the
handler doesn't use them already).
* canonicalized the imports in the handler files.
* added a `NewExampleEndpoint` function for each handler that ensures we're
constructing the handlers with the required arguments.
* reordered the registration in server.go to match the order of the files (to
make it easier to see if we've missed one), and added a bunch of commentary
there as to what the difference between static and dynamic handlers is.
* Throw away result of multierror.Append
When given a *multierror.Error, it is mutated, therefore the return
value is not needed.
* Simplify MergeMultierrorWarnings, use StringBuilder
* Hash.Write() never returns an error
* Remove error that was always nil
* Remove error from Resources.Add signature
When this was originally written it could return an error, but that was
refactored away, and callers of it as of today never handle the error.
* Throw away results of io.Copy during Bridge
* Handle errors when computing node class in test