open-nomad

Commit Graph

Author	SHA1	Message	Date
Luiz Aoqui	1d3a38aae0	Revert "deps: update go-metrics to v0.5.3 in 1.6.x" (#19375 ) * Revert "deps: update go-metrics to v0.5.3 (#19190) (#19208)" This reverts commit 1112a282d76e67e26b3973a1e4cfc85b22678072. * changelog add entry for #19375	2023-12-08 08:47:02 -05:00
Luiz Aoqui	e552e1726f	deps: update go-metrics to v0.5.3 (#19190 ) (#19208 ) Update `go-metrics` to v0.5.3 to pick https://github.com/hashicorp/go-metrics/pull/146.	2023-11-28 13:52:25 -05:00
hashicorp-copywrite[bot]	005636afa0	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Tim Gross	8747059b86	service: fix regression in task access to list/read endpoint (#16316 ) When native service discovery was added, we used the node secret as the auth token. Once Workload Identity was added in Nomad 1.4.x we needed to use the claim token for `template` blocks, and so we allowed valid claims to bypass the ACL policy check to preserve the existing behavior. (Invalid claims are still rejected, so this didn't widen any security boundary.) In reworking authentication for 1.5.0, we unintentionally removed this bypass. For WIs without a policy attached to their job, everything works as expected because the resulting `acl.ACL` is nil. But once a policy is attached to the job the `acl.ACL` is no longer nil and this causes permissions errors. Fix the regression by adding back the bypass for valid claims. In future work, we should strongly consider getting turning the implicit policies into real `ACLPolicy` objects (even if not stored in state) so that we don't have these kind of brittle exceptions to the auth code.	2023-03-03 11:41:19 -05:00
Tim Gross	0e1b554299	handle `FSM.Apply` errors in `raftApply` (#16287 ) The signature of the `raftApply` function requires that the caller unwrap the first returned value (the response from `FSM.Apply`) to see if it's an error. This puts the burden on the caller to remember to check two different places for errors, and we've done so inconsistently. Update `raftApply` to do the unwrapping for us and return any `FSM.Apply` error as the error value. Similar work was done in Consul in https://github.com/hashicorp/consul/pull/9991. This eliminates some boilerplate and surfaces a few minor bugs in the process: * job deregistrations of already-GC'd jobs were still emitting evals * reconcile job summaries does not return scheduler errors * node updates did not report errors associated with inconsistent service discovery or CSI plugin states Note that although _most_ of the `FSM.Apply` functions return only errors (which makes it tempting to remove the first return value entirely), there are few that return `bool` for some reason and Variables relies on the response value for proper CAS checking.	2023-03-02 13:51:09 -05:00
Tim Gross	5e75ea9fb3	metrics: Add RPC rate metrics to endpoints that validate TLS names (#15900 )	2023-01-26 15:04:25 -05:00
Tim Gross	6677a103c2	metrics: measure rate of RPC requests that serve API (#15876 ) This changeset configures the RPC rate metrics that were added in #15515 to all the RPCs that support authenticated HTTP API requests. These endpoints already configured with pre-forwarding authentication in #15870, and a handful of others were done already as part of the proof-of-concept work. So this changeset is entirely copy-and-pasting one method call into a whole mess of handlers. Upcoming PRs will wire up pre-forwarding auth and rate metrics for the remaining set of RPCs that have no API consumers or aren't authenticated, in smaller chunks that can be more thoughtfully reviewed.	2023-01-25 16:37:24 -05:00
Tim Gross	f3f64af821	WI: allow workloads to use RPCs associated with HTTP API (#15870 ) This changeset allows Workload Identities to authenticate to all the RPCs that support HTTP API endpoints, for use with PR #15864. * Extends the work done for pre-forwarding authentication to all RPCs that support a HTTP API endpoint. * Consolidates the auth helpers used by the CSI, Service Registration, and Node endpoints that are currently used to support both tokens and client secrets. Intentionally excluded from this changeset: * The Variables endpoint still has custom handling because of the implicit policies. Ideally we'll figure out an efficient way to resolve those into real policies and then we can get rid of that custom handling. * The RPCs that don't currently support auth tokens (i.e. those that don't support HTTP endpoints) have not been updated with the new pre-forwarding auth We'll be doing this under a separate PR to support RPC rate metrics.	2023-01-25 14:33:06 -05:00
Tim Gross	f61f801e77	provide `RPCContext` to all RPC handlers (#15430 ) Upcoming work to instrument the rate of RPC requests by consumer (and eventually rate limit) requires that we thread the `RPCContext` through all RPC handlers so that we can access the underlying connection. This changeset adds the context to everywhere we intend to initially support it and intentionally excludes streaming RPCs and client RPCs. To improve the ergonomics of adding the context everywhere its needed and to clarify the requirements of dynamic vs static handlers, I've also done a good bit of refactoring here: * canonicalized the RPC handler fields so they're as close to identical as possible without introducing unused fields (i.e. I didn't add loggers if the handler doesn't use them already). * canonicalized the imports in the handler files. * added a `NewExampleEndpoint` function for each handler that ensures we're constructing the handlers with the required arguments. * reordered the registration in server.go to match the order of the files (to make it easier to see if we've missed one), and added a bunch of commentary there as to what the difference between static and dynamic handlers is.	2022-12-01 10:05:15 -05:00
James Rasell	9923f9e6f3	nnsd: gate registration write & delete RPC use on v1.3.0 or greater. (#14924 )	2022-10-18 15:30:28 +02:00
Seth Hoenig	bd2935ee54	cleanup: tweaks from cr feedback	2022-07-20 10:42:35 -05:00
Seth Hoenig	93cfeb177b	cleanup: example refactoring out map[string]struct{} using set.Set This PR is a little demo of using github.com/hashicorp/go-set to replace the use of map[T]struct{} as a make-shift set.	2022-07-19 22:50:49 -05:00
Tim Gross	83dc3ec758	secure variables ACL policies (#13294 ) Adds a new policy block inside namespaces to control access to secure variables on the basis of path, with support for globbing. Splits out VerifyClaim from ResolveClaim. The ServiceRegistration RPC only needs to be able to verify that a claim is valid for some allocation in the store; it doesn't care about implicit policies or capabilities. Split this out to its own method on the server so that the SecureVariables RPC can reuse it as a separate step from resolving policies (see next commit). Support implicit policies based on workload identity	2022-07-11 13:34:05 -04:00
Tim Gross	bfcbc00f4e	workload identity (#13223 ) In order to support implicit ACL policies for tasks to get their own secrets, each task would need to have its own ACL token. This would add extra raft overhead as well as new garbage collection jobs for cleaning up task-specific ACL tokens. Instead, Nomad will create a workload Identity Claim for each task. An Identity Claim is a JSON Web Token (JWT) signed by the server’s private key and attached to an Allocation at the time a plan is applied. The encoded JWT can be submitted as the X-Nomad-Token header to replace ACL token secret IDs for the RPCs that support identity claims. Whenever a key is is added to a server’s keyring, it will use the key as the seed for a Ed25519 public-private private keypair. That keypair will be used for signing the JWT and for verifying the JWT. This implementation is a ruthlessly minimal approach to support the secure variables feature. When a JWT is verified, the allocation ID will be checked against the Nomad state store, and non-existent or terminal allocation IDs will cause the validation to be rejected. This is sufficient to support the secure variables feature at launch without requiring implementation of a background process to renew soon-to-expire tokens.	2022-07-11 13:34:05 -04:00
Seth Hoenig	9467bc9eb3	api: enable selecting subset of services using rendezvous hashing This PR adds the 'choose' query parameter to the '/v1/service/<service>' endpoint. The value of 'choose' is in the form '<number>\|<key>', number is the number of desired services and key is a value unique but consistent to the requester (e.g. allocID). Folks aren't really expected to use this API directly, but rather through consul-template which will soon be getting a new helper function making use of this query parameter. Example, curl 'localhost:4646/v1/service/redis?choose=2\|abc123' Note: consul-templte v0.29.1 includes the necessary nomadServices functionality.	2022-06-25 10:37:37 -05:00
James Rasell	4cdc46ae75	service discovery: add pagination and filtering support to info requests (#12552 ) * services: add pagination and filter support to info RPC. * cli: add filter flag to service info command. * docs: add pagination and filter details to services info API. * paginator: minor updates to comment and func signature.	2022-04-13 07:41:44 +02:00
James Rasell	ea4d7366fc	service-disco: add mixed auth to list and read RPC endpoints. In the same manner as the delete RPC, the list and read service registration endpoints can be called either by external operators or Nomad nodes. The latter occurs when a template is being rendered which includes Nomad API template funcs. In this case, the auth token is looked up as the node secret ID for auth.	2022-04-04 13:45:43 +01:00
James Rasell	1ad8ea558a	rpc: add service registration RPC endpoints.	2022-03-03 11:25:29 +01:00

18 Commits