Protobuf Refactoring for Multi-Module Cleanliness
This commit includes the following:
Moves all packages that were within proto/ to proto/private
Rewrites imports to account for the packages being moved
Adds in buf.work.yaml to enable buf workspaces
Names the proto-public buf module so that we can override the Go package imports within proto/buf.yaml
Bumps the buf version dependency to 1.14.0 (I was trying out the version to see if it would get around an issue - it didn't but it also doesn't break things and it seemed best to keep up with the toolchain changes)
Why:
In the future we will need to consume other protobuf dependencies such as the Google HTTP annotations for openapi generation or grpc-gateway usage.
There were some recent changes to have our own ratelimiting annotations.
The two combined were not working when I was trying to use them together (attempting to rebase another branch)
Buf workspaces should be the solution to the problem
Buf workspaces means that each module will have generated Go code that embeds proto file names relative to the proto dir and not the top level repo root.
This resulted in proto file name conflicts in the Go global protobuf type registry.
The solution to that was to add in a private/ directory into the path within the proto/ directory.
That then required rewriting all the imports.
Is this safe?
AFAICT yes
The gRPC wire protocol doesn't seem to care about the proto file names (although the Go grpc code does tack on the proto file name as Metadata in the ServiceDesc)
Other than imports, there were no changes to any generated code as a result of this.
* remove legacy tokens
* Update test comment
Co-authored-by: Paul Glass <pglass@hashicorp.com>
* fix imports
* update docs for additional CLI changes
* add test case for anonymous token
* set deprecated api fields to json ignore and fix patch errors
* update changelog to breaking-change
* fix import
* update api docs to remove legacy reference
* fix docs nav data
---------
Co-authored-by: Paul Glass <pglass@hashicorp.com>
We have many indexer functions in Consul which take interface{} and type assert before building the index. We can use generics to get rid of the initial plumbing and pass around functions with better defined signatures. This has two benefits: 1) Less verbosity; 2) Developers can parse the argument types to memdb schemas without having to introspect the function for the type assertion.
* Port consul-enterprise #1123 to OSS
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* Fixup missing query field
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* change to re-trigger ci system
Signed-off-by: Mark Anderson <manderson@hashicorp.com>
* move intFromBool to be available for oss
* add expiry indexes
* remove dead code: `TokenExpirationIndex`
* fix remove indexer `TokenExpirationIndex`
* fix rebase issue
Previously the tokens would fail to insert into the secondary's state
store because the AuthMethod field of the ACLToken did not point to a
known auth method from the primary.
Making these functions allows them to be used without introducing
an artificial dependency on the struct. Many of these will be called
from streaming Event processors, which do not have a store.
This change is being made ahead of the streaming work to get to reduce
the size of the streaming diff.
* Fixes#5606: Tokens converted from legacy ACLs get their Hash computed
This allows new style token replication to work for legacy tokens as well when they change.
* tests: fix timestamp comparison
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
Main Changes:
• method signature updates everywhere to account for passing around enterprise meta.
• populate the EnterpriseAuthorizerContext for all ACL related authorizations.
• ACL resource listings now operate like the catalog or kv listings in that the returned entries are filtered down to what the token is allowed to see. With Namespaces its no longer all or nothing.
• Modified the acl.Policy parsing to abstract away basic decoding so that enterprise can do it slightly differently. Also updated method signatures so that when parsing a policy it can take extra ent metadata to use during rules validation and policy creation.
Secondary Changes:
• Moved protobuf encoding functions out of the agentpb package to eliminate circular dependencies.
• Added custom JSON unmarshalers for a few ACL resource types (to support snake case and to get rid of mapstructure)
• AuthMethod validator cache is now an interface as these will be cached per-namespace for Consul Enterprise.
• Added checks for policy/role link existence at the RPC API so we don’t push the request through raft to have it fail internally.
• Forward ACL token delete request to the primary datacenter when the secondary DC doesn’t have the token.
• Added a bunch of ACL test helpers for inserting ACL resource test data.
* Add build system support for protobuf generation
This is done generically so that we don’t have to keep updating the makefile to add another proto generation.
Note: anything not in the vendor directory and with a .proto extension will be run through protoc if the corresponding namespace.pb.go file is not up to date.
If you want to rebuild just a single proto file you can do so with: make proto-rebuild PROTOFILES=<list of proto files to rebuild>
Providing the PROTOFILES var will override the default behavior of finding all the .proto files.
* Start adding types to the agent/proto package
These will be needed for some other work and are by no means comprehensive.
* Add ability to resolve/fixup the agentpb.ACLLinks structure in the state store.
* Use protobuf marshalling of raft requests instead of msgpack for protoc generated types.
This does not change any encoding of existing types.
* Removed structs package automatically encoding with protobuf marshalling
Instead the caller of raftApply that wants to opt-in to protobuf encoding will have to call `raftApplyProtobuf`
* Run update-vendor to fixup modules.txt
Nothing changed as far as dependencies go but the ordering of modules in that file depends on the time they are first seen and its not alphabetical.
* Rename some things and implement the structs.RPCInfo interface bits
agentpb.QueryOptions and agentpb.WriteRequest implement 3 of the 4 RPCInfo funcs and the new TargetDatacenter message type implements the fourth.
* Use the right encoding function.
* Renamed agent/proto package to agent/agentpb to prevent package name conflicts
* Update modules.txt to fix ordering
* Change blockingQuery to take in interfaces for the query options and meta
* Add %T to error output.
* Add/Update some comments
All these changes should have no side-effects or change behavior:
- Use bytes.Buffer's String() instead of a conversion
- Use time.Since and time.Until where fitting
- Drop unnecessary returns and assignment
Roles are named and can express the same bundle of permissions that can
currently be assigned to a Token (lists of Policies and Service
Identities). The difference with a Role is that it not itself a bearer
token, but just another entity that can be tied to a Token.
This lets an operator potentially curate a set of smaller reusable
Policies and compose them together into reusable Roles, rather than
always exploding that same list of Policies on any Token that needs
similar permissions.
This also refactors the acl replication code to be semi-generic to avoid
3x copypasta.
This is mainly to avoid having the API return "0001-01-01T00:00:00Z" as
a value for the ExpirationTime field when it is not set. Unfortunately
time.Time doesn't respect the json marshalling "omitempty" directive.
These act like a special cased version of a Policy Template for granting
a token the privileges necessary to register a service and its connect
proxy, and read upstreams from the catalog.
Previously we were fixing up the token links directly on the *ACLToken returned by memdb. This invalidated some assumptions that a snapshot is immutable as well as potentially being able to cause a crash.
The fix here is to give the policy link fixing function copy on write semantics. When no fixes are necessary we can return the memdb object directly, otherwise we copy it and create a new list of links.
Eventually we might find a better way to keep those policy links in sync but for now this fixes the issue.
* Add leader token upgrade test and fix various ACL enablement bugs
* Update the leader ACL initialization tests.
* Add a StateStore ACL tests for ACLTokenSet and ACLTokenGetBy* functions
* Advertise the agents acl support status with the agent/self endpoint.
* Make batch token upsert CAS’able to prevent consistency issues with token auto-upgrade
* Finish up the ACL state store token tests
* Finish the ACL state store unit tests
Also rename some things to make them more consistent.
* Do as much ACL replication testing as I can.
This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week.
Description
At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers.
On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though.
Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though.
All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management.
Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are:
A server running the new system must still support other clients using the legacy system.
A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system.
The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode.
So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.