Protobuf Refactoring for Multi-Module Cleanliness
This commit includes the following:
Moves all packages that were within proto/ to proto/private
Rewrites imports to account for the packages being moved
Adds in buf.work.yaml to enable buf workspaces
Names the proto-public buf module so that we can override the Go package imports within proto/buf.yaml
Bumps the buf version dependency to 1.14.0 (I was trying out the version to see if it would get around an issue - it didn't but it also doesn't break things and it seemed best to keep up with the toolchain changes)
Why:
In the future we will need to consume other protobuf dependencies such as the Google HTTP annotations for openapi generation or grpc-gateway usage.
There were some recent changes to have our own ratelimiting annotations.
The two combined were not working when I was trying to use them together (attempting to rebase another branch)
Buf workspaces should be the solution to the problem
Buf workspaces means that each module will have generated Go code that embeds proto file names relative to the proto dir and not the top level repo root.
This resulted in proto file name conflicts in the Go global protobuf type registry.
The solution to that was to add in a private/ directory into the path within the proto/ directory.
That then required rewriting all the imports.
Is this safe?
AFAICT yes
The gRPC wire protocol doesn't seem to care about the proto file names (although the Go grpc code does tack on the proto file name as Metadata in the ServiceDesc)
Other than imports, there were no changes to any generated code as a result of this.
The consul container tests orchestrate running containers from various
versions of consul to test things like upgrades. Having the test
framework itself depend on the consul codebase inherently links it to a
specific version of consul which may make some test approaches in the
future difficult.
This change prohibits any such relationship via a custom linting rule.
Unfortunately because the api, sdk, and
test/integration/consul-container packages are submodules of
github.com/hashicorp/consul the gomodguard linter is incapable of
handling those separately hence the need for some custom bash instead.
Previously, we'd begin a session with the xDS concurrency limiter
regardless of whether the proxy was registered in the catalog or in
the server's local agent state.
This caused problems for users who run `consul connect envoy` directly
against a server rather than a client agent, as the server's locally
registered proxies wouldn't be included in the limiter's capacity.
Now, the `ConfigSource` is responsible for beginning the session and we
only do so for services in the catalog.
Fixes: https://github.com/hashicorp/consul/issues/15753
- remove dep on consul main module
- use 'consul tls' subcommands instead of tlsutil
- use direct json config construction instead of agent/config structs
- merge libcluster and libagent packages together
- more widely use BuildContext
- get the OSS/ENT runner stuff working properly
- reduce some flakiness
- fix some correctness related to http/https API
* Protobuf Modernization
Remove direct usage of golang/protobuf in favor of google.golang.org/protobuf
Marshallers (protobuf and json) needed some changes to account for different APIs.
Moved to using the google.golang.org/protobuf/types/known/* for the well known types including replacing some custom Struct manipulation with whats available in the structpb well known type package.
This also updates our devtools script to install protoc-gen-go from the right location so that files it generates conform to the correct interfaces.
* Fix go-mod-tidy make target to work on all modules
Previously, I would sometimes get Killed errors when executing the
binary created by `make dev`. This is because we were using `cp`
and OSX has a bug where it caches the old signature and so doesn't
allow the new binary to run (see https://apple.stackexchange.com/a/428388)
Fixes a `go vet` warning caused by the pragma.DoNotCopy on the protobuf
message type.
Originally I'd hoped we wouldn't need any reflection in the proxycfg hot
path, but it seems proto.Clone is the only supported way to copy a message.
Replaces the reflection-based implementation of proxycfg's
ConfigSnapshot.Clone with code generated by deep-copy.
While load testing server-based xDS (for consul-dataplane) we discovered
this method is extremely expensive. The ConfigSnapshot struct, directly
or indirectly, contains a copy of many of the structs in the agent/structs
package, which creates a large graph for copystructure.Copy to traverse
at runtime, on every proxy reconfiguration.
Locally, always run integration tests using amd64, even if running
on an arm mac. This ensures the architecture locally always matches
the CI/CD environment.
In addition:
* Use consul:local for envoy integration and upgrade tests. Previously,
consul:local was used for upgrade tests and consul-dev for integration
tests. I didn't see a reason to use separate images as it's more
confusing.
* By default, disable the requirement that aws credentials are set.
These are only needed for the lambda tests and make it so you
can't run any tests locally, even if you're not running the
lambda tests. Now they'll only run if the LAMBDA_TESTS_ENABLED
env var is set.
* Split out the building of the Docker image for integration
tests into its own target from `dev-docker`. This allows us to always
use an amd64 image without messing up the `dev-docker` target.
* Add support for passing GO_TEST_FLAGs to `test-envoy-integ` target.
* Add a wait_for_leader function because tests were failing locally
without it.
We cannot do this for "subscribe" and "partition" this easily without
breakage so those are omitted.
Any protobuf message passed around via an Any construct will have the
fully qualified package name embedded in the protobuf as a string. Also
RPC method dispatch will include the package of the service during
serialization.
- We will be passing pbservice and pbpeering through an Any as part of
peer stream replication.
- We will be exposing two new gRPC services via pbpeering and
pbpeerstream.
Replace bindata packages with stdlib go:embed.
Modernize some uiserver code with newer interfaces introduced in go 1.16 (mainly working with fs.File instead of http.File.
Remove steps that are no longer used from our build files.
Add Github Action to detect differences in agent/uiserver/dist and verify that the files are correct (by compiling UI assets and comparing contents).
I noticed that the JSON api endpoints for peerings json encodes protobufs directly, rather than converting them into their `api` package equivalents before marshal/unmarshaling them.
I updated this and used `mog` to do the annoying part in the middle.
Other changes:
- the status enum was converted into the friendlier string form of the enum for readability with tools like `curl`
- some of the `api` library functions were slightly modified to match other similar endpoints in UX (cc: @ndhanushkodi )
- peeringRead returns `nil` if not found
- partitions are NOT inferred from the agent's partition (matching 1.11-style logic)
* Install `buf` instead of `protoc`
* Created `buf.yaml` and `buf.gen.yaml` files in the two proto directories to control how `buf` generates/lints proto code.
* Invoke `buf` instead of `protoc`
* Added a `proto-format` make target.
* Committed the reformatted proto files.
* Added a `proto-lint` make target.
* Integrated proto linting with CI
* Fixed tons of proto linter warnings.
* Got rid of deprecated builtin protoc-gen-go grpc plugin usage. Moved to direct usage of protoc-gen-go-grpc.
* Unified all proto directories / go packages around using pb prefixes but ensuring all proto packages do not have the prefix.
* add a sample
* Consul cluster test
* add build dockerfile
* add tests to cover mixed versions tests
* use flag to pass docker image name
* remove default config and rely on flags to inject the right image to test
* add cluster abstraction
* fix imports and remove old files
* fix imports and remove old files
* fix dockerIgnore
* make a `Node interface` and encapsulate ConsulContainer
* fix a test bug where we only check the leader against a single node.
* add upgrade tests to CI
* fix yaml alignment
* fix alignment take 2
* fix flag naming
* fix image to build
* fix test run and go mod tidy
* add a debug command
* run without RYUK
* fix parallel run
* add skip reaper code
* make tempdir in local dir
* chmod the temp dir to 0777
* chmod the right dir name
* change executor to use machine instead of docker
* add docker layer caching
* remove setup docker
* add gotestsum
* install go version
* use variable for GO installed version
* add environment
* add environment in the right place
* do not disable RYUK in CI
* add service check to tests
* assertions outside routines
* add queryBackend to the api query meta.
* check if we are using the right backend for those tests (streaming)
* change the tested endpoint to use one that have streaming.
* refactor to test multiple scenarios for streaming
* Fix dockerfile
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
* rename Clients to clients
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
* check if cluster have 0 node
* tidy code and add some doc strings
* use uuid instead of random string
* add doc strings to tests
* add queryBackend to the api query meta.
* add a changelog
* fix for api backend query
* add missing require
* fix q.QueryBackend
* Revert "fix q.QueryBackend"
This reverts commit cd0e5f7b1a1730e191673d624f8e89b591871c05.
* fix circle ci config
* tidy go mod after merging main
* rename package and fix test scenario
* update go download url
* address review comments
* rename flag in CI
* add readme to the upgrade tests
* fix golang download url
* fix golang arch downloaded
* fix AddNodes to handle an empty cluster case
* use `parseBool`
* rename circle job and add comment
* update testcontainer to 0.13
* fix circle ci config
* remove build docker file and use `make dev-docker` instead
* Apply suggestions from code review
Co-authored-by: Dan Upton <daniel@floppy.co>
* fix a typo
Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Dan Upton <daniel@floppy.co>
* mogify needed pbcommon structs
* mogify needed pbconnect structs
* fix compilation errors and make config_translate_test pass
* add missing file
* remove redundant oss func declaration
* fix EnterpriseMeta to copy the right data for enterprise
* rename pbcommon package to pbcommongogo
* regenerate proto and mog files
* add missing mog files
* add pbcommon package
* pbcommon no mog
* fix enterprise meta code generation
* fix enterprise meta code generation (pbcommongogo)
* fix mog generation for gogo
* use `protoc-go-inject-tag` to inject tags
* rename proto package
* pbcommon no mog
* use `protoc-go-inject-tag` to inject tags
* add non gogo proto to make file
* fix proto get
Previously we were using two different criteria to decide where to run a
test. The main `go-test` job would skip Vault tests based on the
presence of the `vault` binary, but the `test-connect-ca-providers` job
would run tests based on the name.
This led to a scenario where a test may never run in CI.
To fix this problem I added a name check to the function we use to skip
the test. This should ensure that any test that requires vault is named
correctly to be run as part of the `test-connect-ca-providers` job.
At the same time I relaxed the regex we use. I verified this runs the
same tests using `go test --list Vault`. I made this change because a
bunch of tests in `agent/connect/ca` used `Vault` in the name, without
the underscores. Instead of changing a bunch of test names, this seemed
easier.
With this approach, the worst case is that we run a few extra tests in
the `test-connect-ca-providers` job, which doesn't seem like a problem.
This restores the prior behavior of make dev and ensures that tests
using the sdk package (like the api package) will correctly locate the
consul binary under test.
Also ensure the constructed consul binary is present on the path for sdk-based tests.