This is the OSS portion of enterprise PR 2056.
This commit provides server-local implementations of the proxycfg.ConfigEntry
and proxycfg.ConfigEntryList interfaces, that source data from streaming events.
It makes use of the LocalMaterializer type introduced for peering replication,
adding the necessary support for authorization.
It also adds support for "wildcard" subscriptions (within a topic) to the event
publisher, as this is needed to fetch service-resolvers for all services when
configuring mesh gateways.
Currently, events will be emitted for just the ingress-gateway, service-resolver,
and mesh config entry types, as these are the only entries required by proxycfg
— the events will be emitted on topics named IngressGateway, ServiceResolver,
and MeshConfig topics respectively.
Though these events will only be consumed "locally" for now, they can also be
consumed via the gRPC endpoint (confirmed using grpcurl) so using them from
client agents should be a case of swapping the LocalMaterializer for an
RPCMaterializer.
Having this type live in the agent/consul package makes it difficult to
put anything that relies on token resolution (e.g. the new gRPC services)
in separate packages without introducing import cycles.
For example, if package foo imports agent/consul for the ACLResolveResult
type it means that agent/consul cannot import foo to register its service.
We've previously worked around this by wrapping the ACLResolver to
"downgrade" its return type to an acl.Authorizer - aside from the
added complexity, this also loses the resolved identity information.
In the future, we may want to move the whole ACLResolver into the
acl/resolver package. For now, putting the result type there at least,
fixes the immediate import cycle issues.
Require use of mesh gateways in order for service mesh data plane
traffic to flow between peers.
This also adds plumbing for envoy integration tests involving peers, and
one starter peering test.
* Install `buf` instead of `protoc`
* Created `buf.yaml` and `buf.gen.yaml` files in the two proto directories to control how `buf` generates/lints proto code.
* Invoke `buf` instead of `protoc`
* Added a `proto-format` make target.
* Committed the reformatted proto files.
* Added a `proto-lint` make target.
* Integrated proto linting with CI
* Fixed tons of proto linter warnings.
* Got rid of deprecated builtin protoc-gen-go grpc plugin usage. Moved to direct usage of protoc-gen-go-grpc.
* Unified all proto directories / go packages around using pb prefixes but ensuring all proto packages do not have the prefix.
Introduces two new public gRPC endpoints (`Login` and `Logout`) and
includes refactoring of the equivalent net/rpc endpoints to enable the
majority of logic to be reused (i.e. by extracting the `Binder` and
`TokenWriter` types).
This contains the OSS portions of the following enterprise commits:
- 75fcdbfcfa6af21d7128cb2544829ead0b1df603
- bce14b714151af74a7f0110843d640204082630a
- cc508b70fbf58eda144d9af3d71bd0f483985893
* Implement the ServerDiscovery.WatchServers gRPC endpoint
* Fix the ConnectCA.Sign gRPC endpoints metadata forwarding.
* Unify public gRPC endpoints around the public.TraceID function for request_id logging
Adds a new gRPC endpoint to get envoy bootstrap params. The new consul-dataplane service will use this
endpoint to generate an envoy bootstrap configuration.
Introduces a gRPC endpoint for signing Connect leaf certificates. It's also
the first of the public gRPC endpoints to perform leader-forwarding, so
establishes the pattern of forwarding over the multiplexed internal RPC port.
Previously we had 1 EventPublisher per state.Store. When a state store was closed/abandoned such as during a consul snapshot restore, this had the behavior of force closing subscriptions for that topic and evicting event snapshots from the cache.
The intention of this commit is to keep all that behavior. To that end, the shared EventPublisher now supports the ability to refresh a topic. That will perform the force close + eviction. The FSM upon abandoning the previous state.Store will call RefreshTopic for all the topics with events generated by the state.Store.
Adds a new gRPC service and endpoint to return the list of supported
consul dataplane features. The Consul Dataplane will use this API to
customize its interaction with that particular server.
Adds a new gRPC streaming endpoint (WatchRoots) that dataplane clients will
use to fetch the current list of active Connect CA roots and receive new
lists whenever the roots are rotated.
* mogify needed pbcommon structs
* mogify needed pbconnect structs
* fix compilation errors and make config_translate_test pass
* add missing file
* remove redundant oss func declaration
* fix EnterpriseMeta to copy the right data for enterprise
* rename pbcommon package to pbcommongogo
* regenerate proto and mog files
* add missing mog files
* add pbcommon package
* pbcommon no mog
* fix enterprise meta code generation
* fix enterprise meta code generation (pbcommongogo)
* fix mog generation for gogo
* use `protoc-go-inject-tag` to inject tags
* rename proto package
* pbcommon no mog
* use `protoc-go-inject-tag` to inject tags
* add non gogo proto to make file
* fix proto get
Introduces the capability to configure TLS differently for Consul's
listeners/ports (i.e. HTTPS, gRPC, and the internal multiplexed RPC
port) which is useful in scenarios where you may want the HTTPS or
gRPC interfaces to present a certificate signed by a well-known/public
CA, rather than the certificate used for internal communication which
must have a SAN in the form `server.<dc>.consul`.
Fixes a bug whereby servers present in multiple network areas would be
properly segmented in the Router, but not in the gRPC mirror. This would
lead servers in the current datacenter leaving from a network area
(possibly during the network area's removal) from deleting their own
records that still exist in the standard WAN area.
The gRPC client stack uses the gRPC server tracker to execute all RPCs,
even those targeting members of the current datacenter (which is unlike
the net/rpc stack which has a bypass mechanism).
This would manifest as a gRPC method call never opening a socket because
it would block forever waiting for the current datacenter's pool of
servers to be non-empty.
Previously we believe it was necessary for all code that required ports
to use freeport to prevent conflicts.
https://github.com/dnephin/freeport-test shows that it is actually save
to use port 0 (`127.0.0.1:0`) as long as it is passed directly to
`net.Listen`, and the listener holds the port for as long as it is
needed.
This works because freeport explicitly avoids the ephemeral port range,
and port 0 always uses that range. As you can see from the test output
of https://github.com/dnephin/freeport-test, the two systems never use
overlapping ports.
This commit converts all uses of freeport that were being passed
directly to a net.Listen to use port 0 instead. This allows us to remove
a bit of wrapping we had around httptest, in a couple places.
1) xds and grpc servers:
1.1) to use recovery middleware with callback that prints stack trace to log
1.2) callback turn the panic into a core.Internal error
2) added unit test for grpc server
* deps: upgrade gogo-protobuf to v1.3.2
* go mod tidy using go 1.16
* proto: regen protobufs after upgrading gogo/protobuf
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
sync/atomic must be used with 64-bit aligned fields, and that alignment is difficult to
ensure unless the field is the first one in the struct.
https://golang.org/pkg/sync/atomic/#pkg-note-BUG.
We have seen test flakes caused by 'concurrent map read and map write', and the race detector
reports the problem as well (prevent us from running some tests with -race).
The root of the problem is the grpc expects resolvers to be registered at init time
before any requests are made, but we were using a separate resolver for each test.
This commit introduces a resolver registry. The registry is registered as the single
resolver for the consul scheme. Each test uses the Authority section of the target
(instead of the scheme) to identify the resolver that should be used for the test.
The scheme is used for lookup, which is why it can no longer be used as the unique
key.
This allows us to use a lock around the map of resolvers, preventing the data race.
defaultMetrics was being set at package import time, which meant that it received an instance of
the original default. But lib/telemetry.InitTelemetry sets a new global when it is called.
This resulted in the metrics being sent nowhere.
This commit changes defaultMetrics to be a function, so it will return the global instance when
called. Since it is called after InitTelemetry it will return the correct metrics instance.