* backport of commit afa1f42cc719b13074f2f286202d8f21b8000753
* backport of commit e0970025d4c2e2702af30e642b37dd5e32561756
* backport of commit 2f2aad545b1ebcae22bb481b57115a679eb539e5
* backport of commit 4a5c9c181f50343911cd30fbb0f0475e473a2c7b
---------
Co-authored-by: Ranjandas <thejranjan@gmail.com>
Co-authored-by: Chris S. Kim <kisunji92@gmail.com>
* backport of commit d77048f1ea7136af9f627182bc79126f9472a060
* backport of commit 301de5980e7fcc8d68d16e6cdcbd5b2afea73711
* backport of commit 436bcda5ec58abbc5a65f6d244a1e1ec26beb46c
---------
Co-authored-by: Ronald Ekambi <ronekambi@gmail.com>
* backport of commit ad6652a47b5c5f9e1b8aed673922b918f56dbfcc
* backport of commit 7a1e99b6b774257ef6e293c2d47375e3b2332b9a
---------
Co-authored-by: Matt Keeler <mjkeeler7@gmail.com>
* backport of commit fb2f3b61004d9ef2296b51306ddbf5b6d72679ed
* backport of commit 178abb8495ba4bb35c29a835965e3f244a385865
* backport of commit 77b399877413c6e65669659deb8962c2cc5b52f6
* backport of commit a245b326ac030f7ef3292e7bbdb58e0b2850a12f
---------
Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com>
TLDR with many modules the versions included in each diverged quite a bit. Attempting to use Go Workspaces produces a bunch of errors.
This commit:
1. Fixes envoy-library-references.sh to work again
2. Ensures we are pulling in go-control-plane@v0.11.0 everywhere (previously it was at that version in some modules and others were much older)
3. Remove one usage of golang/protobuf that caused us to have a direct dependency on it.
4. Remove deprecated usage of the Endpoint field in the grpc resolver.Target struct. The current version of grpc (v1.55.0) has removed that field and recommended replacement with URL.Opaque and calls to the Endpoint() func when needing to consume the previous field.
4. `go work init <all the paths to go.mod files>` && `go work sync`. This syncrhonized versions of dependencies from the main workspace/root module to all submodules
5. Updated .gitignore to ignore the go.work and go.work.sum files. This seems to be standard practice at the moment.
6. Update doc comments in protoc-gen-consul-rate-limit to be go fmt compatible
7. Upgraded makefile infra to perform linting, testing and go mod tidy on all modules in a flexible manner.
8. Updated linter rules to prevent usage of golang/protobuf
9. Updated a leader peering test to account for an extra colon in a grpc error message.
Sameness groups with default-for-failover enabled did not function properly with
tproxy whenever all instances of the service disappeared from the local cluster.
This occured, because there were no corresponding resolvers (due to the implicit
failover policy) which caused VIPs to be deallocated.
This ticket expands upon the VIP allocations so that both service-defaults and
service-intentions (without destination wildcards) will ensure that the virtual
IP exists.
* Move hcp client to subpackage hcpclient (#16800)
* [HCP Observability] New MetricsClient (#17100)
* Client configured with TLS using HCP config and retry/throttle
* Add tests and godoc for metrics client
* close body after request
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* remove clone
* Extract CloudConfig and mock for future PR
* Switch to hclog.FromContext
* [HCP Observability] OTELExporter (#17128)
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* [HCP Observability] OTELSink (#17159)
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Initialize OTELSink with sync.Map for all the instrument stores.
* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.
* Switch to mutex instead of sync.Map to avoid type assertion
* Add gauge store
* Clarify comments
* return concrete sink type
* Fix lint errors
* Move gauge store to be within sink
* Use context.TODO,rebase and clenaup opts handling
* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1
* Fix imports
* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx
* Add lots of documentation to the OTELSink
* Fix gauge store comment and check ok
* Add select and ctx.Done() check to gauge callback
* use require.Equal for attributes
* Fixed import naming
* Remove float64 calls and add a NewGaugeStore method
* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store
* Generate 100 gauge operations
* Seperate the labels into goroutines in sink test
* Generate kv store for the test case keys to avoid using uuid
* Added a race test with 300 samples for OTELSink
* Do not pass in waitgroup and use error channel instead.
* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel
* Fix nits
* [HCP Observability] Init OTELSink in Telemetry (#17162)
* Move hcp client to subpackage hcpclient (#16800)
* [HCP Observability] New MetricsClient (#17100)
* Client configured with TLS using HCP config and retry/throttle
* Add tests and godoc for metrics client
* close body after request
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* remove clone
* Extract CloudConfig and mock for future PR
* Switch to hclog.FromContext
* [HCP Observability] New MetricsClient (#17100)
* Client configured with TLS using HCP config and retry/throttle
* Add tests and godoc for metrics client
* close body after request
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* remove clone
* Extract CloudConfig and mock for future PR
* Switch to hclog.FromContext
* [HCP Observability] New MetricsClient (#17100)
* Client configured with TLS using HCP config and retry/throttle
* Add tests and godoc for metrics client
* close body after request
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* remove clone
* Extract CloudConfig and mock for future PR
* Switch to hclog.FromContext
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Initialize OTELSink with sync.Map for all the instrument stores.
* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.
* Switch to mutex instead of sync.Map to avoid type assertion
* Add gauge store
* Clarify comments
* return concrete sink type
* Fix lint errors
* Move gauge store to be within sink
* Use context.TODO,rebase and clenaup opts handling
* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1
* Fix imports
* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx
* Add lots of documentation to the OTELSink
* Fix gauge store comment and check ok
* Add select and ctx.Done() check to gauge callback
* use require.Equal for attributes
* Fixed import naming
* Remove float64 calls and add a NewGaugeStore method
* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store
* Generate 100 gauge operations
* Seperate the labels into goroutines in sink test
* Generate kv store for the test case keys to avoid using uuid
* Added a race test with 300 samples for OTELSink
* [HCP Observability] OTELExporter (#17128)
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* Do not pass in waitgroup and use error channel instead.
* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel
* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1
* Initialize OTELSink with sync.Map for all the instrument stores.
* Added telemetry agent to client and init sink in deps
* Fixed client
* Initalize sink in deps
* init sink in telemetry library
* Init deps before telemetry
* Use concrete telemetry.OtelSink type
* add /v1/metrics
* Avoid returning err for telemetry init
* move sink init within the IsCloudEnabled()
* Use HCPSinkOpts in deps instead
* update golden test for configuration file
* Switch to using extra sinks in the telemetry library
* keep name MetricsConfig
* fix log in verifyCCMRegistration
* Set logger in context
* pass around MetricSink in deps
* Fix imports
* Rebased onto otel sink pr
* Fix URL in test
* [HCP Observability] OTELSink (#17159)
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Initialize OTELSink with sync.Map for all the instrument stores.
* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.
* Switch to mutex instead of sync.Map to avoid type assertion
* Add gauge store
* Clarify comments
* return concrete sink type
* Fix lint errors
* Move gauge store to be within sink
* Use context.TODO,rebase and clenaup opts handling
* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1
* Fix imports
* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx
* Add lots of documentation to the OTELSink
* Fix gauge store comment and check ok
* Add select and ctx.Done() check to gauge callback
* use require.Equal for attributes
* Fixed import naming
* Remove float64 calls and add a NewGaugeStore method
* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store
* Generate 100 gauge operations
* Seperate the labels into goroutines in sink test
* Generate kv store for the test case keys to avoid using uuid
* Added a race test with 300 samples for OTELSink
* Do not pass in waitgroup and use error channel instead.
* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel
* Fix nits
* pass extraSinks as function param instead
* Add default interval as package export
* remove verifyCCM func
* Add clusterID
* Fix import and add t.Parallel() for missing tests
* Kick Vercel CI
* Remove scheme from endpoint path, and fix error logging
* return metrics.MetricSink for sink method
* Update SDK
* [HCP Observability] Metrics filtering and Labels in Go Metrics sink (#17184)
* Move hcp client to subpackage hcpclient (#16800)
* [HCP Observability] New MetricsClient (#17100)
* Client configured with TLS using HCP config and retry/throttle
* Add tests and godoc for metrics client
* close body after request
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* remove clone
* Extract CloudConfig and mock for future PR
* Switch to hclog.FromContext
* [HCP Observability] New MetricsClient (#17100)
* Client configured with TLS using HCP config and retry/throttle
* Add tests and godoc for metrics client
* close body after request
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* remove clone
* Extract CloudConfig and mock for future PR
* Switch to hclog.FromContext
* [HCP Observability] New MetricsClient (#17100)
* Client configured with TLS using HCP config and retry/throttle
* Add tests and godoc for metrics client
* close body after request
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* remove clone
* Extract CloudConfig and mock for future PR
* Switch to hclog.FromContext
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Initialize OTELSink with sync.Map for all the instrument stores.
* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.
* Switch to mutex instead of sync.Map to avoid type assertion
* Add gauge store
* Clarify comments
* return concrete sink type
* Fix lint errors
* Move gauge store to be within sink
* Use context.TODO,rebase and clenaup opts handling
* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1
* Fix imports
* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx
* Add lots of documentation to the OTELSink
* Fix gauge store comment and check ok
* Add select and ctx.Done() check to gauge callback
* use require.Equal for attributes
* Fixed import naming
* Remove float64 calls and add a NewGaugeStore method
* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store
* Generate 100 gauge operations
* Seperate the labels into goroutines in sink test
* Generate kv store for the test case keys to avoid using uuid
* Added a race test with 300 samples for OTELSink
* [HCP Observability] OTELExporter (#17128)
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* Do not pass in waitgroup and use error channel instead.
* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel
* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1
* Initialize OTELSink with sync.Map for all the instrument stores.
* Added telemetry agent to client and init sink in deps
* Fixed client
* Initalize sink in deps
* init sink in telemetry library
* Init deps before telemetry
* Use concrete telemetry.OtelSink type
* add /v1/metrics
* Avoid returning err for telemetry init
* move sink init within the IsCloudEnabled()
* Use HCPSinkOpts in deps instead
* update golden test for configuration file
* Switch to using extra sinks in the telemetry library
* keep name MetricsConfig
* fix log in verifyCCMRegistration
* Set logger in context
* pass around MetricSink in deps
* Fix imports
* Rebased onto otel sink pr
* Fix URL in test
* [HCP Observability] OTELSink (#17159)
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Create new OTELExporter which uses the MetricsClient
Add transform because the conversion is in an /internal package
* Fix lint error
* early return when there are no metrics
* Add NewOTELExporter() function
* Downgrade to metrics SDK version: v1.15.0-rc.1
* Fix imports
* fix small nits with comments and url.URL
* Fix tests by asserting actual error for context cancellation, fix parallel, and make mock more versatile
* Cleanup error handling and clarify empty metrics case
* Fix input/expected naming in otel_transform_test.go
* add comment for metric tracking
* Add a general isEmpty method
* Add clear error types
* update to latest version 1.15.0 of OTEL
* Client configured with TLS using HCP config and retry/throttle
* run go mod tidy
* Remove one abstraction to use the config from deps
* Address PR feedback
* Initialize OTELSink with sync.Map for all the instrument stores.
* Moved PeriodicReader init to NewOtelReader function. This allows us to use a ManualReader for tests.
* Switch to mutex instead of sync.Map to avoid type assertion
* Add gauge store
* Clarify comments
* return concrete sink type
* Fix lint errors
* Move gauge store to be within sink
* Use context.TODO,rebase and clenaup opts handling
* Rebase onto otl exporter to downgrade metrics API to v1.15.0-rc.1
* Fix imports
* Update to latest stable version by rebasing on cc-4933, fix import, remove mutex init, fix opts error messages and use logger from ctx
* Add lots of documentation to the OTELSink
* Fix gauge store comment and check ok
* Add select and ctx.Done() check to gauge callback
* use require.Equal for attributes
* Fixed import naming
* Remove float64 calls and add a NewGaugeStore method
* Change name Store to Set in gaugeStore, add concurrency tests in both OTELSink and gauge store
* Generate 100 gauge operations
* Seperate the labels into goroutines in sink test
* Generate kv store for the test case keys to avoid using uuid
* Added a race test with 300 samples for OTELSink
* Do not pass in waitgroup and use error channel instead.
* Using SHA 7dea2225a218872e86d2f580e82c089b321617b0 to avoid build failures in otel
* Fix nits
* pass extraSinks as function param instead
* Add default interval as package export
* remove verifyCCM func
* Add clusterID
* Fix import and add t.Parallel() for missing tests
* Kick Vercel CI
* Remove scheme from endpoint path, and fix error logging
* return metrics.MetricSink for sink method
* Update SDK
* Added telemetry agent to client and init sink in deps
* Add node_id and __replica__ default labels
* add function for default labels and set x-hcp-resource-id
* Fix labels tests
* Commit suggestion for getDefaultLabels
Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>
* Fixed server.id, and t.Parallel()
* Make defaultLabels a method on the TelemetryConfig object
* Rename FilterList to lowercase filterList
* Cleanup filter implemetation by combining regex into a single one, and making the type lowercase
* Fix append
* use regex directly for filters
* Fix x-resource-id test to use mocked value
* Fix log.Error formats
* Forgot the len(opts.Label) optimization)
* Use cfg.NodeID instead
---------
Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>
* remove replic tag (#17484)
* [HCP Observability] Add custom metrics for OTEL sink, improve logging, upgrade modules and cleanup metrics client (#17455)
* Add custom metrics for Exporter and transform operations
* Improve deps logging
Run go mod tidy
* Upgrade SDK and OTEL
* Remove the partial success implemetation and check for HTTP status code in metrics client
* Add x-channel
* cleanup logs in deps.go based on PR feedback
* Change to debug log and lowercase
* address test operation feedback
* use GetHumanVersion on version
* Fix error wrapping
* Fix metric names
* [HCP Observability] Turn off retries for now until dynamically configurable (#17496)
* Remove retries for now until dynamic configuration is possible
* Clarify comment
* Update changelog
* improve changelog
---------
Co-authored-by: Joshua Timmons <joshua.timmons1@gmail.com>
* perf: Remove expensive reflection from raft/mesh hot path
Replaces a reflection-based copy of a struct in the mesh topology with a
deep-copy generated implementation.
This is in the hot-path of raft FSM updates, and the reflection overhead was a
substantial part of mesh registration times (~90%). This could manifest as raft
thread saturation, and resulting instability.
Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com>
* add changelog
---------
Co-authored-by: Joel Brandhorst <joel.brandhorst@gmail.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
This will likely happen frequently with sameness groups. Relaxing this
constraint is harmless for failover because xds/endpoints exludes cross
partition and peer endpoints.
* xds generation for routes api gateway
* Update gateway.go
* move buildHttpRoute into xds package
* Update agent/consul/discoverychain/gateway.go
* remove unneeded function
* convert http route code to only run for http protocol to future proof code path
* Update agent/consul/discoverychain/gateway.go
Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com>
* fix tests, clean up http check logic
* clean up todo
* Fix casing in docstring
* Fix import block, adjust docstrings
* update name and comment
* use constant value
* use constant
---------
Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com>
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
Fix ACL check on health endpoint
Prior to this change, the service health API would not explicitly return an
error whenever a token with invalid permissions was given, and it would instead
return empty results. With this change, a "Permission denied" error is returned
whenever data is queried. This is done to better support the agent cache, which
performs a fetch backoff sleep whenever ACL errors are encountered. Affected
endpoints are: `/v1/health/connect/` and `/v1/health/ingress/`.
* API Gateway XDS Primitives, endpoints and clusters (#17002)
* XDS primitive generation for endpoints and clusters
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* server_test
* deleted extra file
* add missing parents to test
---------
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* Routes for API Gateway (#17158)
* XDS primitive generation for endpoints and clusters
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* server_test
* deleted extra file
* add missing parents to test
* checkpoint
* delete extra file
* httproute flattening code
* linting issue
* so close on this, calling for tonight
* unit test passing
* add in header manip to virtual host
* upstream rebuild commented out
* Use consistent upstream name whether or not we're rebuilding
* Start working through route naming logic
* Fix typos in test descriptions
* Simplify route naming logic
* Simplify RebuildHTTPRouteUpstream
* Merge additional compiled discovery chains instead of overwriting
* Use correct chain for flattened route, clean up + add TODOs
* Remove empty conditional branch
* Restore previous variable declaration
Limit the scope of this PR
* Clean up, improve TODO
* add logging, clean up todos
* clean up function
---------
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* checkpoint, skeleton, tests not passing
* checkpoint
* endpoints xds cluster configuration
* resources test fix
* fix reversion in resources_test
* checkpoint
* Update agent/proxycfg/api_gateway.go
Co-authored-by: John Maguire <john.maguire@hashicorp.com>
* unit tests passing
* gofmt
* add deterministic sorting to appease the unit test gods
* remove panic
* Find ready upstream matching listener instead of first in list
* Clean up, improve TODO
* Modify getReadyUpstreams to filter upstreams by listener (#17410)
Each listener would previously have all upstreams from any route that bound to the listener. This is problematic when a route bound to one listener also binds to other listeners and so includes upstreams for multiple listeners. The list for a given listener would then wind up including upstreams for other listeners.
* clean up todos, references to api gateway in listeners_ingress
* merge in Nathan's fix
* Update agent/consul/discoverychain/gateway.go
* cleanup current todos, remove snapshot manipulation from generation code
* Update agent/structs/config_entry_gateways.go
Co-authored-by: Thomas Eckert <teckert@hashicorp.com>
* Update agent/consul/discoverychain/gateway.go
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* Update agent/consul/discoverychain/gateway.go
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* Update agent/proxycfg/snapshot.go
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
* clarified header comment for FlattenHTTPRoute, changed RebuildHTTPRouteUpstream to BuildHTTPRouteUpstream
* simplify cert logic
* Delete scratch
* revert route related changes in listener PR
* Update agent/consul/discoverychain/gateway.go
* Update agent/proxycfg/snapshot.go
* clean up uneeded extra lines in endpoints
---------
Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com>
Co-authored-by: John Maguire <john.maguire@hashicorp.com>
Co-authored-by: Thomas Eckert <teckert@hashicorp.com>
* agent: configure server lastseen timestamp
Signed-off-by: Dan Bond <danbond@protonmail.com>
* use correct config
Signed-off-by: Dan Bond <danbond@protonmail.com>
* add comments
Signed-off-by: Dan Bond <danbond@protonmail.com>
* use default age in test golden data
Signed-off-by: Dan Bond <danbond@protonmail.com>
* add changelog
Signed-off-by: Dan Bond <danbond@protonmail.com>
* fix runtime test
Signed-off-by: Dan Bond <danbond@protonmail.com>
* agent: add server_metadata
Signed-off-by: Dan Bond <danbond@protonmail.com>
* update comments
Signed-off-by: Dan Bond <danbond@protonmail.com>
* correctly check if metadata file does not exist
Signed-off-by: Dan Bond <danbond@protonmail.com>
* follow instructions for adding new config
Signed-off-by: Dan Bond <danbond@protonmail.com>
* add comments
Signed-off-by: Dan Bond <danbond@protonmail.com>
* update comments
Signed-off-by: Dan Bond <danbond@protonmail.com>
* Update agent/agent.go
Co-authored-by: Dan Upton <daniel@floppy.co>
* agent/config: add validation for duration with min
Signed-off-by: Dan Bond <danbond@protonmail.com>
* docs: add new server_rejoin_age_max config definition
Signed-off-by: Dan Bond <danbond@protonmail.com>
* agent: add unit test for checking server last seen
Signed-off-by: Dan Bond <danbond@protonmail.com>
* agent: log continually for 60s before erroring
Signed-off-by: Dan Bond <danbond@protonmail.com>
* pr comments
Signed-off-by: Dan Bond <danbond@protonmail.com>
* remove unneeded todo
* agent: fix error message
Signed-off-by: Dan Bond <danbond@protonmail.com>
---------
Signed-off-by: Dan Bond <danbond@protonmail.com>
Co-authored-by: Dan Upton <daniel@floppy.co>
* Add v1/internal/service-virtual-ip for manually setting service VIPs
* Attach service virtual IP info to compiled discovery chain
* Separate auto-assigned and manual VIPs in response
The grpc resolver implementation is fed from changes to the
router.Router. Within the router there is a map of various areas storing
the addressing information for servers in those areas. All map entries
are of the WAN variety except a single special entry for the LAN.
Addressing information in the LAN "area" are local addresses intended
for use when making a client-to-server or server-to-server request.
The client agent correctly updates this LAN area when receiving lan serf
events, so by extension the grpc resolver works fine in that scenario.
The server agent only initially populates a single entry in the LAN area
(for itself) on startup, and then never mutates that area map again.
For normal RPCs a different structure is used for LAN routing.
Additionally when selecting a server to contact in the local datacenter
it will randomly select addresses from either the LAN or WAN addressed
entries in the map.
Unfortunately this means that the grpc resolver stack as it exists on
server agents is either broken or only accidentally functions by having
servers dial each other over the WAN-accessible address. If the operator
disables the serf wan port completely likely this incidental functioning
would break.
This PR enforces that local requests for servers (both for stale reads
or leader forwarded requests) exclusively use the LAN "area" information
and also fixes it so that servers keep that area up to date in the
router.
A test for the grpc resolver logic was added, as well as a higher level
full-stack test to ensure the externally perceived bug does not return.
* snapshot: some improvments to the snapshot process
Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Fix multiple issues related to proxycfg health queries.
1. The datacenter was not being provided to a proxycfg query, which resulted in
bypassing agentless query optimizations and using the normal API instead.
2. The health rpc endpoint would return a zero index when insufficient ACLs were
detected. This would result in the agent cache performing an infinite loop of
queries in rapid succession without backoff.