Commit Graph

5059 Commits

Author SHA1 Message Date
Dan Upton f7c4f04060
Controller Supervision (#17016) 2023-04-25 12:52:35 +01:00
John Maguire c5b7164b16
APIGW Normalize Status Conditions (#16994)
* normalize status conditions for gateways and routes

* Added tests for checking condition status and panic conditions for
validating combinations, added dummy code for fsm store

* get rid of unneeded gateway condition generator struct

* Remove unused file

* run go mod tidy

* Update tests, add conflicted gateway status

* put back removed status for test

* Fix linting violation, remove custom conflicted status

* Update fsm commands oss

* Fix incorrect combination of type/condition/status

* cleaning up from PR review

* Change "invalidCertificate" to be of accepted status

* Move status condition enums into api package

* Update gateways controller and generated code

* Update conditions in fsm oss tests

* run go mod tidy on consul-container module to fix linting

* Fix type for gateway endpoint test

* go mod tidy from changes to api

* go mod tidy on troubleshoot

* Fix route conflicted reason

* fix route conflict reason rename

* Fix text for gateway conflicted status

* Add valid certificate ref condition setting

* Revert change to resolved refs to be handled in future PR
2023-04-24 16:22:55 -04:00
Michael Wilkerson 40dd8ce65b
Add sameness group field to prepared queries (#17089)
* added method for converting SamenessGroupConfigEntry
- added new method `ToQueryFailoverTargets` for converting a SamenessGroupConfigEntry's members to a list of QueryFailoverTargets
- renamed `ToFailoverTargets` ToServiceResolverFailoverTargets to distinguish it from `ToQueryFailoverTargets`

* Added SamenessGroup to PreparedQuery
- exposed Service.Partition to API when defining a prepared query
- added a method for determining if a QueryFailoverOptions is empty
- This will be useful for validation
- added unit tests

* added method for retrieving a SamenessGroup to state store

* added logic for using PQ with SamenessGroup
- added branching path for SamenessGroup handling in execute. It will be handled separate from the normal PQ case
- added a new interface so that the `GetSamenessGroupFailoverTargets` can be properly tested
- separated the execute logic into a `targetSelector` function so that it can be used for both failover and sameness group PQs
- split OSS only methods into new PQ OSS files
- added validation that `samenessGroup` is an enterprise only feature

* added documentation for PQ SamenessGroup
2023-04-24 13:21:28 -07:00
Derek Menteer 136adf52da
Fix virtual services being included in intention topology as downstreams. (#17099) 2023-04-24 12:03:26 -05:00
Semir Patel 2409c32e20
De-scope tenenacy requirements to OSS only for now. (#17087)
Partition and namespace must be "default"
Peername must be "local"
2023-04-24 08:14:51 -05:00
Kyle Havlovitz dee8609793
Include virtual services from discovery chain in intention topology (#16862) 2023-04-21 16:58:13 +00:00
Kyle Havlovitz 29daa2f054
Add manual virtual IP support to state store (#16815) 2023-04-21 09:19:02 -07:00
Eric Haberkorn e61ba07fa1
Fix a bug with disco chain config entry fetching (#17078)
Before this change, we were not fetching service resolvers (and therefore
service defaults) configuration entries for services on members of sameness
groups.
2023-04-21 09:18:32 -04:00
Semir Patel b12d638046
Enforce operator:write acl on `WriteStatus` endpoint (#17019) 2023-04-20 16:25:33 +00:00
Eric Haberkorn 87994e4c5f
Add sameness groups to service intentions. (#17064) 2023-04-20 12:16:04 -04:00
hashicorp-copywrite[bot] 87aee8308b
[COMPLIANCE] Add Copyright and License Headers (#16854)
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
Co-authored-by: Ronald <roncodingenthusiast@users.noreply.github.com>
2023-04-20 12:40:22 +00:00
Paul Glass 972e81bd67
[NET-3091] Update service intentions to support jwt provider references (#17037)
* [NET-3090] Add new JWT provider config entry

* Add initial test cases

* update validations for jwt-provider config entry fields

* more validation

* start improving tests

* more tests

* Normalize

* Improve tests and move validate fns

* usage test update

* Add split between ent and oss for partitions

* fix lint issues

* Added retry backoff, fixed tests, removed unused defaults

* take into account default partitions

* use countTrue and add aliases

* omit audiences if empty

* fix failing tests

* add omit-entry

* Add JWT intentions

* generate proto

* fix deep copy issues

* remove extra field

* added some tests

* more tests

* add validation for creating existing jwt

* fix nil issue

* More tests, fix conflicts and improve memdb call

* fix namespace

* add aliases

* consolidate errors, skip duplicate memdb calls

* reworked iteration over config entries

* logic improvements from review

---------

Co-authored-by: Ronald Ekambi <ronekambi@gmail.com>
2023-04-19 18:16:39 -04:00
Paul Glass 91ca3b012c
[NET-3090] Add new JWT provider config entry (#17036)
* [NET-3090] Add new JWT provider config entry

* Add initial test cases

* update validations for jwt-provider config entry fields

* more validation

* start improving tests

* more tests

* Normalize

* Improve tests and move validate fns

* usage test update

* Add split between ent and oss for partitions

* fix lint issues

* Added retry backoff, fixed tests, removed unused defaults

* take into account default partitions

* use countTrue and add aliases

* omit audiences if empty

* fix failing tests

* add omit-entry

* update copyright headers ids

---------

Co-authored-by: Ronald Ekambi <ronekambi@gmail.com>
Co-authored-by: Ronald <roncodingenthusiast@users.noreply.github.com>
2023-04-19 17:54:14 -04:00
Paul Glass d8d89d4b59
Permissive mTLS (#17035)
This implements permissive mTLS , which allows toggling services into "permissive" mTLS mode.
Permissive mTLS mode allows incoming "non Consul-mTLS" traffic to be forward unmodified to the application.

* Update service-defaults and proxy-defaults config entries with a MutualTLSMode field
* Update the mesh config entry with an AllowEnablingPermissiveMutualTLS field and implement the necessary validation. AllowEnablingPermissiveMutualTLS must be true to allow changing to MutualTLSMode=permissive, but this does not require that all proxy-defaults and service-defaults are currently in strict mode.
* Update xDS listener config to add a "permissive filter chain" when MutualTLSMode=permissive for a particular service. The permissive filter chain matches incoming traffic by the destination port. If the destination port matches the service port from the catalog, then no mTLS is required and the traffic sent is forwarded unmodified to the application.
2023-04-19 14:45:00 -05:00
R.B. Boyer 5e019393d3
Revert "cache: refactor agent cache fetching to prevent unnecessary f… (#16818) (#17046)
Revert "cache: refactor agent cache fetching to prevent unnecessary fetches on error (#14956)"

Co-authored-by: Derek Menteer <105233703+hashi-derek@users.noreply.github.com>
2023-04-19 13:17:21 -05:00
John Murret d7c488762e
ci: remove test-integrations CircleCI workflow (#16928)
* remove all CircleCI files

* remove references to CircleCI

* remove more references to CircleCI

* pin golangci-lint to v1.51.1 instead of v1.51
2023-04-19 16:19:29 +00:00
Luke Kysow 9345ce89d4
Don't send updates twice (#16999) 2023-04-18 10:41:58 -07:00
Poonam Jadhav 4f7d1b4700
feat: set up reporting agent (#16991) 2023-04-18 11:03:05 -04:00
Dan Upton 3466c85cc4
server: wire up in-process Resource Service (#16978) 2023-04-18 10:03:23 +01:00
Semir Patel 0674f30fc1
Tenancy wildcard validaton for `Write`, `Read`, and `Delete` endpoints (#17004) 2023-04-17 16:33:20 -05:00
Derek Menteer 7ce928a42e
Add PrioritizeByLocality to config entries. (#17007)
This commit adds the PrioritizeByLocality field to both proxy-config
and service-resolver config entries for locality-aware routing. The
field is currently intended for enterprise only, and will be used to
enable prioritization of service-mesh connections to services based
on geographical region / zone.
2023-04-14 15:42:54 -05:00
Michael Wilkerson 4edb1b553d
* added Sameness Group to proto files (#16998)
- added Sameness Group to config entries
- added Sameness Group to subscriptions

* generated proto files

* added Sameness Group events to the state store
- added test cases

* Refactored health RPC Client
- moved code that is common to rpcclient under rpcclient common.go. This will help set us up to support future RPC clients

* Refactored proxycfg glue views
- Moved views to rpcclient config entry. This will allow us to reuse this code for a config entry client

* added config entry RPC Client
- Copied most of the testing code from rpcclient/health

* hooked up new rpcclient in agent

* fixed documentation and comments for clarity
2023-04-14 09:24:46 -07:00
Dhia Ayachi b628355cc4
add IP rate limiting config update (#16997)
* add IP rate limiting config update

* fix review comments
2023-04-14 09:26:38 -04:00
Semir Patel fc3d024d4d
Enforce Owner rules in `Write` endpoint (#16983) 2023-04-14 08:19:46 -05:00
Semir Patel 1f860b99d2
Fix delete when uid not provided (#16996) 2023-04-14 08:18:24 -05:00
Eric Haberkorn ece9b58e97
move enterprise test cases out of open source (#16985) 2023-04-13 09:07:06 -04:00
Semir Patel f9311318e1
Add mutate hook to `Write` endpoint (#16958) 2023-04-12 16:50:07 -05:00
Semir Patel 53a0755f03
Enforce ACLs on resource `Write` and `Delete` endpoints (#16956) 2023-04-12 16:22:44 -05:00
Dhia Ayachi 825663b38a
Memdb Txn Commit race condition fix (#16871)
* Add a test to reproduce the race condition

* Fix race condition by publishing the event after the commit and adding a lock to prevent out of order events.

* split publish to generate the list of events before committing the transaction.

* add changelog

* remove extra func

* Apply suggestions from code review

Co-authored-by: Dan Upton <daniel@floppy.co>

* add comment to explain test

---------

Co-authored-by: Dan Upton <daniel@floppy.co>
2023-04-12 13:18:01 -04:00
Poonam Jadhav c8d21de074
feat: add reporting config with reload (#16890) 2023-04-11 15:04:02 -04:00
Dan Upton d46543631c
resource: `WriteStatus` endpoint (#16886) 2023-04-11 19:23:14 +01:00
Derek Menteer 2a13c9af1f
Remove deprecated service-defaults upstream behavior. (#16957)
Prior to this change, peer services would be targeted by service-default
overrides as long as the new `peer` field was not found in the config entry.
This commit removes that deprecated backwards-compatibility behavior. Now
it is necessary to specify the `peer` field in order for upstream overrides
to apply to a peer upstream.
2023-04-11 10:20:33 -05:00
Semir Patel 8d0d600ea3
Resource validation hook for `Write` endpoint (#16950) 2023-04-11 06:55:32 -05:00
Semir Patel ca19954c08
Check acls on resource `Read`, `List`, and `WatchList` (#16842) 2023-04-11 06:10:14 -05:00
John Maguire 3d11e9b26a
APIGW: Routes with duplicate parents should be invalid (#16926)
* ensure route parents are unique when creating an http route

* Ensure tcp route parents are unique

* Added unit tests
2023-04-10 13:20:32 -04:00
John Eikenberry 30d3a087dc
log warning about certificate expiring sooner and with more details
The old setting of 24 hours was not enough time to deal with an expiring certificates. This change ups it to 28 days OR 40% of the full cert duration, whichever is shorter. It also adds details to the log message to indicate which certificate it is logging about and a suggested action.
2023-04-07 20:38:07 +00:00
Chris Thain f9126b6c3a
Wasm Envoy HTTP extension (#16877) 2023-04-06 14:12:07 -07:00
Semir Patel 2b0a5b52c2
Resource `Delete` endpoint (#16756) 2023-04-06 08:58:54 -05:00
Dan Upton 4e8ab7a390
Resource `Write` endpoint (#16786) 2023-04-06 10:40:04 +01:00
Dan Upton 52ce151221
Raft storage backend (#16619) 2023-04-04 17:30:06 +01:00
cskh 97c4424743
docs: add envoy to the proxycfg diagram (#16834)
* docs: add envoy to the proxycfg diagram
2023-04-04 09:42:42 -04:00
Freddy 04e6e79b09
Allow dialer to re-establish terminated peering (#16776)
Currently, if an acceptor peer deletes a peering the dialer's peering
will eventually get to a "terminated" state. If the two clusters need to
be re-peered the acceptor will re-generate the token but the dialer will
encounter this error on the call to establish:

"failed to get addresses to dial peer: failed to refresh peer server
addresses, will continue to use initial addresses: there is no active
peering for "<<<ID>>>""

This is because in `exchangeSecret().GetDialAddresses()` we will get an
error if fetching addresses for an inactive peering. The peering shows
up as inactive at this point because of the existing terminated state.

Rather than checking whether a peering is active we can instead check
whether it was deleted. This way users do not need to delete terminated
peerings in the dialing cluster before re-establishing them.
2023-04-03 12:07:45 -06:00
Chris S. Kim 8671448b73
Connect CA Primary Provider refactor (#16749)
* Rename Intermediate cert references to LeafSigningCert

Within the Consul CA subsystem, the term "Intermediate"
is confusing because the meaning changes depending on
provider and datacenter (primary vs secondary). For
example, when using the Consul CA the "ActiveIntermediate"
may return the root certificate in a primary datacenter.

At a high level, we are interested in knowing which
CA is responsible for signing leaf certs, regardless of
its position in a certificate chain. This rename makes
the intent clearer.

* Move provider state check earlier

* Remove calls to GenerateLeafSigningCert

GenerateLeafSigningCert (formerly known
as GenerateIntermediate) is vestigial in
non-Vault providers, as it simply returns
the root certificate in primary
datacenters.

By folding Vault's intermediate cert logic
into `GenerateRoot` we can encapsulate
the intermediate cert handling within
`newCARoot`.

* Move GenerateLeafSigningCert out of PrimaryProvidder

Now that the Vault Provider calls
GenerateLeafSigningCert within
GenerateRoot, we can remove the method
from all other providers that never
used it in a meaningful way.

* Add test for IntermediatePEM

* Rename GenerateRoot to GenerateCAChain

"Root" was being overloaded in the Consul CA
context, as different providers and configs
resulted in a single root certificate or
a chain originating from an external trusted
CA. Since the Vault provider also generates
intermediates, it seems more accurate to
call this a CAChain.
2023-04-03 11:40:33 -04:00
Eric Haberkorn beb9f17337
Add default resolvers to disco chains based on the default sameness group (#16837) 2023-03-31 14:35:56 -04:00
Derek Menteer ee59a81dc9
Add sameness-group to exported-services config entries (#16836)
This PR adds the sameness-group field to exported-service
config entries, which allows for services to be exported
to multiple destination partitions / peers easily.
2023-03-31 12:36:44 -05:00
Dan Upton 37207b4e4c
storage: fix resource leak in Watch (#16817) 2023-03-31 13:24:19 +01:00
Eric Haberkorn b97a3a17d8
add order by locality failover to Consul enterprise (#16791) 2023-03-30 10:08:38 -04:00
Ronald 71fb0a723e
Copyright headers for missing files/folders (#16708)
* copyright headers for agent folder
2023-03-28 18:48:58 -04:00
Ronald dd0e8eec14
copyright headers for agent folder (#16704)
* copyright headers for agent folder

* Ignore test data files

* fix proto files and remove headers in agent/uiserver folder

* ignore deep-copy files
2023-03-28 14:39:22 -04:00
John Maguire 09512ae32d
Update normalization of route refs (#16789)
* Use merge of enterprise meta's rather than new custom method

* Add merge logic for tcp routes

* Add changelog

* Normalize certificate refs on gateways

* Fix infinite call loop

* Explicitly call enterprise meta
2023-03-28 11:23:49 -04:00
Michael Wilkerson baa1fd3cd6
changes to support new PQ enterprise fields (#16793) 2023-03-27 15:40:49 -07:00
Semir Patel bd4a01f38f
Resource service List(..) endpoint (#16753) 2023-03-27 16:25:27 -05:00
Dhia Ayachi 6da620159e
add ip rate limiter controller OSS parts (#16790) 2023-03-27 17:00:25 -04:00
Kyle Havlovitz 045f39ddd9
Allocate virtual ip for resolver/router/splitter config entries (#16760) 2023-03-27 13:04:24 -07:00
Semir Patel 0b441e07cc
WatchList(..) endpoint for the resource service (#16726) 2023-03-27 14:37:54 -05:00
John Maguire 74dfee9359
Fix struct tags for TCPService enterprise meta (#16781)
* Fix struct tags for TCPService enterprise meta

* Add changelog
2023-03-27 16:17:04 +00:00
Semir Patel 9f607d4970
Read(...) endpoint for the resource service (#16655) 2023-03-27 10:35:39 -05:00
Derek Menteer 5be6469506
Change partition for peers in discovery chain targets (#16769)
This commit swaps the partition field to the local partition for
discovery chains targeting peers. Prior to this change, peer upstreams
would always use a value of default regardless of which partition they
exist in. This caused several issues in xds / proxycfg because of id
mismatches.

Some prior fixes were made to deal with one-off id mismatches that this
PR also cleans up, since they are no longer needed.
2023-03-24 15:40:19 -05:00
John Eikenberry 56b93388bc
tests instantiating clients w/o shutting down (#16755)
noticed via their port still in use messages.
2023-03-24 16:54:11 +00:00
Poonam Jadhav 82a81dc486
fix: remove unused tenancy category from rate limit spec (#16740) 2023-03-23 12:14:59 -04:00
Dhia Ayachi 8fb85d6b7a
delete config when nil (#16690)
* delete config when nil

* fix mock interface implementation

* fix handler test to use the right assertion

* extract DeleteConfig as a separate API.

* fix mock limiter implementation to satisfy the new interface

* fix failing tests

* add test comments
2023-03-22 15:19:54 -04:00
Eric Haberkorn 672c1eea41
add enterprise xds tests (#16738) 2023-03-22 14:56:18 -04:00
Eric Haberkorn d7c81a3b1d
fix bug where pqs that failover to a cluster peer dont un-fail over (#16729) 2023-03-22 09:24:13 -04:00
cskh 3a4835e258
fix: gracefully fail on invalid port number (#16721) 2023-03-21 22:29:21 -04:00
John Maguire b97ff15b4c
Remove unused are hosts set check (#16691)
* Remove unused are hosts set check

* Remove all traces of unused 'AreHostsSet' parameter

* Remove unused Hosts attribute

* Remove commented out use of snap.APIGateway.Hosts
2023-03-21 16:23:23 +00:00
Nitya Dhanushkodi 69bd62f9c3
peering: peering partition failover fixes (#16673)
add local source partition for peered upstreams
2023-03-20 10:00:29 -07:00
John Maguire 2e07180662
Fix route subscription when using namespaces (#16677)
* Fix route subscription when using namespaces

* Update changelog

* Fix changelog entry to reference that the bug was enterprise only
2023-03-20 12:42:30 -04:00
Melisa Griffin fa1b6e7450
Adds check to verify that the API Gateway is being created with at least one listener 2023-03-20 12:37:30 -04:00
Poonam Jadhav 3be683fcc1
feat: add category annotation to RPC and gRPC methods (#16646) 2023-03-20 11:24:29 -04:00
Eric Haberkorn 2bf2e81a6b
add sameness groups to discovery chains (#16671) 2023-03-20 09:12:37 -04:00
Andrew Stucki a597cb3d57
[API Gateway] Fix invalid cluster causing gateway programming delay (#16661)
* Add test for http routes

* Add fix

* Fix tests

* Add changelog entry

* Refactor and fix flaky tests
2023-03-17 13:31:04 -04:00
Eric Haberkorn 68046060ea
add sameness group support to service resolver failover and redirects (#16664) 2023-03-17 10:48:06 -04:00
Eric Haberkorn 65bfef6df1
fix confusing spiffe ids in golden tests (#16643) 2023-03-15 14:30:36 -04:00
wangxinyi7 4aa1b5ee01
net 2731 ip config entry OSS version (#16642)
* ip config entry

* name changing

* move to ent

* ent version

* renaming

* change format

* renaming

* refactor

* add default values
2023-03-15 11:21:24 -07:00
John Maguire 7fca314579
Update e2e tests for namespaces (#16627)
* Refactored "NewGatewayService" to handle namespaces, fixed
TestHTTPRouteFlattening test

* Fixed existing http_route tests for namespacing

* Squash aclEnterpriseMeta for ResourceRefs and HTTPServices, accept
namespace for creating connect services and regular services

* Use require instead of assert after creating namespaces in
http_route_tests

* Refactor NewConnectService and NewGatewayService functions to use cfg
objects to reduce number of method args

* Rename field on SidecarConfig in tests from `SidecarServiceName` to
`Name` to avoid stutter
2023-03-15 17:51:36 +00:00
Freddy a5bd98ae3e
Backport ENT-4704 (#16612) 2023-03-14 14:55:11 -06:00
Derek Menteer f3be5d9b80
Fix issue with trust bundle read ACL check. (#16630)
This commit fixes an issue where trust bundles could not be read
by services in a non-default namespace, unless they had excessive
ACL permissions given to them.

Prior to this change, `service:write` was required in the default
namespace in order to read the trust bundle. Now, `service:write`
to a service in any namespace is sufficient.
2023-03-14 12:24:33 -05:00
Chris S. Kim bb4baeba95
Preserve CARoots when updating Vault CA configuration (#16592)
If a CA config update did not cause a root change, the codepath would return early and skip some steps which preserve its intermediate certificates and signing key ID. This commit re-orders some code and prevents updates from generating new intermediate certificates.
2023-03-13 17:32:59 -04:00
Derek Menteer 5d17b2c90b
Add sameness-group configuration entry. (#16608)
This commit adds a sameness-group config entry to the API and structs packages. It includes some validation logic and a new memdb index that tracks the default sameness-group for each partition. Sameness groups will simplify the effort of managing failovers / intentions / exports for peers and partitions.

Note that this change purely to introduce the configuration entry and does not include the full functionality of sameness-groups.
2023-03-13 16:19:11 -05:00
Ashvitha f514182f3e
Allow HCP metrics collection for Envoy proxies
Co-authored-by: Ashvitha Sridharan <ashvitha.sridharan@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>

Add a new envoy flag: "envoy_hcp_metrics_bind_socket_dir", a directory
where a unix socket will be created with the name
`<namespace>_<proxy_id>.sock` to forward Envoy metrics.

If set, this will configure:
- In bootstrap configuration a local stats_sink and static cluster.
  These will forward metrics to a loopback listener sent over xDS.

- A dynamic listener listening at the socket path that the previously
  defined static cluster is sending metrics to.

- A dynamic cluster that will forward traffic received at this listener
  to the hcp-metrics-collector service.


Reasons for having a static cluster pointing at a dynamic listener:
- We want to secure the metrics stream using TLS, but the stats sink can
  only be defined in bootstrap config. With dynamic listeners/clusters
  we can use the proxy's leaf certificate issued by the Connect CA,
  which isn't available at bootstrap time.

- We want to intelligently route to the HCP collector. Configuring its
  addreess at bootstrap time limits our flexibility routing-wise. More
  on this below.

Reasons for defining the collector as an upstream in `proxycfg`:
- The HCP collector will be deployed as a mesh service.

- Certificate management is taken care of, as mentioned above.

- Service discovery and routing logic is automatically taken care of,
  meaning that no code changes are required in the xds package.

- Custom routing rules can be added for the collector using discovery
  chain config entries. Initially the collector is expected to be
  deployed to each admin partition, but in the future could be deployed
  centrally in the default partition. These config entries could even be
  managed by HCP itself.
2023-03-10 13:52:54 -07:00
Eric Haberkorn d449096190
Add Peer Locality to Discovery Chains (#16588)
Add peer locality to discovery chains
2023-03-10 12:59:47 -05:00
Eric Haberkorn 0351f48bfd
allow setting locality on services and nodes (#16581) 2023-03-10 09:36:15 -05:00
Semir Patel a4780c60b8
GRPC stub for the ResourceService (#16528) 2023-03-09 13:40:23 -06:00
Andrew Stucki 28869784e1
auto-updated agent/uiserver/dist/ from commit 63204b518 (#16587)
Co-authored-by: hc-github-team-consul-core <github-team-consul-core@hashicorp.com>
2023-03-09 13:56:53 -05:00
Eric Haberkorn a66433e527
fix bug that can lead to peering service deletes impacting the state of local services (#16570) 2023-03-08 11:24:03 -05:00
Eric Haberkorn 1d9a09f276
add agent locality and replicate it across peer streams (#16522) 2023-03-07 14:05:23 -05:00
John Eikenberry 25ed13261b
support vault auth config for alicloud ca provider
Add support for using existing vault auto-auth configurations as the
provider configuration when using Vault's CA provider with AliCloud.

AliCloud requires 2 extra fields to enable it to use STS (it's preferred
auth setup). Our vault-plugin-auth-alicloud package contained a method
to help generate them as they require you to make an http call to
a faked endpoint proxy to get them (url and headers base64 encoded).
2023-03-07 03:02:05 +00:00
Melisa Griffin dac0cc90ed
NET-2904 Fixes API Gateway Route Service Weight Division Error 2023-03-06 08:41:57 -05:00
Melisa Griffin 71379b9621
NET-2903 Normalize weight for http routes (#16512)
* NET-2903 Normalize weight for http routes

* Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
2023-03-03 16:39:59 -05:00
R.B. Boyer b089f93292
proxycfg: ensure that an irrecoverable error in proxycfg closes the xds session and triggers a replacement proxycfg watcher (#16497)
Receiving an "acl not found" error from an RPC in the agent cache and the
streaming/event components will cause any request loops to cease under the
assumption that they will never work again if the token was destroyed. This
prevents log spam (#14144, #9738).

Unfortunately due to things like:

- authz requests going to stale servers that may not have witnessed the token
  creation yet

- authz requests in a secondary datacenter happening before the tokens get
  replicated to that datacenter

- authz requests from a primary TO a secondary datacenter happening before the
  tokens get replicated to that datacenter

The caller will get an "acl not found" *before* the token exists, rather than
just after. The machinery added above in the linked PRs will kick in and
prevent the request loop from looping around again once the tokens actually
exist.

For `consul-dataplane` usages, where xDS is served by the Consul servers
rather than the clients ultimately this is not a problem because in that
scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS
stream needing data for a specific service in the catalog. If the watching
goroutines are terminated it ripples down and terminates the xDS stream, which
CDP will eventually re-establish and restart everything.

For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time
launched at service registration time (called "local" in some of the proxycfg
machinery) so when the xDS stream comes in the data is already ready to go. If
the watching goroutines terminate it should terminate the xDS stream, but
there's no mechanism to re-spawn the watching goroutines. If the xDS stream
reconnects it will see no `ConfigSnapshot` and will not get one again until
the client agent is restarted, or the service is re-registered with something
changed in it.

This PR fixes a few things in the machinery:

- there was an inadvertent deadlock in fetching snapshot from the proxycfg
  machinery by xDS, such that when the watching goroutine terminated the
  snapshots would never be fetched. This caused some of the xDS machinery to
  get indefinitely paused and not finish the teardown properly.

- Every 30s we now attempt to re-insert all locally registered services into
  the proxycfg machinery.

- When services are re-inserted into the proxycfg machinery we special case
  "dead" ones such that we unilaterally replace them rather that doing that
  conditionally.
2023-03-03 14:27:53 -06:00
John Eikenberry 95288615fa
add provider ca support for approle auth-method
Adds support for the approle auth-method. Only handles using the approle
role/secret to auth and it doesn't support the agent's extra management
configuration options (wrap and delete after read) as they are not
required as part of the auth (ie. they are vault agent things).
2023-03-03 19:29:53 +00:00
Andrew Stucki ba667221a5
Fix resolution of service resolvers with subsets for external upstreams (#16499)
* Fix resolution of service resolvers with subsets for external upstreams

* Add tests

* Add changelog entry

* Update view filter logic
2023-03-03 14:17:11 -05:00
Eric Haberkorn 5c8414e772
Add support for failover policies (#16505) 2023-03-03 11:12:38 -05:00
Andrew Stucki 6ca1c9f15c
Fix issue where terminating gateway service resolvers weren't properly cleaned up (#16498)
* Fix issue where terminating gateway service resolvers weren't properly cleaned up

* Add integration test for cleaning up resolvers

* Add changelog entry

* Use state test and drop integration test
2023-03-03 09:56:57 -05:00
Andrew Stucki 2916821b55
Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable (#16495)
* Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable

* Regenerate golden files

* Add RequestTimeout field

* Add changelog entry
2023-03-03 09:37:12 -05:00
John Eikenberry 1a065c08bc
add provider ca auth support for kubernetes
Adds support for Kubernetes jwt/token file based auth. Only needs to
read the file and save the contents as the jwt/token.
2023-03-02 22:05:40 +00:00
John Eikenberry a75800a988
add provider ca support for jwt file base auth
Adds support for a jwt token in a file. Simply reads the file and sends
the read in jwt along to the vault login.

It also supports a legacy mode with the jwt string being passed
directly. In which case the path is made optional.
2023-03-02 20:33:06 +00:00
Chris S. Kim fea543993d
Speed up test by registering services concurrently (#16509) 2023-03-02 14:36:44 -05:00
John Eikenberry 5ac637f07d
add provider ca auth-method support for azure
Does the required dance with the local HTTP endpoint to get the required
data for the jwt based auth setup in Azure. Keeps support for 'legacy'
mode where all login data is passed on via the auth methods parameters.
Refactored check for hardcoded /login fields.
2023-03-01 00:07:33 +00:00
Dan Upton 118ffb1e95
grpc: fix data race in balancer registration (#16229)
Registering gRPC balancers is thread-unsafe because they are stored in a
global map variable that is accessed without holding a lock. Therefore,
it's expected that balancers are registered _once_ at the beginning of
your program (e.g. in a package `init` function) and certainly not after
you've started dialing connections, etc.

> NOTE: this function must only be called during initialization time
> (i.e. in an init() function), and is not thread-safe.

While this is fine for us in production, it's challenging for tests that
spin up multiple agents in-memory. We currently register a balancer per-
agent which holds agent-specific state that cannot safely be shared.

This commit introduces our own registry that _is_ thread-safe, and
implements the Builder interface such that we can call gRPC's `Register`
method once, on start-up. It uses the same pattern as our resolver
registry where we use the dial target's host (aka "authority"), which is
unique per-agent, to determine which builder to use.
2023-02-28 10:18:38 +00:00