Commit Graph

11940 Commits

Author SHA1 Message Date
R.B. Boyer 64fc002e03
connect: fix failover through a mesh gateway to a remote datacenter (#6259)
Failover is pushed entirely down to the data plane by creating envoy
clusters and putting each successive destination in a different load
assignment priority band. For example this shows that normally requests
go to 1.2.3.4:8080 but when that fails they go to 6.7.8.9:8080:

- name: foo
  load_assignment:
    cluster_name: foo
    policy:
      overprovisioning_factor: 100000
    endpoints:
    - priority: 0
      lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: 1.2.3.4
              port_value: 8080
    - priority: 1
      lb_endpoints:
      - endpoint:
          address:
            socket_address:
              address: 6.7.8.9
              port_value: 8080

Mesh gateways route requests based solely on the SNI header tacked onto
the TLS layer. Envoy currently only lets you configure the outbound SNI
header at the cluster layer.

If you try to failover through a mesh gateway you ideally would
configure the SNI value per endpoint, but that's not possible in envoy
today.

This PR introduces a simpler way around the problem for now:

1. We identify any target of failover that will use mesh gateway mode local or
   remote and then further isolate any resolver node in the compiled discovery
   chain that has a failover destination set to one of those targets.

2. For each of these resolvers we will perform a small measurement of
   comparative healths of the endpoints that come back from the health API for the
   set of primary target and serial failover targets. We walk the list of targets
   in order and if any endpoint is healthy we return that target, otherwise we
   move on to the next target.

3. The CDS and EDS endpoints both perform the measurements in (2) for the
   affected resolver nodes.

4. For CDS this measurement selects which TLS SNI field to use for the cluster
   (note the cluster is always going to be named for the primary target)

5. For EDS this measurement selects which set of endpoints will populate the
   cluster. Priority tiered failover is ignored.

One of the big downsides to this approach to failover is that the failover
detection and correction is going to be controlled by consul rather than
deferring that entirely to the data plane as with the prior version. This also
means that we are bound to only failover using official health signals and
cannot make use of data plane signals like outlier detection to affect
failover.

In this specific scenario the lack of data plane signals is ok because the
effectiveness is already muted by the fact that the ultimate destination
endpoints will have their data plane signals scrambled when they pass through
the mesh gateway wrapper anyway so we're not losing much.

Another related fix is that we now use the endpoint health from the
underlying service, not the health of the gateway (regardless of
failover mode).
2019-08-05 13:30:35 -05:00
Alvin Huang 4f6523b2d7
Merge pull request #6274 from hashicorp/merge-master-de01a1e
Merge master at de01a1e279230624fcc2d7e692b7e773d570204b
2019-08-02 19:13:54 -04:00
Alvin Huang a9dc90b001 fix grpc-addr-config hosts template 2019-08-02 19:00:39 -04:00
Alvin Huang ae898a4a33 Merge remote-tracking branch 'origin/master' into release/1-6 2019-08-02 18:09:32 -04:00
R.B. Boyer eaeb9998f2 update changelog 2019-08-02 15:36:13 -05:00
R.B. Boyer 0165e93517
connect: expose an API endpoint to compile the discovery chain (#6248)
In addition to exposing compilation over the API cleaned up the structures that would be exchanged to be cleaner and easier to support and understand.

Also removed ability to configure the envoy OverprovisioningFactor.
2019-08-02 15:34:54 -05:00
Matt Keeler 510b1271bc
Update CHANGELOG.md 2019-08-02 16:23:00 -04:00
Matt Keeler 72b8149333
Add license management functions to API client (#6268)
* Add license management functions to API client

* Get rid of jsonapi struct tags
2019-08-02 16:20:38 -04:00
Alvin Huang f834854f8a
add generic master merge into release/* branches (#6249) 2019-08-02 16:11:41 -04:00
Todd Radel 295abd82c3
connect: generate intermediate at same time as root (#6272)
Generate intermediate at same time as root
Co-Authored-By: Freddy <freddygv@users.noreply.github.com>
2019-08-02 15:36:03 -04:00
Alvin Huang f5471c7f8d
Add arm builds (#6263)
* try arm builds

* Update .circleci/config.yml

Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com>

* Update .circleci/config.yml

Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com>

* Update .circleci/config.yml

Co-Authored-By: Matt Keeler <mkeeler@users.noreply.github.com>
2019-08-02 15:15:59 -04:00
R.B. Boyer 871ebb9dc2 update changelog 2019-08-02 09:19:37 -05:00
R.B. Boyer 4e2fb5730c
connect: detect and prevent circular discovery chain references (#6246) 2019-08-02 09:18:45 -05:00
John Cowen d3930d55aa
ui: Adds readonly meta data to the serviceInstance and node detail pages (#6196) 2019-08-02 13:53:52 +02:00
R.B. Boyer 3177392194 update changelog 2019-08-01 23:07:54 -05:00
R.B. Boyer 6c9edb17c2
server: if inserting bootstrap config entries fails don't silence the errors (#6256) 2019-08-01 23:07:11 -05:00
R.B. Boyer 95d08b8ea8 update changelog 2019-08-01 22:45:01 -05:00
R.B. Boyer 782c647bf4
connect: simplify the compiled discovery chain data structures (#6242)
This should make them better for sending over RPC or the API.

Instead of a chain implemented explicitly like a linked list (nodes
holding pointers to other nodes) instead switch to a flat map of named
nodes with nodes linking other other nodes by name. The shipped
structure is just a map and a string to indicate which key to start
from.

Other changes:

* inline the compiler option InferDefaults as true

* introduce compiled target config to avoid needing to send back
  additional maps of Resolvers; future target-specific compiled state
  can go here

* move compiled MeshGateway out of the Resolver and into the
  TargetConfig where it makes more sense.
2019-08-01 22:44:05 -05:00
R.B. Boyer 142a983c05 update changelog 2019-08-01 22:05:02 -05:00
R.B. Boyer 4666599e18
connect: reconcile how upstream configuration works with discovery chains (#6225)
* connect: reconcile how upstream configuration works with discovery chains

The following upstream config fields for connect sidecars sanely
integrate into discovery chain resolution:

- Destination Namespace/Datacenter: Compilation occurs locally but using
different default values for namespaces and datacenters. The xDS
clusters that are created are named as they normally would be.

- Mesh Gateway Mode (single upstream): If set this value overrides any
value computed for any resolver for the entire discovery chain. The xDS
clusters that are created may be named differently (see below).

- Mesh Gateway Mode (whole sidecar): If set this value overrides any
value computed for any resolver for the entire discovery chain. If this
is specifically overridden for a single upstream this value is ignored
in that case. The xDS clusters that are created may be named differently
(see below).

- Protocol (in opaque config): If set this value overrides the value
computed when evaluating the entire discovery chain. If the normal chain
would be TCP or if this override is set to TCP then the result is that
we explicitly disable L7 Routing and Splitting. The xDS clusters that
are created may be named differently (see below).

- Connect Timeout (in opaque config): If set this value overrides the
value for any resolver in the entire discovery chain. The xDS clusters
that are created may be named differently (see below).

If any of the above overrides affect the actual result of compiling the
discovery chain (i.e. "tcp" becomes "grpc" instead of being a no-op
override to "tcp") then the relevant parameters are hashed and provided
to the xDS layer as a prefix for use in naming the Clusters. This is to
ensure that if one Upstream discovery chain has no overrides and
tangentially needs a cluster named "api.default.XXX", and another
Upstream does have overrides for "api.default.XXX" that they won't
cross-pollinate against the operator's wishes.

Fixes #6159
2019-08-01 22:03:34 -05:00
R.B. Boyer cf793b3313 update changelog 2019-08-01 13:27:14 -05:00
R.B. Boyer 6bbbfde88b
connect: validate upstreams and prevent duplicates (#6224)
* connect: validate upstreams and prevent duplicates

* Actually run Upstream.Validate() instead of ignoring it as dead code.

* Prevent two upstreams from declaring the same bind address and port.
  It wouldn't work anyway.

* Prevent two upstreams from being declared that use the same
  type+name+namespace+datacenter. Due to how the Upstream.Identity()
  function worked this ended up mostly being enforced in xDS at use-time,
  but it should be enforced more clearly at register-time.
2019-08-01 13:26:02 -05:00
Omer Zach 1e80fc9c0f Fix typo in architecture.html.md (#6261) 2019-08-01 12:21:37 -06:00
Venkata Krishna Annam 5011f305e0 docs: Fix minor mistakes in index.html.md (#6239) 2019-08-01 12:57:26 -05:00
Sarah Adams df036f06a7
fix 'consul connect envoy' to try to use previously-configured grpc port (#6245)
fix 'consul connect envoy' to try to use previously-configured grpc port on running agent before defaulting to 8502

Fixes #5011
2019-08-01 09:53:34 -07:00
Paul Banks a5c70d79d0 Revert "connect: support AWS PCA as a CA provider" (#6251)
This reverts commit 3497b7c00d49c4acbbf951d84f2bba93f3da7510.
2019-07-31 09:08:10 -04:00
Todd Radel d3b7fd83fe
connect: support AWS PCA as a CA provider (#6189)
Port AWS PCA provider from consul-ent
2019-07-30 22:57:51 -04:00
Todd Radel 1b14d6595e
connect: Support RSA keys in addition to ECDSA (#6055)
Support RSA keys in addition to ECDSA
2019-07-30 17:47:39 -04:00
Freddy c691b75b75
Update CHANGELOG.md 2019-07-30 11:03:16 -06:00
freddygv 00157a2c1f Update default gossip encryption key size to 32 bytes 2019-07-30 09:45:41 -06:00
Matt Keeler 4407ec5faf
Update CHANGELOG.md 2019-07-30 09:58:38 -04:00
Matt Keeler 4bdd27ef31
Fix envoy canBind (#6238)
* Fix envoy cli canBind function

The string form of an Addr was including the CIDR causing the str equals to not match.

* Remove debug prints
2019-07-30 09:56:56 -04:00
hashicorp-ci 20f477cf13 Merge Consul OSS branch 'master' at commit a1725e6b5299c6ce12e8273205f90fba31403686 2019-07-30 02:00:29 +00:00
Matt Keeler e9f6805adc Fix flaky tests (#6229) 2019-07-29 15:07:25 -04:00
Matt Keeler e0aa9fccbb
Update CHANGELOG.md 2019-07-29 11:19:39 -04:00
Matt Keeler 5058faf653
Update CHANGELOG.md 2019-07-29 11:17:58 -04:00
Matt Keeler 7e69646a77
Fix prepared query upstream endpoint generation (#6236)
Use the correct SNI value for prepared query upstreams
2019-07-29 11:15:55 -04:00
hashicorp-ci da93c2ce79
Release v1.6.0-beta3 2019-07-26 23:15:20 +00:00
hashicorp-ci 252900d3e5
update bindata_assetfs.go 2019-07-26 23:15:20 +00:00
Alvin Huang 7972514b82 Merge remote-tracking branch 'origin/master' into release/1-6 2019-07-26 16:22:53 -04:00
Matt Keeler f3d0bdd8f3
Update CHANGELOG.md 2019-07-26 15:59:20 -04:00
Matt Keeler a7c4b7af7c
Fix CA Replication when ACLs are enabled (#6201)
Secondary CA initialization steps are:

• Wait until the primary will be capable of signing intermediate certs. We use serf metadata to check the versions of servers in the primary which avoids needing a token like the previous implementation that used RPCs. We require at least one alive server in the primary and the all alive servers meet the version requirement.
• Initialize the secondary CA by getting the primary to sign an intermediate

When a primary dc is configured, if no existing CA is initialized and for whatever reason we cannot initialize a secondary CA the secondary DC will remain without a CA. As soon as it can it will initialize the secondary CA by pulling the primaries roots and getting the primary to sign an intermediate.

This also fixes a segfault that can happen during leadership revocation. There was a spot in the secondaryCARootsWatch that was getting the CA Provider and executing methods on it without nil checking. Under normal circumstances it wont be nil but during leadership revocation it gets nil'ed out. Therefore there is a period of time between closing the stop chan and when the go routine is actually stopped where it could read a nil provider and cause a segfault.
2019-07-26 15:57:57 -04:00
Matt Keeler 9dd72121e1
Set --max-obj-name-len 256 when execing Envoy (#6202)
* Pass -max-obj-name-len 256 to envoy

* Update test expectations.

* Add a note about requireing the max-obj-name-len option to be set
2019-07-26 15:43:15 -04:00
Todd Radel c253a23630
Merge pull request #6210 from hashicorp/docs/fix-ambassador-link
Fix links to ambassador website
2019-07-26 14:29:03 -04:00
R.B. Boyer 200d470f7b
Merge pull request #6223 from hashicorp/master-merge-b3541c4f3
Master merge b3541c4f3
2019-07-26 11:44:01 -05:00
R.B. Boyer 1b95d2e5e3 Merge Consul OSS branch master at commit b3541c4f34d43ab92fe52256420759f17ea0ed73 2019-07-26 10:34:24 -05:00
Jack Pearkes ed9365cfd4 Putting source back into Dev Mode 2019-07-25 17:58:56 -07:00
hashicorp-ci 601703497f
Release v1.5.3 2019-07-25 23:41:17 +00:00
hashicorp-ci 86ff9e9dc9
update bindata_assetfs.go 2019-07-25 23:41:16 +00:00
Jack Pearkes 4670f16f85
Update CHANGELOG.md 2019-07-25 14:20:11 -07:00