Commit Graph

225 Commits

Author SHA1 Message Date
R.B. Boyer 086ff42b56
partitions: various refactors to support partitioning the serf LAN pool (#11568) 2021-11-15 09:51:14 -06:00
Giulio Micheloni 10cdc0a5c8
Merge branch 'main' into serve-panic-recovery 2021-11-06 16:12:06 +01:00
Connor b3af482e09
Support Vault Namespaces explicitly in CA config (#11477)
* Support Vault Namespaces explicitly in CA config

If there is a Namespace entry included in the Vault CA configuration,
set it as the Vault Namespace on the Vault client

Currently the only way to support Vault namespaces in the Consul CA
config is by doing one of the following:
1) Set the VAULT_NAMESPACE environment variable which will be picked up
by the Vault API client
2) Prefix all Vault paths with the namespace

Neither of these are super pleasant. The first requires direct access
and modification to the Consul runtime environment. It's possible and
expected, not super pleasant.

The second requires more indepth knowledge of Vault and how it uses
Namespaces and could be confusing for anyone without that context. It
also infers that it is not supported

* Add changelog

* Remove fmt.Fprint calls

* Make comment clearer

* Add next consul version to website docs

* Add new test for default configuration

* go mod tidy

* Add skip if vault not present

* Tweak changelog text
2021-11-05 11:42:28 -05:00
FFMMM 67a624a49f
use *telemetry.MetricsPrefix as prometheus.PrometheusOpts.Name (#11290)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2021-10-21 13:33:01 -07:00
Dhia Ayachi 7ef431c823
update raft to v1.3.2 (#11375)
* update raft to v1.3.2

* add changelog

* fix changelog
2021-10-21 13:21:22 -04:00
Mike Morris 6aa6b66268
deps: update consul-api and consul-sdk (#11361)
* deps: update consul/sdk in api/go.mod

* deps: update consul/api to v1.11.0
2021-10-20 14:58:53 -04:00
Giulio Micheloni a5a4eb9cae Separete test file and no stack trace in ret error 2021-10-16 18:02:03 +01:00
Giulio Micheloni 10814d934e Merge branch 'main' of https://github.com/hashicorp/consul into hashicorp-main 2021-10-16 16:59:32 +01:00
Jeff Widman a8f396c55f
Bump `go-discover` to fix broken dep tree (#10898) 2021-09-16 15:31:22 -04:00
freddygv 8772e2fbce Update yamux 2021-08-25 19:46:12 -06:00
Giulio Micheloni 10b03c3f4e
Merge branch 'main' into serve-panic-recovery 2021-08-22 20:31:11 +02:00
Daniel Nephin d2f5b4d335 debug: improve a couple of the test cases
Use gotest.tools/v3/fs to make better assertions about the files

Remove the TestAgent from TestDebugCommand_Prepare_ValidateTiming, since we can test that validation
without making any API calls.
2021-08-18 12:29:34 -04:00
Roopak Venkatakrishnan d4dacd0e2e Update x/sys to support go 1.17 2021-08-18 03:00:22 +00:00
Mike Morris 86d76cb099
deps: upgrade gogo-protobuf to v1.3.2 (#10813)
* deps: upgrade gogo-protobuf to v1.3.2

* go mod tidy using go 1.16

* proto: regen protobufs after upgrading gogo/protobuf

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-08-12 14:05:46 -04:00
Giulio Micheloni 0bf124502e grpc Server: turn panic into error through middleware 2021-08-07 13:21:12 +01:00
Daniel Nephin 5da6c51ae4 Update armon/go-metrics
To pickup new InMemSink.Stream method
2021-07-26 15:58:17 -04:00
Daniel Nephin 291315e39f Update serf
To pick up data race fixes
2021-07-14 18:58:16 -04:00
Dhia Ayachi c3eacac764
upgrade golang crypto from 0.0.0-20200930160638-afb6bcd081ae => v0.0.0-20210513164829-c07d793c2f9a (#10390) 2021-06-14 12:38:42 -04:00
Dhia Ayachi e3dd0f9a44
generate a single debug file for a long duration capture (#10279)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* extract wait group outside the go routine to avoid a race condition

* capture pprof in a separate go routine

* perform a single capture for pprof data for the whole duration

* add missing vendor dependency

* add a change log and fix documentation to reflect the change

* create function for timestamp dir creation and simplify error handling

* use error groups and ticker to simplify interval capture loop

* Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval

* refactor Logs capture routine and add log capture specific test

* improve error reporting when log test fail

* change test duration to 1s

* make time parsing in log line more robust

* refactor log time format in a const

* test on log line empty the earliest possible and return

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* rename function to captureShortLived

* more specific changelog

Co-authored-by: Paul Banks <banks@banksco.de>

* update documentation to reflect current implementation

* add test for behavior when invalid param is passed to the command

* fix argument line in test

* a more detailed description of the new behaviour

Co-authored-by: Paul Banks <banks@banksco.de>

* print success right after the capture is done

* remove an unnecessary error check

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 13:00:51 -04:00
Matt Keeler b45dd03b8f
Bump raft-autopilot version to the latest. (#10306) 2021-05-27 12:59:14 -04:00
Daniel Nephin 71d6a2bf4b Fix some test flakes
- return errors in TestAgent.Start so that the retry works correctly
- remove duplicate logging, the error is returned already
- add a missing t.Helper() to retry.Run
- properly set a.Agent to nil so that subsequent retry attempts will actually try to start
2021-05-10 13:20:45 -04:00
Daniel Nephin 203c752ee8 Update a couple dependencies
To pickup bug fixes
2021-05-04 14:09:10 -04:00
Paul Banks d47eea3a3f
Make Raft trailing logs and snapshot timing reloadable (#10129)
* WIP reloadable raft config

* Pre-define new raft gauges

* Update go-metrics to change gauge reset behaviour

* Update raft to pull in new metric and reloadable config

* Add snapshot persistance timing and installSnapshot to our 'protected' list as they can be infrequent but are important

* Update telemetry docs

* Update config and telemetry docs

* Add note to oldestLogAge on when it is visible

* Add changelog entry

* Update website/content/docs/agent/options.mdx

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2021-05-04 15:36:53 +01:00
R.B. Boyer 91bee6246f
Support Incremental xDS mode (#9855)
This adds support for the Incremental xDS protocol when using xDS v3. This is best reviewed commit-by-commit and will not be squashed when merged.

Union of all commit messages follows to give an overarching summary:

xds: exclusively support incremental xDS when using xDS v3

Attempts to use SoTW via v3 will fail, much like attempts to use incremental via v2 will fail.
Work around a strange older envoy behavior involving empty CDS responses over incremental xDS.
xds: various cleanups and refactors that don't strictly concern the addition of incremental xDS support

Dissolve the connectionInfo struct in favor of per-connection ResourceGenerators instead.
Do a better job of ensuring the xds code uses a well configured logger that accurately describes the connected client.
xds: pull out checkStreamACLs method in advance of a later commit

xds: rewrite SoTW xDS protocol tests to use protobufs rather than hand-rolled json strings

In the test we very lightly reuse some of the more boring protobuf construction helper code that is also technically under test. The important thing of the protocol tests is testing the protocol. The actual inputs and outputs are largely already handled by the xds golden output tests now so these protocol tests don't have to do double-duty.

This also updates the SoTW protocol test to exclusively use xDS v2 which is the only variant of SoTW that will be supported in Consul 1.10.

xds: default xds.Server.AuthCheckFrequency at use-time instead of construction-time
2021-04-29 13:54:05 -05:00
R.B. Boyer 1ae772ff99
mod: bump to github.com/hashicorp/mdns v1.0.4 (#10018) 2021-04-14 14:17:52 -05:00
Daniel Nephin 78cb867c8e Update memberlist to v0.2.3
To pickup data race fixes
2021-03-24 18:20:19 -04:00
Daniel Nephin 66c3c76aa6 Update go-memdb
To use a version that will not panic when an iterator is used with modifications.
2021-01-28 17:19:55 -05:00
Daniel Nephin 2eea58bcc4
Merge pull request #9302 from hashicorp/dnephin/add-service-3
agent: remove ServiceManager.Start goroutine
2021-01-28 16:59:41 -05:00
Matt Keeler 1379b5f7d6
Upgrade raft-autopilot and wait for autopilot it to stop when revoking leadership (#9644)
Fixes: 9626
2021-01-27 11:14:52 -05:00
Daniel Nephin 3685f39970 lib/mutex: add mutex with TryLock and update vendor 2021-01-25 18:01:47 -05:00
Daniel Nephin 90bf8460a1 Update mapstructure 2021-01-12 12:24:56 -05:00
Pierre Souchay 4f8b0b307c [bugfix] Prometheus metrics without warnings
go-metrics is updated to 0.3.6 to properly handle help in prometheus metrics

This fixes https://github.com/hashicorp/consul/issues/9303 and
https://github.com/hashicorp/consul/issues/9471
2021-01-06 13:54:05 +01:00
Mike Morris 67a11e4d16 Merge pull request #9270 from hashicorp/release/1.9.0
merge: release/1.9.0 back into 1.9.x
2020-11-24 17:36:47 -05:00
Matt Keeler 755fb72994
Switch to using the external autopilot module 2020-11-09 09:22:11 -05:00
Mike Morris 9ccb340893
chore: upgrade to gopsutil/v3 (#9118)
* deps: update golang.org/x/sys

* deps: update imports to gopsutil/v3

* chore: make update-vendor
2020-11-06 20:48:38 -05:00
Kit Patella b668592326 rollback golang.org/x/sys version to fix distro-build 2020-11-05 12:09:07 -08:00
Kit Patella fbe61ad16c upgrade go-metrics to latest 2020-11-04 14:02:13 -08:00
Kyle Havlovitz 95f7b354c2 vendor: Update github.com/hashicorp/yamux 2020-10-09 05:05:46 -07:00
Kyle Havlovitz 8e0ea86754 vendor: Update github.com/hashicorp/mdns 2020-10-09 04:43:27 -07:00
Kyle Havlovitz 3cd60e1d72 vendor: Update github.com/hashicorp/hil 2020-10-09 04:43:27 -07:00
Kyle Havlovitz 02e282a7ab vendor: Update github.com/hashicorp/go-version 2020-10-09 04:43:27 -07:00
Kyle Havlovitz bc6ffb59b8 vendor: Update github.com/hashicorp/go-memdb 2020-10-09 04:43:27 -07:00
Kyle Havlovitz b5bb29f938 vendor: Update github.com/hashicorp/go-checkpoint 2020-10-09 04:43:27 -07:00
Mike Morris 4ae98cde2b
chore: update raft to v1.2.0 (#8822) 2020-10-08 15:07:10 -04:00
Matt Keeler 141eb60f06
Add per-agent reconnect timeouts (#8781)
This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it.

This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.
2020-10-08 15:02:19 -04:00
Mike Morris 1d4f3166fb
chore(deps): update gopsutil to v2.20.9 (#8843)
* core(deps): bump golang.org/x/sys

To resolve /go/pkg/mod/github.com/shirou/gopsutil@v2.20.9+incompatible/host/host_bsd.go:20:13: undefined: unix.SysctlTimeval

* chore(deps): make update-vendor
2020-10-07 12:57:18 -04:00
Daniel Nephin b9bf0b527c Vendor gofuzz and google/go-cmp 2020-09-28 18:28:37 -04:00
Kyle Havlovitz c8fd61abc7 Merge branch 'master' into vault-ca-renew-token 2020-09-15 14:39:04 -07:00
Kyle Havlovitz 316600a685 Update vault CA for latest api client 2020-09-15 13:33:55 -07:00
Kyle Havlovitz c3bd917650 vendor: Update vault api package 2020-09-15 12:45:29 -07:00
Daniel Nephin beb125f053 Update go-metrics dependencies, to use metrics.Default() 2020-09-14 19:05:22 -04:00
Mike Morris e08272ce8b vendor: bump consul/api to v1.7.0 2020-09-10 21:40:41 -04:00
R.B. Boyer f2b8bf109c
xds: use envoy's rbac filter to handle intentions entirely within envoy (#8569) 2020-08-27 12:20:58 -05:00
Hans Hasselberg 02de4c8b76
add primary keys to list keyring (#8522)
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.

Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:

```json
[
  {
    "WAN": true,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "NumNodes": 6
  },
  {
    "WAN": false,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  },
  {
    "WAN": false,
    "Datacenter": "dc1",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  }
]
```

I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output

Fixes #3393.
2020-08-18 09:50:24 +02:00
s-christoff efcda70b85
Update Go-Metrics 0.3.4 (#8478) 2020-08-11 11:17:43 -05:00
Mike Morris 68389410d6
api: bump consul/api to v1.6.0 and consul/sdk to v0.6.0 (#8460)
* api: bump consul/sdk dependency to v0.6.0

* api: bump dependency to v1.6.0
2020-08-07 17:26:05 -04:00
Kyle Havlovitz 22721d56c8 vendor: Update github.com/armon/go-metrics to v0.3.3 2020-07-23 11:37:33 -07:00
Matt Keeler 2f68d5972a
Update mapstructure to v1.3.3 (#8361)
This was done in preparation for another PR where I was running into https://github.com/mitchellh/mapstructure/issues/202 and implemented a fix for the library.
2020-07-22 15:13:21 -04:00
R.B. Boyer 33f3436e94
gossip: Avoid issue where two unique leave events for the same node could lead to infinite rebroadcast storms (#8343)
bump serf to v0.9.3 to include fix for https://github.com/hashicorp/serf/pull/606
2020-07-21 15:48:10 -05:00
Pierre Souchay f77182aa51
Upgrade go-connlimit to v0.3.0 / return http 429 on too many connections (#8221)
Fixes #7527

I want to highlight this and explain what I think the implications are and make sure we are aware:

* `HTTPConnStateFunc` closes the connection when it is beyond the limit. `Close` does not block.
* `HTTPConnStateFuncWithDefault429Handler(10 * time.Millisecond)` blocks until the following is done (worst case):
  1) `conn.SetDeadline(10*time.Millisecond)` so that
  2) `conn.Write(429error)` is guaranteed to timeout after 10ms, so that the http 429 can be written and 
  3) `conn.Close` can happen

The implication of this change is that accepting any new connection is worst case delayed by 10ms. But only after a client reached the limit already.
2020-07-03 09:25:07 +02:00
Hans Hasselberg 9a38e4f766
Update gopsutil (#8208)
https://github.com/shirou/gopsutil/pull/895 is merged and fixes our
problem. Time to update. Since there is no new version just yet,
updating to the sha.
2020-07-01 14:47:56 +02:00
Matt Keeler 2ab8af4093
Add a test for go routine leaks
This is in its own separate package so that it will be a separate test binary that runs thus isolating the go runtime from other tests and allowing accurate go routine leak checking.

This test would ideally use goleak.VerifyTestMain but that will fail 100% of the time due to some architectural things (blocking queries and net/rpc uncancellability).

This test is not comprehensive. We should enable/exercise more features and more cluster configurations. However its a start.
2020-06-24 17:09:50 -04:00
R.B. Boyer ba83b52b32
connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165) 2020-06-23 15:19:56 -05:00
Paul Banks f9a6386c4a state: track changes so that they may be used to produce change events 2020-06-16 13:04:29 -04:00
Daniel Nephin f9a4bb5c2d Update google.golang.org/api and stretchr/testify
To match the versions used in enterprise, should slightly reduce the
chances of getting a merge conflict when using `go.mod`.
2020-06-09 16:03:05 -04:00
Daniel Nephin 221a49f430 Update protobuf and golang.org/x/... vendor
Partially extracted from #7547

Updates protobuf to the most recent in the 1.3.x series, and updates
golang.org/x/sys to a7d97aace0b0 because of https://github.com/shirou/gopsutil/issues/853
prevents updating to a more recent version.

This breaking change in x/sys also prevents us from getting a newer
version of x/net. In the future, if gopsutil is not patched,  we may want to run a fork version of
gopsutil so that we can update both x/net and x/sys.
2020-06-09 14:46:41 -04:00
Daniel Nephin e8a883e829
Replace goe/verify.Values with testify/require.Equal (#7993)
* testing: replace most goe/verify.Values with require.Equal

One difference between these two comparisons is that go/verify considers
nil slices/maps to be equal to empty slices/maps, where as testify/require
does not, and does not appear to provide any way to enable that behaviour.

Because of this difference some expected values were changed from empty
slices to nil slices, and some calls to verify.Values were left.

* Remove github.com/pascaldekloe/goe/verify

Reduce the number of assertion packages we use from 2 to 1
2020-06-02 12:41:25 -04:00
R.B. Boyer 940e5ad160
acl: add auth method for JWTs (#7846) 2020-05-11 20:59:29 -05:00
Mike Morris f39bd26a7c
vendor: revert golang.org/x/sys bump to avoid FreeBSD regression (#7780) 2020-05-05 09:26:17 +02:00
Hans Hasselberg 1be90e0fa1
agent: don't let left nodes hold onto their node-id (#7747) 2020-05-04 18:39:08 +02:00
Matt Keeler 669d22933e
Merge pull request #7714 from hashicorp/oss-sync/msp-agent-token 2020-05-04 11:33:50 -04:00
Matt Keeler 31c5bcc53e
Update go-discover dependency (#7731) 2020-05-04 10:59:48 -04:00
Matt Keeler 49c90833ff
Update mapstructure to v1.2.3
This release contains a fix to prevent duplicate keys in the Metadata after decoding where the output value contains pointer fields.
2020-04-28 09:33:16 -04:00
R.B. Boyer f1d8ea7018
cli: ensure that 'snapshot save' is fsync safe and also only writes to the requested file on success (#7698) 2020-04-24 17:34:47 -05:00
R.B. Boyer 032e0ae901
cli: fix usage of gzip.Reader to better detect corrupt snapshots during save/restore (#7697) 2020-04-24 17:18:56 -05:00
Daniel Nephin 7237eb67ed Update github.com/joyent/triton-go to latest
There was an RSA private key used for testing included in the old
version. This commit updates it to a version that does not include the
key so that the key is not detected by tools which scan the Consul
binary for private keys.

Commands run:

go get github.com/joyent/triton-go@6801d15b779f042cfd821c8a41ef80fc33af9d47
make update-vendor
2020-04-16 12:34:29 -04:00
Daniel Nephin ab19e43b61
Merge pull request #7519 from hashicorp/dnephin/help-to-stdout
cli: send requested help text to stdout
2020-04-01 11:26:12 -04:00
Daniel Nephin 96c4a35de7 Remove t.Name() from TestAgent.Name
And re-add the name to the logger so that log messages from different agents
in a single can be identified.
2020-03-30 16:47:24 -04:00
Daniel Nephin b7ff6424f5 cli: send requested help text to stdout
This behaviour matches the GNU CLI standard:
http://www.gnu.org/prep/standards/html_node/_002d_002dhelp.html
2020-03-26 15:27:34 -04:00
R.B. Boyer bfd7424e0d
bump the expected go language version of the main module to 1.13 (#7429) 2020-03-10 14:46:09 -05:00
R.B. Boyer a7fb26f50f
wan federation via mesh gateways (#6884)
This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch:

There are several distinct chunks of code that are affected:

* new flags and config options for the server

* retry join WAN is slightly different

* retry join code is shared to discover primary mesh gateways from secondary datacenters

* because retry join logic runs in the *agent* and the results of that
  operation for primary mesh gateways are needed in the *server* there are
  some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur
  at multiple layers of abstraction just to pass the data down to the right
  layer.

* new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers

* the function signature for RPC dialing picked up a new required field (the
  node name of the destination)

* several new RPCs for manipulating a FederationState object:
  `FederationState:{Apply,Get,List,ListMeshGateways}`

* 3 read-only internal APIs for debugging use to invoke those RPCs from curl

* raft and fsm changes to persist these FederationStates

* replication for FederationStates as they are canonically stored in the
  Primary and replicated to the Secondaries.

* a special derivative of anti-entropy that runs in secondaries to snapshot
  their local mesh gateway `CheckServiceNodes` and sync them into their upstream
  FederationState in the primary (this works in conjunction with the
  replication to distribute addresses for all mesh gateways in all DCs to all
  other DCs)

* a "gateway locator" convenience object to make use of this data to choose
  the addresses of gateways to use for any given RPC or gossip operation to a
  remote DC. This gets data from the "retry join" logic in the agent and also
  directly calls into the FSM.

* RPC (`:8300`) on the server sniffs the first byte of a new connection to
  determine if it's actually doing native TLS. If so it checks the ALPN header
  for protocol determination (just like how the existing system uses the
  type-byte marker).

* 2 new kinds of protocols are exclusively decoded via this native TLS
  mechanism: one for ferrying "packet" operations (udp-like) from the gossip
  layer and one for "stream" operations (tcp-like). The packet operations
  re-use sockets (using length-prefixing) to cut down on TLS re-negotiation
  overhead.

* the server instances specially wrap the `memberlist.NetTransport` when running
  with gateway federation enabled (in a `wanfed.Transport`). The general gist is
  that if it tries to dial a node in the SAME datacenter (deduced by looking
  at the suffix of the node name) there is no change. If dialing a DIFFERENT
  datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh
  gateways to eventually end up in a server's :8300 port.

* a new flag when launching a mesh gateway via `consul connect envoy` to
  indicate that the servers are to be exposed. This sets a special service
  meta when registering the gateway into the catalog.

* `proxycfg/xds` notice this metadata blob to activate additional watches for
  the FederationState objects as well as the location of all of the consul
  servers in that datacenter.

* `xds:` if the extra metadata is in place additional clusters are defined in a
  DC to bulk sink all traffic to another DC's gateways. For the current
  datacenter we listen on a wildcard name (`server.<dc>.consul`) that load
  balances all servers as well as one mini-cluster per node
  (`<node>.server.<dc>.consul`)

* the `consul tls cert create` command got a new flag (`-node`) to help create
  an additional SAN in certs that can be used with this flavor of federation.
2020-03-09 15:59:02 -05:00
Matt Keeler 17591327e4
Bump `api` and `sdk` module versions
api -> v1.4.0
sdk -> v0.4.0
2020-02-10 20:08:47 -05:00
R.B. Boyer c37d00791c
make the TestRPC_RPCMaxConnsPerClient test less flaky (#7255) 2020-02-10 15:13:53 -06:00
Matt Keeler 005d6d4e24
Update to miekg/dns v1.1.26 (#7252) 2020-02-10 15:14:27 -05:00
Hans Hasselberg 8fd814d64c
memberlist: vendor v0.1.6 to pull in new state: stateLeft (#7184) 2020-02-03 11:02:13 +01:00
Matt Keeler 53a76519d5
Add replace directive to prevent contacting istio.io during the… (#7194)
They keep having TLS handshake timeouts. Its pointed at github instead.
2020-01-31 13:57:54 -05:00
Hans Hasselberg 50281032e0
Security fixes (#7182)
* Mitigate HTTP/RPC Services Allow Unbounded Resource Usage

Fixes #7159.

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2020-01-31 11:19:37 -05:00
Chris Piraino 3dd0b59793
Allow users to configure either unstructured or JSON logging (#7130)
* hclog Allow users to choose between unstructured and JSON logging
2020-01-28 17:50:41 -06:00
Hans Hasselberg 43392d5db3
raft: update raft to v1.1.2 (#7079)
* update raft
* use hclogger for raft.
2020-01-20 13:58:02 +01:00
Wim e3e56ff3c2 dns: fix memoryleak by upgrading outdated miekg/dns (#6748)
* Add updated github.com/miekg/dns to go modules
* Add updated github.com/miekg/dns to vendor
* Fix github.com/miekg/dns api breakage
* Decrease size when trimming UDP packets
Need more room for the header(?), if we don't decrease the size we get an
"overflow unpacking uint32" from the dns library
* Fix dns truncate tests with api changes
* Make windows build working again. Upgrade x/sys and x/crypto and vendor
This upgrade is needed because of API breakage in x/sys introduced
by the minimal x/sys dependency of miekg/dns

This API breakage has been fixed in commit
855e68c859
2019-12-16 22:31:27 +01:00
Worming ce62eacd3a accept go get command with go 1.13 (#6938) 2019-12-12 10:15:05 -06:00
Mike Morris 1fe6da2ad6
Bump go-discover to support EC2 Metadata Service v2 (#6865)
Refs https://github.com/hashicorp/go-discover/pull/128

* deps: add replace directive for gocheck

Transitive dep, source at https://launchpad.net/gocheck indicates
project moved. This also avoids a dependency on bzr when fetching
modules. Refs https://github.com/hashicorp/consul/pull/6818

* deps: make update-vendor

* test: update retry-join expected names from go-discover
2019-12-04 11:59:16 -05:00
Paul Banks a84b82b3df
connect: Add AWS PCA provider (#6795)
* Update AWS SDK to use PCA features.

* Add AWS PCA provider

* Add plumbing for config, config validation tests, add test for inheriting existing CA resources created by user

* Unparallel the tests so we don't exhaust PCA limits

* Merge updates

* More aggressive polling; rate limit pass through on sign; Timeout on Sign and CA create

* Add AWS PCA docs

* Fix Vault doc typo too

* Doc typo

* Apply suggestions from code review

Co-Authored-By: R.B. Boyer <rb@hashicorp.com>
Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Doc fixes; tests for erroring if State is modified via API

* More review cleanup

* Uncomment tests!

* Minor suggested clean ups
2019-11-21 17:40:29 +00:00
Paul Banks 9e17aa3b41
Change CA Configure struct to pass Datacenter through (#6775)
* Change CA Configure struct to pass Datacenter through

* Remove connect/ca/plugin as we don't have immediate plans to use it.

We still intend to one day but there are likely to be several changes to the CA provider interface before we do so it's better to rebuild from history when we do that work properly.

* Rename PrimaryDC; fix endpoint in secondary DCs
2019-11-18 14:22:19 +00:00
Matt Keeler e8ee7c42a3
Dependency: Update github.com/hashicorp/go-memdb to v1.0.3 (#6626) 2019-10-16 12:10:12 -04:00
Sarah Christoff 9b93dd93c9
Prune Unhealthy Agents (#6571)
* Add -prune flag to ForceLeave
2019-10-04 16:10:02 -05:00
R.B. Boyer 1d54909333
connect: intermediate CA certs generated with the vault provider lack URI SANs (#6491)
This only affects vault versions >=1.1.1 because the prior code
accidentally relied upon a bug that was fixed in
https://github.com/hashicorp/vault/pull/6505

The existing tests should have caught this, but they were using a
vendored copy of vault version 0.10.3. This fixes the tests by running
an actual copy of vault instead of an in-process copy. This has the
added benefit of changing the dependency on vault to just vault/api.

Also update VaultProvider to use similar SetIntermediate validation code
as the ConsulProvider implementation.
2019-09-23 12:04:40 -05:00
Matt Keeler 01c3445a53
Update memberlist to v0.1.5 (#6479) 2019-09-12 10:28:57 -04:00
Aaron Bennett 9988cf1200 update dependencies for enterprise change (#6395) 2019-08-26 15:22:28 -04:00
R.B. Boyer e45484c62f mod: bump main module to api v1.2.0 and sdk v0.2.0 2019-08-23 16:45:30 -05:00
Matt Keeler 0cf3271a68
Update to google.golang.org/grpc v1.23.0 (#6320) 2019-08-14 10:41:27 -04:00
Paul Banks a5c70d79d0 Revert "connect: support AWS PCA as a CA provider" (#6251)
This reverts commit 3497b7c00d49c4acbbf951d84f2bba93f3da7510.
2019-07-31 09:08:10 -04:00
Todd Radel d3b7fd83fe
connect: support AWS PCA as a CA provider (#6189)
Port AWS PCA provider from consul-ent
2019-07-30 22:57:51 -04:00
R.B. Boyer 1b95d2e5e3 Merge Consul OSS branch master at commit b3541c4f34d43ab92fe52256420759f17ea0ed73 2019-07-26 10:34:24 -05:00
Jeff Mitchell e0068431f5 Chunking support (#6172)
* Initial chunk support

This uses the go-raft-middleware library to allow for chunked commits to the KV
2019-07-24 17:06:39 -04:00
Matt Keeler c51b7aa676
Update go-bexpr (#6190)
* Update go-bexpr to v0.1.1

This brings in:

• `in`/`not in` operators to do substring matching
• `matches` / `not matches` operators to perform regex string matching.

* Add the capability to auto-generate the filtering selector ops tables for our docs
2019-07-23 14:45:20 -04:00
Hans Hasselberg 0d8d7ae052
agent: transfer leadership when establishLeadership fails (#5247) 2019-06-19 14:50:48 +02:00
Paul Banks 737be347eb
Upgrade xDS (go-control-plane) API to support Envoy 1.10. (#5872)
* Upgrade xDS (go-control-plane) API to support Envoy 1.10.

This includes backwards compatibility shim to work around the ext_authz package rename in 1.10.

It also adds integration test support in CI for 1.10.0.

* Fix go vet complaints

* go mod vendor

* Update Envoy version info in docs

* Update website/source/docs/connect/proxies/envoy.md
2019-06-07 07:10:43 -05:00
Kyle Havlovitz 5fde284cbc
vendor: update memberlist 2019-05-15 11:10:40 -07:00
R.B. Boyer 5ef7b80b9d bump versions of api and sdk 2019-05-08 13:32:07 -05:00
Alvin Huang d1dd6716fd
bump hashicorp/yamux (#5776)
* bump hashicorp/yamux

* go mod tidy to remove older yamux
2019-05-02 15:11:39 -04:00
Matt Keeler 8c386277ef
Fix up the MapWalk function so that it properly handles nested map[interface{}]interface{} (#5774) 2019-05-02 14:43:54 -04:00
Paul Banks 078f4cf5bb Add integration test for central config; fix central config WIP (#5752)
* Add integration test for central config; fix central config WIP

* Add integration test for central config; fix central config WIP

* Set proxy protocol correctly and begin adding upstream support

* Add upstreams to service config cache key and start new notify watcher if they change.

This doesn't update the tests to pass though.

* Fix some merging logic get things working manually with a hack (TODO fix properly)

* Simplification to not allow enabling sidecars centrally - it makes no sense without upstreams anyway

* Test compile again and obvious ones pass. Lots of failures locally not debugged yet but may be flakes. Pushing up to see what CI does

* Fix up service manageer and API test failures

* Remove the enable command since it no longer makes much sense without being able to turn on sidecar proxies centrally

* Remove version.go hack - will make integration test fail until release

* Remove unused code from commands and upstream merge

* Re-bump version to 1.5.0
2019-05-01 16:39:31 -07:00
Paul Banks d6c0557e86
Connect: allow configuring Envoy for L7 Observability (#5558)
* Add support for HTTP proxy listeners

* Add customizable bootstrap configuration options

* Debug logging for xDS AuthZ

* Add Envoy Integration test suite with basic test coverage

* Add envoy command tests to cover new cases

* Add tracing integration test

* Add gRPC support WIP

* Merged changes from master Docker. get CI integration to work with same Dockerfile now

* Make docker build optional for integration

* Enable integration tests again!

* http2 and grpc integration tests and fixes

* Fix up command config tests

* Store all container logs as artifacts in circle on fail

* Add retries to outer part of stats measurements as we keep missing them in CI

* Only dump logs on failing cases

* Fix typos from code review

* Review tidying and make tests pass again

* Add debug logs to exec test.

* Fix legit test failure caused by upstream rename in envoy config

* Attempt to reduce cases of bad TLS handshake in CI integration tests

* bring up the right service

* Add prometheus integration test

* Add test for denied AuthZ both HTTP and TCP

* Try ANSI term for Circle
2019-04-29 17:27:57 +01:00
R.B. Boyer 5a505c5b3a acl: adding support for kubernetes auth provider login (#5600)
* auth providers
* binding rules
* auth provider for kubernetes
* login/logout
2019-04-26 14:49:25 -05:00
Matt Keeler 913c82ec9f
Update go-msgpack version (#5683)
Fixes #4673
Supercedes: #5677 

There was an error decoding `map[string]string` values due to Go strings being immutable. This was fixes in our go-msgpack fork.
2019-04-18 15:10:34 -04:00
Matt Keeler ac78c23021
Implement data filtering of some endpoints (#5579)
Fixes: #4222 

# Data Filtering

This PR will implement filtering for the following endpoints:

## Supported HTTP Endpoints

- `/agent/checks`
- `/agent/services`
- `/catalog/nodes`
- `/catalog/service/:service`
- `/catalog/connect/:service`
- `/catalog/node/:node`
- `/health/node/:node`
- `/health/checks/:service`
- `/health/service/:service`
- `/health/connect/:service`
- `/health/state/:state`
- `/internal/ui/nodes`
- `/internal/ui/services`

More can be added going forward and any endpoint which is used to list some data is a good candidate.

## Usage

When using the HTTP API a `filter` query parameter can be used to pass a filter expression to Consul. Filter Expressions take the general form of:

```
<selector> == <value>
<selector> != <value>
<value> in <selector>
<value> not in <selector>
<selector> contains <value>
<selector> not contains <value>
<selector> is empty
<selector> is not empty
not <other expression>
<expression 1> and <expression 2>
<expression 1> or <expression 2>
```

Normal boolean logic and precedence is supported. All of the actual filtering and evaluation logic is coming from the [go-bexpr](https://github.com/hashicorp/go-bexpr) library

## Other changes

Adding the `Internal.ServiceDump` RPC endpoint. This will allow the UI to filter services better.
2019-04-16 12:00:15 -04:00
Freddy 4fa4cffd41
Add additional raft metrics (#5628)
* Add documentation for new raft metrics
* Revendor raft from master
2019-04-09 16:09:22 -06:00
Paul Banks 869387323f
Pull go-discover to fix Sirupsen/logrus (#5598)
* Pull go-discover to fix Sirupsen/logrus

* Actually rename Sirupsen -> sirupsen in vendor (despite macOS)

* Actually _actually_ rename Sirupsen -> sirupsen in vendor (despite macOS)
2019-04-03 20:07:00 +01:00
Hans Hasselberg cf4eb2474a
fix remaining CI failures after Go 1.12.1 Upgrade (#5576) 2019-03-29 16:29:27 +01:00
Jeff Mitchell e8e8aa1fe5
Update api ver to 1.0.1 (#5573) 2019-03-27 09:01:06 -04:00
Jeff Mitchell 118f5c93c4
Update SDK version (#5572) 2019-03-27 08:58:50 -04:00
Jeff Mitchell d3c7d57209
Move internal/ to sdk/ (#5568)
* Move internal/ to sdk/

* Add a readme to the SDK folder
2019-03-27 08:54:56 -04:00
Jeff Mitchell bfb42c34a8
Update api/internal tags (#5565) 2019-03-26 17:40:53 -04:00
Jeff Mitchell a41c865059
Convert to Go Modules (#5517)
* First conversion

* Use serf 0.8.2 tag and associated updated deps

* * Move freeport and testutil into internal/

* Make internal/ its own module

* Update imports

* Add replace statements so API and normal Consul code are
self-referencing for ease of development

* Adapt to newer goe/values

* Bump to new cleanhttp

* Fix ban nonprintable chars test

* Update lock bad args test

The error message when the duration cannot be parsed changed in Go 1.12
(ae0c435877d3aacb9af5e706c40f9dddde5d3e67). This updates that test.

* Update another test as well

* Bump travis

* Bump circleci

* Bump go-discover and godo to get rid of launchpad dep

* Bump dockerfile go version

* fix tar command

* Bump go-cleanhttp
2019-03-26 17:04:58 -04:00