Commit Graph

191 Commits

Author SHA1 Message Date
R.B. Boyer 809344a6f5
peering: initial sync (#12842)
- Add endpoints related to peering: read, list, generate token, initiate peering
- Update node/service/check table indexing to account for peers
- Foundational changes for pushing service updates to a peer
- Plumb peer name through Health.ServiceNodes path

see: ENT-1765, ENT-1280, ENT-1283, ENT-1283, ENT-1756, ENT-1739, ENT-1750, ENT-1679,
     ENT-1709, ENT-1704, ENT-1690, ENT-1689, ENT-1702, ENT-1701, ENT-1683, ENT-1663,
     ENT-1650, ENT-1678, ENT-1628, ENT-1658, ENT-1640, ENT-1637, ENT-1597, ENT-1634,
     ENT-1613, ENT-1616, ENT-1617, ENT-1591, ENT-1588, ENT-1596, ENT-1572, ENT-1555

Co-authored-by: R.B. Boyer <rb@hashicorp.com>
Co-authored-by: freddygv <freddy@hashicorp.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Co-authored-by: Evan Culver <eculver@hashicorp.com>
Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com>
2022-04-21 17:34:40 -05:00
DanStough a050aa39b9 Update go version to 1.18.1 2022-04-18 11:41:10 -04:00
R.B. Boyer 005e5f12d8
deps: update to latest go-discover (#12739)
Fixes #11253

    $ go mod why -m github.com/dgrijalva/jwt-go
    # github.com/dgrijalva/jwt-go
    (main module does not need module github.com/dgrijalva/jwt-go)

    $ go mod why -m github.com/form3tech-oss/jwt-go
    # github.com/form3tech-oss/jwt-go
    github.com/hashicorp/consul/agent
    github.com/hashicorp/go-discover
    github.com/hashicorp/go-discover/provider/azure
    github.com/Azure/go-autorest/autorest
    github.com/Azure/go-autorest/autorest/adal
    github.com/form3tech-oss/jwt-go
2022-04-12 13:41:12 -05:00
Matt Keeler 3447880091
Enable running autopilot state updates on all servers (#12617)
* Fixes a lint warning about t.Errorf not supporting %w

* Enable running autopilot on all servers

On the non-leader servers all they do is update the state and do not attempt any modifications.

* Fix the RPC conn limiting tests

Technically they were relying on racey behavior before. Now they should be reliable.
2022-04-07 10:48:48 -04:00
Eric 91a493efe9 Bump go-control-plane
* `go get cloud.google.com/go@v0.59.0`
* `go get github.com/envoyproxy/go-control-plane@v0.9.9`
* `make envoy-library`
* Bumpprotoc to 3.15.8
2022-03-30 13:11:27 -04:00
Matt Keeler 8d51e22d26
Update raft-boltdb to pull in new writeCapacity metric (#12646) 2022-03-30 11:38:44 -04:00
Eric 98b733e41a remove gogo from pbservice 2022-03-23 12:18:01 -04:00
FFMMM 3c08843847
[sync oss] add net/rpc interceptor implementation (#12573)
* sync ent changes from 866dcb0667

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

* update oss go.mod

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-03-17 16:02:26 -07:00
R.B. Boyer 679cea7171
raft: upgrade to v1.3.6 (#12496)
Add additional protections on the Consul side to prevent NonVoters from bootstrapping raft.

This should un-flake TestServer_Expect_NonVoters
2022-03-02 17:00:02 -06:00
R.B. Boyer 3804677570
server: suppress spurious blocking query returns where multiple config entries are involved (#12362)
Starting from and extending the mechanism introduced in #12110 we can specially handle the 3 main special Consul RPC endpoints that react to many config entries in a single blocking query in Connect:

- `DiscoveryChain.Get`
- `ConfigEntry.ResolveServiceConfig`
- `Intentions.Match`

All of these will internally watch for many config entries, and at least one of those will likely be not found in any given query. Because these are blends of multiple reads the exact solution from #12110 isn't perfectly aligned, but we can tweak the approach slightly and regain the utility of that mechanism.

### No Config Entries Found

In this case, despite looking for many config entries none may be found at all. Unlike #12110 in this scenario we do not return an empty reply to the caller, but instead synthesize a struct from default values to return. This can be handled nearly identically to #12110 with the first 1-2 replies being non-empty payloads followed by the standard spurious wakeup suppression mechanism from #12110.

### No Change Since Last Wakeup

Once a blocking query loop on the server has completed and slept at least once, there is a further optimization we can make here to detect if any of the config entries that were present at specific versions for the prior execution of the loop are identical for the loop we just woke up for. In that scenario we can return a slightly different internal sentinel error and basically externally handle it similar to #12110.

This would mean that even if 20 discovery chain read RPC handling goroutines wakeup due to the creation of an unrelated config entry, the only ones that will terminate and reply with a blob of data are those that genuinely have new data to report.

### Extra Endpoints

Since this pattern is pretty reusable, other key config-entry-adjacent endpoints used by `agent/proxycfg` also were updated:

- `ConfigEntry.List`
- `Internal.IntentionUpstreams` (tproxy)
2022-02-25 15:46:34 -06:00
Dhia Ayachi 378f688a6a
file watcher to be used for configuration auto-reload feature (#12301)
* add config watcher to the config package

* add logging to watcher

* add test and refactor to add WatcherEvent.

* add all API calls and fix a bug with recreated files

* add tests for watcher

* remove the unnecessary use of context

* Add debug log and a test for file rename

* use inode to detect if the file is recreated/replaced and only listen to create events.

* tidy ups (#1535)

* tidy ups

* Add tests for inode reconcile

* fix linux vs windows syscall

* fix linux vs windows syscall

* fix windows compile error

* increase timeout

* use ctime ID

* remove remove/creation test as it's a use case that fail in linux

* fix linux/windows to use Ino/CreationTime

* fix the watcher to only overwrite current file id

* fix linter error

* fix remove/create test

* set reconcile loop to 200 Milliseconds

* fix watcher to not trigger event on remove, add more tests

* on a remove event try to add the file back to the watcher and trigger the handler if success

* fix race condition

* fix flaky test

* fix race conditions

* set level to info

* fix when file is removed and get an event for it after

* fix to trigger handler when we get a remove but re-add fail

* fix error message

* add tests for directory watch and fixes

* detect if a file is a symlink and return an error on Add

* rename Watcher to FileWatcher and remove symlink deref

* add fsnotify@v1.5.1

* fix go mod

* fix flaky test

* Apply suggestions from code review

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>

* fix a possible stack overflow

* do not reset timer on errors, rename OS specific files

* start the watcher when creating it

* fix data race in tests

* rename New func

* do not call handler when a remove event happen

* events trigger on write and rename

* fix watcher tests

* make handler async

* remove recursive call

* do not produce events for sub directories

* trim "/" at the end of a directory when adding

* add missing test

* fix logging

* add todo

* fix failing test

* fix flaking tests

* fix flaky test

* add logs

* fix log text

* increase timeout

* reconcile when remove

* check reconcile when removed

* fix reconcile move test

* fix logging

* delete invalid file

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix review comments

* fix is watched to properly catch a remove

* change test timeout

* fix test and rename id

* fix test to create files with different mod time.

* fix deadlock when stopping watcher

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix a deadlock when calling stop while emitting event is blocked

* make sure to close the event channel after the event loop is done

* add go doc

* back date file instead of sleeping

* Apply suggestions from code review

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* check error

Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com>
Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
2022-02-21 11:36:52 -05:00
R.B. Boyer ef8cc33949
raft: update to v1.3.5 (#12325)
This includes closing some leadership transfer gaps and adding snapshot
restore progress logging.
2022-02-14 13:48:52 -06:00
FFMMM 1f8fb17be7
Vendor in rpc mono repo for net/rpc fork, go-msgpack, msgpackrpc. (#12311)
This commit syncs ENT changes to the OSS repo.

Original commit details in ENT:

```
commit 569d25f7f4578981c3801e6e067295668210f748
Author: FFMMM <FFMMM@users.noreply.github.com>
Date:   Thu Feb 10 10:23:33 2022 -0800

    Vendor fork net rpc (#1538)

    * replace net/rpc w consul-net-rpc/net/rpc

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * replace msgpackrpc and go-msgpack with fork from mono repo

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>

    * gofmt all files touched

    Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
```

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2022-02-14 09:45:45 -08:00
Dhia Ayachi 7e57aa8d36
update serf to v0.9.7 (#12057)
* update serf to v0.9.7

* add change log

* update changelog
2022-01-18 13:03:22 -05:00
dependabot[bot] 0d0b55bd1b
build(deps): bump github.com/ryanuber/columnize (#12062)
Bumps [github.com/ryanuber/columnize](https://github.com/ryanuber/columnize) from 2.1.0+incompatible to 2.1.2+incompatible.
- [Release notes](https://github.com/ryanuber/columnize/releases)
- [Commits](https://github.com/ryanuber/columnize/compare/v2.1.0...v2.1.2)

---
updated-dependencies:
- dependency-name: github.com/ryanuber/columnize
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 12:39:13 -05:00
dependabot[bot] 761c286d62
build(deps): bump github.com/kr/text from 0.1.0 to 0.2.0 (#12063)
Bumps [github.com/kr/text](https://github.com/kr/text) from 0.1.0 to 0.2.0.
- [Release notes](https://github.com/kr/text/releases)
- [Commits](https://github.com/kr/text/compare/v0.1.0...v0.2.0)

---
updated-dependencies:
- dependency-name: github.com/kr/text
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 12:35:00 -05:00
dependabot[bot] a6ae87f1a7
build(deps): bump github.com/mitchellh/pointerstructure (#12072)
Bumps [github.com/mitchellh/pointerstructure](https://github.com/mitchellh/pointerstructure) from 1.0.0 to 1.2.1.
- [Release notes](https://github.com/mitchellh/pointerstructure/releases)
- [Commits](https://github.com/mitchellh/pointerstructure/compare/v1.0.0...v1.2.1)

---
updated-dependencies:
- dependency-name: github.com/mitchellh/pointerstructure
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 11:56:26 -05:00
dependabot[bot] 6c7379d58e
Bump github.com/aws/aws-sdk-go from 1.25.41 to 1.42.34 (#12083)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.25.41 to 1.42.34.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.25.41...v1.42.34)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-14 11:54:41 -05:00
dependabot[bot] 5c1a491cd4
Bump github.com/hashicorp/go-memdb from 1.3.1 to 1.3.2 (#11066)
Bumps [github.com/hashicorp/go-memdb](https://github.com/hashicorp/go-memdb) from 1.3.1 to 1.3.2.
- [Release notes](https://github.com/hashicorp/go-memdb/releases)
- [Changelog](https://github.com/hashicorp/go-memdb/blob/master/changes.go)
- [Commits](https://github.com/hashicorp/go-memdb/compare/v1.3.1...v1.3.2)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-memdb
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-13 10:54:36 -08:00
dependabot[bot] 1ea48ec523
Bump github.com/hashicorp/go-raftchunking from 0.6.1 to 0.6.2 (#11065)
Bumps [github.com/hashicorp/go-raftchunking](https://github.com/hashicorp/go-raftchunking) from 0.6.1 to 0.6.2.
- [Release notes](https://github.com/hashicorp/go-raftchunking/releases)
- [Commits](https://github.com/hashicorp/go-raftchunking/compare/v0.6.1...v0.6.2)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-raftchunking
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-13 10:28:49 -08:00
dependabot[bot] 12a148f0d3
build(deps): bump github.com/hashicorp/go-multierror from 1.1.0 to 1.1.1
Bumps [github.com/hashicorp/go-multierror](https://github.com/hashicorp/go-multierror) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/hashicorp/go-multierror/releases)
- [Commits](https://github.com/hashicorp/go-multierror/compare/v1.1.0...v1.1.1)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-multierror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-13 17:41:15 +00:00
Chris S. Kim 5a0f3e994e
Update memberlist to 0.3.1 (#12042) 2022-01-12 12:00:18 -05:00
Dhia Ayachi 09688bdc38
upgrade raft to v1.3.3 (#11958)
* upgrade raft to v1.3.3

* add change log

* reword the changelog

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>

Co-authored-by: FFMMM <FFMMM@users.noreply.github.com>
2022-01-06 14:09:13 -05:00
kisunji 2927327529 Update golang.org/x/net to address CVE-2021-44716 2021-12-15 11:54:47 -05:00
Daniel Nephin 52c8b4994b Merge remote-tracking branch 'origin/main' into serve-panic-recovery 2021-12-07 16:30:41 -05:00
Matt Keeler 1f49738167 Use raft-boltdb/v2 2021-12-02 16:56:15 -05:00
Mike Morris b02253acbf
deps: update gopsutil to fix Windows ARM and macOS non-Apple LLVM builds (#11586)
Bumps transitive dep go-ole to v1.2.6 with fixes
2021-11-16 15:40:11 -05:00
R.B. Boyer 086ff42b56
partitions: various refactors to support partitioning the serf LAN pool (#11568) 2021-11-15 09:51:14 -06:00
Giulio Micheloni 10cdc0a5c8
Merge branch 'main' into serve-panic-recovery 2021-11-06 16:12:06 +01:00
FFMMM 67a624a49f
use *telemetry.MetricsPrefix as prometheus.PrometheusOpts.Name (#11290)
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
2021-10-21 13:33:01 -07:00
Dhia Ayachi 7ef431c823
update raft to v1.3.2 (#11375)
* update raft to v1.3.2

* add changelog

* fix changelog
2021-10-21 13:21:22 -04:00
Giulio Micheloni a5a4eb9cae Separete test file and no stack trace in ret error 2021-10-16 18:02:03 +01:00
Giulio Micheloni 10814d934e Merge branch 'main' of https://github.com/hashicorp/consul into hashicorp-main 2021-10-16 16:59:32 +01:00
Jeff Widman a8f396c55f
Bump `go-discover` to fix broken dep tree (#10898) 2021-09-16 15:31:22 -04:00
freddygv 8772e2fbce Update yamux 2021-08-25 19:46:12 -06:00
Giulio Micheloni c4a63d4665 Fix go.sum with go mod tidy 2021-08-22 19:50:10 +01:00
Giulio Micheloni 10b03c3f4e
Merge branch 'main' into serve-panic-recovery 2021-08-22 20:31:11 +02:00
Daniel Nephin d2f5b4d335 debug: improve a couple of the test cases
Use gotest.tools/v3/fs to make better assertions about the files

Remove the TestAgent from TestDebugCommand_Prepare_ValidateTiming, since we can test that validation
without making any API calls.
2021-08-18 12:29:34 -04:00
Roopak Venkatakrishnan d4dacd0e2e Update x/sys to support go 1.17 2021-08-18 03:00:22 +00:00
Mike Morris 86d76cb099
deps: upgrade gogo-protobuf to v1.3.2 (#10813)
* deps: upgrade gogo-protobuf to v1.3.2

* go mod tidy using go 1.16

* proto: regen protobufs after upgrading gogo/protobuf

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-08-12 14:05:46 -04:00
Giulio Micheloni 0bf124502e grpc Server: turn panic into error through middleware 2021-08-07 13:21:12 +01:00
Daniel Nephin 5da6c51ae4 Update armon/go-metrics
To pickup new InMemSink.Stream method
2021-07-26 15:58:17 -04:00
Daniel Nephin 291315e39f Update serf
To pick up data race fixes
2021-07-14 18:58:16 -04:00
Dhia Ayachi c3eacac764
upgrade golang crypto from 0.0.0-20200930160638-afb6bcd081ae => v0.0.0-20210513164829-c07d793c2f9a (#10390) 2021-06-14 12:38:42 -04:00
Dhia Ayachi e3dd0f9a44
generate a single debug file for a long duration capture (#10279)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* extract wait group outside the go routine to avoid a race condition

* capture pprof in a separate go routine

* perform a single capture for pprof data for the whole duration

* add missing vendor dependency

* add a change log and fix documentation to reflect the change

* create function for timestamp dir creation and simplify error handling

* use error groups and ticker to simplify interval capture loop

* Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval

* refactor Logs capture routine and add log capture specific test

* improve error reporting when log test fail

* change test duration to 1s

* make time parsing in log line more robust

* refactor log time format in a const

* test on log line empty the earliest possible and return

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* rename function to captureShortLived

* more specific changelog

Co-authored-by: Paul Banks <banks@banksco.de>

* update documentation to reflect current implementation

* add test for behavior when invalid param is passed to the command

* fix argument line in test

* a more detailed description of the new behaviour

Co-authored-by: Paul Banks <banks@banksco.de>

* print success right after the capture is done

* remove an unnecessary error check

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 13:00:51 -04:00
Matt Keeler b45dd03b8f
Bump raft-autopilot version to the latest. (#10306) 2021-05-27 12:59:14 -04:00
Daniel Nephin 203c752ee8 Update a couple dependencies
To pickup bug fixes
2021-05-04 14:09:10 -04:00
Paul Banks d47eea3a3f
Make Raft trailing logs and snapshot timing reloadable (#10129)
* WIP reloadable raft config

* Pre-define new raft gauges

* Update go-metrics to change gauge reset behaviour

* Update raft to pull in new metric and reloadable config

* Add snapshot persistance timing and installSnapshot to our 'protected' list as they can be infrequent but are important

* Update telemetry docs

* Update config and telemetry docs

* Add note to oldestLogAge on when it is visible

* Add changelog entry

* Update website/content/docs/agent/options.mdx

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>

Co-authored-by: Matt Keeler <mkeeler@users.noreply.github.com>
2021-05-04 15:36:53 +01:00
Daniel Nephin bb67ab6257 vendor: commit changes from update-vendor
I guess a couple PRs crossed streams and somehow that resulted in this line not being
needed anymore in go.sum
2021-04-16 14:31:56 -04:00
R.B. Boyer 1ae772ff99
mod: bump to github.com/hashicorp/mdns v1.0.4 (#10018) 2021-04-14 14:17:52 -05:00
Daniel Nephin 78cb867c8e Update memberlist to v0.2.3
To pickup data race fixes
2021-03-24 18:20:19 -04:00
Daniel Nephin 66c3c76aa6 Update go-memdb
To use a version that will not panic when an iterator is used with modifications.
2021-01-28 17:19:55 -05:00
Daniel Nephin 2eea58bcc4
Merge pull request #9302 from hashicorp/dnephin/add-service-3
agent: remove ServiceManager.Start goroutine
2021-01-28 16:59:41 -05:00
Matt Keeler 1379b5f7d6
Upgrade raft-autopilot and wait for autopilot it to stop when revoking leadership (#9644)
Fixes: 9626
2021-01-27 11:14:52 -05:00
Daniel Nephin 3685f39970 lib/mutex: add mutex with TryLock and update vendor 2021-01-25 18:01:47 -05:00
Daniel Nephin 90bf8460a1 Update mapstructure 2021-01-12 12:24:56 -05:00
Pierre Souchay 4f8b0b307c [bugfix] Prometheus metrics without warnings
go-metrics is updated to 0.3.6 to properly handle help in prometheus metrics

This fixes https://github.com/hashicorp/consul/issues/9303 and
https://github.com/hashicorp/consul/issues/9471
2021-01-06 13:54:05 +01:00
Matt Keeler 755fb72994
Switch to using the external autopilot module 2020-11-09 09:22:11 -05:00
Mike Morris 9ccb340893
chore: upgrade to gopsutil/v3 (#9118)
* deps: update golang.org/x/sys

* deps: update imports to gopsutil/v3

* chore: make update-vendor
2020-11-06 20:48:38 -05:00
Kit Patella b668592326 rollback golang.org/x/sys version to fix distro-build 2020-11-05 12:09:07 -08:00
Kit Patella a0f64616f7 apply make update-vendor 2020-11-05 11:51:58 -08:00
Kit Patella fbe61ad16c upgrade go-metrics to latest 2020-11-04 14:02:13 -08:00
Kyle Havlovitz 95f7b354c2 vendor: Update github.com/hashicorp/yamux 2020-10-09 05:05:46 -07:00
Kyle Havlovitz 8e0ea86754 vendor: Update github.com/hashicorp/mdns 2020-10-09 04:43:27 -07:00
Kyle Havlovitz 3cd60e1d72 vendor: Update github.com/hashicorp/hil 2020-10-09 04:43:27 -07:00
Kyle Havlovitz 02e282a7ab vendor: Update github.com/hashicorp/go-version 2020-10-09 04:43:27 -07:00
Kyle Havlovitz bc6ffb59b8 vendor: Update github.com/hashicorp/go-memdb 2020-10-09 04:43:27 -07:00
Kyle Havlovitz b5bb29f938 vendor: Update github.com/hashicorp/go-checkpoint 2020-10-09 04:43:27 -07:00
Mike Morris 4ae98cde2b
chore: update raft to v1.2.0 (#8822) 2020-10-08 15:07:10 -04:00
Matt Keeler 141eb60f06
Add per-agent reconnect timeouts (#8781)
This allows for client agent to be run in a more stateless manner where they may be abruptly terminated and not expected to come back. If advertising a per-agent reconnect timeout using the advertise_reconnect_timeout configuration when that agent leaves, other agents will wait only that amount of time for the agent to come back before reaping it.

This has the advantageous side effect of causing servers to deregister the node/services/checks for that agent sooner than if the global reconnect_timeout was used.
2020-10-08 15:02:19 -04:00
Mike Morris 1d4f3166fb
chore(deps): update gopsutil to v2.20.9 (#8843)
* core(deps): bump golang.org/x/sys

To resolve /go/pkg/mod/github.com/shirou/gopsutil@v2.20.9+incompatible/host/host_bsd.go:20:13: undefined: unix.SysctlTimeval

* chore(deps): make update-vendor
2020-10-07 12:57:18 -04:00
Daniel Nephin b9bf0b527c Vendor gofuzz and google/go-cmp 2020-09-28 18:28:37 -04:00
Kyle Havlovitz c8fd61abc7 Merge branch 'master' into vault-ca-renew-token 2020-09-15 14:39:04 -07:00
Kyle Havlovitz 316600a685 Update vault CA for latest api client 2020-09-15 13:33:55 -07:00
Kyle Havlovitz c3bd917650 vendor: Update vault api package 2020-09-15 12:45:29 -07:00
Daniel Nephin beb125f053 Update go-metrics dependencies, to use metrics.Default() 2020-09-14 19:05:22 -04:00
Hans Hasselberg 02de4c8b76
add primary keys to list keyring (#8522)
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.

Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:

```json
[
  {
    "WAN": true,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "NumNodes": 6
  },
  {
    "WAN": false,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  },
  {
    "WAN": false,
    "Datacenter": "dc1",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  }
]
```

I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output

Fixes #3393.
2020-08-18 09:50:24 +02:00
s-christoff efcda70b85
Update Go-Metrics 0.3.4 (#8478) 2020-08-11 11:17:43 -05:00
Kyle Havlovitz 22721d56c8 vendor: Update github.com/armon/go-metrics to v0.3.3 2020-07-23 11:37:33 -07:00
Matt Keeler 2f68d5972a
Update mapstructure to v1.3.3 (#8361)
This was done in preparation for another PR where I was running into https://github.com/mitchellh/mapstructure/issues/202 and implemented a fix for the library.
2020-07-22 15:13:21 -04:00
R.B. Boyer 33f3436e94
gossip: Avoid issue where two unique leave events for the same node could lead to infinite rebroadcast storms (#8343)
bump serf to v0.9.3 to include fix for https://github.com/hashicorp/serf/pull/606
2020-07-21 15:48:10 -05:00
Pierre Souchay f77182aa51
Upgrade go-connlimit to v0.3.0 / return http 429 on too many connections (#8221)
Fixes #7527

I want to highlight this and explain what I think the implications are and make sure we are aware:

* `HTTPConnStateFunc` closes the connection when it is beyond the limit. `Close` does not block.
* `HTTPConnStateFuncWithDefault429Handler(10 * time.Millisecond)` blocks until the following is done (worst case):
  1) `conn.SetDeadline(10*time.Millisecond)` so that
  2) `conn.Write(429error)` is guaranteed to timeout after 10ms, so that the http 429 can be written and 
  3) `conn.Close` can happen

The implication of this change is that accepting any new connection is worst case delayed by 10ms. But only after a client reached the limit already.
2020-07-03 09:25:07 +02:00
Hans Hasselberg 9a38e4f766
Update gopsutil (#8208)
https://github.com/shirou/gopsutil/pull/895 is merged and fixes our
problem. Time to update. Since there is no new version just yet,
updating to the sha.
2020-07-01 14:47:56 +02:00
Matt Keeler 2ab8af4093
Add a test for go routine leaks
This is in its own separate package so that it will be a separate test binary that runs thus isolating the go runtime from other tests and allowing accurate go routine leak checking.

This test would ideally use goleak.VerifyTestMain but that will fail 100% of the time due to some architectural things (blocking queries and net/rpc uncancellability).

This test is not comprehensive. We should enable/exercise more features and more cluster configurations. However its a start.
2020-06-24 17:09:50 -04:00
R.B. Boyer ba83b52b32
connect: upgrade github.com/envoyproxy/go-control-plane to v0.9.5 (#8165) 2020-06-23 15:19:56 -05:00
Paul Banks f9a6386c4a state: track changes so that they may be used to produce change events 2020-06-16 13:04:29 -04:00
Daniel Nephin f9a4bb5c2d Update google.golang.org/api and stretchr/testify
To match the versions used in enterprise, should slightly reduce the
chances of getting a merge conflict when using `go.mod`.
2020-06-09 16:03:05 -04:00
Daniel Nephin 221a49f430 Update protobuf and golang.org/x/... vendor
Partially extracted from #7547

Updates protobuf to the most recent in the 1.3.x series, and updates
golang.org/x/sys to a7d97aace0b0 because of https://github.com/shirou/gopsutil/issues/853
prevents updating to a more recent version.

This breaking change in x/sys also prevents us from getting a newer
version of x/net. In the future, if gopsutil is not patched,  we may want to run a fork version of
gopsutil so that we can update both x/net and x/sys.
2020-06-09 14:46:41 -04:00
R.B. Boyer 940e5ad160
acl: add auth method for JWTs (#7846) 2020-05-11 20:59:29 -05:00
Mike Morris f39bd26a7c
vendor: revert golang.org/x/sys bump to avoid FreeBSD regression (#7780) 2020-05-05 09:26:17 +02:00
Hans Hasselberg 1be90e0fa1
agent: don't let left nodes hold onto their node-id (#7747) 2020-05-04 18:39:08 +02:00
Matt Keeler 669d22933e
Merge pull request #7714 from hashicorp/oss-sync/msp-agent-token 2020-05-04 11:33:50 -04:00
Matt Keeler 31c5bcc53e
Update go-discover dependency (#7731) 2020-05-04 10:59:48 -04:00
Matt Keeler 49c90833ff
Update mapstructure to v1.2.3
This release contains a fix to prevent duplicate keys in the Metadata after decoding where the output value contains pointer fields.
2020-04-28 09:33:16 -04:00
R.B. Boyer f1d8ea7018
cli: ensure that 'snapshot save' is fsync safe and also only writes to the requested file on success (#7698) 2020-04-24 17:34:47 -05:00
R.B. Boyer 032e0ae901
cli: fix usage of gzip.Reader to better detect corrupt snapshots during save/restore (#7697) 2020-04-24 17:18:56 -05:00
Daniel Nephin 7237eb67ed Update github.com/joyent/triton-go to latest
There was an RSA private key used for testing included in the old
version. This commit updates it to a version that does not include the
key so that the key is not detected by tools which scan the Consul
binary for private keys.

Commands run:

go get github.com/joyent/triton-go@6801d15b779f042cfd821c8a41ef80fc33af9d47
make update-vendor
2020-04-16 12:34:29 -04:00
Daniel Nephin b7ff6424f5 cli: send requested help text to stdout
This behaviour matches the GNU CLI standard:
http://www.gnu.org/prep/standards/html_node/_002d_002dhelp.html
2020-03-26 15:27:34 -04:00
R.B. Boyer a7fb26f50f
wan federation via mesh gateways (#6884)
This is like a Möbius strip of code due to the fact that low-level components (serf/memberlist) are connected to high-level components (the catalog and mesh-gateways) in a twisty maze of references which make it hard to dive into. With that in mind here's a high level summary of what you'll find in the patch:

There are several distinct chunks of code that are affected:

* new flags and config options for the server

* retry join WAN is slightly different

* retry join code is shared to discover primary mesh gateways from secondary datacenters

* because retry join logic runs in the *agent* and the results of that
  operation for primary mesh gateways are needed in the *server* there are
  some methods like `RefreshPrimaryGatewayFallbackAddresses` that must occur
  at multiple layers of abstraction just to pass the data down to the right
  layer.

* new cache type `FederationStateListMeshGatewaysName` for use in `proxycfg/xds` layers

* the function signature for RPC dialing picked up a new required field (the
  node name of the destination)

* several new RPCs for manipulating a FederationState object:
  `FederationState:{Apply,Get,List,ListMeshGateways}`

* 3 read-only internal APIs for debugging use to invoke those RPCs from curl

* raft and fsm changes to persist these FederationStates

* replication for FederationStates as they are canonically stored in the
  Primary and replicated to the Secondaries.

* a special derivative of anti-entropy that runs in secondaries to snapshot
  their local mesh gateway `CheckServiceNodes` and sync them into their upstream
  FederationState in the primary (this works in conjunction with the
  replication to distribute addresses for all mesh gateways in all DCs to all
  other DCs)

* a "gateway locator" convenience object to make use of this data to choose
  the addresses of gateways to use for any given RPC or gossip operation to a
  remote DC. This gets data from the "retry join" logic in the agent and also
  directly calls into the FSM.

* RPC (`:8300`) on the server sniffs the first byte of a new connection to
  determine if it's actually doing native TLS. If so it checks the ALPN header
  for protocol determination (just like how the existing system uses the
  type-byte marker).

* 2 new kinds of protocols are exclusively decoded via this native TLS
  mechanism: one for ferrying "packet" operations (udp-like) from the gossip
  layer and one for "stream" operations (tcp-like). The packet operations
  re-use sockets (using length-prefixing) to cut down on TLS re-negotiation
  overhead.

* the server instances specially wrap the `memberlist.NetTransport` when running
  with gateway federation enabled (in a `wanfed.Transport`). The general gist is
  that if it tries to dial a node in the SAME datacenter (deduced by looking
  at the suffix of the node name) there is no change. If dialing a DIFFERENT
  datacenter it is wrapped up in a TLS+ALPN blob and sent through some mesh
  gateways to eventually end up in a server's :8300 port.

* a new flag when launching a mesh gateway via `consul connect envoy` to
  indicate that the servers are to be exposed. This sets a special service
  meta when registering the gateway into the catalog.

* `proxycfg/xds` notice this metadata blob to activate additional watches for
  the FederationState objects as well as the location of all of the consul
  servers in that datacenter.

* `xds:` if the extra metadata is in place additional clusters are defined in a
  DC to bulk sink all traffic to another DC's gateways. For the current
  datacenter we listen on a wildcard name (`server.<dc>.consul`) that load
  balances all servers as well as one mini-cluster per node
  (`<node>.server.<dc>.consul`)

* the `consul tls cert create` command got a new flag (`-node`) to help create
  an additional SAN in certs that can be used with this flavor of federation.
2020-03-09 15:59:02 -05:00
Matt Keeler 78a6c5674d
Run make update-vendor and fixup various go.sum files
go mod tidy removes these lines because we have a replace directive
2020-02-11 09:20:49 -05:00