Commit Graph

233 Commits

Author SHA1 Message Date
R.B. Boyer 7f30950060
update github.com/hashicorp/{serf,memberlist,go-sockaddr} (#5189)
This activates large-cluster improvements in the gossip layer from
https://github.com/hashicorp/memberlist/pull/167
2019-01-07 15:00:47 -06:00
Jack Pearkes 5951f842d3 vendor: upgrade to latest version of gopsutil 2018-10-19 11:33:23 -07:00
Matt Keeler 99e0a124cb
New ACLs (#4791)
This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week.
Description

At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers.

On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though.

    Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though.
    All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management.
    Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are:
        A server running the new system must still support other clients using the legacy system.
        A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system.
        The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode.

So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.
2018-10-19 12:04:07 -04:00
Jack Pearkes 197d62c6ca New command: consul debug (#4754)
* agent/debug: add package for debugging, host info

* api: add v1/agent/host endpoint

* agent: add v1/agent/host endpoint

* command/debug: implementation of static capture

* command/debug: tests and only configured targets

* agent/debug: add basic test for host metrics

* command/debug: add methods for dynamic data capture

* api: add debug/pprof endpoints

* command/debug: add pprof

* command/debug: timing, wg, logs to disk

* vendor: add gopsutil/disk

* command/debug: add a usage section

* website: add docs for consul debug

* agent/host: require operator:read

* api/host: improve docs and no retry timing

* command/debug: fail on extra arguments

* command/debug: fixup file permissions to 0644

* command/debug: remove server flags

* command/debug: improve clarity of usage section

* api/debug: add Trace for profiling, fix profile

* command/debug: capture profile and trace at the same time

* command/debug: add index document

* command/debug: use "clusters" in place of members

* command/debug: remove address in output

* command/debug: improve comment on metrics sleep

* command/debug: clarify usage

* agent: always register pprof handlers and protect

This will allow us to avoid a restart of a target agent
for profiling by always registering the pprof handlers.

Given this is a potentially sensitive path, it is protected
with an operator:read ACL and enable debug being
set to true on the target agent. enable_debug still requires
a restart.

If ACLs are disabled, enable_debug is sufficient.

* command/debug: use trace.out instead of .prof

More in line with golang docs.

* agent: fix comment wording

* agent: wrap table driven tests in t.run()
2018-10-19 08:41:03 -07:00
Paul Banks 251da1077f xDS Server Implementation (#4731)
* Vendor updates for gRPC and xDS server

* xDS server implementation for serving Envoy as a Connect proxy

* Address initial review comments

* consistent envoy package aliases; typos fixed; override TLS and authz for custom listeners

* Moar Typos

* Moar typos
2018-10-10 16:55:34 +01:00
Mitchell Hashimoto 9846999505
vendor: update mapstructure to v1.1.0
We require this change to support struct to struct decoding.
2018-09-30 19:15:40 -07:00
Matt Keeler ba4f912b25
Update Raft Vendoring (#4539)
Pulls in a fix for a potential memory leak regarding consistent reads that invoke VerifyLeader.
2018-09-06 15:07:42 -04:00
Mitchell Hashimoto f7a95e1a28
vendor k8s client lib 2018-09-05 14:59:02 -07:00
Mitchell Hashimoto 144b7efa51
Update go-discover vendor 2018-09-05 13:31:10 -07:00
Shubheksha 1afcabb0a2 replace old fork of text package (#4501) 2018-08-14 12:23:18 -07:00
Paul Banks 3adfe86f03 Update Serf and memberlist (#4511)
This includes fixes that improve gossip scalability on very large (> 10k node) clusters.

The Serf changes:
 - take snapshot disk IO out of the critical path for handling messages hashicorp/serf#524
 - make snapshot compaction much less aggressive - the old fixed threshold caused snapshots to be constantly compacted (synchronously with request handling) on clusters larger than about 2000 nodes! hashicorp/serf#525

Memberlist changes:
 - prioritize handling alive messages over suspect/dead to improve stability, and handle queue in LIFO order to avoid acting on info that 's already stale in the queue by the time we handle it. hashicorp/memberlist#159
 - limit the number of concurrent pushPull requests being handled at once to 128. In one test scenario with 10s of thousands of servers we saw channel and lock blocking cause over 3000 pushPulls at once which ballooned the memory of the server because each push pull contained a de-serialised list of all known 10k+ nodes and their tags for a total of about 60 million objects and 7GB of memory stuck. While the rest of the fixes here should prevent the same root cause from blocking in the same way, this prevents any other bug or source of contention from allowing pushPull messages to stack up and eat resources. hashicorp/memberlist#158
2018-08-09 13:16:13 -04:00
Siva Prasad a5ebab63e7 Vendoring update for go-discover. (#4412)
* New Providers added and updated vendoring for go-discover

* Vendor.json formatted using make vendorfmt

* Docs/Agent/auto-join: Added documentation for the new providers introduced in this PR

* Updated the golang.org/x/sys/unix in the vendor directory

* Agent: TestGoDiscoverRegistration updated to reflect the addition of new providers

* Deleted terraform.tfstate from vendor.

* Deleted terraform.tfstate.backup

Deleted terraform state file artifacts from unknown runs.

* Updated x/sys/windows vendor for Windows binary compilation
2018-07-25 16:21:04 -07:00
mkeeler 1da3c42867 Merge remote-tracking branch 'connect/f-connect' 2018-06-25 19:42:51 +00:00
Matt Keeler bc7e9b6fd4 Remove build tags from vendored vault file to allow for this to merge properly into enterprise 2018-06-25 12:26:10 -07:00
Matt Keeler 2f90768662 Vendor the vault api 2018-06-25 12:26:10 -07:00
Paul Banks 86a55892fd Remove go-diff vendor as assert.JSONEq output is way better for our case 2018-06-25 12:25:39 -07:00
Leo Zhang b498816e80
Fix invalid vendor.json syntax for go-discover 2018-06-15 02:02:12 -07:00
Kyle Havlovitz 80b6d0a6cf
Add missing vendor dep github.com/stretchr/objx 2018-06-14 09:42:13 -07:00
Matt Keeler 1e485ed727 Update yamux vendoring
Pulls in logging fixes.
2018-06-04 16:02:50 -04:00
Jack Pearkes c4112f2b9a
Merge pull request #4013 from sethvargo/sethvargo/user_agent
Add a helper for generating Consul's user-agent string
2018-06-01 09:13:38 -07:00
Matt Keeler 1c577b2012
Merge pull request #4131 from pierresouchay/enable_full_dns_compression
Enable full dns compression
2018-06-01 10:42:03 -04:00
Seth Vargo 5911fd5344
Update vendor for go-discover 2018-05-25 15:52:05 -04:00
Wim e8d0474a8e Add github.com/coredns/coredns/plugin/pkg/dnsutil files 2018-05-21 22:25:16 +02:00
Pierre Souchay 61e7d06174 Bump DNS lib to 1.0.7 with 14bits Len() fix 2018-05-16 10:52:51 +02:00
Kyle Havlovitz 7cd7f4acd7
vendor: pull in latest version of go-discover 2018-05-10 15:40:16 -07:00
Preetha Appan 98a04a0af9
Update serf to pick up clean leave fix 2018-05-04 15:51:55 -05:00
Paul Banks 06e1a62653
Merge pull request #4016 from pierresouchay/support_for_prometheus
Support for prometheus for metrics endpoint
2018-04-24 16:14:43 +01:00
Mitchell Hashimoto 17af5e6a24
vendor: add hashstructure and mock 2018-04-19 08:10:05 -07:00
Pierre Souchay e1c64f70df Added dependency github.com/prometheus/client_golang/prometheus/promhttp 2018-04-06 08:54:37 +02:00
Pierre Souchay 09a7546b12 Bump github.com/armon/go-metrics to allow having prometheus support 2018-04-05 18:21:32 +02:00
Yoann 23a6ad9356 Add support for compression in http api
The need has been spotted in issue https://github.com/hashicorp/consul/issues/3687.
Using "NYTimes/gziphandler", the http api responses can now be compressed if required.
The Go API requires compressed response if possible and handle the compressed response.
We here change only the http api (not the UI for instance).
2018-04-03 22:33:13 +02:00
Paul Banks e8372918a6
Actually add the `require` vendored files I intended to add in 0d5600ff60
Note that the vendor.json is already correct but the actual files were never checked in so report as missing:

```
$ govendor list | grep testify
 v  github.com/stretchr/testify/assert
  m github.com/stretchr/testify/require
```
2018-03-29 17:05:11 +01:00
Pierre Souchay ab9dd18bec
Bump version of miekg/dns to 1.0.4
See https://github.com/hashicorp/consul/issues/3977

While trying to improve furthermore #3948 (This pull request is still valid since we are not using Compression to compute the result anyway).

I saw a strange behaviour of dns library.
Basically, msg.Len() and len(msg.Pack()) disagree on Message len.

Thus, calculation of DNS response is false consul relies on msg.Len() instead of the result of Pack()

This is linked to miekg/dns#453 and a fix has been provided with miekg/dns#454

Would it be possible to upgrade miekg/dns to a more recent function ?

Consul might for instance upgrade to a post 1.0 release such as https://github.com/miekg/dns/releases/tag/v1.0.4
2018-03-28 10:23:57 -05:00
Paul Banks eed1ef851b
Add vendored `testify/require` subpackage; upgrade `assert` to match. (#3986) 2018-03-27 15:19:15 +01:00
Preetha Appan 16962cccfa
Update yamux to pick up performance improvements 2018-03-26 08:56:40 -05:00
Mitchell Hashimoto fbac58280e
agent/consul/fsm: begin using testify/assert 2018-03-06 09:48:15 -08:00
Alvin Huang 6bc9f6844f remove old pkgs and put deps of missing packages in vendor.json 2018-02-23 17:08:24 -05:00
James Phillips 37cf6583db
Fixes a panic on TCP-based DNS lookups.
This came in via the monkey patch in #3861.

Fixes #3877
2018-02-08 17:57:41 -08:00
Preetha 1665908bed
Fix panic in azure go discover provider (#3876) 2018-02-08 16:46:33 -06:00
Preetha c460537b2a
Patch dns vendor code for picking up a TCP DOS attack bugfix (#3861) 2018-02-05 17:27:45 -06:00
James Phillips 0aa05cc5f0
Merge pull request #3855 from hashicorp/pr-3782-slackpad
Adds support for gRPC health checks.
2018-02-02 17:57:27 -08:00
James Phillips 2e660c7172
Updates hashicorp/go-discover to pull in support for Azure Virtual Machine Scale Sets. 2018-01-19 16:24:08 -08:00
James Phillips 503e01774b
Updates Serf to pickup fix for spammy zero RTT log messages.
Fixes #3789.
2018-01-19 14:47:12 -08:00
Dmytro Kostiuchenko a45f6ad740 Add gRPC health-check #3073 2018-01-04 16:42:30 -05:00
James Phillips ffb81aaba5
Updates go-discover to get monkey patch for golang.org/x/net/trace. 2018-01-03 13:22:42 -08:00
James Phillips 63274429ff
Updates hashicorp/go-cleanhttp to pick up new sanitizer. 2017-12-20 13:48:49 -08:00
James Phillips 7cc034dc13
Merge pull request #3646 from posener/update-complete
vendor: update complete version to v1.1
2017-12-14 17:51:33 -08:00
James Phillips 3ac47e3b1c
Updates Serf to pull in new queue depth controls. 2017-12-06 17:06:08 -08:00
James Phillips 68b698993f
Updated memberlist to fix negative RTT measurements.
Fixes #3704
2017-11-21 01:37:49 -08:00
James Phillips b166ce6969
Updates memberlist to pick up https://github.com/hashicorp/memberlist/pull/69.
Fixes #3671
2017-11-10 09:31:02 -08:00