Commit graph

497 commits

Author SHA1 Message Date
R.B. Boyer 0675e0606e
connect: generate the full SNI names for discovery targets in the compiler rather than in the xds package (#6340) 2019-08-19 13:03:03 -05:00
R.B. Boyer d6456fddeb
connect: introduce ExternalSNI field on service-defaults (#6324)
Compiling this will set an optional SNI field on each DiscoveryTarget.
When set this value should be used for TLS connections to the instances
of the target. If not set the default should be used.

Setting ExternalSNI will disable mesh gateway use for that target. It also 
disables several service-resolver features that do not make sense for an 
external service.
2019-08-19 12:19:44 -05:00
hashicorp-ci 29767157ed Merge Consul OSS branch 'master' at commit 8f7586b339dbb518eff3a2eec27d7b8eae7a3fbb 2019-08-13 02:00:43 +00:00
Matt Keeler 59f2272517
Add missing LicenseReset API function (#6311) 2019-08-12 15:24:02 -04:00
Sarah Adams 2f7a90bc52
add flag to allow /operator/keyring requests to only hit local servers (#6279)
Add parameter local-only to operator keyring list requests to force queries to only hit local servers (no WAN traffic).

HTTP API: GET /operator/keyring?local-only=true
CLI: consul keyring -list --local-only

Sending the local-only flag with any non-GET/list request will result in an error.
2019-08-12 11:11:11 -07:00
Matt Keeler 0e1b7038cc
Update the v1/agent/service/:service endpoint to output tagged… (#6304) 2019-08-10 09:15:19 -04:00
Mike Morris 88df658243
connect: remove managed proxies (#6220)
* connect: remove managed proxies implementation and all supporting config options and structs

* connect: remove deprecated ProxyDestination

* command: remove CONNECT_PROXY_TOKEN env var

* agent: remove entire proxyprocess proxy manager

* test: remove all managed proxy tests

* test: remove irrelevant managed proxy note from TestService_ServerTLSConfig

* test: update ContentHash to reflect managed proxy removal

* test: remove deprecated ProxyDestination test

* telemetry: remove managed proxy note

* http: remove /v1/agent/connect/proxy endpoint

* ci: remove deprecated test exclusion

* website: update managed proxies deprecation page to note removal

* website: remove managed proxy configuration API docs

* website: remove managed proxy note from built-in proxy config

* website: add note on removing proxy subdirectory of data_dir
2019-08-09 15:19:30 -04:00
R.B. Boyer 48f23a6fb6
api: un-deprecate api.DecodeConfigEntry (#6278)
Add clarifying commentary about when it is not safe to use it. Also add
tests.
2019-08-05 17:15:22 -05:00
R.B. Boyer 0165e93517
connect: expose an API endpoint to compile the discovery chain (#6248)
In addition to exposing compilation over the API cleaned up the structures that would be exchanged to be cleaner and easier to support and understand.

Also removed ability to configure the envoy OverprovisioningFactor.
2019-08-02 15:34:54 -05:00
Matt Keeler 72b8149333
Add license management functions to API client (#6268)
* Add license management functions to API client

* Get rid of jsonapi struct tags
2019-08-02 16:20:38 -04:00
R.B. Boyer bd4a2d7be2
connect: allow L7 routers to match on http methods (#6164)
Fixes #6158
2019-07-23 20:56:39 -05:00
R.B. Boyer 67f3da61af
connect: change router syntax for matching query parameters to resemble the syntax for matching paths and headers for consistency. (#6163)
This is a breaking change, but only in the context of the beta series.
2019-07-23 20:55:26 -05:00
Alvin Huang 17654c6292 Merge branch 'master' into release/1-6 2019-07-17 15:43:30 -04:00
Freddy 476a4b95a5
More flaky test fixes (#6151)
* Add retry to TestAPI_ClientTxn

* Add retry to TestLeader_RegisterMember

* Account for empty watch result in ConnectRootsWatch
2019-07-17 09:33:38 -06:00
hashicorp-ci 022483aff0 Merge Consul OSS branch 'master' at commit 95dbb7f2f1b9fc3528a16335201e2324f1b388bd 2019-07-17 02:00:21 +00:00
Freddy 99601aa3a7
Update retries that weren't using retry.R (#6146) 2019-07-16 14:47:45 -06:00
Jack Pearkes fa15914813 Merge branch 'master' into release/1-6 2019-07-12 14:51:25 -07:00
Freddy 4381d42bcd
Fix some retries in api pkg (#6124) 2019-07-12 12:57:41 -06:00
Freddy a295d9e5db
Flaky test overhaul (#6100) 2019-07-12 09:52:26 -06:00
Freddy 74b7bcb612
Update TestServer creation in sdk/testutil (#6084)
* Retry the creation of the test server three times.
* Reduce the retry timeout for the API wait to 2 seconds, opting to fail faster and start over.
* Remove wait for leader from server creation. This wait can be added on a test by test basis now that the function is being exported.
* Remove wait for anti-entropy sync. This is built into the existing WaitForSerfCheck func, so that can be used if the anti-entropy wait is needed
2019-07-12 09:37:29 -06:00
Matt Keeler bcb3439c4c
Fix some tests that I broke when refactoring the ConfigSnapshot (#6051)
* Fix some tests that I broke when refactoring the ConfigSnapshot

* Make sure the MeshGateway config is added to all the right api structs

* Fix some more tests
2019-07-01 19:47:58 -04:00
Matt Keeler 39bb0e3e77 Implement Mesh Gateways
This includes both ingress and egress functionality.
2019-07-01 16:28:30 -04:00
R.B. Boyer 686e4606c6
do some initial config entry graph validation during writes (#6047) 2019-07-01 15:23:36 -05:00
R.B. Boyer 6ce008c022
Allow for both snake_case and CamelCase for config entries written with 'consul config write'. (#6044)
This also has the added benefit of fixing an issue with passing
time.Duration fields through config entries.
2019-06-28 11:35:35 -05:00
R.B. Boyer 8850656580
adding new config entries for L7 discovery chain (unused) (#5987) 2019-06-27 12:37:43 -05:00
hashicorp-ci d237e86d83 Merge Consul OSS branch 'master' at commit 88b15d84f9fdb58ceed3dc971eb0390be85e3c15
skip-checks: true
2019-06-25 02:00:26 +00:00
Matt Keeler f1a6919b2f
Update some tests to fix ContentHash broken by the tagged service addresses (#5996) 2019-06-20 11:50:18 -04:00
Aestek 24c29e195b kv: do not trigger watches when setting the same value (#5885)
If a KVSet is performed but does not update the entry, do not trigger
watches for this key.
This avoids releasing blocking queries for KV values that did not
actually changed.
2019-06-18 15:06:29 +02:00
Matt Keeler b6688a6b5b
Add tagged addresses for services (#5965)
This allows addresses to be tagged at the service level similar to what we allow for nodes already. The address translation that can be enabled with the `translate_wan_addrs` config was updated to take these new addresses into account as well.
2019-06-17 10:51:50 -04:00
Pavel Drankov 3ac3011e6b api: update link to agent caching in comments (#5935) 2019-06-13 23:56:19 +02:00
R.B. Boyer eb8ed27500 api: go mod tidy 2019-05-08 13:26:07 -05:00
R.B. Boyer 0fefec2b63 api: bump sdk version to v0.1.1 2019-05-08 13:18:12 -05:00
Paul Banks 078f4cf5bb Add integration test for central config; fix central config WIP (#5752)
* Add integration test for central config; fix central config WIP

* Add integration test for central config; fix central config WIP

* Set proxy protocol correctly and begin adding upstream support

* Add upstreams to service config cache key and start new notify watcher if they change.

This doesn't update the tests to pass though.

* Fix some merging logic get things working manually with a hack (TODO fix properly)

* Simplification to not allow enabling sidecars centrally - it makes no sense without upstreams anyway

* Test compile again and obvious ones pass. Lots of failures locally not debugged yet but may be flakes. Pushing up to see what CI does

* Fix up service manageer and API test failures

* Remove the enable command since it no longer makes much sense without being able to turn on sidecar proxies centrally

* Remove version.go hack - will make integration test fail until release

* Remove unused code from commands and upstream merge

* Re-bump version to 1.5.0
2019-05-01 16:39:31 -07:00
Matt Keeler ea6cbf01a5 Centralized Config CLI (#5731)
* Add HTTP endpoints for config entry management

* Finish implementing decoding in the HTTP Config entry apply endpoint

* Add CAS operation to the config entry apply endpoint

Also use this for the bootstrapping and move the config entry decoding function into the structs package.

* First pass at the API client for the config entries

* Fixup some of the ConfigEntry APIs

Return a singular response object instead of a list for the ConfigEntry.Get RPC. This gets plumbed through the HTTP API as well.

Dont return QueryMeta in the JSON response for the config entry listing HTTP API. Instead just return a list of config entries.

* Minor API client fixes

* Attempt at some ConfigEntry api client tests

These don’t currently work due to weak typing in JSON

* Get some of the api client tests passing

* Implement reflectwalk magic to correct JSON encoding a ProxyConfigEntry

Also added a test for the HTTP endpoint that exposes the problem. However, since the test doesn’t actually do the JSON encode/decode its still failing.

* Move MapWalk magic into a binary marshaller instead of JSON.

* Add a MapWalk test

* Get rid of unused func

* Get rid of unused imports

* Fixup some tests now that the decoding from msgpack coerces things into json compat types

* Stub out most of the central config cli

Fully implement the config read command.

* Basic config delete command implementation

* Implement config write command

* Implement config list subcommand

Not entirely sure about the output here. Its basically the read output indented with a line specifying the kind/name of each type which is also duplicated in the indented output.

* Update command usage

* Update some help usage formatting

* Add the connect enable helper cli command

* Update list command output

* Rename the config entry API client methods.

* Use renamed apis

* Implement config write tests

Stub the others with the noTabs tests.

* Change list output format

Now just simply output 1 line per named config

* Add config read tests

* Add invalid args write test.

* Add config delete tests

* Add config list tests

* Add connect enable tests

* Update some CLI commands to use CAS ops

This also modifies the HTTP API for a write op to return a boolean indicating whether the value was written or not.

* Fix up the HTTP API CAS tests as I realized they weren’t testing what they should.

* Update config entry rpc tests to properly test CAS

* Fix up a few more tests

* Fix some tests that using ConfigEntries.Apply

* Update config_write_test.go

* Get rid of unused import
2019-04-30 16:27:16 -07:00
Matt Keeler 8beb5c6082
ACL Token ID Initialization (#5307) 2019-04-30 11:45:36 -04:00
Kyle Havlovitz 64174f13d6 Add HTTP endpoints for config entry management (#5718) 2019-04-29 18:08:09 -04:00
Matt Keeler 32e821eda2
Enabling "service" watch handler to accept a slice of tags
Originally from PR #5347
2019-04-29 15:28:01 -04:00
R.B. Boyer 5a505c5b3a acl: adding support for kubernetes auth provider login (#5600)
* auth providers
* binding rules
* auth provider for kubernetes
* login/logout
2019-04-26 14:49:25 -05:00
R.B. Boyer 9542fdc9bc acl: adding Roles to Tokens (#5514)
Roles are named and can express the same bundle of permissions that can
currently be assigned to a Token (lists of Policies and Service
Identities). The difference with a Role is that it not itself a bearer
token, but just another entity that can be tied to a Token.

This lets an operator potentially curate a set of smaller reusable
Policies and compose them together into reusable Roles, rather than
always exploding that same list of Policies on any Token that needs
similar permissions.

This also refactors the acl replication code to be semi-generic to avoid
3x copypasta.
2019-04-26 14:49:12 -05:00
R.B. Boyer f43bc981e9 making ACLToken.ExpirationTime a *time.Time value instead of time.Time (#5663)
This is mainly to avoid having the API return "0001-01-01T00:00:00Z" as
a value for the ExpirationTime field when it is not set. Unfortunately
time.Time doesn't respect the json marshalling "omitempty" directive.
2019-04-26 14:48:16 -05:00
R.B. Boyer b3956e511c acl: ACL Tokens can now be assigned an optional set of service identities (#5390)
These act like a special cased version of a Policy Template for granting
a token the privileges necessary to register a service and its connect
proxy, and read upstreams from the catalog.
2019-04-26 14:48:04 -05:00
R.B. Boyer 76321aa952 acl: tokens can be created with an optional expiration time (#5353) 2019-04-26 14:47:51 -05:00
Matt Keeler 2831c8993d
Move the watch package into the api module (#5664)
* Move the watch package into the api module

It was already just a thin wrapper around the API anyways. The biggest change was to the testing. Instead of using a test agent directly from the agent package it now uses the binary on the PATH just like the other API tests.

The other big changes were to fix up the connect based watch tests so that we didn’t need to pull in the connect package (and therefore all of Consul)
2019-04-26 12:33:01 -04:00
Matt Keeler ac78c23021
Implement data filtering of some endpoints (#5579)
Fixes: #4222 

# Data Filtering

This PR will implement filtering for the following endpoints:

## Supported HTTP Endpoints

- `/agent/checks`
- `/agent/services`
- `/catalog/nodes`
- `/catalog/service/:service`
- `/catalog/connect/:service`
- `/catalog/node/:node`
- `/health/node/:node`
- `/health/checks/:service`
- `/health/service/:service`
- `/health/connect/:service`
- `/health/state/:state`
- `/internal/ui/nodes`
- `/internal/ui/services`

More can be added going forward and any endpoint which is used to list some data is a good candidate.

## Usage

When using the HTTP API a `filter` query parameter can be used to pass a filter expression to Consul. Filter Expressions take the general form of:

```
<selector> == <value>
<selector> != <value>
<value> in <selector>
<value> not in <selector>
<selector> contains <value>
<selector> not contains <value>
<selector> is empty
<selector> is not empty
not <other expression>
<expression 1> and <expression 2>
<expression 1> or <expression 2>
```

Normal boolean logic and precedence is supported. All of the actual filtering and evaluation logic is coming from the [go-bexpr](https://github.com/hashicorp/go-bexpr) library

## Other changes

Adding the `Internal.ServiceDump` RPC endpoint. This will allow the UI to filter services better.
2019-04-16 12:00:15 -04:00
Alvin Huang aacb81a566
Merge pull request #5376 from hashicorp/fix-tests
Fix tests in prep for CircleCI Migration
2019-04-04 17:09:32 -04:00
Hans Hasselberg cf4eb2474a
fix remaining CI failures after Go 1.12.1 Upgrade (#5576) 2019-03-29 16:29:27 +01:00
Jeff Mitchell 118f5c93c4
Update SDK version (#5572) 2019-03-27 08:58:50 -04:00
Jeff Mitchell d3c7d57209
Move internal/ to sdk/ (#5568)
* Move internal/ to sdk/

* Add a readme to the SDK folder
2019-03-27 08:54:56 -04:00
Jeff Mitchell 1d8ad9aa92
Update internal dep (#5564) 2019-03-26 17:30:23 -04:00
Jeff Mitchell a41c865059
Convert to Go Modules (#5517)
* First conversion

* Use serf 0.8.2 tag and associated updated deps

* * Move freeport and testutil into internal/

* Make internal/ its own module

* Update imports

* Add replace statements so API and normal Consul code are
self-referencing for ease of development

* Adapt to newer goe/values

* Bump to new cleanhttp

* Fix ban nonprintable chars test

* Update lock bad args test

The error message when the duration cannot be parsed changed in Go 1.12
(ae0c435877d3aacb9af5e706c40f9dddde5d3e67). This updates that test.

* Update another test as well

* Bump travis

* Bump circleci

* Bump go-discover and godo to get rid of launchpad dep

* Bump dockerfile go version

* fix tar command

* Bump go-cleanhttp
2019-03-26 17:04:58 -04:00
R.B. Boyer caadf208bd
api: fix panic in 'consul acl set-agent-token' (#5533)
api: fix panic in 'consul acl set-agent-token'

Fixes #5531
2019-03-21 15:04:40 -05:00
R.B. Boyer 91e78e00c7
fix typos reported by golangci-lint:misspell (#5434) 2019-03-06 11:13:28 -06:00
Matt Keeler 0c76a4389f
ACL Token Persistence and Reloading (#5328)
This PR adds two features which will be useful for operators when ACLs are in use.

1. Tokens set in configuration files are now reloadable.
2. If `acl.enable_token_persistence` is set to `true` in the configuration, tokens set via the `v1/agent/token` endpoint are now persisted to disk and loaded when the agent starts (or during configuration reload)

Note that token persistence is opt-in so our users who do not want tokens on the local disk will see no change.

Some other secondary changes:

* Refactored a bunch of places where the replication token is retrieved from the token store. This token isn't just for replicating ACLs and now it is named accordingly.
* Allowed better paths in the `v1/agent/token/` API. Instead of paths like: `v1/agent/token/acl_replication_token` the path can now be just `v1/agent/token/replication`. The old paths remain to be valid. 
* Added a couple new API functions to set tokens via the new paths. Deprecated the old ones and pointed to the new names. The names are also generally better and don't imply that what you are setting is for ACLs but rather are setting ACL tokens. There is a minor semantic difference there especially for the replication token as again, its no longer used only for ACL token/policy replication. The new functions will detect 404s and fallback to using the older token paths when talking to pre-1.4.3 agents.
* Docs updated to reflect the API additions and to show using the new endpoints.
* Updated the ACL CLI set-agent-tokens command to use the non-deprecated APIs.
2019-02-27 14:28:31 -05:00
Alvin Huang e9919a5d00 add serf check to TestAPI_ACLToken_List 2019-02-22 17:34:45 -05:00
Alvin Huang 23417780dc add serf check to various api/agent tests 2019-02-22 17:34:45 -05:00
Alvin Huang 897c5ff746 add serf check to TestAPI_CatalogNodes 2019-02-22 17:34:45 -05:00
Alvin Huang 9953306901 add serf check to TestAPI_ConnectCARoots_empty, TestAPI_ConnectCAConfig_get_set 2019-02-22 17:34:45 -05:00
Alvin Huang c45a193e3d add serf check to TestAPI_CoordinateUpdate 2019-02-22 17:34:45 -05:00
Alvin Huang 6f27454ea0 add serf check to TestAPI_HealthChecks_NodeMetaFilter, TestAPI_HealthConnect 2019-02-22 17:34:45 -05:00
Alvin Huang 313408a35f add serf check to TestAPI_ClientPutGetDelete, TestAPI_ClientWatchGet 2019-02-22 17:34:45 -05:00
Alvin Huang 6617785b33 add serf check to TestAPI_SemaphoreConflict 2019-02-22 17:34:45 -05:00
Alvin Huang 9ccaa7f0ca add serf check to TestAPI_Snapshot 2019-02-22 17:34:45 -05:00
Alvin Huang 0bff689281 add serf check to TestAPI_AgentServices_ManagedConnectProxyDeprecatedUpstreams 2019-02-22 17:34:45 -05:00
Alvin Huang d3b56af72f remove serf check 2019-02-22 17:34:45 -05:00
Alvin Huang 8ab8bdda53 add serf check to TestAPI_SemaphoreConflict 2019-02-22 17:34:45 -05:00
Matt Keeler 210c3a56b0
Improve Connect with Prepared Queries (#5291)
Given a query like:

```
{
   "Name": "tagged-connect-query",
   "Service": {
      "Service": "foo",
      "Tags": ["tag"],
      "Connect": true
   }
}
```

And a Consul configuration like:

```
{
   "services": [
      "name": "foo",
      "port": 8080,
      "connect": { "sidecar_service": {} },
      "tags": ["tag"]
   ]
}
```

If you executed the query it would always turn up with 0 results. This was because the sidecar service was being created without any tags. You could instead make your config look like:

```
{
   "services": [
      "name": "foo",
      "port": 8080,
      "connect": { "sidecar_service": {
         "tags": ["tag"]
      } },
      "tags": ["tag"]
   ]
}
```

However that is a bit redundant for most cases. This PR ensures that the tags and service meta of the parent service get copied to the sidecar service. If there are any tags or service meta set in the sidecar service definition then this copying does not take place. After the changes, the query will now return the expected results.

A second change was made to prepared queries in this PR which is to allow filtering on ServiceMeta just like we allow for filtering on NodeMeta.
2019-02-04 09:36:51 -05:00
Matt Keeler 470f6969de
Fall back to using old fields when marshaling
Co-Authored-By: kyhavlov <kylehav@gmail.com>
2019-01-28 08:47:26 -08:00
Kyle Havlovitz 8544ee9d71
Re-add ReadableDuration types to health check definition
This is to fix the backwards-incompatible change made in 1.4.1 by
changing these fields to time.Duration.
2019-01-25 14:47:35 -08:00
Hans Hasselberg 8356f6246f
agent: display messages from serf in cli (#5236)
* display messages from serf in cli
2019-01-22 21:08:50 +01:00
Kyle Havlovitz b0f07d9b5e
Merge pull request #4869 from hashicorp/txn-checks
Add node/service/check operations to transaction api
2019-01-22 11:16:09 -08:00
Paul Banks 1c4dfbcd2e
connect: tame thundering herd of CSRs on CA rotation (#5228)
* Support rate limiting and concurrency limiting CSR requests on servers; handle CA rotations gracefully with jitter and backoff-on-rate-limit in client

* Add CSR rate limiting docs

* Fix config naming and add tests for new CA configs
2019-01-22 17:19:36 +00:00
Kyle Havlovitz 70a6f5b2c0 txn: update existing txn api docs with new operations 2019-01-15 16:54:07 -08:00
Paul Banks fcdb5b3494
agent: add default weights to service in local state to prevent AE churn (#5126)
* Add default weights when adding a service with no weights to local state to prevent constant AE re-sync.

This fix was contributed by @42wim in https://github.com/hashicorp/consul/pull/5096 but was merged against the wrong base. This adds it to master and adds a test to cover the behaviour.

* Fix tests that broke due to comparing internal state which now has default weights
2019-01-08 10:13:49 +00:00
Grégoire Seux 6a57c7fec5 Implement /v1/agent/health/service/<service name> endpoint (#3551)
This endpoint aggregates all checks related to <service id> on the agent
and return an appropriate http code + the string describing the worst
check.

This allows to cleanly expose service status to other component, hiding
complexity of multiple checks.
This is especially useful to use consul to feed a load balancer which
would delegate health checking to consul agent.

Exposing this endpoint on the agent is necessary to avoid a hit on
consul servers and avoid decreasing resiliency (this endpoint will work
even if there is no consul leader in the cluster).
2019-01-07 09:39:23 -05:00
Pierre Souchay 61870be137 [Travis][UnstableTests] Fixed unstable tests in travis (#5013)
* [Travis][UnstableTests] Fixed unstable tests in travis as seen in https://travis-ci.org/hashicorp/consul/jobs/460824602

* Fixed unstable tests in https://travis-ci.org/hashicorp/consul/jobs/460857687
2018-12-12 12:09:42 -08:00
Kyle Havlovitz efcdc85e1a api: add support for new txn operations 2018-12-12 10:54:09 -08:00
Kyle Havlovitz b1aeb3b943 txn: add node operations 2018-12-12 10:04:10 -08:00
Kyle Havlovitz 8a0d7b65d6 Add check operations to transaction api 2018-12-12 10:04:10 -08:00
Philipp Gillé f23abddbc0 Remove wrong space character (#4910)
There should be no space before a comma
2018-11-06 22:34:36 +01:00
Matt Keeler 26b1873b3b Adds documentation for the new ACL APIs (#4851)
* Update the ACL API docs

* Add a CreateTime to the anon token

Also require acl:read permissions at least to perform rule translation. Don’t want someone DoSing the system with an open endpoint that actually does a bit of work.

* Fix one place where I was referring to id instead of AccessorID

* Add godocs for the API package additions.

* Minor updates: removed some extra commas and updated the acl intro paragraph

* minor tweaks

* Updated the language to be clearer

* Updated the language to be clearer for policy page

* I was also confused by that! Your updates are much clearer.

Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Sounds much better.

Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Updated sidebar layout and deprecated warning
2018-10-31 15:11:51 -07:00
Rémi Lapeyre 70a051363f Add support for multiple checks in Consul Client (#4849) 2018-10-29 09:39:25 -07:00
Matt Keeler 2f42298565
New ACL API Tests (#4848)
* A few API mods and unit tests.

* Update the unit tests to verify query/write metadata and to fix the rules endpoint tests.

* Make sure the full information for the replication status is in the api packge
2018-10-25 11:09:46 -04:00
Matt Keeler 8fa3d61d25
Implement CLI token cloning & special ID handling (#4827)
* Implement CLI token cloning & special ID handling

* Update a couple CLI commands to take some alternative options.

* Document the CLI.

* Update the policy list and set-agent-token synopsis
2018-10-24 10:24:29 -04:00
Matt Keeler 99e0a124cb
New ACLs (#4791)
This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week.
Description

At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers.

On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though.

    Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though.
    All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management.
    Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are:
        A server running the new system must still support other clients using the legacy system.
        A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system.
        The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode.

So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.
2018-10-19 12:04:07 -04:00
Jack Pearkes 197d62c6ca New command: consul debug (#4754)
* agent/debug: add package for debugging, host info

* api: add v1/agent/host endpoint

* agent: add v1/agent/host endpoint

* command/debug: implementation of static capture

* command/debug: tests and only configured targets

* agent/debug: add basic test for host metrics

* command/debug: add methods for dynamic data capture

* api: add debug/pprof endpoints

* command/debug: add pprof

* command/debug: timing, wg, logs to disk

* vendor: add gopsutil/disk

* command/debug: add a usage section

* website: add docs for consul debug

* agent/host: require operator:read

* api/host: improve docs and no retry timing

* command/debug: fail on extra arguments

* command/debug: fixup file permissions to 0644

* command/debug: remove server flags

* command/debug: improve clarity of usage section

* api/debug: add Trace for profiling, fix profile

* command/debug: capture profile and trace at the same time

* command/debug: add index document

* command/debug: use "clusters" in place of members

* command/debug: remove address in output

* command/debug: improve comment on metrics sleep

* command/debug: clarify usage

* agent: always register pprof handlers and protect

This will allow us to avoid a restart of a target agent
for profiling by always registering the pprof handlers.

Given this is a potentially sensitive path, it is protected
with an operator:read ACL and enable debug being
set to true on the target agent. enable_debug still requires
a restart.

If ACLs are disabled, enable_debug is sufficient.

* command/debug: use trace.out instead of .prof

More in line with golang docs.

* agent: fix comment wording

* agent: wrap table driven tests in t.run()
2018-10-19 08:41:03 -07:00
Rebecca Zanzig 0ec6d880f5 Support multiple tags for health and catalog http api endpoints (#4717)
* Support multiple tags for health and catalog api endpoints

Fixes #1781.

Adds a `ServiceTags` field to the ServiceSpecificRequest to support
multiple tags, updates the filter logic in the catalog store, and
propagates these change through to the health and catalog endpoints.

Note: Leaves `ServiceTag` in the struct, since it is being used as
part of the DNS lookup, which in turn uses the health check.

* Update the api package to support multiple tags

Includes additional tests.

* Update new tests to use the `require` library

* Update HealthConnect check after a bad merge
2018-10-11 12:50:05 +01:00
Pierre Souchay 42f250fa53 Added SOA configuration for DNS settings. (#4714)
This will allow to fine TUNE SOA settings sent by Consul in DNS responses,
for instance to be able to control negative ttl.

Will fix: https://github.com/hashicorp/consul/issues/4713

# Example

Override all settings:

* min_ttl: 0 => 60s
* retry: 600 (10m) => 300s (5 minutes),
* expire: 86400 (24h) => 43200 (12h)
* refresh: 3600 (1h) => 1800 (30 minutes)

```
consul agent -dev -hcl 'dns_config={soa={min_ttl=60,retry=300,expire=43200,refresh=1800}}'
```

Result:
```
dig +multiline @localhost -p 8600 service.consul

; <<>> DiG 9.12.1 <<>> +multiline @localhost -p 8600 service.consul
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 36557
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;service.consul.		IN A

;; AUTHORITY SECTION:
consul.			0 IN SOA ns.consul. hostmaster.consul. (
				1537959133 ; serial
				1800       ; refresh (30 minutes)
				300        ; retry (5 minutes)
				43200      ; expire (12 hours)
				60         ; minimum (1 minute)
				)

;; Query time: 4 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Wed Sep 26 12:52:13 CEST 2018
;; MSG SIZE  rcvd: 93
```
2018-10-10 15:50:56 -04:00
R.B. Boyer 4427417140 cli: avoid passing envoy bootstrap configuration as arguments (#4747)
Play a trick with CLOEXEC to pass the envoy bootstrap configuration as
an open file descriptor to the exec'd envoy process. The file only
briefly touches disk before being unlinked.

We convince envoy to read from this open file descriptor by using the
/dev/fd/$FDNUMBER mechanism to read the open file descriptor as a file.

Because the filename no longer has an extension envoy's sniffing logic
falls back on JSON instead of YAML, so the bootstrap configuration must
be generated as JSON instead.
2018-10-10 16:55:34 +01:00
Paul Banks 7a8023a57f Fix up tests broken by master merge; add proxy tests to services command (and fix it!); actually run the proxycfg.Manager 2018-10-10 16:55:34 +01:00
Paul Banks 1e4c5a1811 Connect Envoy Command (#4735)
* Plumb xDS server and proxyxfg into the agent startup

* Add `consul connect envoy` command to allow running Envoy as a connect sidecar.

* Add test for help tabs; typos and style fixups from review
2018-10-10 16:55:34 +01:00
Paul Banks 979e1c9c94 Add -sidecar-for and new /agent/service/:service_id endpoint (#4691)
- A new endpoint `/v1/agent/service/:service_id` which is a generic way to look up the service for a single instance. The primary value here is that it:
   - **supports hash-based blocking** and so;
   - **replaces `/agent/connect/proxy/:proxy_id`** as the mechanism the built-in proxy uses to read its config.
   - It's not proxy specific and so works for any service.
   - It has a temporary shim to call through to the existing endpoint to preserve current managed proxy config defaulting behaviour until that is removed entirely (tested).
 - The built-in proxy now uses the new endpoint exclusively for it's config
 - The built-in proxy now has a `-sidecar-for` flag that allows the service ID of the _target_ service to be specified, on the condition that there is exactly one "sidecar" proxy (that is one that has `Proxy.DestinationServiceID` set) for the service registered.
 - Several fixes for edge cases for SidecarService
 - A fix for `Alias` checks - when running locally they didn't update their state until some external thing updated the target. If the target service has no checks registered as below, then the alias never made it past critical.
2018-10-10 16:55:34 +01:00
Paul Banks 7038fe6b71 Add SidecarService Syntax sugar to Service Definition (#4686)
* Added new Config for SidecarService in ServiceDefinitions.

* WIP: all the code needed for SidecarService is written... none of it is tested other than config :). Need API updates too.

* Test coverage for the new sidecarServiceFromNodeService method.

* Test API registratrion with SidecarService

* Recursive Key Translation 🤦

* Add tests for nested sidecar defintion arrays to ensure they are translated correctly

* Use dedicated internal state rather than Service Meta for tracking sidecars for deregistration.

Add tests for deregistration.

* API struct for agent register. No other endpoint should be affected yet.

* Additional test cases to cover updates to API registrations
2018-10-10 16:55:34 +01:00
Paul Banks 92fe8c8e89 Add Proxy Upstreams to Service Definition (#4639)
* Refactor Service Definition ProxyDestination.

This includes:
 - Refactoring all internal structs used
 - Updated tests for both deprecated and new input for:
   - Agent Services endpoint response
   - Agent Service endpoint response
   - Agent Register endpoint
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Register
     - Unmanaged deprecated field
     - Unmanaged new fields
     - Managed deprecated upstreams
     - Managed new
   - Catalog Services endpoint response
   - Catalog Node endpoint response
   - Catalog Service endpoint response
 - Updated API tests for all of the above too (both deprecated and new forms of register)

TODO:
 - config package changes for on-disk service definitions
 - proxy config endpoint
 - built-in proxy support for new fields

* Agent proxy config endpoint updated with upstreams

* Config file changes for upstreams.

* Add upstream opaque config and update all tests to ensure it works everywhere.

* Built in proxy working with new Upstreams config

* Command fixes and deprecations

* Fix key translation, upstream type defaults and a spate of other subtele bugs found with ned to end test scripts...

TODO: tests still failing on one case that needs a fix. I think it's key translation for upstreams nested in Managed proxy struct.

* Fix translated keys in API registration.
≈

* Fixes from docs
 - omit some empty undocumented fields in API
 - Bring back ServiceProxyDestination in Catalog responses to not break backwards compat - this was removed assuming it was only used internally.

* Documentation updates for Upstreams in service definition

* Fixes for tests broken by many refactors.

* Enable travis on f-connect branch in this branch too.

* Add consistent Deprecation comments to ProxyDestination uses

* Update version number on deprecation notices, and correct upstream datacenter field with explanation in docs
2018-10-10 16:55:34 +01:00
Paul Banks 5b0d4db6bc Support Agent Caching for Service Discovery Results (#4541)
* Add cache types for catalog/services and health/services and basic test that caching works

* Support non-blocking cache types with Cache-Control semantics.

* Update API docs to include caching info for every endpoint.

* Comment updates per PR feedback.

* Add note on caching to the 10,000 foot view on the architecture page to make the new data path more clear.

* Document prepared query staleness quirk and force all background requests to AllowStale so we can spread service discovery load across servers.
2018-10-10 16:55:34 +01:00
Freddy 54e4cd4801 Improve resilience of api pkg tests (#4676)
* Add function to wait for serfHealth in api tests

* Disable connect when creating semaphore test clients

* Wait for serfHealth when creating sessions in their tests

* Add helper functions to create lock/semaphore sessions without checks

* Log passing tests to prevent timeout in Travis due to lack of output
2018-09-18 17:47:01 +01:00
Pierre Souchay 7a42c31330 Fix unstable tests in agent, api, and command/watch 2018-09-10 16:58:53 +01:00
Pierre Souchay 473e589d86 Implementation of Weights Data structures (#4468)
* Implementation of Weights Data structures

Adding this datastructure will allow us to resolve the
issues #1088 and #4198

This new structure defaults to values:
```
   { Passing: 1, Warning: 0 }
```

Which means, use weight of 0 for a Service in Warning State
while use Weight 1 for a Healthy Service.
Thus it remains compatible with previous Consul versions.

* Implemented weights for DNS SRV Records

* DNS properly support agents with weight support while server does not (backwards compatibility)

* Use Warning value of Weights of 1 by default

When using DNS interface with only_passing = false, all nodes
with non-Critical healthcheck used to have a weight value of 1.
While having weight.Warning = 0 as default value, this is probably
a bad idea as it breaks ascending compatibility.

Thus, we put a default value of 1 to be consistent with existing behaviour.

* Added documentation for new weight field in service description

* Better documentation about weights as suggested by @banks

* Return weight = 1 for unknown Check states as suggested by @banks

* Fixed typo (of -> or) in error message as requested by @mkeeler

* Fixed unstable unit test TestRetryJoin

* Fixed unstable tests

* Fixed wrong Fatalf format in `testrpc/wait.go`

* Added notes regarding DNS SRV lookup limitations regarding number of instances

* Documentation fixes and clarification regarding SRV records with weights as requested by @banks

* Rephrase docs
2018-09-07 15:30:47 +01:00
Freddy baa35ca503
Remove operator_area note from godoc overview (#4603) 2018-08-28 16:02:24 -04:00
Siva Prasad 33c3d1ddfa
Addresses the flakiness of CatalogNodes (#4530) 2018-08-15 11:16:05 -04:00
Pierre Souchay fd927ea110 BUGFIX: Unit test relying on WaitForLeader() did not work due to wrong test (#4472)
- Improve resilience of testrpc.WaitForLeader()

- Add additionall retry to CI

- Increase "go test" timeout to 8m

- Add wait for cluster leader to several tests in the agent package

- Add retry to some tests in the api and command packages
2018-08-06 19:46:09 -04:00