Commit Graph

9300 Commits

Author SHA1 Message Date
Pierre Souchay 8dd3476921 Allow `"disable_host_node_id": false` to work on Linux as non-root. (#4926)
Bump `shirou/gopsutil` to include https://github.com/shirou/gopsutil/pull/603

This will allow to have consistent node-id even when machine is reinstalled
when using `"disable_host_node_id": false`

It will fix https://github.com/hashicorp/consul/issues/4914 and allow having
the same node-id even when reinstalling a node from scratch. However,
it is only compatible with a single OS (installing to Windows will change
the node-id, but it seems acceptable).
2019-01-10 10:50:14 -05:00
Matt Keeler 3b8bdf0f62
Update CHANGELOG.md 2019-01-10 09:29:51 -05:00
Aestek 154c41e165 [Security] Allow blocking Write endpoints on Agent using Network Addresses (#4719)
* Add -write-allowed-nets option

* Add documentation for the new write_allowed_nets option
2019-01-10 09:27:26 -05:00
Matt Keeler 29b4512120
acl: Prevent tokens from deleting themselves (#5210)
Fixes #4897 

Also apparently token deletion could segfault in secondary DCs when attempting to delete non-existant tokens. For that reason both checks are wrapped within the non-nil check.
2019-01-10 09:22:51 -05:00
Paul Banks cc14db1c26
Update CHANGELOG.md 2019-01-10 12:50:31 +00:00
Paul Banks d4ae9d8e46
Update CHANGELOG.md 2019-01-10 12:47:59 +00:00
Paul Banks c4fa66b4c9
connect: agent leaf cert caching improvements (#5091)
* Add State storage and LastResult argument into Cache so that cache.Types can safely store additional data that is eventually expired.

* New Leaf cache type working and basic tests passing. TODO: more extensive testing for the Root change jitter across blocking requests, test concurrent fetches for different leaves interact nicely with rootsWatcher.

* Add multi-client and delayed rotation tests.

* Typos and cleanup error handling in roots watch

* Add comment about how the FetchResult can be used and change ca leaf state to use a non-pointer state.

* Plumb test override of root CA jitter through TestAgent so that tests are deterministic again!

* Fix failing config test
2019-01-10 12:46:11 +00:00
kaitlincarter-hc 2e972b46c6
Re-worked the ACL guide into two docs and an updated guide. (#5093)
* Re-worked the ACL guide into two docs and an updated guide.

Co-Authored-By: kaitlincarter-hc <43049322+kaitlincarter-hc@users.noreply.github.com>

* Updating syntax based on amayer5125's comments.

* Missed one of amayer5125's comments

* found a bad link in the acl system docs

* fixing a link in the rules docs
2019-01-09 15:07:20 -06:00
Matt Keeler 021138c923
Update CHANGELOG.md 2019-01-08 11:44:36 -05:00
dawxy f4288baafe Fix data race (#5029)
Fix #4357
2019-01-08 11:43:14 -05:00
Hans Hasselberg 092907077d
connect: add tls config for vault connect ca provider (#5125)
* add tlsconfig for vault connect ca provider.
* add options to the docs
* add tests for new configuration
2019-01-08 17:09:22 +01:00
Hans Hasselberg 08502cfa61
snapshot: read meta.json correctly. (#5193)
* snapshot: read meta.json correctly.

Fixes #4452.
2019-01-08 17:06:28 +01:00
Alejandro Guirao Rodríguez 7b1bd33add agent/config: Fix typo in comment (#5202) 2019-01-08 16:27:22 +01:00
Paul Banks 5105c46b92
Update CHANGELOG.md 2019-01-08 10:16:34 +00:00
Paul Banks fcdb5b3494
agent: add default weights to service in local state to prevent AE churn (#5126)
* Add default weights when adding a service with no weights to local state to prevent constant AE re-sync.

This fix was contributed by @42wim in https://github.com/hashicorp/consul/pull/5096 but was merged against the wrong base. This adds it to master and adds a test to cover the behaviour.

* Fix tests that broke due to comparing internal state which now has default weights
2019-01-08 10:13:49 +00:00
Paul Banks 99e20e2d9e
Update CHANGELOG.md 2019-01-08 10:09:03 +00:00
Paul Banks 1bf3a37597
agent: Don't leave old errors around in cache (#5094)
* Fixes #4480. Don't leave old errors around in cache that can be hit in specific circumstances.

* Move error reset to cover extreme edge case of nil Value, nil err Fetch
2019-01-08 10:06:38 +00:00
Jack Pearkes 5b31ec33db
website: fixed ca provider references (#5185)
Fixes https://github.com/hashicorp/consul/issues/5182.
2019-01-07 18:47:02 -08:00
Matt Keeler f2456d1b02
Update CHANGELOG.md 2019-01-07 16:55:14 -05:00
Pierre Souchay 5b8a7d7127 Avoid to have infinite recursion in DNS lookups when resolving CNAMEs (#4918)
* Avoid to have infinite recursion in DNS lookups when resolving CNAMEs

This will avoid killing Consul when a Service.Address is using CNAME
to a Consul CNAME that creates an infinite recursion.

This will fix https://github.com/hashicorp/consul/issues/4907

* Use maxRecursionLevel = 3 to allow several recursions
2019-01-07 16:53:54 -05:00
Paul Banks 339dc78456
Update CHANGELOG.md 2019-01-07 21:32:20 +00:00
Paul Banks 0962e95e85
bugfix: use ServiceTags to generate cache key hash (#4987)
* bugfix: use ServiceTags to generate cahce key hash

* update unit test

* update

* remote print log

* Update .gitignore

* Completely deprecate ServiceTag field internally for clarity

* Add explicit test for CacheInfo cases
2019-01-07 21:30:47 +00:00
R.B. Boyer f5e7709dd7 Update CHANGELOG.md 2019-01-07 15:05:06 -06:00
R.B. Boyer 7f30950060
update github.com/hashicorp/{serf,memberlist,go-sockaddr} (#5189)
This activates large-cluster improvements in the gossip layer from
https://github.com/hashicorp/memberlist/pull/167
2019-01-07 15:00:47 -06:00
Matt Keeler e90a846128
Update CHANGELOG.md 2019-01-07 13:56:08 -05:00
Aestek eba1dd7856 Prevent status flap when re-registering a check (#4904)
Fixes point `#2` of: https://github.com/hashicorp/consul/issues/4903

When registering a service each healthcheck status is saved and restored (https://github.com/hashicorp/consul/blob/master/agent/agent.go#L1914) to avoid unnecessary flaps in health state.
This change extends this feature to single check registration by moving this protection in `AddCheck()` so that both `PUT /v1/agent/service/register` and `PUT /v1/agent/check/register` behave in the same idempotent way.

#### Steps to reproduce
1. Register a check :
```
curl -X PUT \
  http://127.0.0.1:8500/v1/agent/check/register \
  -H 'Content-Type: application/json' \
  -d '{
  "Name": "my_check",
  "ServiceID": "srv",
  "Interval": "10s",
  "Args": ["true"]
}'
```
2. The check will initialize and change to `passing`
3. Run the same request again
4. The check status will quickly go from `critical` to `passing` (the delay for this transission is determined by https://github.com/hashicorp/consul/blob/master/agent/checks/check.go#L95)
2019-01-07 13:53:03 -05:00
RJ Spiker be7d28b46b website: fix carousel bugs 2019-01-07 13:39:14 -05:00
Mitchell Hashimoto 5452c32f50 CA Provider Plugins (#4751)
This adds the `agent/connect/ca/plugin` library for consuming/serving Connect CA providers as [go-plugin](https://github.com/hashicorp/go-plugin) plugins. This **does not** wire this up in any way to Consul itself, so this will not enable using these plugins yet. 

## Why?

We want to enable CA providers to be pluggable without modifying Consul so that any CA or PKI system can potentially back the Connect certificates. This CA system may also be used in the future for easier bootstrapping and internal cluster security.

### go-plugin

The benefit of `go-plugin` is that for the plugin consumer, the fact that the interface implementation is communicating over multi-process RPC is invisible. Internals of Consul will continue to just use `ca.Provider` interface implementations as if they're local. For plugin _authors_, they simply have to implement the interface. The network/transport/process management issues are handled by go-plugin itself.

The CA provider plugins support both `net/rpc` and gRPC transports. This enables easy authoring in any language. go-plugin handles the actual protocol handshake and connection. This is just a feature of go-plugin. 

`go-plugin` is already in production use for years by Packer, Terraform, Nomad, Vault, and Sentinel. We've shown stability for both desktop and server-side software. It is very mature.

## Implementation Details

### `map[string]interface{}`

The `Configure` method passes a `map[string]interface{}`. This map contains only Go primitives and containers of primitives (no funcs, chans, etc.). For `net/rpc` we encode as-is using Gob. For gRPC we marshal to JSON and transmit as a `bytes` type. This is the same approach we take with Vault and other software.

Note that this is just the transport protocol, the end software views it fully decoded.

### `x509.Certificate` and `CertificateRequest`

We transmit the raw ASN.1  bytes and decode on the other side. Unit tests are verifying we get the same cert/csrs across the wire.

### Testing

`go-plugin` exposes test helpers that enable testing the full plugin RPC over real loopback network connections. We test all endpoints for success and error for both `net/rpc` and gRPC.

### Vendoring

This PR doesn't introduce vendoring for two reasons:

  1. @banks's `f-envoy` branch introduces a lot of these and I didn't want conflict.
  2. The library isn't actually used yet so it doesn't introduce compile-time errors (it does introduce test errors).

## Next Steps

With this in place, we need to figure out the proper way to actually hook these up to Consul, load them, etc. This discussion can happen elsewhere, since regardless of approach this plugin library implementation is the exact same.
2019-01-07 12:48:44 -05:00
R.B. Boyer 801e7e1c38
website: fix stray sentinel references using the old syntax (#5191)
[skip ci]
2019-01-07 09:59:17 -06:00
Matt Keeler b62ff49b16
Update CHANGELOG.md 2019-01-07 09:56:31 -05:00
Grégoire Seux 6a57c7fec5 Implement /v1/agent/health/service/<service name> endpoint (#3551)
This endpoint aggregates all checks related to <service id> on the agent
and return an appropriate http code + the string describing the worst
check.

This allows to cleanly expose service status to other component, hiding
complexity of multiple checks.
This is especially useful to use consul to feed a load balancer which
would delegate health checking to consul agent.

Exposing this endpoint on the agent is necessary to avoid a hit on
consul servers and avoid decreasing resiliency (this endpoint will work
even if there is no consul leader in the cluster).
2019-01-07 09:39:23 -05:00
Alvin Huang bf6db3934c
Merge pull request #5187 from hashicorp/add_ui_tests
Add ui tests
2019-01-04 12:13:08 -05:00
kaitlincarter-hc 841b01c6f0
Added the new monitoring guide (#5117) 2019-01-04 10:26:07 -06:00
Matt Keeler 2b047641ee
Update CHANGELOG.md 2019-01-04 10:03:29 -05:00
Aestek 5647ca2bbb [Fix] Services sometimes not being synced with acl_enforce_version_8 = false (#4771)
Fixes: https://github.com/hashicorp/consul/issues/3676

This fixes a bug were registering an agent with a non-existent ACL token can prevent other 
services registered with a good token from being synced to the server when using 
`acl_enforce_version_8 = false`.

## Background

When `acl_enforce_version_8` is off the agent does not check the ACL token validity before 
storing the service in its state.
When syncing a service registered with a missing ACL token we fall into the default error 
handling case (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1255)
and stop the sync (https://github.com/hashicorp/consul/blob/master/agent/local/state.go#L1082)
without setting its Synced property to true like in the permission denied case.
This means that the sync will always stop at the faulty service(s).
The order in which the services are synced is random since we iterate on a map. So eventually
all services with good ACL tokens will be synced, this can however take some time and is influenced 
by the cluster size, the bigger the slower because retries are less frequent.
Having a service in this state also prevent all further sync of checks as they are done after
the services.

## Changes 

This change modify the sync process to continue even if there is an error. 
This fixes the issue described above as well as making the sync more error tolerant: if the server repeatedly refuses
a service (the ACL token could have been deleted by the time the service is synced, the servers 
were upgraded to a newer version that has more strict checks on the service definition...). 
Then all services and check that can be synced will, and those that don't will be marked as errors in 
the logs instead of blocking the whole process.
2019-01-04 10:01:50 -05:00
Alvin Huang 2254c4000e
Merge pull request #5186 from hashicorp/add_codeowners
add codeowners for consul docs
2019-01-04 09:32:53 -05:00
Alvin Huang fb62d4471c add documentation on how to use ember-exam 2019-01-03 23:50:02 -05:00
Alvin Huang 18bfab1db6 add test-parallel yarn command 2019-01-03 23:49:45 -05:00
Alvin Huang b15996ae4f refresh cache key and fix comments 2019-01-03 22:57:40 -05:00
Alvin Huang cdde1d7f7d change env to test 2019-01-03 22:27:17 -05:00
Alvin Huang d507d9cf36 using build-ci target in circleci 2019-01-03 22:27:17 -05:00
Alvin Huang 806f1a68b0 adding build-ci yarn target 2019-01-03 22:27:17 -05:00
Alvin Huang 90e863bfdc test circleci 2019-01-03 22:27:02 -05:00
Alvin Huang 961440870b output XML test results 2019-01-03 22:21:19 -05:00
Alvin Huang 4233509e83 enable ember-exam 2019-01-03 22:21:20 -05:00
Alvin Huang 9c1e0a75d0 update testem with newer conventions 2019-01-03 22:18:55 -05:00
Alvin Huang 898c725279 adding ember-exam package 2019-01-03 16:59:30 -05:00
Alvin Huang 85141973d9 add codeowners for consul docs 2019-01-03 15:01:16 -05:00
Junpei Tsuji 6f14d3eeae website: Use secret id instead of policy id (#5049)
The document uses _the agent-token policy_ as agent token by mistake.
So I fixed it to use the secret id instead of it.
2019-01-03 10:45:01 -05:00
Matt Keeler c723ba09d5
Update CHANGELOG.md 2019-01-03 10:41:33 -05:00