Commit Graph

12352 Commits

Author SHA1 Message Date
Jeff Escalante 14545999c6
fix more redirects 2020-08-18 16:36:40 -04:00
Jeff Escalante 3ca754dd6e
remove invalid redirects 2020-08-18 16:36:21 -04:00
Jeff Escalante ca8c9484e9
upgrade to latest docs-sidenav, fix some issues with the k8s section 2020-08-18 15:11:06 -04:00
Hans Hasselberg 02de4c8b76
add primary keys to list keyring (#8522)
During gossip encryption key rotation it would be nice to be able to see if all nodes are using the same key. This PR adds another field to the json response from `GET v1/operator/keyring` which lists the primary keys in use per dc. That way an operator can tell when a key was successfully setup as primary key.

Based on https://github.com/hashicorp/serf/pull/611 to add primary key to list keyring output:

```json
[
  {
    "WAN": true,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 6,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 6
    },
    "NumNodes": 6
  },
  {
    "WAN": false,
    "Datacenter": "dc2",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 8,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  },
  {
    "WAN": false,
    "Datacenter": "dc1",
    "Segment": "",
    "Keys": {
      "0OuM4oC3Os18OblWiBbZUaHA7Hk+tNs/6nhNYtaNduM=": 3,
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "PrimaryKeys": {
      "SINm887hKTzmMWeBNKTJReaTLX3mBEJKriDyt88Ad+g=": 8
    },
    "NumNodes": 8
  }
]
```

I intentionally did not change the CLI output because I didn't find a good way of displaying this information. There are a couple of options that we could implement later:
* add a flag to show the primary keys
* add a flag to show json output

Fixes #3393.
2020-08-18 09:50:24 +02:00
Daniel Nephin 7078ca07fa config: Move remote-script-checks warning to config
Previously it was done in Agent.Start, but it can be done much earlier
2020-08-17 17:39:49 -04:00
Daniel Nephin a0dc4222b6 config: move NodeName validation to config validation
Previsouly it was done in Agent.Start, which is much later then it needs to be.

The new 'dns' package was required, because otherwise there would be an
import cycle. In the future we should move more of the dns server into
the dns package.
2020-08-17 17:25:02 -04:00
Daniel Nephin 647236bb17
Merge pull request #8515 from hashicorp/dnephin/unexport-testing-shims
config: unexport fields and resolve TODOs in config.Builder
2020-08-17 16:03:07 -04:00
Daniel Nephin 749819e235
Merge pull request #8517 from hashicorp/dnephin/remove-defer-os-removeall
testing: remove all the 'defer os.RemoveAll'
2020-08-17 12:42:34 -04:00
danielehc 4a1cc8b9f9
Refactor api-docs links to learn (#8488) 2020-08-17 18:20:02 +02:00
danielehc e5d84ad45a
Refactor docs links to learn 2 (#8491)
* Refactor docs links to learn

* Apply suggestions from code review

Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>

Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>
2020-08-17 18:19:04 +02:00
danielehc b2d87296ca
Refactor docs links to learn (#8490) 2020-08-17 18:17:51 +02:00
Daniel Nephin 3e0d63a6b7 testing: use t.Cleanup in testutil.TempFile
So that it has the same behaviour as TempDir.

Also remove the now unnecessary 'defer os.Remove'
2020-08-14 20:06:01 -04:00
Daniel Nephin 8d35e37b3c testing: Remove all the defer os.Removeall
Now that testutil uses t.Cleanup to remove the directory the caller no longer has to manage
the removal
2020-08-14 19:58:53 -04:00
Daniel Nephin fe8790da9e config: unexport and resolve TODOs in config.Builder
- unexport testing shims, and document their purpose
- resolve a TODO by moving validation to NewBuilder and storing the one
  field that is used instead of all of Options
- create a slice with the correct size to avoid extra allocations
2020-08-14 19:23:32 -04:00
Daniel Nephin 85655098be testing: Improve session_endpoint_test
While working on another change I caused a bunch of these tests to fail.
Unfortunately the failure messages were not super helpful at first.

One problem was that the request and response were created outside of
the retry. This meant that when the second attempt happened, the request
body was empty (because the buffer had been consumed), and so the
request was not actually being retried. This was fixed by moving more of
the request creation into the retry block.

Another problem was that these functions can return errors in two ways, and
are not consistent about which way they use. Some errors are returned to
the response writer, but the tests were not checking those errors, which
was causing a panic later on. This was fixed by adding a check for the
response code.

Also adds some missing t.Helper(), and has assertIndex use checkIndex so
that it is clear these are the same implementation.
2020-08-14 18:55:52 -04:00
Daniel Nephin 5d53e7dbde Use t.Helper in testutil/retry 2020-08-14 18:55:52 -04:00
Daniel Nephin 483140c9a6
Merge pull request #8509 from hashicorp/dnephin/use-t.cleanup-in-testagent
testing: Use t.cleanup in TestAgent , and fix two flaky tests
2020-08-14 16:33:16 -04:00
Derek Strickland e5e2e3c3e3
Added link to Secure Consul on K8S Tutorial (#8402) 2020-08-14 14:27:34 -04:00
Daniel Nephin 2725513eea testutil: Add t.Cleanup to TempDir
TempDir registers a Cleanup so that the directory is always removed. To disable to cleanup, set the TEST_NOCLEANUP
env var.
2020-08-14 13:19:10 -04:00
Mike Morris cc82741517
changelog: update for 1.8.3, 1.7.7 and 1.6.8 (#8508)
* Update CHANGELOG.md

* changelog: update for 1.8.3, 1.7.7 and 1.6.8

Co-authored-by: Sarah Christoff <schristoff@hashicorp.com>
2020-08-13 20:03:58 -04:00
Daniel Nephin 6b0ac22c1b testing: fix flaky test TestDNS_NonExistentDC_RPC
I saw this test flake locally, and it was easy to reproduce with -count=10.

The failure was: 'TestAgent.dns: rpc error: error=No known Consul servers'.

Waiting for the agent seems to fix it.
2020-08-13 18:03:04 -04:00
Daniel Nephin 512a523a3e testing: wait until monitor has started before shutdown
This commit fixes a test that I saw flake locally while running tests. The test output from the monitor
started immediately after the line the test was looking for.

To fix the problem a channel is closed when the goroutine starts. Shutdown is not called until this channel
is closed, which seems to greatly reduce the chance of a flake.
2020-08-13 17:53:29 -04:00
Daniel Nephin b6d91d59f3 testing: Remove TestAgent.Key and change TestAgent.DataDir
TestAgent.Key was only used by 3 tests. Extracting it from the common helper that is used in hundreds of
tests helps keep the shared part small and more focused.

This required a second change (which I was planning on making anyway), which was to change the behaviour of
DataDir. Now in all cases the TestAgent will use the DataDir, and clean it up once the test is complete.
2020-08-13 17:53:24 -04:00
Iryna Shustava 12ee36abd9
docs: add docs to use Connect CA providers with Helm (#8464) 2020-08-13 14:29:59 -07:00
Daniel Nephin 3abe4e43d3 testing: use t.Cleanup in TestAgent for returnPorts 2020-08-13 17:09:37 -04:00
Daniel Nephin edf6e74a14 testing: remove unused fields from TestACLAgent 2020-08-13 17:03:55 -04:00
Derek Strickland d373be125e
Learn/link updates derek (#8487)
* Updated Learn url paths.

Co-authored-by: danielehc <40759828+danielehc@users.noreply.github.com>
2020-08-13 17:02:44 -04:00
Iryna Shustava 426e8b9029
docs: consul on k8s doesn't support external servers requiring mTLS (#8484) 2020-08-13 12:04:34 -07:00
Daniel Nephin 8ee2062b1a
Merge pull request #8473 from hashicorp/dnephin/unmethod-consul-config
agent: convert consulConfig method to a function
2020-08-13 12:35:44 -04:00
Daniel Nephin 24d8db906f agent: rename vars in newConsulConfig
'base' is a bit misleading, since it is the return value. Renamed to cfg.
2020-08-13 11:58:21 -04:00
Daniel Nephin 88ddd8e0e7 agent: Move setupKeyring functions to keyring.go
There are a couple reasons for this change:

1. agent.go is way too big. Smaller files makes code eaasier to read
   because tools that show usage also include filename which can give
   a lot more context to someone trying to understand which functions
   call other functions.
2. these two functions call into a large number of functions already in
   keyring.go.
2020-08-13 11:58:21 -04:00
Daniel Nephin 119df79d7c agent: unmethod consulConfig
To allow us to move newConsulConfig out of Agent.
2020-08-13 11:58:21 -04:00
Daniel Nephin 10689729d4
Merge pull request #8482 from hashicorp/dnephin/more-state-store-unmethod
state: remove unused Store method receiver
2020-08-13 11:56:44 -04:00
Daniel Nephin 055e7e8ca3 Fix conflict in merged PRs
One PR renamed the var from config->cfg, and another used the old name config, which caused the
build to fail on master.
2020-08-13 11:28:26 -04:00
Daniel Nephin 629c34085d state: remove unused Store method receiver
And use ReadTxn interface where appropriate.
2020-08-13 11:25:22 -04:00
Daniel Nephin 20f94bf9ab
Merge pull request #8463 from hashicorp/dnephin/unmethod-make-node-id
agent: convert NodeID methods to functions
2020-08-13 11:18:11 -04:00
Daniel Nephin fc797a279a
Merge pull request #8461 from hashicorp/dnephin/remove-notify-shutdown
agent/consul: Remove NotifyShutdown
2020-08-13 11:16:48 -04:00
Daniel Nephin d8ffcd5686
Merge pull request #8365 from hashicorp/dnephin/fix-service-by-node-meta-flake
state: speed up tests that use watchLimit
2020-08-13 11:16:12 -04:00
Hans Hasselberg 658c4cad0b
Link issue in note template (#8502)
Issue and PR numbers do not overlap, they are based of the same counter.
A PR can be also linked to via issues, if it is a PR, Github will
redirect to it.
This change has the benefit that one can link to both - issues and PRs.
2020-08-13 10:22:56 +02:00
Luke Kysow 02c2f023f5
Update k8s sync docs (#8452)
* Update k8s sync docs

- remove docs that said for nodeport service we register each instance
on a node with its same node name. We instead register each instance
onto the k8s-sync node
- add docs describing which ports and ips are used
2020-08-12 16:34:55 -07:00
Luke Kysow 1095b93775
Document k8s sidecar resource annotations (#8455) 2020-08-12 16:34:17 -07:00
s-christoff b218692382
Update version.js to 1.8.3 (#8505) 2020-08-12 17:35:54 -05:00
Daniel Nephin 7d4201a09c
Merge pull request #8500 from hashicorp/dnephin/auto-config-loader
auto-config: reduce awareness of config
2020-08-12 18:14:09 -04:00
s-christoff 7be28d9150
Update CHANGELOG.md 2020-08-12 14:47:19 -05:00
Daniel Nephin 45dae87ee7 auto-config: reduce awareness of config
This is a small step to allowing Agent to accept its dependencies
instead of creating them in New.

There were two fields in autoconfig.Config that were used exclusively
to load config. These were replaced with a single function, allowing us
to move LoadConfig back to the config package.

Also removed the WithX functions for building a Config. Since these were
simple assignment, it appeared we were not getting much value from them.
2020-08-12 13:23:23 -04:00
Daniel Nephin 62e402c4f9 Remove check that hostID is a uuid.
Immediately afterward we hash the ID, so it does not need to be a uuid anymore.
2020-08-12 13:05:10 -04:00
Daniel Nephin 55b074f0eb agent: convert NodeID methods to functions
Making these functions allows us to cleanup how an agent is initialized. They only make use of a config and a logger, so they do not need to be agent methods.

Also cleanup the testing to use t.Run and require.
2020-08-12 13:05:10 -04:00
Daniel Nephin 6be568119b Extract nodeID functions to a different file
In preparation for turning them into functions.
To reduce the scope of Agent, and refactor how Agent is created and started.
2020-08-12 13:05:10 -04:00
R.B. Boyer 1593ce2948 update changelog snippet 2020-08-12 11:21:54 -05:00
R.B. Boyer 63422ca9c5
connect: use stronger validation that ingress gateways have compatible protocols defined for their upstreams (#8470)
Fixes #8466

Since Consul 1.8.0 there was a bug in how ingress gateway protocol
compatibility was enforced. At the point in time that an ingress-gateway
config entry was modified the discovery chain for each upstream was
checked to ensure the ingress gateway protocol matched. Unfortunately
future modifications of other config entries were not validated against
existing ingress-gateway definitions, such as:

1. create tcp ingress-gateway pointing to 'api' (ok)
2. create service-defaults for 'api' setting protocol=http (worked, but not ok)
3. create service-splitter or service-router for 'api' (worked, but caused an agent panic)

If you were to do these in a different order, it would fail without a
crash:

1. create service-defaults for 'api' setting protocol=http (ok)
2. create service-splitter or service-router for 'api' (ok)
3. create tcp ingress-gateway pointing to 'api' (fail with message about
   protocol mismatch)

This PR introduces the missing validation. The two new behaviors are:

1. create tcp ingress-gateway pointing to 'api' (ok)
2. (NEW) create service-defaults for 'api' setting protocol=http ("ok" for back compat)
3. (NEW) create service-splitter or service-router for 'api' (fail with
   message about protocol mismatch)

In consideration for any existing users that may be inadvertently be
falling into item (2) above, that is now officiall a valid configuration
to be in. For anyone falling into item (3) above while you cannot use
the API to manufacture that scenario anymore, anyone that has old (now
bad) data will still be able to have the agent use them just enough to
generate a new agent/proxycfg error message rather than a panic.
Unfortunately we just don't have enough information to properly fix the
config entries.
2020-08-12 11:19:20 -05:00