Commit Graph

320 Commits

Author SHA1 Message Date
Dhia Ayachi e5dbf5e55b
Add ca certificate metrics (#10504)
* add intermediate ca metric routine

* add Gauge config for intermediate cert

* Stop metrics routine when stopping leader

* add changelog entry

* updage changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* use variables instead of a map

* go imports sort

* Add metrics for primary and secondary ca

* start metrics routine in the right DC

* add telemetry documentation

* update docs

* extract expiry fetching in a func

* merge metrics for primary and secondary into signing ca metric

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-07-07 09:41:01 -04:00
John Cowen d93d9d4e39
ui: Don't default to the default namespace, use the token default namespace instead (#10503)
The default namespace, and the tokens default namespace (or its origin namespace) is slightly more complicated than other things we deal with in the UI, there's plenty of info/docs on this that I've added in this PR.

Previously:

When a namespace was not specified in the URL, we used to default to the default namespace. When you logged in using a token we automatically forward you the namespace URL that your token originates from, so you are then using the namespace for your token by default. You can of course then edit the URL to remove the namespace portion, or perhaps revisit the UI at the root path with you token already set. In these latter cases we would show you information from the default namespace. So if you had no namespace segment/portion in the URL, we would assume default, perform actions against the default namespace and highlight the default namespace in the namespace selector menu. If you wanted to perform actions in your tokens origin namespace you would have to manually select it from the namespace selector menu.

This PR:

Now, when you have no namespace segment/portion in the URL, we use the token's origin namespace instead (and if you don't have a token, we then use the default namespace like it was previously)

Notes/thoughts:

I originally thought we were showing an incorrectly selected namespace in the namespace selector, but it also matched up with what we were doing with the API, so it was in fact correct. The issue was more that we weren't selecting the origin namespace of the token for the user when a namespace segment was omitted from the URL. Seeing as we automatically forward you to the tokens origin namespace when you log in, and we were correctly showing the namespace we were acting on when you had no namespace segment in the URL (in the previous case default), I'm not entirely sure how much of an issue this actually was.

This characteristic of namespace+token+namespace is a little weird and its easy to miss a subtlety or two so I tried to add some documentation in here for future me/someone else (including some in depth code comment around one of the API endpoints where this is very subtle and very hard to miss). I'm not the greatest at words, so would be great to get some edits there if it doesn't seem clear to folks.

The fact that we used to save your previous datacenter and namespace into local storage for reasons also meant the interaction here was slightly more complicated than it needed to be, so whilst we were here we rejigged things slightly to satisfy said reasons still but not use local storage (we try and grab the info from higher up). A lot of the related code here is from before we had our Routlets which I think could probably make all of this a lot less complicated, but I didn't want to do a wholesale replacement in this PR, we can save that for a separate PR on its own at some point.
2021-07-07 11:46:41 +01:00
Daniel Nephin 916a9779f1 Add changelog 2021-07-06 17:57:36 -04:00
Dhia Ayachi b57cf27e8f
Format certificates properly (rfc7468) with a trailing new line (#10411)
* trim carriage return from certificates when inserting rootCA in the inMemDB

* format rootCA properly when returning the CA on the connect CA endpoint

* Fix linter warnings

* Fix providers to trim certs before returning it

* trim newlines on write when possible

* add changelog

* make sure all provider return a trailing newline after the root and intermediate certs

* Fix endpoint to return trailing new line

* Fix failing test with vault provider

* make test more robust

* make sure all provider return a trailing newline after the leaf certs

* Check for suffix before removing newline and use function

* Add comment to consul provider

* Update change log

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* fix typo

* simplify code callflow

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>

* extract requireNewLine as shared func

* remove dependency to testify in testing file

* remove extra newline in vault provider

* Add cert newline fix to envoy xds

* remove new line from mock provider

* Remove adding a new line from provider and fix it when the cert is read

* Add a comment to explain the fix

* Add missing for leaf certs

* fix missing new line

* fix missing new line in leaf certs

* remove extra new line in test

* updage changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* fix in vault provider and when reading cache (RPC call)

* fix AWS provider

* fix failing test in the provider

* remove comments and empty lines

* add check for empty cert in test

* fix linter warnings

* add new line for leaf and private key

* use string concat instead of Sprintf

* fix new lines for leaf signing

* preallocate slice and remove append

* Add new line to `SignIntermediate` and `CrossSignCA`

Co-authored-by: R.B. Boyer <4903+rboyer@users.noreply.github.com>
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-06-30 20:48:29 -04:00
Kenia 1ec131924a
Add changelog to backported bug fix for Topology Metrics (#10538) 2021-06-30 16:51:29 -04:00
Daniel Nephin 72ea979c39
Merge pull request #10515 from hashicorp/dnephin/fix-arm32-atomic-aligment
Fix panic on 32-bit platforms
2021-06-30 16:40:20 -04:00
Daniel Nephin e226733b26 fix 64-bit aligment for 32-bit platforms
sync/atomic must be used with 64-bit aligned fields, and that alignment is difficult to
ensure unless the field is the first one in the struct.

https://golang.org/pkg/sync/atomic/#pkg-note-BUG.
2021-06-29 16:10:21 -04:00
Daniel Nephin a4a390d7c5 streaming: fix enable of streaming in the client
And add checks to all the tests that explicitly use streaming.
2021-06-28 17:23:14 -04:00
R.B. Boyer 30ccd5c2d9
connect: include optional partition prefixes in SPIFFE identifiers (#10507)
NOTE: this does not include any intentions enforcement changes yet
2021-06-25 16:47:47 -05:00
R.B. Boyer c3d5a2a5ab
connect/ca: cease including the common name field in generated certs (#10424)
As part of this change, we ensure that the SAN extensions are marked as
critical when the subject is empty so that AWS PCA tolerates the loss of
common names well and continues to function as a Connect CA provider.

Parts of this currently hack around a bug in crypto/x509 and can be
removed after https://go-review.googlesource.com/c/go/+/329129 lands in
a Go release.

Note: the AWS PCA tests do not run automatically, but the following
passed locally for me:

    ENABLE_AWS_PCA_TESTS=1 go test ./agent/connect/ca -run TestAWS
2021-06-25 13:00:00 -05:00
Dhia Ayachi 8b967b3bb6
return an empty record when asked for an addr dns with type other then A, AAAA and ANY (#10401)
* return an invalid record when asked for an addr dns with type other then A and AAAA

* add changelog

* fix ANY use case and add a test for it

* update changelog type

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* return empty response if the question record type do not match for addr

* set comment in the right place

* return A\AAAA record in extra section if record type is not A\AAAA for addr

* Fix failing test

* remove commented code

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* use require for test validation

* use variable to init struct

* fix failing test

* Update agent/dns.go

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Update .changelog/10401.txt

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Update agent/dns.go

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Update agent/dns.go

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Update agent/dns.go

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* fix compilation error

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-06-24 20:44:44 -04:00
R.B. Boyer e20abcc14f
connect/proxy: fixes logic bug preventing builtin/native proxy from starting upstream listeners (#10486)
Fixes #10480

Also fixed a data race in the `connect/proxy` package that was unearthed by the tests changed for this bugfix.
2021-06-24 15:02:34 -05:00
Daniel Nephin 6de514fbd1 tlsutil: fix default server name for health checks
Don't use the agent node name or agent server name when EnableAgentTLSForChecks=false.
2021-06-24 13:49:58 -04:00
R.B. Boyer 9778bee35a
structs: prohibit config entries from referencing more than one partition at a time (#10478)
affected kinds: service-defaults, ingress-gateway, terminating-gateway, service-intentions
2021-06-23 16:44:10 -05:00
R.B. Boyer 952df8b491
structs: prevent service-defaults upstream configs from using wildcard names or namespaces (#10475) 2021-06-23 15:48:54 -05:00
John Cowen 93c5780fab
ui: Move all our icons to use CSS custom properties instead of SASS vars (#10298)
* ui: Move all our icons to use CSS custom properties

The good thing about SASS vars is, if you don't use them they get removed from the final build. Whereas with CSS we have no tree-shaking to get rid of unused CSS custom properties. We can mostly work around this and for some things like colors its no big deal if we have some hex-codes in the build that we don't use as hex-codes are relatively small.

We've been slowly but surely moving all of our colors (and other things) to use CSS custom properties instead of SASS vars now that we have them available.

This commit makes use of the 'tree-shaking' abilities of @extend to ensure that we only compile in the icons that we use.

This commit is mostly churn-less as we already use @extend for the majority of our icons, so generally there is zero change here for working on the UI, but I did spot one single place where we were using SASS vars instead of @extend. This now uses the new form (second commit)

Interestingly this reduces our CSS payload by ~2kb to ~53kb (around 25kb of that is these icons)
2021-06-21 11:54:58 +01:00
Daniel Nephin f4c1f982d1
Merge pull request #9924 from hashicorp/dnephin/cert-expiration-metric
connect: emit a metric for the seconds until root CA expiry
2021-06-18 14:18:55 -04:00
Nitya Dhanushkodi 2d27dbb711
Update .changelog/10423.txt
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
2021-06-17 12:06:26 -07:00
Nitya Dhanushkodi ffbbe9e73f proxycfg: reference to entry in map should not panic 2021-06-17 11:49:04 -07:00
Mike Morris d02051349b changelog: add notes section to changelog template 2021-06-16 16:58:11 -04:00
Mike Morris dedcaf5331 changelog: add note about packaging EULA and ToE alongside Enterprise binaries 2021-06-16 16:58:08 -04:00
Freddy 3127dac0fb
Merge pull request #10404 from hashicorp/ingress-stats 2021-06-15 14:28:07 -06:00
freddygv 62353557c0 Add changelog entry 2021-06-15 14:15:30 -06:00
Nitya Dhanushkodi 08ed3edf71
proxycfg: Ensure that endpoints for explicit upstreams in other datacenters are watched in transparent mode (#10391)
Co-authored-by: Freddy Vallenilla <freddy@hashicorp.com>
2021-06-15 11:00:26 -07:00
Dhia Ayachi 4b75f15fb7
improve monitor performance (#10368)
* remove flush for each write to http response in the agent monitor endpoint

* fix race condition when we stop and start monitor multiple times, the doneCh is closed and never recover.

* start log reading goroutine before adding the sink to avoid filling the log channel before getting a chance of reading from it

* flush every 500ms to optimize log writing in the http server side.

* add changelog file

* add issue url to changelog

* fix changelog url

* Update changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* use ticker to flush and avoid race condition when flushing in a different goroutine

* stop the ticker when done

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* Revert "fix race condition when we stop and start monitor multiple times, the doneCh is closed and never recover."

This reverts commit 1eeddf7a

* wait for log consumer loop to start before registering the sink

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-06-15 12:05:52 -04:00
R.B. Boyer 8d5f81b460
xds: ensure that dependent xDS resources are reconfigured during primary type warming (#10381)
Updates to a cluster will clear the associated endpoints, and updates to
a listener will clear the associated routes. Update the incremental xDS
logic to account for this implicit cleanup so that we can finish warming
the clusters and listeners.

Fixes #10379
2021-06-14 17:20:27 -05:00
Daniel Nephin e36800cefa Update metric name
and handle the case where there is no active root CA.
2021-06-14 17:01:16 -04:00
Daniel Nephin 548796ae13 connect: emit a metric for the number of seconds until root CA expiration 2021-06-14 16:57:01 -04:00
R.B. Boyer b050232abb
grpc: move gRPC INFO logs to be emitted as TRACE logs from Consul (#10395)
Fixes #10183
2021-06-14 15:13:58 -05:00
Freddy f19b1f0058
Relax validation for expose.paths config (#10394)
Previously we would return an error if duplicate paths were specified.
This could lead to problems in cases where a user has the same path,
say /healthz, on two different ports.

This validation was added to signal a potential misconfiguration.
Instead we will only check for duplicate listener ports, since that is
what would lead to ambiguity issues when generating xDS config.

In the future we could look into using a single listener and creating
distinct filter chains for each path/port.
2021-06-14 14:04:11 -06:00
Freddy 61ae2995b7
Add flag for transparent proxies to dial individual instances (#10329) 2021-06-09 14:34:17 -06:00
Freddy 62facc1a04
Revert "Avoid adding original_dst filter when not needed" (#10365) 2021-06-08 13:18:41 -06:00
Dhia Ayachi e3dd0f9a44
generate a single debug file for a long duration capture (#10279)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* snapshot the timestamp at the beginning of the capture

- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist

* Revert "snapshot the timestamp at the beginning of the capture"

This reverts commit c2d03346

* Refactor captureDynamic to extract capture logic for each item in a different func

* extract wait group outside the go routine to avoid a race condition

* capture pprof in a separate go routine

* perform a single capture for pprof data for the whole duration

* add missing vendor dependency

* add a change log and fix documentation to reflect the change

* create function for timestamp dir creation and simplify error handling

* use error groups and ticker to simplify interval capture loop

* Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval

* refactor Logs capture routine and add log capture specific test

* improve error reporting when log test fail

* change test duration to 1s

* make time parsing in log line more robust

* refactor log time format in a const

* test on log line empty the earliest possible and return

Co-authored-by: Freddy <freddygv@users.noreply.github.com>

* rename function to captureShortLived

* more specific changelog

Co-authored-by: Paul Banks <banks@banksco.de>

* update documentation to reflect current implementation

* add test for behavior when invalid param is passed to the command

* fix argument line in test

* a more detailed description of the new behaviour

Co-authored-by: Paul Banks <banks@banksco.de>

* print success right after the capture is done

* remove an unnecessary error check

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

* upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 13:00:51 -04:00
Dhia Ayachi 60d51a50d1
fix monitor to only start the monitor in json format when requested (#10358)
* fix monitor to only start the monitor in json format when requested

* add release notes

* add test to validate json format when requested
2021-06-07 12:08:48 -04:00
Matt Keeler 42007d4a94
Add license inspect command documentation and changelog (#10351)
Also reformatted another changelog entry.
2021-06-04 14:33:13 -04:00
Daniel Nephin 58116be6ca
Merge pull request #10338 from hashicorp/dnephin/fix-logging-indent
agent: remove leading whitespace from agent log lines
2021-06-03 12:16:45 -04:00
Paul Ewing e454a9aae0
usagemetrics: add cluster members to metrics API (#10340)
This PR adds cluster members to the metrics API. The number of members per
segment are reported as well as the total number of members.

Tested by running a multi-node cluster locally and ensuring the numbers were
correct. Also added unit test coverage to add the new expected gauges to
existing test cases.
2021-06-03 08:25:53 -07:00
Daniel Nephin d4c17f29a4 Add changelog 2021-06-02 17:39:30 -04:00
Daniel Nephin 0a39ba2c54 envoy: fix bootstrap deadlock caused by a full named pipe
Normally the named pipe would buffer up to 64k, but in some cases when a
soft limit is reached, they will start only buffering up to 4k.
In either case, we should not deadlock.

This commit changes the pipe-bootstrap command to first buffer all of
stdin into the process, before trying to write it to the named pipe.
This allows the process memory to act as the buffer, instead of the
named pipe.

Also changed the order of operations in `makeBootstrapPipe`. The new
test added in this PR showed that simply buffering in the process memory
was not enough to fix the issue. We also need to ensure that the
`pipe-bootstrap` process is started before we try to write to its
stdin. Otherwise the write will still block.

Also set stdout/stderr on the subprocess, so that any errors are visible
to the user.
2021-05-31 18:53:17 -04:00
Dhia Ayachi 0c13f80d5a
RPC Timeout/Retries account for blocking requests (#8978) 2021-05-27 17:29:43 -04:00
Matt Keeler b45dd03b8f
Bump raft-autopilot version to the latest. (#10306) 2021-05-27 12:59:14 -04:00
Dhia Ayachi 00f7e0772a
debug: remove the CLI check for debug_enabled (#10273)
* debug: remove the CLI check for debug_enabled

The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.

Also:
- fix the API client to return errors on non-200 status codes for debug
  endpoints
- improve the failure messages when pprof data can not be collected

Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>

* remove parallel test runs

parallel runs create a race condition that fail the debug tests

* Add changelog

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
2021-05-27 09:41:53 -04:00
Mike Morris da73899a10 changelog: add entry for network areas WAN config fix 2021-05-26 17:49:19 -04:00
Freddy c61e2bbda7
Ensure passthrough clusters can be created (#10301) 2021-05-26 15:05:14 -06:00
Freddy 7cfd7e9ec1
Avoid adding original_dst filter when not needed (#10302) 2021-05-26 15:04:45 -06:00
John Cowen f77615b872
ui: Unix Domain Socket support (#10287)
This commit adds UI support for Unix Domain Sockets for upstream and downstreams (see #9981 and #10252)
2021-05-26 17:52:25 +01:00
John Cowen 7083c39b96
ui: Support Route optional parameters/segments (#10212)
Moves our URLs with 'optional namespace segment' into a separately abstracted 'optional URL segment' feature
2021-05-26 17:43:46 +01:00
Rémi Lapeyre 5e33428b22
Always set the Content-Type header when a body is present (#10204)
* Always set the Content-Type header when a body is present

Closes https://github.com/hashicorp/consul/issues/10011

* Add Changelog entry

* Add more Content-Type exceptions

* Fix tests
2021-05-25 16:03:48 +01:00
Kenia e78e9de396
ui: Create and use collapsible notices component (#10270)
* Create and use collapsible notices

* Refactor collapsible-notices

* Split up the topology acceptance tests

* Add acceptance tests for tproxy notices

* Add component file

* Adds additional TProxy notices tests

* Adds conditional to only show collapsable if more than 2 notices are present

* Adds changelog

* Refactorting the conditonal for collapsing the notices

* Renaming undefinedIntention to be notDefinedIntention

* Refactor tests
2021-05-25 11:02:38 -04:00
Matt Keeler 7e4ea16149 Move some things around to allow for license updating via config reload
The bulk of this commit is moving the LeaderRoutineManager from the agent/consul package into its own package: lib/gort. It also got a renaming and its Start method now requires a context. Requiring that context required updating a whole bunch of other places in the code.
2021-05-25 09:57:50 -04:00