## Backport
This PR is auto-generated from #17754 to be assessed for backporting due
to the inclusion of the label backport/1.16.
🚨
>**Warning** automatic cherry-pick of commits failed. If the first
commit failed,
you will see a blank no-op commit below. If at least one commit
succeeded, you
will see the cherry-picked commits up to, _not including_, the commit
where
the merge conflict occurred.
The person who merged in the original PR is:
@WenInCode
This person should manually cherry-pick the original PR into a new
backport PR,
and close this one when the manual backport PR is merged in.
> merge conflict error: unable to process merge commit:
"1c757b8a2c1160ad53421b7b8bd7f74b205c4b89", automatic backport requires
rebase workflow
The below text is copied from the body of the original PR.
---
fixes#17097 Consul version of each nodes in UI nodes section
@jkirschner-hashicorp @huikang @team @Maintainers
Updated consul version in the request to register consul.
Added this as Node MetaData.
Fetching this new metadata in UI
<img width="1512" alt="Screenshot 2023-06-15 at 4 21 33 PM"
src="https://github.com/hashicorp/consul/assets/3139634/94f7cf6b-701f-4230-b9f7-d8c4342d0737">
Also made this backward compatible and tested.
Backward compatible in this context means - If consul binary with above
PR changes is deployed to one of node, and if UI is run from this node,
then the version of not only current (upgraded) node is displayed in UI
, but also of older nodes given that they are consul servers only.
For older (non-server or client) nodes the version is not added in
NodeMeta Data and hence the version will not be displayed for them.
If a old node is consul server, the version will be displayed. As the
endpoint - "v1/internal/ui/nodes?dc=dc1" was already returning version
in service meta. This is made use of in current UI changes.
<img width="1480" alt="Screenshot 2023-06-16 at 6 58 32 PM"
src="https://github.com/hashicorp/consul/assets/3139634/257942f4-fbed-437d-a492-37849d2bec4c">
---
<details>
<summary> Overview of commits </summary>
- 931fdfc7ecdc26bb7cc20b698c5e14c1b65fcc6e -
b3e2ec1ccaca3832a088ffcac54257fa6653c6c1 -
8d0e9a54907039c09330c6cd7b9e761566af6856 -
04e5d88cca37821f6667be381c16aaa5958b5c92 -
28286a2e98f8cd66ef8593c2e2893b4db6080417 -
43e50ad38207952a9c4d04d45d08b6b8f71b31fe -
0cf1b7077cdf255596254d9dc1624a269c42b94d -
27f34ce1c2973591f75b1e38a81ccbe7cee6cee3 -
2ac76d62b8cbae76b1a903021aebb9b865e29d6e -
3d618df9ef1d10dd5056c8b1ed865839c553a0e0 -
1c757b8a2c1160ad53421b7b8bd7f74b205c4b89 -
23ce82b4cee8f74dd634dbe145313e9a56c0077d -
4dc1c9b4c5aafdb8883ef977dfa9b39da138b6cb -
85a12a92528bfa267a039a9bb258170be914abf7 -
25d30a3fa980d130a30d445d26d47ef2356cb553 -
7f1d6192dce3352e92307175848b89f91e728c24 -
5174cbff84b0795d4cb36eb8980d0d5336091ac9
</details>
---------
Co-authored-by: Vijay Srinivas <vijayraghav22@gmail.com>
Co-authored-by: John Murret <john.murret@hashicorp.com>
Co-authored-by: Jared Kirschner <85913323+jkirschner-hashicorp@users.noreply.github.com>
Use only the agent token for deregistration during anti-entropy
The previous behavior had the agent attempt to use the "service" token
(i.e. from the `token` field in a service definition file), and if that
was not set then it would use the agent token.
The previous behavior was problematic because, if the service token had
been deleted, the deregistration request would fail. The agent would
retry the deregistration during each anti-entropy sync, and the
situation would never resolve.
The new behavior is to only/always use the agent token for service and
check deregistration during anti-entropy. This approach is:
* Simpler: No fallback logic to try different tokens
* Faster (slightly): No time spent attempting the service token
* Correct: The agent token is able to deregister services on that
agent's node, because:
* node:write permissions allow deregistration of services/checks on
that node.
* The agent token must have node:write permission, or else the agent
is not be able to (de)register itself into the catalog
Co-authored-by: Vesa Hagström <weeezes@gmail.com>
Having this type live in the agent/consul package makes it difficult to
put anything that relies on token resolution (e.g. the new gRPC services)
in separate packages without introducing import cycles.
For example, if package foo imports agent/consul for the ACLResolveResult
type it means that agent/consul cannot import foo to register its service.
We've previously worked around this by wrapping the ACLResolver to
"downgrade" its return type to an acl.Authorizer - aside from the
added complexity, this also loses the resolved identity information.
In the future, we may want to move the whole ACLResolver into the
acl/resolver package. For now, putting the result type there at least,
fixes the immediate import cycle issues.
The race detector noticed this initially in `TestAgentConfigWatcherSidecarProxy` but it is not restricted to just tests.
The two main changes here were:
- ensure that before we mutate the internal `agent/local` representation of a Service (for tags or VIPs) we clone those fields
- ensure that there's no function argument joint ownership between the caller of a function and the local state when calling `AddService`, `AddCheck`, and related using `copystructure` for now.
set -euo pipefail
unset CDPATH
cd "$(dirname "$0")"
for f in $(git grep '\brequire := require\.New(' | cut -d':' -f1 | sort -u); do
echo "=== require: $f ==="
sed -i '/require := require.New(t)/d' $f
# require.XXX(blah) but not require.XXX(tblah) or require.XXX(rblah)
sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\([^tr]\)/require.\1(t,\2/g' $f
# require.XXX(tblah) but not require.XXX(t, blah)
sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/require.\1(t,\2/g' $f
# require.XXX(rblah) but not require.XXX(r, blah)
sed -i 's/\brequire\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/require.\1(t,\2/g' $f
gofmt -s -w $f
done
for f in $(git grep '\bassert := assert\.New(' | cut -d':' -f1 | sort -u); do
echo "=== assert: $f ==="
sed -i '/assert := assert.New(t)/d' $f
# assert.XXX(blah) but not assert.XXX(tblah) or assert.XXX(rblah)
sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\([^tr]\)/assert.\1(t,\2/g' $f
# assert.XXX(tblah) but not assert.XXX(t, blah)
sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(t[^,]\)/assert.\1(t,\2/g' $f
# assert.XXX(rblah) but not assert.XXX(r, blah)
sed -i 's/\bassert\.\([a-zA-Z0-9_]*\)(\(r[^,]\)/assert.\1(t,\2/g' $f
gofmt -s -w $f
done
* clone the service under lock to avoid a data race
* add change log
* create a struct and copy the pointer to mutate it to avoid a data race
* fix failing test
* revert added space
* add comments, to clarify the data race.
Error messages related to service and check operations previously included
the following substrings:
- service %q
- check %q
From this error message, it isn't clear that the expected field is the ID for
the entity, not the name. For example, if the user has a service named test,
the error message would read 'Unknown service "test"'. This is misleading -
a service with that *name* does exist, but not with that *ID*.
The substrings above have been modified to make it clear that ID is needed,
not name:
- service with ID %q
- check with ID %q
* fix tests to use a dummy nodeName and not fail when hostname is not a valid nodeName
* remove conditional testing
* add test when node name is invalid
Only default to the user token and agent token for the sync. Change the
exported methods to only return the stored tokens associated with a
specific check or service.
Add a skip condition to all tests slower than 100ms.
This change was made using `gotestsum tool slowest` with data from the
last 3 CI runs of master.
See https://github.com/gotestyourself/gotestsum#finding-and-skipping-slow-tests
With this change:
```
$ time go test -count=1 -short ./agent
ok github.com/hashicorp/consul/agent 0.743s
real 0m4.791s
$ time go test -count=1 -short ./agent/consul
ok github.com/hashicorp/consul/agent/consul 4.229s
real 0m8.769s
```
There are a couple of things in here.
First, just like auto encrypt, any Cluster.AutoConfig RPC will implicitly use the less secure RPC mechanism.
This drastically modifies how the Consul Agent starts up and moves most of the responsibilities (other than signal handling) from the cli command and into the Agent.
In current implementation of Consul, check alias cannot determine
if a service exists or not. Because a service without any check
is semantically considered as passing, so when no healthchecks
are found for an agent, the check was considered as passing.
But this make little sense as the current implementation does not
make any difference between:
* a non-existing service (passing)
* a service without any check (passing as well)
In order to make it work, we have to ensure that when a check did
not find any healthcheck, the service does indeed exists. If it
does not, lets consider the check as failing.
* Fix segfault when removing both a service and associated check
updateSyncState creates entries in the services and checks maps for
remote services/checks that are not found locally, so that we can then
make sure to delete them in our reconciliation process. However, the
values added to the map are missing key fields that the rest of the code
expects to not be nil.
* Add comment stating Check field can be nil
* Store primaries root in secondary after intermediate signature
This ensures that the intermediate exists within the CA root stored in raft and not just in the CA provider state. This has the very nice benefit of actually outputting the intermediate cert within the ca roots HTTP/RPC endpoints.
This change means that if signing the intermediate fails it will not set the root within raft. So far I have not come up with a reason why that is bad. The secondary CA roots watch will pull the root again and go through all the motions. So as soon as getting an intermediate CA works the root will get set.
* Make TestAgentAntiEntropy_Check_DeferSync less flaky
I am not sure this is the full fix but it seems to help for me.
* First conversion
* Use serf 0.8.2 tag and associated updated deps
* * Move freeport and testutil into internal/
* Make internal/ its own module
* Update imports
* Add replace statements so API and normal Consul code are
self-referencing for ease of development
* Adapt to newer goe/values
* Bump to new cleanhttp
* Fix ban nonprintable chars test
* Update lock bad args test
The error message when the duration cannot be parsed changed in Go 1.12
(ae0c435877d3aacb9af5e706c40f9dddde5d3e67). This updates that test.
* Update another test as well
* Bump travis
* Bump circleci
* Bump go-discover and godo to get rid of launchpad dep
* Bump dockerfile go version
* fix tar command
* Bump go-cleanhttp