* integer values for some log flags
* Adjusted `log_flags` to expect `int` for max files and max bytes
* Updated `server` and `agent`
Renamed updateConfig (and updateLogConfig)
* Added int log params to test
* Adjust config/params so we can identify when they're not present
* Removed pointer confusion
* add core state lockd eadlock detection config option v2
* add changelog
* split out NewTestCluster function to maintain build flag
* replace long func with constant
* remove line
* rename file, and move where detect deadlock flag is set
* add Link config, init, and capabilities
* add node status proto
* bump protoc version to 3.21.9
* make proto
* adding link tests
* remove wrapped link
* add changelog entry
* update changelog entry
Move version out of SDK. For now it's a copy rather than move: the part not addressed by this change is sdk/helper/useragent.String, which we'll want to remove in favour of PluginString. That will have to wait until we've removed uses of useragent.String from all builtins.
Create global quotas of each type in every NewTestCluster. Also switch some key locks to use DeadlockMutex to make it easier to discover deadlocks in testing.
NewTestCluster also now starts the cluster, and the Start method becomes a no-op. Unless SkipInit is provided, we also wait for a node to become active, eliminating the need for WaitForActiveNode. This was needed because otherwise we can't safely make the quota api call. We can't do it in Start because Start doesn't return an error, and I didn't want to begin storing the testing object T instead TestCluster just so we could call t.Fatal inside Start.
The last change here was to address the problem of how to skip setting up quotas when creating a cluster with a nonstandard handler that might not even implement the quotas endpoint. The challenge is that because we were taking a func pointer to generate the real handler func, we didn't have any way to compare that func pointer to the standard handler-generating func http.Handler without creating a circular dependency between packages vault and http. The solution was to pass a method instead of an anonymous func pointer so that we can do reflection on it.
* Work to unify log-file for agent/server and add rotation
* Updates to rotation code, tried to centralise the log config setup
* logging + tests
* Move LogFile to ShareConfig in test
* Docs
* Rename integation_test.go->integration_test.go
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add ability to fetch container's network addresses
This lets us return the on-network container address, allowing us to
spawn client containers which contact server containers.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add integration tests with nginx, curl, wget, Go
We build new integration tests, spawning a test instance on nginx and
ensuring we can connect with a variety of clients against a variety of
CA and leaf certificate types. This will ultimately let us detect issues
with compatibility as we expand the matrix of supported servers and
clients.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Make runner reference unique
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Attempt to fix CI with longer wait
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Finish moving nginx tests to pkiext package
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* make fmt
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add more debugging, work on CircleCI
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Removes _builtin_ versions from mount storage where it already exists
* Stops new builtin versions being put into storage on mount creation/tuning
* Stops the plugin catalog from returning a builtin plugin that has been overridden, so it more accurately reflects the plugins that are available to actually run
* Started work on adding log-file support to Agent
* Allow log file to be picked up and appended
* Use NewLogFile everywhere
* Tried to pull out the config aggregation from Agent.Run
Co-authored-by: Nick Cabatoff <ncabatoff@hashicorp.com>
Give the default SetupLoginMFATOTP helper a more robust period/skew. 403 failures on test-go-race are likely due to TOTP code timeouts being too aggressive.
When running the test suite in CI (where requests are centralized from
relatively few IPs), we'd occasionally hit Dockerhub's rate limits.
Luckily Hashicorp runs a (limited) public mirror of the containers we
need, so we can switch to them here in the tests.
For consistency between developer and CI, we've opted to have the tests
always pull from the Hashicorp mirror, rather than updating the CI
runner to prefer the mirror.
We exclude nomad and influxdb as we don't presently mirror these repos.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Store login MFA secret with tokenhelper
* Clean up and refactor tokenhelper paths
* Refactor totp test code for re-use
* Add login MFA command tests
* Use longer sleep times and sha512 for totp test
* Add changelog
* Add tests for zlint-clean CA building
This test ensures that we can consistently pass ZLint's CA linting
tests on a root certificate generated by Vault. In particular, nominal
requirements are placed on the structure on the issuer's Subject, which
we supply, and the remaining requirements pass.
The one exception is we include both RFC and CA/BF BR lints in the
default zlint checks; this means ECDSA P-521 (which isn't accepted by
Mozilla's root store policies) is rejected, so we ignore to lints
related to that.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Add options to copy to/from container, fix stopping
Stopping the container takes a bit of time for some unknown reason so
I've instead opted to shorten the sleep in the zlint tests to avoid
consuming resources too long after the test finish.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Make zlint tests execute in parallel
This improves the overall test time of the zlint tests, making the
container build up front once (provisioning zlint), and then copying the
cert into the new container image later.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* make fmt
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Refactor Docker command execution
This refactor will allow others to interact with containers more easily,
providing two interfaces (RunCmdWithOutput and RunCmdInBackground) for
executing commands in running containers if they don't wish to do so
manually.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow building containerfiles in tests
By adding image building capabilities to testhelpers (and coupled with
the better command execution support), we can begin to build better,
more reliable integration tests on top of public base images without
needing to maintain separate forks of these images out-of-tree for any
shortcomings they might have.
In particular, rather than doing the rather messy echo hack for writing
clients.conf, it is far better to provision this via a slim
Containerfile overlay on top of the stock jumanjiman/radiusd:latest
image.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Correctly parse stdout/stderr in RunCmdWithOutput
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* ctx -> bCtx for BuildContext
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Update errors to use %w instead of %v
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Add plugin version to GRPC interface
Added a version interface in the sdk/logical so that it can be shared between all plugin types, and then wired it up to RunningVersion in the mounts, auth list, and database systems.
I've tested that this works with auth, database, and secrets plugin types, with the following logic to populate RunningVersion:
If a plugin has a PluginVersion() method implemented, then that is used
If not, and the plugin is built into the Vault binary, then the go.mod version is used
Otherwise, the it will be the empty string.
My apologies for the length of this PR.
* Placeholder backend should be external
We use a placeholder backend (previously a framework.Backend) before a
GRPC plugin is lazy-loaded. This makes us later think the plugin is a
builtin plugin.
So we added a `placeholderBackend` type that overrides the
`IsExternal()` method so that later we know that the plugin is external,
and don't give it a default builtin version.
* Allow exposing access to the underlying container
This exposes the Container response from the Docker API, allowing
consumers of the testhelper to interact with the newly started running
container instance. This will be useful for two reasons:
1. Allowing radiusd container to start its own daemon after modifying
its configuration.
2. For loading certificates into a future similar integration test
using the PKI secrets engine.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* Allow any client to connect to test radiusd daemon
This fixes test failures of the following form:
> 2022-09-07T10:46:19.332-0400 [TRACE] core: adding local paths: paths=[]
> 2022-09-07T10:46:19.333-0400 [INFO] core: enabled credential backend: path=mnt/ type=test
> 2022-09-07T10:46:19.334-0400 [WARN] Executing test step: step_number=1
> 2022-09-07T10:46:19.334-0400 [WARN] Executing test step: step_number=2
> 2022-09-07T10:46:29.334-0400 [WARN] Executing test step: step_number=3
> 2022-09-07T10:46:29.335-0400 [WARN] Executing test step: step_number=4
> 2022-09-07T10:46:39.336-0400 [WARN] Requesting RollbackOperation
> --- FAIL: TestBackend_acceptance (28.56s)
> testing.go:364: Failed step 4: erroneous response:
>
> &logical.Response{Secret:<nil>, Auth:<nil>, Data:map[string]interface {}{"error":"context deadline exceeded"}, Redirect:"", Warnings:[]string(nil), WrapInfo:(*wrapping.ResponseWrapInfo)(nil), Headers:map[string][]string(nil)}
> FAIL
> FAIL github.com/hashicorp/vault/builtin/credential/radius 29.238s
In particular, radiusd container ships with a default clients.conf which
restricts connections to ranges associated with the Docker daemon. When
creating new networks (such as in CircleCI) or when running via Podman
(which has its own set of network ranges), this initial config will no
longer be applicable. We thus need to write a new config into the image;
while we could do this by rebuilding a new image on top of the existing
layers (provisioning our config), we then need to manage these changes
and give hooks for the service setup to build it.
Thus, post-startup modification is probably easier to execute in our
case.
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
Signed-off-by: Alexander Scheel <alex.scheel@hashicorp.com>
* OSS portion of wrapper-v2
* Prefetch barrier type to avoid encountering an error in the simple BarrierType() getter
* Rename the OveriddenType to WrapperType and use it for the barrier type prefetch
* Fix unit test
* storage/raft: Fix cluster init with retry_join
Commit 8db66f4853abce3f432adcf1724b1f237b275415 introduced an error
wherein a join() would return nil (no error) with no information on its
channel if a joining node had been initialized. This was not handled
properly by the caller and resulted in a canceled `retry_join`.
Fix this by handling the `nil` channel respone by treating it as an
error and allowing the existing mechanics to work as intended.
* storage/raft: Improve retry_join go test
* storage/raft: Make VerifyRaftPeers pollable
* storage/raft: Add changelog entry for retry_join fix
* storage/raft: Add description to VerifyRaftPeers
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* Update gopsutil to v3
* Adds v2 field names in host-info response to allow eventual deprecation in favor of v3 field names
* Map v3 to v2 field names to keep host-info api compat
* copy gopsutil license into source
* raft: Ensure init before setting suffrage
As reported in https://hashicorp.atlassian.net/browse/VAULT-6773:
The /sys/storage/raft/join endpoint is intended to be unauthenticated. We rely
on the seal to manage trust.
It’s possible to use multiple join requests to switch nodes from voter to
non-voter. The screenshot shows a 3 node cluster where vault_2 is the leader,
and vault_3 and vault_4 are followers with non-voters set to false. sent two
requests to the raft join endpoint to have vault_3 and vault_4 join the cluster
with non_voters:true.
This commit fixes the issue by delaying the call to SetDesiredSuffrage until after
the initialization check, preventing unauthenticated mangling of voter status.
Tested locally using
https://github.com/hashicorp/vault-tools/blob/main/users/ncabatoff/cluster/raft.sh
and the reproducer outlined in VAULT-6773.
* raft: Return join err on failure
This is necessary to correctly distinguish errors returned from the Join
workflow. Previously, errors were being masked as timeouts.
* raft: Default autopilot parameters in teststorage
Change some defaults so we don't have to pass in parameters or set them
in the originating tests. These storage types are only used in two
places:
1) Raft HA testing
2) Seal migration testing
Both consumers have been tested and pass with this change.
* changelog: Unauthn voter status change bugfix
* VAULT-6613 add DetermineRoleFromLoginRequest function to Core
* Fix body handling
* Role resolution for rate limit quotas
* VAULT-6613 update precedence test
* Add changelog
* VAULT-6614 start of changes for roles in LCQs
* Expiration changes for leases
* Add role information to RequestAuth
* VAULT-6614 Test updates
* VAULT-6614 Add expiration test with roles
* VAULT-6614 fix comment
* VAULT-6614 Protobuf on OSS
* VAULT-6614 Add rlock to determine role code
* VAULT-6614 Try lock instead of rlock
* VAULT-6614 back to rlock while I think about this more
* VAULT-6614 Additional safety for nil dereference
* VAULT-6614 Use %q over %s
* VAULT-6614 Add overloading to plugin backends
* VAULT-6614 RLocks instead
* VAULT-6614 Fix return for backend factory
Add database plugin metrics around connections
This is a replacement for #15923 that takes into account recent lock
cleanup.
I went ahead and added back in the hanging plugin test, which I meant to
add in #15944 but forgot.
I tested this by spinning up a statsd sink in the tests and verifying I
got a stream of metrics:
```
$ nc -u -l 8125 | grep backend
test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:1.000000|g
test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:0.000000|g
test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:1.000000|g
test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:0.000000|g
```
We have to rework the shared gauge code to work without a full
`ClusterMetricSink`, since we don't have access to the core metrics from
within a plugin.
This only reports metrics every 10 minutes by default, but it solves
some problems we would have had with the gauge values becoming stale and
needing to be re-sent.
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>