Commit Graph

21 Commits

Author SHA1 Message Date
Ryan Cragun b19617d955
enos: fix licensing on backported files (#24162)
Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-11-16 12:59:47 -07:00
hc-github-team-secure-vault-core eb1376dc13
backport of commit a46def288f06cff8176399f239f87a2a49ba5dd9 (#23869)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-10-26 21:36:50 +00:00
hc-github-team-secure-vault-core 7624576e39
backport of commit 9afd5e52ae31d6c3b7ab6833836647392bb318e6 (#23478)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-10-03 19:29:40 +00:00
hc-github-team-secure-vault-core fd05101133
backport of commit 460b5de47b2b75b9cbeab06933f15774b7819d50 (#23358)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-09-27 23:42:57 +00:00
Ryan Cragun d2db7fbcdd
Backport [QT-602] Run `proxy` and `agent` test scenarios (#23176) into release/1.14.x (#23302)
* [QT-602] Run `proxy` and `agent` test scenarios (#23176)

Update our `proxy` and `agent` scenarios to support new variants and
perform baseline verification and their scenario specific verification.
We integrate these updated scenarios into the pipeline by adding them
to artifact samples.

We've also improved the reliability of the `autopilot` and `replication`
scenarios by refactoring our IP address gathering. Previously, we'd ask
vault for the primary IP address and use some Terraform logic to determine
followers. The leader IP address gathering script was also implicitly
responsible for ensuring that a found leader was within a given group of
hosts, and thus waiting for a given cluster to have a leader, and also for
doing some arithmetic and outputting `replication` specific output data.
We've broken these responsibilities into individual modules, improved their
error messages, and fixed various races and bugs, including:
* Fix a race between creating the file audit device and installing and starting
  vault in the `replication` scenario.
* Fix how we determine our leader and follower IP addresses. We now query
  vault instead of a prior implementation that inferred the followers and sometimes
  did not allow all nodes to be an expected leader.
* Fix a bug where we'd always always fail on the first wrong condition
  in the `vault_verify_performance_replication` module.

We also performed some maintenance tasks on Enos scenarios  byupdating our
references from `oss` to `ce` to handle the naming and license changes. We
also enabled `shellcheck` linting for enos module scripts.

* Rename `oss` to `ce` for license and naming changes.
* Convert template enos scripts to scripts that take environment
  variables.
* Add `shellcheck` linting for enos module scripts.
* Add additional `backend` and `seal` support to `proxy` and `agent`
  scenarios.
* Update scenarios to include all baseline verification.
* Add `proxy` and `agent` scenarios to artifact samples.
* Remove IP address verification from the `vault_get_cluster_ips`
  modules and implement a new `vault_wait_for_leader` module.
* Determine follower IP addresses by querying vault in the
  `vault_get_cluster_ips` module.
* Move replication specific behavior out of the `vault_get_cluster_ips`
  module and into it's own `replication_data` module.
* Extend initial version support for the `upgrade` and `autopilot`
  scenarios.

We also discovered an issue with undo_logs that has been described in
the VAULT-20259. As such, we've disabled the undo_logs check until
it has been fixed.

* actions: fix actionlint error and linting logic (#23305)

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-09-27 10:53:12 -06:00
Ryan Cragun 9da2fc4b8b
test: wait for nc to be listening before enabling auditor (#23142) (#23150)
Rather than assuming a short sleep will work, we instead wait until netcat is listening of the socket. We've also configured the netcat listener to persist after the first connection, which allows Vault and us to check the connection without the process closing.

As we implemented this we also ran into AWS issues in us-east-1 and us-west-2, so we've changed our deploy regions until those issues are resolved.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-09-18 15:10:37 -06:00
hc-github-team-secure-vault-core f52a686b91
[QT-506] Use enos scenario samples for testing (#22641) (#22933)
Replace our prior implementation of Enos test groups with the new Enos
sampling feature. With this feature we're able to describe which
scenarios and variant combinations are valid for a given artifact and
allow enos to create a valid sample field (a matrix of all compatible
scenarios) and take an observation (select some to run) for us. This
ensures that every valid scenario and variant combination will
now be a candidate for testing in the pipeline. See QT-504[0] for further
details on the Enos sampling capabilities.

Our prior implementation only tested the amd64 and arm64 zip artifacts,
as well as the Docker container. We now include the following new artifacts
in the test matrix:
* CE Amd64 Debian package
* CE Amd64 RPM package
* CE Arm64 Debian package
* CE Arm64 RPM package

Each artifact includes a sample definition for both pre-merge/post-merge
(build) and release testing.

Changes:
* Remove the hand crafted `enos-run-matrices` ci matrix targets and replace
  them with per-artifact samples.
* Use enos sampling to generate different sample groups on all pull
  requests.
* Update the enos scenario matrices to handle HSM and FIPS packages.
* Simplify enos scenarios by using shared globals instead of
  cargo-culted locals.

Note: This will require coordination with vault-enterprise to ensure a
smooth migration to the new system. Integrating new scenarios or
modifying existing scenarios/variants should be much smoother after this
initial migration.

[0] https://github.com/hashicorp/enos/pull/102

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-09-08 13:31:09 -06:00
hc-github-team-secure-vault-core 6f2d433394
backport of commit 6ae9f8d4eddfdb134bcbabd3f58e633757a6afc9 (#22443)
Co-authored-by: Rebecca Willett <47540675+rebwill@users.noreply.github.com>
2023-08-18 16:30:36 +00:00
hc-github-team-secure-vault-core b4fa55858c
backport of commit 6654c425d2206624ff42cc7b7b92407a5e338311 (#22221)
Co-authored-by: Rebecca Willett <47540675+rebwill@users.noreply.github.com>
2023-08-08 11:11:03 -04:00
hc-github-team-secure-vault-core 04eed0b14c
backport of commit 6b21994d76b18c91397247dfd69bb01e46c5de25 (#21981)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-07-20 20:51:07 +00:00
hc-github-team-secure-vault-core 324557f57e
enos: use on-demand targets (#21459) (#21464)
Add an updated `target_ec2_instances` module that is capable of
dynamically splitting target instances over subnet/az's that are
compatible with the AMI architecture and the associated instance type
for the architecture. Use the `target_ec2_instances` module where
necessary. Ensure that `raft` storage scenarios don't provision
unnecessary infrastructure with a new `target_ec2_shim` module.

After a lot of trial, the state of Ec2 spot instance capacity, their
associated APIs, and current support for different fleet types in AWS
Terraform provider, have proven to make using spot instances for
scenario targets too unreliable.

The current state of each method:
* `target_ec2_fleet`: unusable due to the fact that the `instant` type
  does not guarantee fulfillment of either `spot` or `on-demand`
  instance request types. The module does support both `on-demand` and
  `spot` request types and is capable of bidding across a maximum of
  four availability zones, which makes it an attractive choice if the
  `instant` type would always fulfill requests. Perhaps a `request` type
  with `wait_for_fulfillment` option like `aws_spot_fleet_request` would
  make it more viable for future consideration.
* `target_ec2_spot_fleet`: more reliable if bidding for target instances
  that have capacity in the chosen zone. Issues in the AWS provider
  prevent us from bidding across multiple zones succesfully. Over the
  last 2-3 months target capacity for the instance types we'd prefer to
  use has dropped dramatically and the price is near-or-at on-demand.
  The volatility for nearly no cost savings means we should put this
  option on the shelf for now.
* `target_ec2_instances`: the most reliable method we've got. It is now
  capable of automatically determing which subnets and availability
  zones to provision targets in and has been updated to be usable for
  both Vault and Consul targets. By default we use the cheapest medium
  instance types that we've found are reliable to test vault.

* Update .gitignore
* enos/modules/create_vpc: create a subnet for every availability zone
* enos/modules/target_ec2_fleet: bid across the maximum of four
  availability zones for targets
* enos/modules/target_ec2_spot_fleet: attempt to make the spot fleet bid
  across more availability zones for targets
* enos/modules/target_ec2_instances: create module to use
  ec2:RunInstances for scenario targets
* enos/modules/target_ec2_shim: create shim module to satisfy the
  target module interface
* enos/scenarios: use target_ec2_shim for backend targets on raft
  storage scenarios
* enos/modules/az_finder: remove unsed module

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-06-26 16:54:39 -06:00
hc-github-team-secure-vault-core 58287739ec
backport of commit 5de6af60760dbcbefd8c8e4eb923f74a5720cf13 (#21440)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-06-23 04:48:54 +00:00
hc-github-team-secure-vault-core be67c16299
backport of commit 8d22142a3e9d13435b1a65685317fefba7e2f5b3 (#21421)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-06-22 22:14:22 +00:00
hc-github-team-secure-vault-core 7d024b4f4e
backport of commit 2ec5a28f51fe0b5095a0554627fb3295c7f2ccb4 (#21148)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-06-12 17:19:46 +00:00
hc-github-team-secure-vault-core 93af7c8756
backport of commit 27621e05d63ae14475e7a5ec8e8f23277d9eeb98 (#21137)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-06-12 16:46:11 +00:00
Mike Baum d323aa33df
Backport of audit file changes to release/1.14.x (#20985) 2023-06-05 11:46:59 -04:00
Ryan Cragun a19f7dbda5
[QT-525] enos: use spot instances for Vault targets (#20037)
The previous strategy for provisioning infrastructure targets was to use
the cheapest instances that could reliably perform as Vault cluster
nodes. With this change we introduce a new model for target node
infrastructure. We've replaced on-demand instances for a spot
fleet. While the spot price fluctuates based on dynamic pricing, 
capacity, region, instance type, and platform, cost savings for our
most common combinations range between 20-70%.

This change only includes spot fleet targets for Vault clusters.
We'll be updating our Consul backend bidding in another PR.

* Create a new `vault_cluster` module that handles installation,
  configuration, initializing, and unsealing Vault clusters.
* Create a `target_ec2_instances` module that can provision a group of
  instances on-demand.
* Create a `target_ec2_spot_fleet` module that can bid on a fleet of
  spot instances.
* Extend every Enos scenario to utilize the spot fleet target acquisition
  strategy and the `vault_cluster` module.
* Update our Enos CI modules to handle both the `aws-nuke` permissions
  and also the privileges to provision spot fleets.
* Only use us-east-1 and us-west-2 in our scenario matrices as costs are
  lower than us-west-1.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-04-13 15:44:43 -04:00
Hamid Ghaf 27bb03bbc0
adding copyright header (#19555)
* adding copyright header

* fix fmt and a test
2023-03-15 09:00:52 -07:00
Jaymala af957e7c9c
Add Vault log level support (#19083)
Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>
2023-02-08 17:41:16 -05:00
Jaymala cdae007e30
Fix arch for backend storage in Enos replication scenario (#18741)
Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>
2023-01-17 22:28:30 +00:00
Jaymala ca18e2fffe
[QT-19] Enable Enos replication scenario (#17748)
* Add initial replication scenario config

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Add support for replication with different backend and seal types

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Update Consul versions

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Additional config for replicaiton scenario

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Update replication scenario modules

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Refactor replication modules

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Add more steps for replication

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Work in progress with unsealing followers on secondary cluster

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Add more replication scenario steps

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* More updates

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Working shamir scenario

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Update to unify get Vault IP module

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Remove duplicate module

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Fix race condition for secondary followers unseal

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Use consistent naming for module directories

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Update replication scenario with latest test matrix

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Verify replication with awskms

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Add write and retrive data support for all scenarios

* Update all scenarios to verify write and read kv data

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Fix write and read data modules

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Add comments explaining the module run

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Address review feedback and update consul version

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Address more review feedback

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Remove vault debug logging

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Exclude ent.fips1402 and ent.hsm.fips1402 packages from Enos test matrix

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Add verification for replication connection status

* Currently this verification fails on Consul due to VAULT-12332

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Add replication scenario to Enos README

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Update README as per review suggesstions

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* [QT-452] Add recovery keys to scenario outputs

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Fix replication output var

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

* Fix autopilot scenario deps and add retry for read data

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>

Signed-off-by: Jaymala Sinha <jaymala@hashicorp.com>
2023-01-13 11:43:26 -05:00