docs: Update load test documentation and minor clean ups (#9548)

This commit is contained in:
s-christoff 2021-01-15 12:41:06 -06:00 committed by GitHub
parent 45c0a71e7e
commit 8fc4de0ead
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
11 changed files with 91 additions and 70 deletions

View File

@ -1,2 +1,32 @@
## Terraform Consul Load Testing
Consul Load Testing is used to capture baseline performance metrics for Consul under stress. This will assist in ensuring there are no performance regressions during releases and substainal changes to Consul. Per the defaults, the test runs for 10 minutes with 25 virtual users spawned by k6. Instance sizes for test instance are `t2.small` and for the Consul cluster `m5n.large`. As long as the thresholds stay under 2 seconds, it passes. All metrics from Consul are pushed to a datadog dashboard for user review.
# Terraform Consul Load Testing
Consul Load Testing is used to capture baseline performance metrics for Consul under stress. This will assist in ensuring there are no performance regressions during releases and substantial changes to Consul. Per the defaults, the test runs for 10 minutes with 25 virtual users spawned by k6. Instance sizes for test instances are `t2.small` and for the Consul cluster `m5n.large`. All metrics from Consul are pushed to a datadog dashboard for user review.
This relies on the [Gruntwork's Terraform AWS Consul Module](https://github.com/hashicorp/terraform-aws-consul) which *by default* creates 3 Consul servers across 3 availability zones. A load test instance which has an image that is configured with the necessary scripts and [k6](https://k6.io/) is created and sends traffic to a load balancer. The load balancer will distribute requests across the Consul clients who will ultimately forward the requests to the servers.
<img src="loadtestdiagram.png" width="500" height="300"/>
# Load Test Automation
This can only be run on PRs that a Dev Build has been made for. When a PR has the `pr/load-test` Github Label applied this will kick off a Github Action. This Github Action will trigger Circle CI to run a Terraform Apply that runs a load test against the Dev Build Consul binary. The GitHub Action will paste the CircleCI load test workflow URL to the PR as a comment.
## How to use
[Terraform](https://www.terraform.io/downloads.html) and [Packer](https://www.packer.io/downloads), AWS and [Datadog](https://docs.datadoghq.com/getting_started/) are required to use this. All of this, except the AWS resources that will be utilized, are free.
This repo has the following folder structure:
* packer: This contains all the necessary stuff to make the load test and the Consul AMI to be utilized in Terraform.
* terraform: This contains all the relevant Terraform files
## Getting Started
1) Download all necessary tools listed (Terraform, Packer)
2) Set up an [AWS account](https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/) and a [Datadog account](https://docs.datadoghq.com/getting_started/) - downloading the Datadog client is not necessary.
3) Configure your AWS credentials using one of the [options supported by the AWS
SDK](http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html). Usually, the easiest option is to
set the `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_DEFAULT_REGION` environment variables.
4) Follow the [Packer README](./packer/README.md) to generate your load test and Consul AMIs
5) Follow the [Terraform README](./terraform/README.md) to stand up the infrastructure in AWS
6) Watch the results in either your datadog dashboard or in the output of your Terraform Apply
## Debugging in Datadog
Consul has hundreds of metrics to choose from. We recommend reading over [Datadog's article](https://www.datadoghq.com/blog/consul-metrics/#communication-metrics) that breaks down relevant Consul metrics, especially on their "communication" portion. This test runs solely based on making requests to endpoints which is what makes that section so relevant.

BIN
test/load/loadtestdiagram.png (Stored with Git LFS) Normal file

Binary file not shown.

3
test/load/packer/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
*.hwm
*.pwd
*.pwi

View File

@ -0,0 +1,27 @@
# Terraform Consul Load Testing
Packer will output AMI IDs when it completes - save these AMI IDs as Terraform will require them later.
```
==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:
us-east-1: ami-19601070
```
## Consul AMI:
Within the `consul-ami/` directory
1) Retrieve your [Datadog API key]((https://docs.datadoghq.com/account_management/api-app-keys/#api-keys)), set this as an environment variable, ex: `export DD_API_KEY=$YOURDDAPIKEYHERE`
2) Set the AWS_DEFAULT_REGION for Packer, ex: `export AWS_DEFAULT_REGION=us-east-1`
3) Run `packer build consul.json`.
For additional customization you can add [tags](https://docs.datadoghq.com/getting_started/tagging/assigning_tags/?tab=noncontainerizedenvironments) within the `scripts/datadog.yaml` file. An example of a tag could be `"consul_version" : "consulent_175"`. These tags are searchable through the datadog dashboard. Another form of customization is changing the datacenter tag within `scripts/telemetry.json`, however it is defaulted to `us-east-1`.
## Load Test AMI
Within the `loadtest-ami/` directory
1) Set the AWS_DEFAULT_REGION for Packer, ex: `export AWS_DEFAULT_REGION=us-east-1`
2) Run the command `packer build loadtest.json`
The script that k6 runs is found within `scripts/loadtest.js`. This script can be updated to send requests to more Consul endpoints. For additional information on k6 please check out their [guides](https://k6.io/docs/getting-started/running-k6).

View File

@ -1,16 +0,0 @@
# Consul AMI
## Quick start
To build the Consul AMI:
1. `git clone` this repo to your computer.
2. Install [Packer](https://www.packer.io/).
3. Configure your AWS credentials using one of the [options supported by the AWS
SDK](http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html). Usually, the easiest option is to
set the `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_DEFAULT_REGION` environment variables.
4. Update the `variables` section of the `consul.json` Packer template to configure the AWS region and datadog api key you would like to use. Feel free to reference this article to find your [datadog API key](https://docs.datadoghq.com/account_management/api-app-keys/#api-keys).
5. For additional customization you can add [tags](https://docs.datadoghq.com/getting_started/tagging/assigning_tags/?tab=noncontainerizedenvironments) within the `scripts/datadog.yaml` file. One example of a tag could be `"consul_version" : "consulent_175"`. These tags are searchable through the datadog dashboard. Another form of customization is changing the datacenter tag within `scripts/telemetry.json`, however it is defaulted to `us-east-1`.
6. Run `packer build consul.json`.
When the build finishes, it will output the IDs of the new AMI. Add this AMI ID in the `consul_ami_id` variable in the `vars.tfvars` file.

View File

@ -2,8 +2,6 @@
"min_packer_version": "1.5.4",
"variables": {
"aws_region": "{{env `AWS_DEFAULT_REGION`}}",
"consul_version": "1.5.1",
"download_url": "{{env `CONSUL_DOWNLOAD_URL`}}",
"dd_api_key": "{{env `DD_API_KEY`}}"
},
"builders": [{

View File

@ -1,6 +0,0 @@
## Load Test AMI
This AMI will be used for all load test servers. Currently it copies the `/scripts` and installs [k6](https://k6.io), so if any additional files are desired place them in that directory.
# How to use
1) Set the AWS region in the `loadtest.json` file
2) Run the command `packer build loadtest.json`

View File

@ -43,6 +43,6 @@ export let options = {
vus: 25,
// 10 minute
duration: "10m",
// 95% of requests must complete below 2s
// 95% of requests must complete below 2.5s
thresholds: { http_req_duration: ["p(95)<2500"] },
};

View File

@ -1,9 +1,16 @@
## Terraform Consul Load Testing
# How to use
1. Build an image with the desired Consul version and a loadtest image in the Packer folder [here](../packer).
# Terraform Consul Load Testing
## How to use
1. Build an image with the desired Consul version and a load test image in the Packer folder [here](../packer).
2. Create your own `vars.tfvars` file in this directory.
3. Place the appropriate AMI IDs in the `consul_ami_id` and `test_server_ami` variables, here is an example of a `vars.tfvars`:
3. Place the appropriate AMI IDs in the `consul_ami_id` and `test_server_ami` variables. If no AMI ID is specified it will default
to pulling from latest.
4. Set either `consul_version` or `consul_download_url`. If neither is set it will default to utilizing Consul 1.9.0
5. AWS Variables are set off of environment variables. Make sure to export necessary variables [shown here](https://registry.terraform.io/providers/hashicorp/aws/latest/docs#environment-variables).
6. Run `terraform plan -var-file=vars.tfvars`, and then `terraform apply -var-file=vars.tfvars` when ready.
7. Upon completion k6 should run and push metrics to the desired Datadog dashboard.
An example of a `vars.tfvars` :
```
vpc_name = "consul-test-vpc"
vpc_cidr = "11.0.0.0/16"
@ -11,26 +18,18 @@ public_subnet_cidrs = ["11.0.1.0/24", "11.0.3.0/24"]
private_subnet_cidrs = ["11.0.2.0/24"]
vpc_az = ["us-east-2a", "us-east-2b"]
test_instance_type = "t2.micro"
## This is found from building the image in packer/loadtest-ami
test_server_ami = "ami-0ad7711e837ebe166"
cluster_name = "ctest"
test_public_ip = "true"
instance_type = "t2.micro"
ami_owners = ["******"]
## This is found from building the image in packer/consul-ami
consul_ami_id = "ami-016d80ff5472346f0"
```
If `consul_version` or `consul_download_url` is not set within the Terraform variables it will default to utilziing Consul 1.9.0
4. AWS Variables are set off of environment variables. Make sure to export nessecary variables [shown here](https://registry.terraform.io/providers/hashicorp/aws/latest/docs#environment-variables).
5. Run `terraform plan -var-file=vars.tfvars`, and then `terraform apply -var-file=vars.tfvars` when ready.
6. Upon completion k6 should run and push metrics to desired Datadog dashboard.
# Customization
````
## Customization
All customization for infrastructure that is available can be found by looking through the `variables.tf` file.
## How to SSH
After `terraform apply` is run Terraform should create a `keys/` directory which will give access to all instances created.
For example, `ssh -i "keys/[cluster-name]-spicy-banana.pem" ubuntu@[IPADDRESS]`
# How to SSH
After `terraform apply` is ran Terraform should create a `keys/` directory which will give access to all instances created.

View File

@ -29,6 +29,7 @@ module "consul_servers" {
cluster_name = "${var.cluster_name}-server"
cluster_size = var.num_servers
instance_type = var.instance_type
cluster_tag_key = var.cluster_tag_key
cluster_tag_value = var.cluster_name
ami_id = var.consul_ami_id == null ? data.aws_ami.consul.id : var.consul_ami_id
@ -44,11 +45,11 @@ module "consul_servers" {
}
module "consul_clients" {
source = "git::git@github.com:hashicorp/terraform-aws-consul.git//modules/consul-cluster?ref=v0.8.0"
cluster_name = "${var.cluster_name}-client"
cluster_size = var.num_clients
instance_type = var.instance_type
source = "git::git@github.com:hashicorp/terraform-aws-consul.git//modules/consul-cluster?ref=v0.8.0"
cluster_name = "${var.cluster_name}-client"
cluster_size = var.num_clients
instance_type = var.instance_type
cluster_tag_key = var.cluster_tag_key
cluster_tag_value = var.cluster_name
ami_id = var.consul_ami_id == null ? data.aws_ami.consul.id : var.consul_ami_id

View File

@ -42,24 +42,6 @@ variable "cluster_tag_key" {
default = "consul-servers"
}
variable "ssh_key_name" {
description = "The name of an EC2 Key Pair that can be used to SSH to the EC2 Instances in this cluster. Set to an empty string to not associate a Key Pair."
type = string
default = null
}
variable "vpc_id" {
description = "The ID of the VPC in which the nodes will be deployed. Uses default VPC if not supplied."
type = string
default = null
}
variable "spot_price" {
description = "The maximum hourly price to pay for EC2 Spot Instances."
type = number
default = null
}
variable "vpc_az" {
type = list(string)
description = "VPC Availability Zone"