open-consul/test/integration/connect/envoy/case-dogstatsd-udp/verify.bats

#!/usr/bin/env bats

load helpers

@test "s1 proxy admin is up on :19000" {
  retry_default curl -f -s localhost:19000/stats -o /dev/null
}

@test "s2 proxy admin is up on :19001" {
  retry_default curl -f -s localhost:19001/stats -o /dev/null
}

@test "s2 proxy should be healthy" {
  assert_service_has_healthy_instances s2 1
}

@test "s1 upstream should be able to connect to s2" {
  run retry_default curl -s -f -d hello localhost:5000

  echo "OUTPUT: $output"

  [ "$status" == 0 ]
  [ "$output" == "hello" ]
}

@test "s1 proxy should be sending metrics to statsd" {
  run retry_default cat /workdir/statsd/statsd.log

  echo "METRICS:"
  echo "$output"
  echo "COUNT: $(echo "$output" | grep -Ec '^envoy\.')"

  [ "$status" == 0 ]
  [ $(echo $output | grep -Ec '^envoy\.') -gt "0" ]
}

@test "s1 proxy should be sending dogstatsd tagged metrics" {
  run retry_default must_match_in_statsd_logs '[#,]local_cluster:s1(,|$)'

  echo "OUTPUT: $output"

  [ "$status" == 0 ]
}

@test "s1 proxy should be adding cluster name as a tag" {
  run retry_default must_match_in_statsd_logs '[#,]envoy.cluster_name:s2(,|$)'

  echo "OUTPUT: $output"

  [ "$status" == 0 ]
}

@test "s1 proxy should be sending additional configured tags" {
  run retry_default must_match_in_statsd_logs '[#,]foo:bar(,|$)'

  echo "OUTPUT: $output"

  [ "$status" == 0 ]
}

@test "s1 proxy should have custom stats flush interval" {
  INTERVAL=$(get_envoy_stats_flush_interval localhost:19000)

  echo "INTERVAL = $INTERVAL"

  [ "$INTERVAL" == "1s" ]
}
Connect: allow configuring Envoy for L7 Observability (#5558) * Add support for HTTP proxy listeners * Add customizable bootstrap configuration options * Debug logging for xDS AuthZ * Add Envoy Integration test suite with basic test coverage * Add envoy command tests to cover new cases * Add tracing integration test * Add gRPC support WIP * Merged changes from master Docker. get CI integration to work with same Dockerfile now * Make docker build optional for integration * Enable integration tests again! * http2 and grpc integration tests and fixes * Fix up command config tests * Store all container logs as artifacts in circle on fail * Add retries to outer part of stats measurements as we keep missing them in CI * Only dump logs on failing cases * Fix typos from code review * Review tidying and make tests pass again * Add debug logs to exec test. * Fix legit test failure caused by upstream rename in envoy config * Attempt to reduce cases of bad TLS handshake in CI integration tests * bring up the right service * Add prometheus integration test * Add test for denied AuthZ both HTTP and TCP * Try ANSI term for Circle 2019-04-29 16:27:57 +00:00			`#!/usr/bin/env bats`

			`load helpers`

			`@test "s1 proxy admin is up on :19000" {`
			`retry_default curl -f -s localhost:19000/stats -o /dev/null`
			`}`

			`@test "s2 proxy admin is up on :19001" {`
			`retry_default curl -f -s localhost:19001/stats -o /dev/null`
			`}`

test: for envoy integration tests, wait until 's2' is healthy in consul before interrogating envoy (#6108) When the envoy healthy panic threshold was explicitly disabled as part of L7 traffic management it changed how envoy decided to load balance to endpoints in a cluster. This only matters when envoy is in "panic mode" aka "when you have a bunch of unhealthy endpoints". Panic mode sends traffic to unhealthy instances in certain circumstances. Note: Prior to explicitly disabling the healthy panic threshold, the default value is 50%. What was happening is that the test harness was bringing up consul the sidecars, and the service instances all at once and sometimes the proxies wouldn't have time to be checked by consul to be labeled as 'passing' in the catalog before a round of EDS happened. The xDS server in consul effectively queries /v1/health/connect/s2 and gets 1 result, but that one result has a 'critical' check so the xDS server sends back that endpoint labeled as UNHEALTHY. Envoy sees that 100% of the endpoints in the cluster are unhealthy and would enter panic mode and still send traffic to s2. This is why the test suites PRIOR to disabling the healthy panic threshold worked. They were _incorrectly_ passing. When the healthy panic threshol is disabled, envoy never enters panic mode in this situation and thus the cluster has zero healthy endpoints so load balancing goes nowhere and the tests fail. Why does this only affect the test suites for envoy 1.8.0? My guess is that https://github.com/envoyproxy/envoy/pull/4442 was merged into the 1.9.x series and somehow that plays a role. This PR modifies the bats scripts to explicitly wait until the upstream sidecar is healthy as measured by /v1/health/connect/s2?passing BEFORE trying to interrogate envoy which should make the tests less racy. 2019-07-10 20:58:25 +00:00			`@test "s2 proxy should be healthy" {`
			`assert_service_has_healthy_instances s2 1`
			`}`

Connect: allow configuring Envoy for L7 Observability (#5558) * Add support for HTTP proxy listeners * Add customizable bootstrap configuration options * Debug logging for xDS AuthZ * Add Envoy Integration test suite with basic test coverage * Add envoy command tests to cover new cases * Add tracing integration test * Add gRPC support WIP * Merged changes from master Docker. get CI integration to work with same Dockerfile now * Make docker build optional for integration * Enable integration tests again! * http2 and grpc integration tests and fixes * Fix up command config tests * Store all container logs as artifacts in circle on fail * Add retries to outer part of stats measurements as we keep missing them in CI * Only dump logs on failing cases * Fix typos from code review * Review tidying and make tests pass again * Add debug logs to exec test. * Fix legit test failure caused by upstream rename in envoy config * Attempt to reduce cases of bad TLS handshake in CI integration tests * bring up the right service * Add prometheus integration test * Add test for denied AuthZ both HTTP and TCP * Try ANSI term for Circle 2019-04-29 16:27:57 +00:00			`@test "s1 upstream should be able to connect to s2" {`
			`run retry_default curl -s -f -d hello localhost:5000`

			`echo "OUTPUT: $output"`

			`[ "$status" == 0 ]`
			`[ "$output" == "hello" ]`
			`}`

			`@test "s1 proxy should be sending metrics to statsd" {`
			`run retry_default cat /workdir/statsd/statsd.log`

			`echo "METRICS:"`
			`echo "$output"`
			`echo "COUNT: $(echo "$output" \| grep -Ec '^envoy\.')"`

			`[ "$status" == 0 ]`
			`[ $(echo $output \| grep -Ec '^envoy\.') -gt "0" ]`
			`}`

			`@test "s1 proxy should be sending dogstatsd tagged metrics" {`
			`run retry_default must_match_in_statsd_logs '[#,]local_cluster:s1(,\|$)'`

			`echo "OUTPUT: $output"`

			`[ "$status" == 0 ]`
			`}`

			`@test "s1 proxy should be adding cluster name as a tag" {`
			`run retry_default must_match_in_statsd_logs '[#,]envoy.cluster_name:s2(,\|$)'`

			`echo "OUTPUT: $output"`

			`[ "$status" == 0 ]`
			`}`

			`@test "s1 proxy should be sending additional configured tags" {`
			`run retry_default must_match_in_statsd_logs '[#,]foo:bar(,\|$)'`

			`echo "OUTPUT: $output"`

			`[ "$status" == 0 ]`
			`}`

			`@test "s1 proxy should have custom stats flush interval" {`
			`INTERVAL=$(get_envoy_stats_flush_interval localhost:19000)`

			`echo "INTERVAL = $INTERVAL"`

			`[ "$INTERVAL" == "1s" ]`
test: for envoy integration tests, wait until 's2' is healthy in consul before interrogating envoy (#6108) When the envoy healthy panic threshold was explicitly disabled as part of L7 traffic management it changed how envoy decided to load balance to endpoints in a cluster. This only matters when envoy is in "panic mode" aka "when you have a bunch of unhealthy endpoints". Panic mode sends traffic to unhealthy instances in certain circumstances. Note: Prior to explicitly disabling the healthy panic threshold, the default value is 50%. What was happening is that the test harness was bringing up consul the sidecars, and the service instances all at once and sometimes the proxies wouldn't have time to be checked by consul to be labeled as 'passing' in the catalog before a round of EDS happened. The xDS server in consul effectively queries /v1/health/connect/s2 and gets 1 result, but that one result has a 'critical' check so the xDS server sends back that endpoint labeled as UNHEALTHY. Envoy sees that 100% of the endpoints in the cluster are unhealthy and would enter panic mode and still send traffic to s2. This is why the test suites PRIOR to disabling the healthy panic threshold worked. They were _incorrectly_ passing. When the healthy panic threshol is disabled, envoy never enters panic mode in this situation and thus the cluster has zero healthy endpoints so load balancing goes nowhere and the tests fail. Why does this only affect the test suites for envoy 1.8.0? My guess is that https://github.com/envoyproxy/envoy/pull/4442 was merged into the 1.9.x series and somehow that plays a role. This PR modifies the bats scripts to explicitly wait until the upstream sidecar is healthy as measured by /v1/health/connect/s2?passing BEFORE trying to interrogate envoy which should make the tests less racy. 2019-07-10 20:58:25 +00:00			`}`