open-consul/command/debug/debug_test.go

627 lines
14 KiB
Go
Raw Normal View History

New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
package debug
import (
"archive/tar"
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
"bufio"
"compress/gzip"
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
"fmt"
"io"
"io/ioutil"
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
"os"
"path/filepath"
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
"regexp"
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
"strings"
"testing"
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
"time"
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
"github.com/mitchellh/cli"
"github.com/stretchr/testify/require"
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
"github.com/google/pprof/profile"
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
"github.com/hashicorp/consul/agent"
"github.com/hashicorp/consul/sdk/testutil"
"github.com/hashicorp/consul/testrpc"
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
)
func TestDebugCommand_noTabs(t *testing.T) {
t.Parallel()
if strings.ContainsRune(New(cli.NewMockUi(), nil).Help(), '\t') {
t.Fatal("help has tabs")
}
}
func TestDebugCommand(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
testDir := testutil.TempDir(t, "debug")
a := agent.NewTestAgent(t, `
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
enable_debug = true
`)
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := fmt.Sprintf("%s/debug", testDir)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
"-duration=100ms",
"-interval=50ms",
"-archive=false",
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
}
code := cmd.Run(args)
require.Equal(t, 0, code)
require.Equal(t, "", ui.ErrorWriter.String())
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
metricsFiles, err := filepath.Glob(fmt.Sprintf("%s/*/%s", outputPath, "metrics.json"))
require.NoError(t, err)
require.Len(t, metricsFiles, 1)
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
}
func TestDebugCommand_Archive(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
testDir := testutil.TempDir(t, "debug")
a := agent.NewTestAgent(t, `
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
enable_debug = true
`)
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := fmt.Sprintf("%s/debug", testDir)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
"-capture=agent",
}
code := cmd.Run(args)
require.Equal(t, 0, code)
require.Equal(t, "", ui.ErrorWriter.String())
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
archivePath := outputPath + debugArchiveExtension
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
file, err := os.Open(archivePath)
require.NoError(t, err)
gz, err := gzip.NewReader(file)
require.NoError(t, err)
tr := tar.NewReader(gz)
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
for {
h, err := tr.Next()
switch {
case err == io.EOF:
return
case err != nil:
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
t.Fatalf("failed to read file in archive: %s", err)
}
// ignore the outer directory
if h.Name == "debug" {
continue
}
// should only contain this one capture target
if h.Name != "debug/agent.json" && h.Name != "debug/index.json" {
t.Fatalf("archive contents do not match: %s", h.Name)
}
}
}
func TestDebugCommand_ArgsBad(t *testing.T) {
t.Parallel()
ui := cli.NewMockUi()
cmd := New(ui, nil)
args := []string{
"foo",
"bad",
}
if code := cmd.Run(args); code == 0 {
t.Fatalf("should exit non-zero, got code: %d", code)
}
errOutput := ui.ErrorWriter.String()
if !strings.Contains(errOutput, "Too many arguments") {
t.Errorf("expected error output, got %q", errOutput)
}
}
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
func TestDebugCommand_InvalidFlags(t *testing.T) {
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := ""
args := []string{
"-invalid=value",
"-output=" + outputPath,
"-duration=100ms",
"-interval=50ms",
}
if code := cmd.Run(args); code == 0 {
t.Fatalf("should exit non-zero, got code: %d", code)
}
errOutput := ui.ErrorWriter.String()
if !strings.Contains(errOutput, "==> Error parsing flags: flag provided but not defined:") {
t.Errorf("expected error output, got %q", errOutput)
}
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
func TestDebugCommand_OutputPathBad(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
t.Parallel()
a := agent.NewTestAgent(t, "")
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := ""
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
"-duration=100ms",
"-interval=50ms",
}
if code := cmd.Run(args); code == 0 {
t.Fatalf("should exit non-zero, got code: %d", code)
}
errOutput := ui.ErrorWriter.String()
if !strings.Contains(errOutput, "no such file or directory") {
t.Errorf("expected error output, got %q", errOutput)
}
}
func TestDebugCommand_OutputPathExists(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
testDir := testutil.TempDir(t, "debug")
a := agent.NewTestAgent(t, "")
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := fmt.Sprintf("%s/debug", testDir)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
"-duration=100ms",
"-interval=50ms",
}
// Make a directory that conflicts with the output path
err := os.Mkdir(outputPath, 0755)
if err != nil {
t.Fatalf("duplicate test directory creation failed: %s", err)
}
if code := cmd.Run(args); code == 0 {
t.Fatalf("should exit non-zero, got code: %d", code)
}
errOutput := ui.ErrorWriter.String()
if !strings.Contains(errOutput, "directory already exists") {
t.Errorf("expected error output, got %q", errOutput)
}
}
func TestDebugCommand_CaptureTargets(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
cases := map[string]struct {
// used in -target param
targets []string
// existence verified after execution
files []string
// non-existence verified after execution
excludedFiles []string
}{
"single": {
[]string{"agent"},
[]string{"agent.json"},
[]string{"host.json", "cluster.json"},
},
"static": {
[]string{"agent", "host", "cluster"},
[]string{"agent.json", "host.json", "cluster.json"},
[]string{"*/metrics.json"},
},
"metrics-only": {
[]string{"metrics"},
[]string{"*/metrics.json"},
[]string{"agent.json", "host.json", "cluster.json"},
},
"all-but-pprof": {
[]string{
"metrics",
"logs",
"host",
"agent",
"cluster",
},
[]string{
"host.json",
"agent.json",
"cluster.json",
"*/metrics.json",
"*/consul.log",
},
[]string{},
},
}
for name, tc := range cases {
testDir := testutil.TempDir(t, "debug")
a := agent.NewTestAgent(t, `
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
enable_debug = true
`)
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := fmt.Sprintf("%s/debug-%s", testDir, name)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
"-archive=false",
"-duration=100ms",
"-interval=50ms",
}
for _, t := range tc.targets {
args = append(args, "-capture="+t)
}
if code := cmd.Run(args); code != 0 {
t.Fatalf("should exit 0, got code: %d", code)
}
errOutput := ui.ErrorWriter.String()
if errOutput != "" {
t.Errorf("expected no error output, got %q", errOutput)
}
// Ensure the debug data was written
_, err := os.Stat(outputPath)
if err != nil {
t.Fatalf("output path should exist: %s", err)
}
// Ensure the captured static files exist
for _, f := range tc.files {
path := fmt.Sprintf("%s/%s", outputPath, f)
// Glob ignores file system errors
fs, _ := filepath.Glob(path)
if len(fs) <= 0 {
t.Fatalf("%s: output data should exist for %s", name, f)
}
}
// Ensure any excluded files do not exist
for _, f := range tc.excludedFiles {
path := fmt.Sprintf("%s/%s", outputPath, f)
// Glob ignores file system errors
fs, _ := filepath.Glob(path)
if len(fs) > 0 {
t.Fatalf("%s: output data should not exist for %s", name, f)
}
}
}
}
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
func TestDebugCommand_CaptureLogs(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
cases := map[string]struct {
// used in -target param
targets []string
// existence verified after execution
files []string
// non-existence verified after execution
excludedFiles []string
}{
"logs-only": {
[]string{"logs"},
[]string{"*/consul.log"},
[]string{"agent.json", "host.json", "cluster.json", "*/metrics.json"},
},
}
for name, tc := range cases {
testDir := testutil.TempDir(t, "debug")
a := agent.NewTestAgent(t, `
enable_debug = true
`)
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := fmt.Sprintf("%s/debug-%s", testDir, name)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
"-archive=false",
"-duration=1000ms",
"-interval=50ms",
}
for _, t := range tc.targets {
args = append(args, "-capture="+t)
}
if code := cmd.Run(args); code != 0 {
t.Fatalf("should exit 0, got code: %d", code)
}
errOutput := ui.ErrorWriter.String()
if errOutput != "" {
t.Errorf("expected no error output, got %q", errOutput)
}
// Ensure the debug data was written
_, err := os.Stat(outputPath)
if err != nil {
t.Fatalf("output path should exist: %s", err)
}
// Ensure the captured static files exist
for _, f := range tc.files {
path := fmt.Sprintf("%s/%s", outputPath, f)
// Glob ignores file system errors
fs, _ := filepath.Glob(path)
if len(fs) <= 0 {
t.Fatalf("%s: output data should exist for %s", name, f)
}
for _, logFile := range fs {
content, err := ioutil.ReadFile(logFile)
require.NoError(t, err)
scanner := bufio.NewScanner(strings.NewReader(string(content)))
for scanner.Scan() {
logLine := scanner.Text()
if !validateLogLine([]byte(logLine)) {
t.Fatalf("%s: log line is not valid %s", name, logLine)
}
}
}
}
// Ensure any excluded files do not exist
for _, f := range tc.excludedFiles {
path := fmt.Sprintf("%s/%s", outputPath, f)
// Glob ignores file system errors
fs, _ := filepath.Glob(path)
if len(fs) > 0 {
t.Fatalf("%s: output data should not exist for %s", name, f)
}
}
}
}
func validateLogLine(content []byte) bool {
fields := strings.SplitN(string(content), " ", 2)
if len(fields) != 2 {
return false
}
const logTimeFormat = "2006-01-02T15:04:05.000"
t := content[:len(logTimeFormat)]
_, err := time.Parse(logTimeFormat, string(t))
if err != nil {
return false
}
re := regexp.MustCompile(`(\[(ERROR|WARN|INFO|DEBUG|TRACE)]) (.*?): (.*)`)
valid := re.Match([]byte(fields[1]))
return valid
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
func TestDebugCommand_ProfilesExist(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
testDir := testutil.TempDir(t, "debug")
a := agent.NewTestAgent(t, `
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
enable_debug = true
`)
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := fmt.Sprintf("%s/debug", testDir)
println(outputPath)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
// CPU profile has a minimum of 1s
"-archive=false",
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
"-duration=2s",
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
"-interval=1s",
"-capture=pprof",
}
if code := cmd.Run(args); code != 0 {
t.Fatalf("should exit 0, got code: %d", code)
}
profiles := []string{"heap.prof", "profile.prof", "goroutine.prof", "trace.out"}
// Glob ignores file system errors
for _, v := range profiles {
fs, _ := filepath.Glob(fmt.Sprintf("%s/*/%s", outputPath, v))
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
if len(fs) < 1 {
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
t.Errorf("output data should exist for %s", v)
}
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
for _, f := range fs {
if !strings.Contains(f, "trace.out") {
content, err := ioutil.ReadFile(f)
require.NoError(t, err)
_, err = profile.ParseData(content)
require.NoError(t, err)
}
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
}
errOutput := ui.ErrorWriter.String()
if errOutput != "" {
t.Errorf("expected no error output, got %s", errOutput)
}
}
func TestDebugCommand_ValidateTiming(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
cases := map[string]struct {
duration string
interval string
output string
code int
}{
"both": {
"20ms",
"10ms",
"duration must be longer",
1,
},
"short interval": {
"10s",
"10ms",
"interval must be longer",
1,
},
"lower duration": {
"20s",
"30s",
"must be longer than interval",
1,
},
}
for name, tc := range cases {
// Because we're only testng validation, we want to shut down
// the valid duration test to avoid hanging
shutdownCh := make(chan struct{})
a := agent.NewTestAgent(t, "")
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, shutdownCh)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-duration=" + tc.duration,
"-interval=" + tc.interval,
"-capture=agent",
}
code := cmd.Run(args)
if code != tc.code {
t.Errorf("%s: should exit %d, got code: %d", name, tc.code, code)
}
errOutput := ui.ErrorWriter.String()
if !strings.Contains(errOutput, tc.output) {
t.Errorf("%s: expected error output '%s', got '%q'", name, tc.output, errOutput)
}
}
}
func TestDebugCommand_DebugDisabled(t *testing.T) {
if testing.Short() {
t.Skip("too slow for testing.Short")
}
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
t.Parallel()
testDir := testutil.TempDir(t, "debug")
a := agent.NewTestAgent(t, `
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
enable_debug = false
`)
defer a.Shutdown()
testrpc.WaitForLeader(t, a.RPC, "dc1")
ui := cli.NewMockUi()
cmd := New(ui, nil)
cmd.validateTiming = false
outputPath := fmt.Sprintf("%s/debug", testDir)
args := []string{
"-http-addr=" + a.HTTPAddr(),
"-output=" + outputPath,
"-archive=false",
// CPU profile has a minimum of 1s
"-duration=1s",
"-interval=1s",
}
if code := cmd.Run(args); code != 0 {
t.Fatalf("should exit 0, got code: %d", code)
}
profiles := []string{"heap.prof", "profile.prof", "goroutine.prof", "trace.out"}
// Glob ignores file system errors
for _, v := range profiles {
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
fs, _ := filepath.Glob(fmt.Sprintf("%s/*r/%s", outputPath, v))
require.True(t, len(fs) == 0)
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
}
errOutput := ui.ErrorWriter.String()
generate a single debug file for a long duration capture (#10279) * debug: remove the CLI check for debug_enabled The API allows collecting profiles even debug_enabled=false as long as ACLs are enabled. Remove this check from the CLI so that users do not need to set debug_enabled=true for no reason. Also: - fix the API client to return errors on non-200 status codes for debug endpoints - improve the failure messages when pprof data can not be collected Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com> * remove parallel test runs parallel runs create a race condition that fail the debug tests * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * snapshot the timestamp at the beginning of the capture - timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures - capture append to the file if it already exist * Revert "snapshot the timestamp at the beginning of the capture" This reverts commit c2d03346 * Refactor captureDynamic to extract capture logic for each item in a different func * extract wait group outside the go routine to avoid a race condition * capture pprof in a separate go routine * perform a single capture for pprof data for the whole duration * add missing vendor dependency * add a change log and fix documentation to reflect the change * create function for timestamp dir creation and simplify error handling * use error groups and ticker to simplify interval capture loop * Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval * refactor Logs capture routine and add log capture specific test * improve error reporting when log test fail * change test duration to 1s * make time parsing in log line more robust * refactor log time format in a const * test on log line empty the earliest possible and return Co-authored-by: Freddy <freddygv@users.noreply.github.com> * rename function to captureShortLived * more specific changelog Co-authored-by: Paul Banks <banks@banksco.de> * update documentation to reflect current implementation * add test for behavior when invalid param is passed to the command * fix argument line in test * a more detailed description of the new behaviour Co-authored-by: Paul Banks <banks@banksco.de> * print success right after the capture is done * remove an unnecessary error check Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> * upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22 Co-authored-by: Daniel Nephin <dnephin@hashicorp.com> Co-authored-by: Freddy <freddygv@users.noreply.github.com> Co-authored-by: Paul Banks <banks@banksco.de>
2021-06-07 17:00:51 +00:00
require.Contains(t, errOutput, "failed to collect")
New command: consul debug (#4754) * agent/debug: add package for debugging, host info * api: add v1/agent/host endpoint * agent: add v1/agent/host endpoint * command/debug: implementation of static capture * command/debug: tests and only configured targets * agent/debug: add basic test for host metrics * command/debug: add methods for dynamic data capture * api: add debug/pprof endpoints * command/debug: add pprof * command/debug: timing, wg, logs to disk * vendor: add gopsutil/disk * command/debug: add a usage section * website: add docs for consul debug * agent/host: require operator:read * api/host: improve docs and no retry timing * command/debug: fail on extra arguments * command/debug: fixup file permissions to 0644 * command/debug: remove server flags * command/debug: improve clarity of usage section * api/debug: add Trace for profiling, fix profile * command/debug: capture profile and trace at the same time * command/debug: add index document * command/debug: use "clusters" in place of members * command/debug: remove address in output * command/debug: improve comment on metrics sleep * command/debug: clarify usage * agent: always register pprof handlers and protect This will allow us to avoid a restart of a target agent for profiling by always registering the pprof handlers. Given this is a potentially sensitive path, it is protected with an operator:read ACL and enable debug being set to true on the target agent. enable_debug still requires a restart. If ACLs are disabled, enable_debug is sufficient. * command/debug: use trace.out instead of .prof More in line with golang docs. * agent: fix comment wording * agent: wrap table driven tests in t.run()
2018-10-17 20:20:35 +00:00
}