* debug: remove the CLI check for debug_enabled
The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.
Also:
- fix the API client to return errors on non-200 status codes for debug
endpoints
- improve the failure messages when pprof data can not be collected
Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>
* remove parallel test runs
parallel runs create a race condition that fail the debug tests
* snapshot the timestamp at the beginning of the capture
- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist
* Revert "snapshot the timestamp at the beginning of the capture"
This reverts commit c2d03346
* Refactor captureDynamic to extract capture logic for each item in a different func
* snapshot the timestamp at the beginning of the capture
- timestamp used to create the capture sub folder is snapshot only at the beginning of the capture and reused for subsequent captures
- capture append to the file if it already exist
* Revert "snapshot the timestamp at the beginning of the capture"
This reverts commit c2d03346
* Refactor captureDynamic to extract capture logic for each item in a different func
* extract wait group outside the go routine to avoid a race condition
* capture pprof in a separate go routine
* perform a single capture for pprof data for the whole duration
* add missing vendor dependency
* add a change log and fix documentation to reflect the change
* create function for timestamp dir creation and simplify error handling
* use error groups and ticker to simplify interval capture loop
* Logs, profile and traces are captured for the full duration. Metrics, Heap and Go routines are captured every interval
* refactor Logs capture routine and add log capture specific test
* improve error reporting when log test fail
* change test duration to 1s
* make time parsing in log line more robust
* refactor log time format in a const
* test on log line empty the earliest possible and return
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
* rename function to captureShortLived
* more specific changelog
Co-authored-by: Paul Banks <banks@banksco.de>
* update documentation to reflect current implementation
* add test for behavior when invalid param is passed to the command
* fix argument line in test
* a more detailed description of the new behaviour
Co-authored-by: Paul Banks <banks@banksco.de>
* print success right after the capture is done
* remove an unnecessary error check
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
* upgraded github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 => v0.0.0-20210601050228-01bbb1931b22
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Co-authored-by: Freddy <freddygv@users.noreply.github.com>
Co-authored-by: Paul Banks <banks@banksco.de>
* debug: remove the CLI check for debug_enabled
The API allows collecting profiles even debug_enabled=false as long as
ACLs are enabled. Remove this check from the CLI so that users do not
need to set debug_enabled=true for no reason.
Also:
- fix the API client to return errors on non-200 status codes for debug
endpoints
- improve the failure messages when pprof data can not be collected
Co-Authored-By: Dhia Ayachi <dhia@hashicorp.com>
* remove parallel test runs
parallel runs create a race condition that fail the debug tests
* Add changelog
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
All these changes should have no side-effects or change behavior:
- Use bytes.Buffer's String() instead of a conversion
- Use time.Since and time.Until where fitting
- Drop unnecessary returns and assignment
* agent/debug: add package for debugging, host info
* api: add v1/agent/host endpoint
* agent: add v1/agent/host endpoint
* command/debug: implementation of static capture
* command/debug: tests and only configured targets
* agent/debug: add basic test for host metrics
* command/debug: add methods for dynamic data capture
* api: add debug/pprof endpoints
* command/debug: add pprof
* command/debug: timing, wg, logs to disk
* vendor: add gopsutil/disk
* command/debug: add a usage section
* website: add docs for consul debug
* agent/host: require operator:read
* api/host: improve docs and no retry timing
* command/debug: fail on extra arguments
* command/debug: fixup file permissions to 0644
* command/debug: remove server flags
* command/debug: improve clarity of usage section
* api/debug: add Trace for profiling, fix profile
* command/debug: capture profile and trace at the same time
* command/debug: add index document
* command/debug: use "clusters" in place of members
* command/debug: remove address in output
* command/debug: improve comment on metrics sleep
* command/debug: clarify usage
* agent: always register pprof handlers and protect
This will allow us to avoid a restart of a target agent
for profiling by always registering the pprof handlers.
Given this is a potentially sensitive path, it is protected
with an operator:read ACL and enable debug being
set to true on the target agent. enable_debug still requires
a restart.
If ACLs are disabled, enable_debug is sufficient.
* command/debug: use trace.out instead of .prof
More in line with golang docs.
* agent: fix comment wording
* agent: wrap table driven tests in t.run()