rocksdb/build_tools/ps_with_stack
Peter Dillinger e466173d5c Print stack traces on frozen tests in CI (#10828)
Summary:
Instead of existing calls to ps from gnu_parallel, call a new wrapper that does ps, looks for unit test like processes, and uses pstack or gdb to print thread stack traces. Also, using `ps -wwf` instead of `ps -wf` ensures output is not cut off.

For security, CircleCI runs with security restrictions on ptrace (/proc/sys/kernel/yama/ptrace_scope = 1), and this change adds a work-around to `InstallStackTraceHandler()` (only used by testing tools) to allow any process from the same user to debug it. (I've also touched >100 files to ensure all the unit tests call this function.)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10828

Test Plan: local manual + temporary infinite loop in a unit test to observe in CircleCI

Reviewed By: hx235

Differential Revision: D40447634

Pulled By: pdillinger

fbshipit-source-id: 718a4c4a5b54fa0f9af2d01a446162b45e5e84e1
2022-10-18 00:35:35 -07:00

39 lines
1 KiB
Perl
Executable file

#!/usr/bin/env perl
use strict;
open(my $ps, "-|", "ps -wwf");
my $cols_known = 0;
my $cmd_col = 0;
my $pid_col = 0;
while (<$ps>) {
print;
my @cols = split(/\s+/);
if (!$cols_known && /CMD/) {
# Parse relevant ps column headers
for (my $i = 0; $i <= $#cols; $i++) {
if ($cols[$i] eq "CMD") {
$cmd_col = $i;
}
if ($cols[$i] eq "PID") {
$pid_col = $i;
}
}
$cols_known = 1;
} else {
my $pid = $cols[$pid_col];
my $cmd = $cols[$cmd_col];
# Match numeric PID and relative path command
# -> The intention is only to dump stack traces for hangs in code under
# test, which means we probably just built it and are executing by
# relative path (e.g. ./my_test or foo/bar_test) rather then by absolute
# path (e.g. /usr/bin/time) or PATH search (e.g. grep).
if ($pid =~ /^[0-9]+$/ && $cmd =~ /^[^\/ ]+[\/]/) {
print "Dumping stacks for $pid...\n";
system("pstack $pid || gdb -batch -p $pid -ex 'thread apply all bt'");
}
}
}
close $ps;