print stderr separately per option (#12301)

Summary:
While working on Meta's internal test triaging process, I found that `db_crashtest.py` was printing out `stdout` and `stderr` altogether. Adding an option to print `stderr` separately so that it's easy to extract only `stderr` from the test run.

`print_stderr_separately` is introduced as an optional parameter with default value `False` to keep the existing behavior as is (except a few minor changes).

Minor changes to the existing behavior
- We no longer print `stderr has error message:` and `***` prefix to each line. We simply print `stderr:` before printing `stderr` if stderr is printed in stdout and print `stderr` as is.
- We no longer print `times error occurred in output is ...` which doesn't appear to have any values

Pull Request resolved: https://github.com/facebook/rocksdb/pull/12301

Test Plan:
**Default Behavior (blackbox)**

Run printed everything as is
```
$> python3 tools/db_crashtest.py blackbox --simple --max_key=25000000 --write_buffer_size=4194304 2> /tmp/error.log
Running blackbox-crash-test with
interval_between_crash=120
total-duration=6000
...
Integrated BlobDB: blob files enabled 0, min blob size 0, blob file size 268435456, blob compression type NoCompression, blob GC enabled 0, cutoff 0.250000, force threshold 1.000000, blob compaction readahead size 0, blob file starting level 0
Integrated BlobDB: blob cache disabled
DB path: [/tmp/jewoongh/rocksdb_crashtest_blackboxwh7yxpec]
(Re-)verified 0 unique IDs
2024/01/29-09:16:30  Initializing worker threads
Crash-recovery verification passed :)
2024/01/29-09:16:35  Starting database operations
2024/01/29-09:16:35  Starting verification
Stress Test : 543.600 micros/op 8802 ops/sec
            : Wrote 0.00 MB (0.27 MB/sec) (50% of 10 ops)
            : Wrote 5 times
            : Deleted 1 times
            : Single deleted 0 times
            : 4 read and 0 found the key
            : Prefix scanned 0 times
            : Iterator size sum is 0
            : Iterated 0 times
            : Deleted 0 key-ranges
            : Range deletions covered 0 keys
            : Got errors 0 times
            : 0 CompactFiles() succeed
            : 0 CompactFiles() did not succeed

stderr:
WARNING: prefix_size is non-zero but memtablerep != prefix_hash
Error : jewoongh injected test error This is not a real failure.
Verification failed :(
```

Nothing in stderr
```
$> cat /tmp/error.log
```

**Default Behavior (whitebox)**
Run printed everything as is
```
$> python3 tools/db_crashtest.py whitebox --simple --max_key=25000000 --write_buffer_size=4194304 2> /tmp/error.log
Running whitebox-crash-test with
total-duration=10000
...
(Re-)verified 571 unique IDs
2024/01/29-09:33:53  Initializing worker threads
Crash-recovery verification passed :)
2024/01/29-09:35:16  Starting database operations
2024/01/29-09:35:16  Starting verification
Stress Test : 97248.125 micros/op 10 ops/sec
            : Wrote 0.00 MB (0.00 MB/sec) (12% of 8 ops)
            : Wrote 1 times
            : Deleted 0 times
            : Single deleted 0 times
            : 4 read and 1 found the key
            : Prefix scanned 1 times
            : Iterator size sum is 120868
            : Iterated 4 times
            : Deleted 0 key-ranges
            : Range deletions covered 0 keys
            : Got errors 0 times
            : 0 CompactFiles() succeed
            : 0 CompactFiles() did not succeed

stderr:
WARNING: prefix_size is non-zero but memtablerep != prefix_hash
Error : jewoongh injected test error This is not a real failure.
New cache capacity = 4865393
Verification failed :(

TEST FAILED. See kill option and exit code above!!!
```
Nothing in stderr
```
$> cat /tmp/error.log
```

**New option  added (blackbox)**
```
$> python3 tools/db_crashtest.py blackbox --simple --max_key=25000000 --write_buffer_size=4194304 --print_stderr_separately 2> /tmp/error.log
Running blackbox-crash-test with
interval_between_crash=120
total-duration=6000
...
Integrated BlobDB: blob files enabled 0, min blob size 0, blob file size 268435456, blob compression type NoCompression, blob GC enabled 0, cutoff 0.250000, force threshold 1.000000, blob compaction readahead size 0, blob file starting level 0
Integrated BlobDB: blob cache disabled
DB path: [/tmp/jewoongh/rocksdb_crashtest_blackbox7ybna32z]
(Re-)verified 0 unique IDs
Compaction filter factory: DbStressCompactionFilterFactory
2024/01/29-09:05:39  Initializing worker threads
Crash-recovery verification passed :)
2024/01/29-09:05:46  Starting database operations
2024/01/29-09:05:46  Starting verification
Stress Test : 235.917 micros/op 16000 ops/sec
            : Wrote 0.00 MB (0.16 MB/sec) (16% of 12 ops)
            : Wrote 2 times
            : Deleted 1 times
            : Single deleted 0 times
            : 9 read and 0 found the key
            : Prefix scanned 0 times
            : Iterator size sum is 0
            : Iterated 0 times
            : Deleted 0 key-ranges
            : Range deletions covered 0 keys
            : Got errors 0 times
            : 0 CompactFiles() succeed
            : 0 CompactFiles() did not succeed
```
stderr printed separately
```
$> cat /tmp/error.log
WARNING: prefix_size is non-zero but memtablerep != prefix_hash
Error : jewoongh injected test error This is not a real failure.
New cache capacity = 19461571
Verification failed :(
```

**New option  added (whitebox)**
```
$> python3 tools/db_crashtest.py whitebox --simple --max_key=25000000 --write_buffer_size=4194304 --print_stderr_separately 2> /tmp/error.log

Running whitebox-crash-test with
total-duration=10000
...
Integrated BlobDB: blob files enabled 0, min blob size 0, blob file size 268435456, blob compression type NoCompression, blob GC enabled 0, cutoff 0.250000, force threshold 1.000000, blob compaction readahead size 0, blob file starting level 0
Integrated BlobDB: blob cache disabled
DB path: [/tmp/jewoongh/rocksdb_crashtest_whiteboxtwj0ihn6]
(Re-)verified 157 unique IDs
2024/01/29-09:39:59  Initializing worker threads
Crash-recovery verification passed :)
2024/01/29-09:40:16  Starting database operations
2024/01/29-09:40:16  Starting verification
Stress Test : 742.474 micros/op 11801 ops/sec
            : Wrote 0.00 MB (0.27 MB/sec) (36% of 19 ops)
            : Wrote 7 times
            : Deleted 1 times
            : Single deleted 0 times
            : 8 read and 0 found the key
            : Prefix scanned 0 times
            : Iterator size sum is 0
            : Iterated 4 times
            : Deleted 0 key-ranges
            : Range deletions covered 0 keys
            : Got errors 0 times
            : 0 CompactFiles() succeed
            : 0 CompactFiles() did not succeed

TEST FAILED. See kill option and exit code above!!!
```
stderr printed separately
```
$> cat /tmp/error.log
WARNING: prefix_size is non-zero but memtablerep != prefix_hash
Error : jewoongh injected test error This is not a real failure.
Error : jewoongh injected test error This is not a real failure.
Error : jewoongh injected test error This is not a real failure.
New cache capacity = 4865393
Verification failed :(
```

Reviewed By: akankshamahajan15

Differential Revision: D53187491

Pulled By: jaykorean

fbshipit-source-id: 76f9100d08b96d014e41b7b88b206d69f0ae932b
This commit is contained in:
Jay Huh 2024-01-29 11:09:47 -08:00 committed by Facebook GitHub Bot
parent 4e60663b31
commit 8829ba9fe1
1 changed files with 15 additions and 13 deletions

View File

@ -812,6 +812,7 @@ def gen_cmd(params, unknown_params):
"test_tiered_storage",
"cleanup_cmd",
"skip_tmpdir_check",
"print_stderr_separately"
}
and v is not None
]
@ -837,15 +838,15 @@ def execute_cmd(cmd, timeout=None):
return hit_timeout, child.returncode, outs.decode("utf-8"), errs.decode("utf-8")
def print_if_stderr_has_errors(stderr):
for line in stderr.split("\n"):
if line != "" and not line.startswith("WARNING"):
print("stderr has error message:")
print("***" + line + "***")
def print_if_stderr_has_errors(stderr, print_as_stderr=False):
if len(stderr) == 0:
return
stderrdata = stderr.lower()
errorcount = stderrdata.count("error") - stderrdata.count("got errors 0 times")
print("#times error occurred in output is " + str(errorcount) + "\n")
if print_as_stderr:
print(stderr, file=sys.stderr)
else:
print("stderr:")
print(stderr)
def cleanup_after_success(dbname):
@ -885,11 +886,10 @@ def blackbox_crash_main(args, unknown_args):
print("Exit Before Killing")
print("stdout:")
print(outs)
print("stderr:")
print(errs)
print_if_stderr_has_errors(errs, print_as_stderr=args.print_stderr_separately)
sys.exit(2)
print_if_stderr_has_errors(errs);
print_if_stderr_has_errors(errs, print_as_stderr=args.print_stderr_separately)
time.sleep(1) # time to stabilize before the next run
@ -909,7 +909,8 @@ def blackbox_crash_main(args, unknown_args):
# Print stats of the final run
print("stdout:", outs)
print_if_stderr_has_errors(errs)
# Print stderr of the final run
print_if_stderr_has_errors(errs, print_as_stderr=args.print_stderr_separately)
# we need to clean up after ourselves -- only do this on test success
cleanup_after_success(dbname)
@ -1054,7 +1055,7 @@ def whitebox_crash_main(args, unknown_args):
print(msg)
print(stdoutdata)
print(stderrdata)
print_if_stderr_has_errors(stderrdata, print_as_stderr=args.print_stderr_separately)
if hit_timeout:
print("Killing the run for running too long")
@ -1118,6 +1119,7 @@ def main():
parser.add_argument("--test_tiered_storage", action="store_true")
parser.add_argument("--cleanup_cmd")
parser.add_argument("--skip_tmpdir_check", action="store_true")
parser.add_argument("--print_stderr_separately", action="store_true", default=False)
all_params = dict(
list(default_params.items())