Commit graph

4831 commits

Author SHA1 Message Date
zensan 78711524b7 In all the places where log records are read, there was a check that
record.size() should not be less than 12.

This "magic number" seems to be the WriteBatch header (8 byte sequence
and 4 byte count).   Replaced all the places where "12" was used
by WriteBatchInternal::kHeader.
2016-03-30 23:05:22 +05:30
Islam AbdelRahman e3802531f1 Merge pull request #1050 from yuslepukhin/support_db_test2
Add support for db_test2 for dev and CI runs
2016-03-23 17:59:30 -07:00
Dmitri Smirnov e7cc49cbdf Add support for db_test2 for dev and CI runs 2016-03-23 16:33:57 -07:00
sdong 3996770d0b Add comments to perf_context skip counters
Summary: Document the skipped counters in perf context more clearly.

Test Plan: Comment only.

Reviewers: IslamAbdelRahman, yhchiang, MarkCallaghan

Reviewed By: MarkCallaghan

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55833
2016-03-23 14:26:25 -07:00
Mike Kolupaev 4e85b74790 Make WritableFileWrapper not screw up preallocation
Summary:
Without this diff, this is what happens to compaction output file if it's a subclass of WritableFileWrapper:
- during compaction, all `PrepareWrite()` calls update `last_preallocated_block_` of the `WritableFileWrapper` itself, not of `target_`, since `PrepareWrite()` is not virtual,
- `PrepareWrite()` calls `Allocate()`, which is virtual; it does `fallocate()` on `target_`,
- after writing data, `target_->Close()` calls `GetPreallocationStatus()` of `target_`; it returns `last_preallocated_block_` of `target_`, which is zero because it was never touched before,
- `target_->Close()` doesn't call `ftruncate()`; file remains big.

This diff fixes it in a straightforward way, by making the methods virtual. `WritableFileWrapper` ends up having the useless fields `last_preallocated_block_` and `preallocation_block_size_`. I think ideally the preallocation logic should be outside `WritableFile`, the same way as `log_writer.h` and `file_reader_writer.h` moved some non-platform-specific logic out of Env, but that's probably not worth the effort now.

Test Plan: `make -j check`; I'm going to deploy it on our test tier and see if it fixes space reclamation problem there

Reviewers: yhchiang, sdong

Reviewed By: sdong

Subscribers: andrewkr, dhruba, sdong

Differential Revision: https://reviews.facebook.net/D54681
2016-03-23 09:14:56 -07:00
Siying Dong ec458dcdeb Merge pull request #1047 from PraveenSinghRao/wal_filter_ex
Avoid overloaded virtual function
2016-03-22 21:48:39 -07:00
Praveen Rao 583157f710 Avoid overloaded virtual function 2016-03-22 17:10:31 -07:00
sdong b9d4fa890b Options settable tests to use a different special charactor
Summary: Something changed and the special charactor seems to be conflict with an exisitng value. Change it to unblock the build.

Test Plan: Run the test and make sure it passes

Reviewers: kradhakrishnan

Reviewed By: kradhakrishnan

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55845
2016-03-22 16:21:53 -07:00
Siying Dong 60e34baef2 Merge pull request #1044 from PraveenSinghRao/wal_filter_ex
Allow per-column-family processing in wal_filter (publish log numbers for column family to wal_filter, and provide log number in the record callback)
2016-03-22 16:03:31 -07:00
Praveen Rao 136b8e0cad Merge from master 2016-03-22 12:38:44 -07:00
Praveen Rao 2dcbb3b4f3 Addressed review comments 2016-03-22 12:07:15 -07:00
sdong b1fafcaca6 Revert "Adding pin_l0_filter_and_index_blocks_in_cache feature."
This reverts commit 522de4f59e.

It has bug of index block cleaning up.
2016-03-21 11:50:42 -07:00
sdong 5f8741a69d Revert "Fix failing Java unit test."
This reverts commit d7ae42b0f8.

This is reverted as auto buld failure. This commit itself doesn't have any problem. Reverting as it depends on the commit to revert.
2016-03-21 11:46:57 -07:00
sdong 43bbb56198 tools/check_format_compatible.sh to use consistent version when testing backward and forward compatibility
Summary: Test seems to fail if we don't use consistent version between testing forward and backward compatibility.

Test Plan: Run the script (with some version removed manually to make it shorter)

Reviewers: IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55773
2016-03-21 11:13:26 -07:00
Marton Trencseni d7ae42b0f8 Fix failing Java unit test.
Test Plan: sent diff to sdong, passes :)

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55749
2016-03-19 15:53:56 +00:00
agiardullo fbbb8a6144 Add test for Snapshot 0
Summary:
I ran into this assert when stress testing transactions.  It's pretty easy to repro.

Changing VersionSet::last_sequence_ to start at 1 seems pretty straightforward.  We would just need to change the 4 callers of SetLastSequence(), including recovery code.  I'd make this change myself, but I do not have enough time to test changes to recovery code-paths this week.  But checking in this test case (disabled) for future fixing.

Test Plan: n/a

Reviewers: yhchiang, kradhakrishnan, andrewkr, anthony, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: andrewkr, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D55311
2016-03-18 16:16:20 -07:00
Andrew Kryczka e182f03c1e Add unit tests for RepairDB
Summary:
Basic test cases:

- Manifest is lost or corrupt
- Manifest refers to too many or too few SST files
- SST file is corrupt
- Unflushed data is present when RepairDB is called

Depends on D55065 for its CreateFile() function in file_utils

Test Plan: Ran the tests.

Reviewers: IslamAbdelRahman, yhchiang, yoshinorim, sdong

Reviewed By: sdong

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55485
2016-03-18 15:18:42 -07:00
Praveen Rao 7d371863e5 travis build fixes 2016-03-18 14:43:22 -07:00
Karthikeyan Radhakrishnan fbea4dc660 Merge pull request #1042 from SherlockNoMad/HistFix
Fix in HistogramWindowingImpl
2016-03-18 14:27:55 -07:00
sdong 780d2b04cb Update format compatible checking tool
Summary: After introducing a less forward-compatible change, update the backward compatible checking tool.

Test Plan: Run it.

Reviewers: IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: IslamAbdelRahman, leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55695
2016-03-18 14:07:15 -07:00
Praveen Rao 4f1c74a46e merge from master 2016-03-18 12:48:01 -07:00
Praveen Rao f8c2189307 Publish log numbers for column family to wal_filter, and provide log number in the record callback 2016-03-18 12:32:15 -07:00
Marton Trencseni 44756260ae Reset block cache in failing unit test.
Test Plan: make -j40 check OPT=-g, on both /tmp and /dev/shm

Reviewers: sdong

Reviewed By: sdong

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55701
2016-03-18 06:13:54 +00:00
Marton Trencseni 522de4f59e Adding pin_l0_filter_and_index_blocks_in_cache feature.
Summary:
When a block based table file is opened, if prefetch_index_and_filter is true, it will prefetch the index and filter blocks, putting them into the block cache.
What this feature adds: when a L0 block based table file is opened, if pin_l0_filter_and_index_blocks_in_cache is true in the options (and prefetch_index_and_filter is true), then the filter and index blocks aren't released back to the block cache at the end of BlockBasedTableReader::Open(). Instead the table reader takes ownership of them, hence pinning them, ie. the LRU cache will never push them out. Meanwhile in the table reader, further accesses will not hit the block cache, thus avoiding lock contention.
When the table reader is destroyed, it releases the pinned blocks (if there were any). This has to happen before the cache is destroyed, so I had to introduce a TableReader::Close(), to guarantee the order of destruction.

Test Plan:
Added two unit tests for this. Existing unit tests run fine (default is pin_l0_filter_and_index_blocks_in_cache=false).

DISABLE_JEMALLOC=1 OPT=-g make all valgrind_check -j32
  Mac: OK.
  Linux: with D55287 patched in it's OK.

Reviewers: sdong

Reviewed By: sdong

Subscribers: andrewkr, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D54801
2016-03-17 22:40:01 +00:00
Islam AbdelRahman be22271265 Merge pull request #1041 from yuslepukhin/adjust_for_jemalloc
Latest versions of Jemalloc library do not require je_init()/je_unint()
2016-03-17 14:50:19 -07:00
SherlockNoMad 4ecc03c039 Fix in HistogramWindowingImpl 2016-03-17 14:28:41 -07:00
Dmitri Smirnov 2ca0994cf7 Latest versions of Jemalloc library do not require je_init()/je_unint()
calls. #ifdef in the source code and make this a default build option.
2016-03-17 11:25:20 -07:00
Gunnar Kudrjavets 90aff0c444 Update --max_write_buffer_number for compaction benchmarks
Summary: For compactions benchmarks (both level and universal) we'll use `--max_write_buffer_number=4`. For all the other benchmarks which don't customize the value of `--max_background_flushes` we'll continue using `--max_write_buffer_number=8`.

Test Plan:
To validate basic correctness and command-line options:

```
cd ~/rocksdb
NKEYS=10000000 ./tools/run_flash_bench.sh
```

Reviewers: MarkCallaghan

Reviewed By: MarkCallaghan

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55497
2016-03-17 10:14:23 -07:00
Andrew Kryczka 72224104d3 Forge current file for checkpoint
Summary:
This fixes a similar issue as D54711: "CURRENT" file can mutate between
GetLiveFiles() and copy to the tmp directory, in which case it would reference
the wrong manifest filename. To fix this, I forge the "CURRENT" file such that
it simply contains the filename for the manifest returned by GetLiveFiles().

- Changed CreateCheckpoint() to forge current file
- Added CreateFile() utility function
- Added test case that rolls manifest during checkpoint creation

Test Plan:
  $ ./checkpoint_test

Reviewers: sdong, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D55065
2016-03-17 10:07:21 -07:00
dhruba borthakur 33d568611d Merge pull request #1040 from bureau14/master
Fixes warnings and ensure correct int behavior on 32-bit platforms.
2016-03-17 03:23:30 -07:00
Edouard A 02e62ebbc8 Fixes warnings and ensure correct int behavior on 32-bit platforms. 2016-03-16 22:57:57 +01:00
Siying Dong 9cad56861a Merge pull request #1039 from bureau14/master
Improve documentation of the allow_os_buffer parameter.
2016-03-16 10:30:01 -07:00
Edouard Alligand 3d29f91466 Improve documentation of the allow_os_buffer parameter. 2016-03-16 15:37:55 +01:00
Islam AbdelRahman 3ff98bd209 Fix no compression test
Summary:
DBBlockCacheTest.TestWithCompressedBlockCache is depending on compression using snappy, so this test fail when snappy is not available
block this test when we don't have snappy

https://ci-builds.fb.com/view/rocksdb/job/rocksdb_no_compression/833/console

Test Plan: run the test when compression libraries are not avaliable

Reviewers: sdong, yiwu

Reviewed By: yiwu

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55413
2016-03-15 12:17:40 -07:00
Igor Canadi b9cc42a72c Merge pull request #1038 from SherlockNoMad/HistFix
Fix FB internal CI build failure
2016-03-15 11:43:06 -07:00
SherlockNoMad f76b260eff Fix FB internal CI build failure 2016-03-15 11:38:15 -07:00
Siying Dong 774922c680 Merge pull request #1026 from SherlockNoMad/Hist
Histogram Concurrency Improvement and Time-Windowing Support
2016-03-15 11:27:54 -07:00
Igor Canadi 17b879b91e Merge pull request #1037 from SherlockNoMad/BuildFix
Fix AppVeyor build error
2016-03-15 11:15:56 -07:00
SherlockNoMad f11b0df121 Fix AppVeyor build error 2016-03-15 10:57:33 -07:00
Gunnar Kudrjavets 6b03f93d4f Fix the build break on Ubuntu 15.10 when gcc 5.2.1 is used
Summary:
Using gcc 5.2.1 to compile RocksDB on Ubuntu 15.10 results in a warning about unused variable. Warning is treated as an error and all of it results in a build break. Fix the issue and make sure that everything compiles with that particular configuration.

P.S. Lint complains about a non-ASCII character used in the source code ("Wagner-Fischer"). Fix this as well.

Test Plan:
  # Everything compiler cleanly on the system which exhibited the problem.
  # `make clean && make -j 16` on CentOS.
  # `make clean && USE_CLANG=1 make -j 16` on CentOS.

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55503
2016-03-15 10:30:10 -07:00
Gunnar Kudrjavets 697fab820a Updates to RocksDB subcompaction benchmarking script
Summary: Set of updates to the subcompaction benchmark script which are based on our internal discussions. The intent behind the changes is to make sure that the scripts will correctly reflect how we're doing the actual benchmarking.

Test Plan: Tested by exercising the full set of compaction benchmarks and validating the execution and consistency of results.

Reviewers: MarkCallaghan, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55461
2016-03-14 23:09:04 -07:00
SherlockNoMad 58379bfb55 remove division from histogramwidowing impl 2016-03-14 21:41:13 -07:00
Dhruba Borthakur 1a2cc27e01 ColumnFamilyOptions SanitizeOptions is buggy on 32-bit platforms.
Summary:
The pre-existing code is trying to clamp between 65,536 and 0,
resulting in clamping to 65,536, resulting in very small buffers,
resulting in ShouldFlushNow() being true quite easily,
resulting in assertion failing and database performance
being "not what it should be".

https://github.com/facebook/rocksdb/issues/1018

Test Plan: make check

Reviewers: sdong, andrewkr, IslamAbdelRahman, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55455
2016-03-14 16:21:54 -07:00
Igor Canadi e778c34e78 Merge pull request #1035 from bureau14/master
Added quasardb to the USERS.md file
2016-03-14 15:50:55 -07:00
Edouard A 5bd3da1c59 Added quasardb to the USERS.md file 2016-03-14 23:48:28 +01:00
sdong b2ae5950ba Index Reader should not be reused after DB restart
Summary:
In block based table reader, wow we put index reader to block cache, which can be retrieved after DB restart. However, index reader may reference internal comparator, which can be destroyed after DB restarts, causing problems.
Fix it by making cache key identical per table reader.

Test Plan: Add a new test which failed with out the commit but now pass.

Reviewers: IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: maro, yhchiang, kradhakrishnan, leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55287
2016-03-14 10:04:09 -07:00
Andrew Kryczka 0267655dad Update change log for 4.6 release
Summary: as titled

Test Plan: N/A

Reviewers: sdong, kradhakrishnan, anthony, yhchiang, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55323
2016-03-12 13:51:57 -08:00
Andrew Kryczka 08304c0867 Expose RepairDB as ldb command
Summary: This will make it easier for admins and devs to use RepairDB.

Test Plan:
Tried deleting the manifest and verified it recovers:

  $ ldb --create_if_missing --db=/tmp/test_db put ok ok
  $ rm -f /tmp/test_db/MANIFEST-000001
  $ ./ldb --db=/tmp/test_db repair
  $ ldb --db=/tmp/test_db get ok
  ok

Reviewers: yhchiang, sdong, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55359
2016-03-12 13:50:20 -08:00
SherlockNoMad fd664a27b5 Fix Build Error 2016-03-11 22:56:25 -08:00
Islam AbdelRahman 580fede347 Aggregate hot Iterator counters in LocalStatistics (DBIter::Next perf regression)
Summary:
This patch bump the counters in the frequent code path DBIter::Next() / DBIter::Prev() in a local data members and send them to Statistics when the iterator is destroyed
A better solution will be to have thread_local implementation for Statistics

New performance
```
readseq      :       0.035 micros/op 28597881 ops/sec; 3163.7 MB/s
     1,851,568,819      stalled-cycles-frontend   #   31.29% frontend cycles idle    [49.86%]
       884,929,823      stalled-cycles-backend    #   14.95% backend  cycles idle    [50.21%]
readreverse  :       0.071 micros/op 14077393 ops/sec; 1557.3 MB/s
     3,239,575,993      stalled-cycles-frontend   #   27.36% frontend cycles idle    [49.96%]
     1,558,253,983      stalled-cycles-backend    #   13.16% backend  cycles idle    [50.14%]

```

Existing performance

```
readreverse  :       0.174 micros/op 5732342 ops/sec;  634.1 MB/s
    20,570,209,389      stalled-cycles-frontend   #   70.71% frontend cycles idle    [50.01%]
    18,422,816,837      stalled-cycles-backend    #   63.33% backend  cycles idle    [50.04%]

readseq      :       0.119 micros/op 8400537 ops/sec;  929.3 MB/s
    15,634,225,844      stalled-cycles-frontend   #   79.07% frontend cycles idle    [49.96%]
    14,227,427,453      stalled-cycles-backend    #   71.95% backend  cycles idle    [50.09%]
```

Test Plan: unit tests

Reviewers: yhchiang, sdong, igor

Reviewed By: sdong

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D55107
2016-03-11 19:01:12 -08:00