rocksdb

mirror of https://github.com/facebook/rocksdb.git synced 2024-12-04 02:02:41 +00:00

Author	SHA1	Message	Date
sdong	a801c1fb09	db_bench --num_hot_column_families to be default off Summary: Having --num_hot_column_families default on fails some existing regression tests. By default turn it off Test Plan: Run db_bench to make sure it is default off. Reviewers: yhchiang, rven, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D30705	2014-12-24 09:00:23 -08:00
Manish Patil	2067058a60	Dump routine to BlockBasedTableReader (valgrind) Summary: Fixed valgrind issue Test Plan: valgrind check done Reviewers: rven, sdong Reviewed By: sdong Subscribers: sdong, dhruba Differential Revision: https://reviews.facebook.net/D30699	2014-12-23 18:01:29 -08:00
sdong	ddc81440d5	db_bench to add an option as number of hot column families to add to Summary: Add option --num_hot_column_families in db_bench. If it is set, write options will first write to that number of column families, and then move on to next set of hot column families. The working set of column families can be smaller than total number of CFs. It is to test how RocksDB can handle cold column families Test Plan: Run db_bench with --num_hot_column_families set and not set. Reviewers: yhchiang, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30663	2014-12-23 16:52:05 -08:00
Yueh-Hsuan Chiang	a944afd356	Fixed a compile error in db/db_impl.cc on ROCKSDB_LITE	2014-12-23 16:19:40 -08:00
Manish Patil	7ea7bdf04d	Dump routine to BlockBasedTableReader Summary: Added necessary routines for dumping block based SST with block filter Test Plan: Added "raw" mode to utility sst_dump Reviewers: sdong, rven Reviewed By: rven Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D29679	2014-12-23 13:24:07 -08:00
Igor Canadi	ae508df90e	Clean up compile for c_simple_example	2014-12-23 17:32:30 +01:00
Igor Canadi	b623009619	Fix compile of compact_file_example	2014-12-23 17:14:44 +01:00
Igor Canadi	ded26605f4	Merge pull request #444 from adamretter/java-api-fix Fix the Java API build on Mac OS X	2014-12-23 15:26:45 +01:00
Adam Retter	98490bccf6	Fix the build on Mac OS X	2014-12-23 14:22:56 +00:00
Yueh-Hsuan Chiang	4d99729741	Merge pull request #443 from behanna/master Fix the build with -DNDEBUG.	2014-12-22 23:57:19 -08:00
Lei Jin	5045c43944	add support for nested BlockBasedTableOptions in config string Summary: Add support to allow nested config for block-based table factory. The format looks like this: "write_buffer_size=1024;block_based_table_factory={block_size=4k};max_write_buffer_num=2" Test Plan: unit test Reviewers: yhchiang, rven, igor, ljin, jonahcohen Reviewed By: jonahcohen Subscribers: jonahcohen, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D29223	2014-12-22 16:34:21 -08:00
Chris BeHanna	d232cb156b	Fix the build with -DNDEBUG. Dike out the body of VerifyCompactionResult. With assert() compiled out, the loop index variable in the inner loop was unused, breaking the build when -Werror is enabled.	2014-12-22 17:06:18 -06:00
Yueh-Hsuan Chiang	45bab305f9	Move GetThreadList() feature under Env. Summary: GetThreadList() feature depends on the thread creation and destruction, which is currently handled under Env. This patch moves GetThreadList() feature under Env to better manage the dependency of GetThreadList() feature on thread creation and destruction. Renamed ThreadStatusImpl to ThreadStatusUpdater. Add ThreadStatusUtil, which is a static class contains utility functions for ThreadStatusUpdater. Test Plan: run db_test, thread_list_test and db_bench and verify the life cycle of Env and ThreadStatusUpdater is properly managed. Reviewers: igor, sdong Reviewed By: sdong Subscribers: ljin, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30057	2014-12-22 12:20:17 -08:00
Igor Canadi	4fd26f287c	Only execute flush from compaction if max_background_flushes = 0 Summary: As title. We shouldn't need to execute flush from compaction if there are dedicated threads doing flushes. Test Plan: make check Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30579	2014-12-22 12:05:14 +01:00
Igor Canadi	0acc738810	Speed up FindObsoleteFiles() Summary: There are two versions of FindObsoleteFiles(): * full scan, which is executed every 6 hours (and it's terribly slow) * no full scan, which is executed every time a background process finishes and iterator is deleted This diff is optimizing the second case (no full scan). Here's what we do before the diff: * Get the list of obsolete files (files with ref==0). Some files in obsolete_files set might actually be live. * Get the list of live files to avoid deleting files that are live. * Delete files that are in obsolete_files and not in live_files. After this diff: * The only files with ref==0 that are still live are files that have been part of move compaction. Don't include moved files in obsolete_files. * Get the list of obsolete files (which exclude moved files). * No need to get the list of live files, since all files in obsolete_files need to be deleted. I'll post the benchmark results, but you can get the feel of it here: https://reviews.facebook.net/D30123 This depends on D30123. P.S. We should do full scan only in failure scenarios, not every 6 hours. I'll do this in a follow-up diff. Test Plan: One new unit test. Made sure that unit test fails if we don't have a `if (!f->moved)` safeguard in ~Version. make check Big number of compactions and flushes: ./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0 --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000 Reviewers: yhchiang, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30249	2014-12-22 12:04:45 +01:00
Igor Canadi	d8c4ce6b50	Merge pull request #442 from alabid/alabid/fix-example-typo fix really trivial typo in column families example	2014-12-22 09:07:51 +01:00
alabid	949bd71fd0	fix really trivial typo	2014-12-22 00:36:16 -05:00
Igor Canadi	f8999fcf31	Fix a SIGSEGV in BackgroundFlush Summary: This one wasn't easy to find :) What happens is we go through all cfds on flush_queue_ and find no cfds to flush, but the cfd is set to the last CF we looped through and following code assumes we want it flushed. BTW @sdong do you think we should also make BackgroundFlush() only check a single cfd for flushing instead of doing this `while (!flush_queue_.empty())`? Test Plan: regression test no longer fails Reviewers: sdong, rven, yhchiang Reviewed By: yhchiang Subscribers: sdong, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30591	2014-12-21 00:23:28 -08:00
Igor Canadi	ade4034a9d	MultiGet for DBWithTTL Summary: This is a feature request from rocksdb's user. I didn't even realize we don't support multigets on TTL DB :) Test Plan: added a unit test Reviewers: yhchiang, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30561	2014-12-20 12:46:37 +01:00
Igor Canadi	fdb6be4e24	Rewritten system for scheduling background work Summary: When scaling to higher number of column families, the worst bottleneck was MaybeScheduleFlushOrCompaction(), which did a for loop over all column families while holding a mutex. This patch addresses the issue. The approach is similar to our earlier efforts: instead of a pull-model, where we do something for every column family, we can do a push-based model -- when we detect that column family is ready to be flushed/compacted, we add it to the flush_queue_/compaction_queue_. That way we don't need to loop over every column family in MaybeScheduleFlushOrCompaction. Here are the performance results: Command: ./db_bench --write_buffer_size=268435456 --db_write_buffer_size=268435456 --db=/fast-rocksdb-tmp/rocks_lots_of_cf --use_existing_db=0 --open_files=55000 --statistics=1 --histogram=1 --disable_data_sync=1 --max_write_buffer_number=2 --sync=0 --benchmarks=fillrandom --threads=16 --num_column_families=5000 --disable_wal=1 --max_background_flushes=16 --max_background_compactions=16 --level0_file_num_compaction_trigger=2 --level0_slowdown_writes_trigger=2 --level0_stop_writes_trigger=3 --hard_rate_limit=1 --num=33333333 --writes=33333333 Before the patch: fillrandom : 26.950 micros/op 37105 ops/sec; 4.1 MB/s After the patch: fillrandom : 17.404 micros/op 57456 ops/sec; 6.4 MB/s Next bottleneck is VersionSet::AddLiveFiles, which is painfully slow when we have a lot of files. This is coming in the next patch, but when I removed that code, here's what I got: fillrandom : 7.590 micros/op 131758 ops/sec; 14.6 MB/s Test Plan: make check two stress tests: Big number of compactions and flushes: ./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0 --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000 max_background_flushes=0, to verify that this case also works correctly ./db_stress --threads=30 --ops_per_thread=2000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0 --reopen=3 --max_background_compactions=3 --max_background_flushes=0 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000 Reviewers: ljin, rven, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30123	2014-12-19 20:38:12 +01:00
Igor Canadi	a3001b1d3d	Remove -mtune=native because it's redundant	2014-12-19 09:06:45 -08:00
Yueh-Hsuan Chiang	e27c84522f	Merge pull request #437 from fyrz/RocksJava-SliceTests-Fixes [RocksJava] Slice / DirectSlice improvements	2014-12-18 13:53:35 -08:00
fyrz	1fed1282ad	[RocksJava] Incorporated changes D30081	2014-12-18 22:27:50 +01:00
fyrz	5b9ceef01d	[RocksJava] JavaDoc correction	2014-12-18 22:19:57 +01:00
fyrz	5fbba60b6a	[RocksJava] Incorporated changes D30081	2014-12-18 22:15:00 +01:00
fyrz	b0230d7e09	[RocksJava] Incorporate additions for D30081	2014-12-18 22:05:07 +01:00
fyrz	b015ed0ca6	[RocksJava] Slice / DirectSlice improvements Summary: - AssertionError when initialized with Non-Direct Buffer - Tests + coverage for DirectSlice - Slice sigsegv fixes when initializing from String and byte arrays - Slice Tests Test Plan: Run tests without source modifications. Reviewers: yhchiang, adamretter, ankgup87 Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D30081	2014-12-18 22:05:07 +01:00
Yueh-Hsuan Chiang	4d422db010	Merge pull request #430 from adamretter/increase-parallelism Added setIncreaseParallelism() to Java API Options	2014-12-18 11:02:13 -08:00
Yueh-Hsuan Chiang	04c4e49691	Merge pull request #411 from fyrz/RocksJava-RangeCompaction [RocksJava] Range compaction	2014-12-18 11:01:23 -08:00
Igor Canadi	62d19b7b59	Merge pull request #427 from haneefmubarak/c-examples C example	2014-12-18 16:01:15 +01:00
Haneef Mubarak	28424d734b	style fixes in c example	2014-12-18 06:48:46 -08:00
Venkatesh Radhakrishnan	7198ed5a2e	Handle errors during pthread calls Summary: Release locks before calling exit. Test Plan: Force errors in debugger and verify correctness Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30423	2014-12-17 16:25:09 -08:00
Haneef Mubarak	91c58752fa	error detection and memory leaks in c example	2014-12-17 02:06:36 -08:00
Yueh-Hsuan Chiang	25f70a5abb	Avoid unnecessary unlock and lock mutex when notifying events. Summary: Avoid unnecessary unlock and lock mutex when notifying events. Test Plan: ./listener_test Reviewers: igor Reviewed By: igor Subscribers: sdong, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30267	2014-12-16 17:10:23 -08:00
Venkatesh Radhakrishnan	7661e5a76e	Move the file copy out of the mutex. Summary: We now release the mutex before copying the files in the case of the trivial move. This path does not use the compaction job. Test Plan: DBTest.LevelCompactionThirdPath Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30381	2014-12-16 16:57:22 -08:00
Adam Retter	17e84f2151	Rudimentary test cases for setIncreaseParallelism	2014-12-16 14:32:42 +00:00
Adam Retter	eda0dcdd97	Exposed IncreasedParallelism option to Java API as setIncreasedParallelism	2014-12-16 14:15:41 +00:00
fyrz	efc94ceb27	[RocksJava] Incorporated changes for D29283	2014-12-16 07:21:01 +01:00
fyrz	69188ff449	[RocksJava] CompactRange support Summary: Manual range compaction support in RocksJava. Test Plan: make rocksdbjava make jtest mvn -f rocksjni.pom package Reviewers: adamretter, yhchiang, ankgup87 Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D29283	2014-12-16 07:21:01 +01:00
fyrz	48adce77cc	[RocksJava] CompactRange support - manual range compaction support in RocksJava	2014-12-16 07:21:00 +01:00
Venkatesh Radhakrishnan	153f4f0719	RocksDB: Allow Level-Style Compaction to Place Files in Different Paths Summary: Allow Level-style compaction to place files in different paths This diff provides the code for task 4854591. We now support level-compaction to place files in different paths by specifying them in db_paths along with the minimum level for files to store in that path. Test Plan: ManualLevelCompactionOutputPathId in db_test.cc Reviewers: yhchiang, MarkCallaghan, dhruba, yoshinorim, sdong Reviewed By: sdong Subscribers: yoshinorim, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D29799	2014-12-15 21:48:16 -08:00
Igor Canadi	06eed650a0	Optimize default compile to compilation platform by default Summary: This diff changes compile to optimize for native platform by default. This will automatically turn on crc32 optimizations for modern processors, which greatly improves rocksdb's performance. I also did some more changes to compilation documentation. Test Plan: compile with `make`, observe -march=native compile with `PORTABLE=1 make`, observe no -march=native Reviewers: sdong, rven, yhchiang, MarkCallaghan Reviewed By: MarkCallaghan Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30225	2014-12-15 11:29:41 +01:00
Qiao Yang	cef6f84393	Added 'dump_live_files' command to ldb tool. Summary: Priliminary diff to solicit comments. Given DB path, dump all SST files (key/value and properties), WAL file and manifest files. What command options do we need to support for this command? Maybe output_hex for keys? Test Plan: Create additional ldb unit tests. Reviewers: sdong, rven Reviewed By: rven Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D29547	2014-12-12 17:50:36 -08:00
sdong	7ab1526c0e	Add an assert and avoid std::sort(autovector) to investigate an ASAN issue Summary: ASAN build fails once for this error: 14:04:52 ==== Test DBTest.CompactFilesOnLevelCompaction 14:04:52 db_test: db/version_set.cc:1062: void rocksdb::VersionStorageInfo::AddFile(int, rocksdb::FileMetaData): Assertion `level <= 0 \|\| level_files->empty() \|\| internal_comparator_->Compare( (level_files)[level_files->size() - 1]->largest, f->smallest) < 0' failed. Not abling figure out reason. We use std:vector for sorting for save and add one more assert to help figure out whether it is the sorting's problem. Test Plan: make all check Reviewers: yhchiang, rven, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D30117	2014-12-12 12:44:00 -08:00
Yueh-Hsuan Chiang	74b3fb6d97	Fix Mac compile errors on util/cache_test.cc Summary: Fix Mac compile errors on util/cache_test.cc Test Plan: make dbg -j32 ./cache_test	2014-12-11 14:15:13 -08:00
sdong	d7a486668c	Improve scalability of DB::GetSnapshot() Summary: Now DB::GetSnapshot() doesn't scale to more column families, as it needs to go through all the column families to find whether snapshot is supported. This patch optimizes it. Test Plan: Add unit tests to cover negative cases. make all check Reviewers: yhchiang, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30093	2014-12-11 13:27:57 -08:00
Alexey Maykov	ee95cae9a4	Modifed the LRU cache eviction code so that it doesn't evict blocks which have exteranl references Summary: Currently, blocks which have more than one reference (ie referenced by something other than cache itself) are evicted from cache. This doesn't make much sense: - blocks are still in RAM, so the RAM usage reported by the cache is incorrect - if the same block is needed by another iterator, it will be loaded and decompressed again This diff changes the reference counting scheme a bit. Previously, if the cache contained the block, this was accounted for in its refcount. After this change, the refcount is only used to track external references. There is a boolean flag which indicates whether or not the block is contained in the cache. This diff also changes how LRU list is used. Previously, both hashtable and the LRU list contained all blocks. After this change, the LRU list contains blocks with the refcount==0, ie those which can be evicted from the cache. Note that this change still allows for cache to grow beyond its capacity. This happens when all blocks are pinned (ie refcount>0). This is consistent with the current behavior. The cache's insert function never fails. I spent lots of time trying to make table_reader and other places work with the insert which might failed. It turned out to be pretty hard. It might really destabilize some customers, so finally, I decided against doing this. table_cache_remove_scan_count_limit option will be unneeded after this change, but I will remove it in the following diff, if this one gets approved Test Plan: Ran tests, made sure they pass Reviewers: sdong, ljin Differential Revision: https://reviews.facebook.net/D25503	2014-12-10 22:28:53 -08:00
sdong	0ab0242f37	VersionBuilder to use unordered set and map to store added and deleted files Summary: Set operations in VerisonBuilder is shown as a performance bottleneck of restarting DB when there are lots of files. Make both of added_files and deleted_files use unordered set or map. Only when adding the files, sort the added files. Test Plan: make all check Reviewers: yhchiang, rven, igor Reviewed By: igor Subscribers: hermanlee4, leveldb, dhruba, ljin Differential Revision: https://reviews.facebook.net/D30051	2014-12-10 18:53:30 -08:00
Lei Jin	e93f044d99	add range scan test to benchmark script Summary: as title Test Plan: ran it Reviewers: yhchiang, igor, sdong, MarkCallaghan Reviewed By: MarkCallaghan Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D25563	2014-12-10 13:04:58 -08:00
Igor Canadi	cb82d7b081	Fix #434 Summary: Why do we assert here? This doesn't seem like user friendly thing to do :) Test Plan: none Reviewers: sdong, yhchiang, rven Reviewed By: rven Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D30027	2014-12-09 10:22:07 -08:00

1 2 3 4 5 ...

2919 commits