rocksdb

mirror of https://github.com/facebook/rocksdb.git synced 2024-11-30 22:41:48 +00:00

Author	SHA1	Message	Date
Yueh-Hsuan Chiang	6935eb24e0	Add ColumnFamilyHandle::GetDescriptor() Summary: This patch addes ColumnFamilyHandle::GetDescriptor(), which allows developers to obtain the CF options and names of the associated column family given its handle. // Returns the up-to-date descriptor used by the current handle. Since it // returns the up-to-date information, this call might internally locks // and releases DB mutex to access the up-to-date CF options. virtual ColumnFamilyDescriptor GetDescriptor() = 0; Test Plan: augment column_family_test Reviewers: sdong, yoshinorim, IslamAbdelRahman, rven, kradhakrishnan, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51543	2016-01-06 18:14:01 -08:00
Jay Edgar	7699439b7c	Prevent the user from setting block_restart_interval to less than 1 Summary: If block_restart_interval gets set to less than 1 an assert will be triggered in BlockBuilder::BlockBuilder(). This prevents the user from doing this by silently setting any value less than 1 to 1. Test Plan: Added a test (in BlockBasedTableTest in table_test) that checks invalid values to make sure that they are reset to the expected values. The block_restart_interval value is checked along with block_size_deviation which also silently sets the value if it is outside a specific range. Reviewers: yoshinorim, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D52509	2016-01-04 14:13:18 -08:00
Igor Canadi	1dec5b8f5d	Merge pull request #916 from warrenfalk/capi_huge_page_option Expose the option, memtable_prefix_bloom_huge_page_tlb_size, via C API	2016-01-04 09:24:30 -08:00
Warren Falk	0fde291abe	expose memtable_prefix_bloom_huge_page_tlb_size option to C API	2015-12-31 09:27:03 -05:00
Warren Falk	7e81dba5cf	Support creation of "full" format bloom filter from C API	2015-12-31 09:26:49 -05:00
Venkatesh Radhakrishnan	63ddb783db	Delete files in given key range Summary: This is an initial diff for providing the ability to delete files which are completely within a given range of keys. Test Plan: DBCompactionTest.DeleteRange Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: yoshinorim, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D52293	2015-12-29 13:22:13 -08:00
Siying Dong	7fafd52dce	Merge pull request #900 from shuzhang1989/hdfs_env_fix add a factory method for creating hdfs env	2015-12-28 09:28:04 -08:00
Nathan Bronson	7d87f02799	support for concurrent adds to memtable Summary: This diff adds support for concurrent adds to the skiplist memtable implementations. Memory allocation is made thread-safe by the addition of a spinlock, with small per-core buffers to avoid contention. Concurrent memtable writes are made via an additional method and don't impose a performance overhead on the non-concurrent case, so parallelism can be selected on a per-batch basis. Write thread synchronization is an increasing bottleneck for higher levels of concurrency, so this diff adds --enable_write_thread_adaptive_yield (default off). This feature causes threads joining a write batch group to spin for a short time (default 100 usec) using sched_yield, rather than going to sleep on a mutex. If the timing of the yield calls indicates that another thread has actually run during the yield then spinning is avoided. This option improves performance for concurrent situations even without parallel adds, although it has the potential to increase CPU usage (and the heuristic adaptation is not yet mature). Parallel writes are not currently compatible with inplace updates, update callbacks, or delete filtering. Enable it with --allow_concurrent_memtable_write (and --enable_write_thread_adaptive_yield). Parallel memtable writes are performance neutral when there is no actual parallelism, and in my experiments (SSD server-class Linux and varying contention and key sizes for fillrandom) they are always a performance win when there is more than one thread. Statistics are updated earlier in the write path, dropping the number of DB mutex acquisitions from 2 to 1 for almost all cases. This diff was motivated and inspired by Yahoo's cLSM work. It is more conservative than cLSM: RocksDB's write batch group leader role is preserved (along with all of the existing flush and write throttling logic) and concurrent writers are blocked until all memtable insertions have completed and the sequence number has been advanced, to preserve linearizability. My test config is "db_bench -benchmarks=fillrandom -threads=$T -batch_size=1 -memtablerep=skip_list -value_size=100 --num=1000000/$T -level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999 -disable_auto_compactions --max_write_buffer_number=8 -max_background_flushes=8 --disable_wal --write_buffer_size=160000000 --block_size=16384 --allow_concurrent_memtable_write" on a two-socket Xeon E5-2660 @ 2.2Ghz with lots of memory and an SSD hard drive. With 1 thread I get ~440Kops/sec. Peak performance for 1 socket (numactl -N1) is slightly more than 1Mops/sec, at 16 threads. Peak performance across both sockets happens at 30 threads, and is ~900Kops/sec, although with fewer threads there is less performance loss when the system has background work. Test Plan: 1. concurrent stress tests for InlineSkipList and DynamicBloom 2. make clean; make check 3. make clean; DISABLE_JEMALLOC=1 make valgrind_check; valgrind db_bench 4. make clean; COMPILE_WITH_TSAN=1 make all check; db_bench 5. make clean; COMPILE_WITH_ASAN=1 make all check; db_bench 6. make clean; OPT=-DROCKSDB_LITE make check 7. verify no perf regressions when disabled Reviewers: igor, sdong Reviewed By: sdong Subscribers: MarkCallaghan, IslamAbdelRahman, anthony, yhchiang, rven, sdong, guyg8, kradhakrishnan, dhruba Differential Revision: https://reviews.facebook.net/D50589	2015-12-25 11:03:40 -08:00
Shu Zhang	4fd23fb130	add a factory method for creating hdfs env	2015-12-23 17:26:50 -08:00
sdong	15b8902264	Change default options.delayed_write_rate Summary: We now have a mechanism to further slowdown writes. Double default options.delayed_write_rate to try to keep the default behavior closer to it used to be. Test Plan: Run all tests. Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: yhchiang, kradhakrishnan, rven, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52281	2015-12-23 14:51:55 -08:00
sdong	b9f77ba12b	When slowdown is triggered, reduce the write rate Summary: It's usually hard for users to set a value of options.delayed_write_rate. With this diff, after slowdown condition triggers, we greedily reduce write rate if estimated pending compaction bytes increase. If estimated compaction pending bytes drop, we increase the write rate. Test Plan: Add a unit test Test with db_bench setting: TEST_TMPDIR=/dev/shm/ ./db_bench --benchmarks=fillrandom -num=10000000 --soft_pending_compaction_bytes_limit=1000000000 --hard_pending_compaction_bytes_limit=3000000000 --delayed_write_rate=100000000 and make sure without the commit, write stop will happen, but with the commit, it will not happen. Reviewers: igor, anthony, rven, yhchiang, kradhakrishnan, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52131	2015-12-23 11:33:15 -08:00
Dmitri Smirnov	dbb8260f7e	Make Status moveable Status is a class which is frequently returned by value from functions. Making it movable avoids 99% of the copies automatically on return by value.	2015-12-22 16:06:20 -08:00
Islam AbdelRahman	2bf9b968ca	Fix lite_build Summary: Fix compiling under ROCKSDB_LITE Test Plan: OPT="-DROCKSDB_LITE" make -j64 check make check -j64 Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D52239	2015-12-22 11:58:13 -08:00
Islam AbdelRahman	d005c66faf	Report compaction reason in CompactionListener Summary: Add CompactionReason to CompactionJobInfo This will allow users to understand why compaction started which will help options tuning Test Plan: added new tests make check -j64 Reviewers: yhchiang, anthony, kradhakrishnan, sdong, rven Reviewed By: rven Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D51975	2015-12-22 11:37:19 -08:00
Alex Yang	33e09c0e19	add call to install superversion and schedule work in enableautocompactions Summary: This patch fixes https://github.com/facebook/mysql-5.6/issues/121 There is a recent change in rocksdb to disable auto compactions on startup: https://reviews.facebook.net/D51147. However, there is a small timing window where a column family needs to be compacted and schedules a compaction, but the scheduled compaction fails when it checks the disable_auto_compactions setting. The expectation is once the application is ready, it will call EnableAutoCompactions() to allow new compactions to go through. However, if the Column family is stalled because L0 is full, and no writes can go through, it is possible the column family may never have a new compaction request get scheduled. EnableAutoCompaction() should probably schedule an new flush and compaction event when it resets disable_auto_compaction. Using InstallSuperVersionAndScheduleWork, we call SchedulePendingFlush, SchedulePendingCompaction, as well as MaybeScheduleFlushOrcompaction on all the column families to avoid the situation above. This is still a first pass for feedback. Could also just call SchedePendingFlush and SchedulePendingCompaction directly. Test Plan: Run on Asan build cd _build-5.6-ASan/ && ./mysql-test/mtr --mem --big --testcase-timeout=36000 --suite-timeout=12000 --parallel=16 --suite=rocksdb,rocksdb_rpl,rocksdb_sys_vars --mysqld=--default-storage-engine=rocksdb --mysqld=--skip-innodb --mysqld=--default-tmp-storage-engine=MyISAM --mysqld=--rocksdb rocksdb_rpl.rpl_rocksdb_stress_crash --repeat=1000 Ensure that it no longer hangs during the test. Reviewers: hermanlee4, yhchiang, anthony Reviewed By: anthony Subscribers: leveldb, yhchiang, dhruba Differential Revision: https://reviews.facebook.net/D51747	2015-12-21 10:06:49 -08:00
agiardullo	eff309867e	Do not use timed_mutex in TransactionDB Summary: Stopped using std::timed_mutex as it has known issues in older versiong of gcc. Ran into these problems when testing MongoRocks. Test Plan: unit tests. Manual mongo testing on gcc 4.8. Reviewers: igor, yhchiang, rven, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D52197	2015-12-18 17:26:02 -08:00
sdong	d72b31774e	Slowdown when writing to the last write buffer Summary: Now if inserting to mem table is much faster than writing to files, there is no mechanism users can rely on to avoid stopping for reaching options.max_write_buffer_number. With the commit, if there are more than four maximum write buffers configured, we slow down to the rate of options.delayed_write_rate while we reach the last one. Test Plan: 1. Add a new unit test. 2. Run db_bench with ./db_bench --benchmarks=fillrandom --num=10000000 --max_background_flushes=6 --batch_size=32 -max_write_buffer_number=4 --delayed_write_rate=500000 --statistics based on hard drive and see stopping is avoided with the commit. Reviewers: yhchiang, IslamAbdelRahman, anthony, rven, kradhakrishnan, igor Reviewed By: igor Subscribers: MarkCallaghan, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52047	2015-12-17 10:49:08 -08:00
Venkatesh Radhakrishnan	6b2a3ac92c	Add documentation for unschedFunction Summary: Documenting the unschedFunction parameter to Schedule as requested by Michael Kolupaev. Test Plan: build, unit test Reviewers: sdong, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: kolmike, dhruba Differential Revision: https://reviews.facebook.net/D52089	2015-12-17 10:41:39 -08:00
Islam AbdelRahman	32ff05e971	Bump version to 4.4 Summary: Bump version to 4.4 Test Plan: none Reviewers: sdong, rven, yhchiang, anthony, kradhakrishnan Reviewed By: kradhakrishnan Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D52035	2015-12-16 14:32:58 -08:00
Islam AbdelRahman	aececc209e	Introduce ReadOptions::pin_data (support zero copy for keys) Summary: This patch update the Iterator API to introduce new functions that allow users to keep the Slices returned by key() valid as long as the Iterator is not deleted ReadOptions::pin_data : If true keep loaded blocks in memory as long as the iterator is not deleted Iterator::IsKeyPinned() : If true, this mean that the Slice returned by key() is valid as long as the iterator is not deleted Also add a new option BlockBasedTableOptions::use_delta_encoding to allow users to disable delta_encoding if needed. Benchmark results (using https://phabricator.fb.com/P20083553) ``` // $ du -h /home/tec/local/normal.4K.Snappy/db10077 // 6.1G /home/tec/local/normal.4K.Snappy/db10077 // $ du -h /home/tec/local/zero.8K.LZ4/db10077 // 6.4G /home/tec/local/zero.8K.LZ4/db10077 // Benchmarks for shard db10077 // _build/opt/rocks/benchmark/rocks_copy_benchmark \ // --normal_db_path="/home/tec/local/normal.4K.Snappy/db10077" \ // --zero_db_path="/home/tec/local/zero.8K.LZ4/db10077" // First run // ============================================================================ // rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s // ============================================================================ // BM_StringCopy 1.73s 576.97m // BM_StringPiece 103.74% 1.67s 598.55m // ============================================================================ // Match rate : 1000000 / 1000000 // Second run // ============================================================================ // rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s // ============================================================================ // BM_StringCopy 611.99ms 1.63 // BM_StringPiece 203.76% 300.35ms 3.33 // ============================================================================ // Match rate : 1000000 / 1000000 ``` Test Plan: Unit tests Reviewers: sdong, igor, anthony, yhchiang, rven Reviewed By: rven Subscribers: dhruba, lovro, adsharma Differential Revision: https://reviews.facebook.net/D48999	2015-12-16 12:08:30 -08:00
Venkatesh Radhakrishnan	030215bf01	Running manual compactions in parallel with other automatic or manual compactions in restricted cases Summary: This diff provides a framework for doing manual compactions in parallel with other compactions. We now have a deque of manual compactions. We also pass manual compactions as an argument from RunManualCompactions down to BackgroundCompactions, so that RunManualCompactions can be reentrant. Parallelism is controlled by the two routines ConflictingManualCompaction to allow/disallow new parallel/manual compactions based on already existing ManualCompactions. In this diff, by default manual compactions still have to run exclusive of other compactions. However, by setting the compaction option, exclusive_manual_compaction to false, it is possible to run other compactions in parallel with a manual compaction. However, we are still restricted to one manual compaction per column family at a time. All of these restrictions will be relaxed in future diffs. I will be adding more tests later. Test Plan: Rocksdb regression + new tests + valgrind Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong Reviewed By: sdong Subscribers: yoshinorim, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47973	2015-12-14 11:20:34 -08:00
SherlockNoMad	768a61486c	Fix appVeyor Build problem	2015-12-11 21:10:49 -08:00
agiardullo	84f98792d6	Transaction::SetWriteOptions() Summary: Add support to change write options after creating a transaction. This is needed for MongoRocks. Test Plan: added test Reviewers: sdong, rven, kradhakrishnan, IslamAbdelRahman, yhchiang Reviewed By: yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51867	2015-12-11 16:08:25 -08:00
agiardullo	3bfd3d39a3	Use SST files for Transaction conflict detection Summary: Currently, transactions can fail even if there is no actual write conflict. This is due to relying on only the memtables to check for write-conflicts. Users have to tune memtable settings to try to avoid this, but it's hard to figure out exactly how to tune these settings. With this diff, TransactionDB will use both memtables and SST files to determine if there are any write conflicts. This relies on the fact that BlockBasedTable stores sequence numbers for all writes that happen after any open snapshot. Also, D50295 is needed to prevent SingleDelete from disappearing writes (the TODOs in this test code will be fixed once the other diff is approved and merged). Note that Optimistic transactions will still rely on tuning memtable settings as we do not want to read from SST while on the write thread. Also, memtable settings can still be used to reduce how often TransactionDB needs to read SST files. Test Plan: unit tests, db bench Reviewers: rven, yhchiang, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb, yoshinorim Differential Revision: https://reviews.facebook.net/D50475	2015-12-11 12:34:11 -08:00
Igor Canadi	64fa43843b	Merge pull request #862 from ceph/wip-env implement EnvMirror	2015-12-10 18:45:07 -08:00
Sage Weil	2074ddd625	env: add EnvMirror This is an Env implementation that mirrors all storage-related methods on two different backend Env's and verifies that they return the same results (return status and read results). This is useful for implementing a new Env and verifying its correctness. Signed-off-by: Sage Weil <sage@redhat.com>	2015-12-10 21:32:45 -05:00
Yueh-Hsuan Chiang	a3ba5915c8	Correct a comment in include/rocksdb/cache.h Summary: Correct a comment in include/rocksdb/cache.h Test Plan: No code change. Reviewers: igor, sdong, IslamAbdelRahman, rven, kradhakrishnan, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51831	2015-12-10 16:39:10 -08:00
charsyam	c30b499541	fix typos in comments	2015-12-11 01:54:48 +09:00
sdong	56e77f0967	Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit Summary: Deprecate options.soft_rate_limit, which is hard to tune, with options.soft_pending_compaction_bytes_limit, which would trigger the slowdown if estimated pending compaction bytes exceeds the threshold. The hope is to make it more striaght-forward to tune. Test Plan: Modify DBTest.SoftLimit to cover options.soft_pending_compaction_bytes_limit instead; run all unit tests. Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, igor, anthony Reviewed By: anthony Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51117	2015-12-09 18:22:45 -08:00
sdong	d6e1035a1f	A new compaction picking priority that optimizes for write amplification for random updates. Summary: Introduce a compaction picking priority that picks files who contains the oldest rows to compact. This is a mode that slightly improves write amplification for random update cases. Test Plan: Add a unit test and run it in valgrind too. Reviewers: yhchiang, anthony, IslamAbdelRahman, rven, kradhakrishnan, MarkCallaghan, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51459	2015-12-09 18:13:03 -08:00
agiardullo	e5c5f23814	Support marking snapshots for write-conflict checking - Take 2 Summary: D51183 was reverted due to breaking the LITE build. This diff is the same as D51183 but with a fix for the LITE BUILD(D51693) Test Plan: run all unit tests Reviewers: sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51711	2015-12-08 16:47:31 -08:00
sdong	1d63c3d610	Revert "Support marking snapshots for write-conflict checking" This reverts commit `ec704aafdc` for it broke RocksDB LITE build.	2015-12-08 09:27:17 -08:00
agiardullo	ec704aafdc	Support marking snapshots for write-conflict checking Summary: D50475 enables using SST files for transaction write-conflict checking. In order for this to work, we need to make sure not to compact out SingleDeletes when there is an earlier transaction snapshot(D50295). If there is a long-held snapshot, this could reduce the benefit of the SingleDelete optimization. This diff allows Transactions to mark snapshots as being used for write-conflict checking. Then, during compaction, we will be able to optimize SingleDeletes better in the future. This diff adds a flag to SnapshotImpl which is used by Transactions. This diff also passes the earliest write-conflict snapshot's sequence number to CompactionIterator. This diff does not actually change Compaction (after this diff is pushed, D50295 will be able to use this information). Test Plan: no behavior change, ran existing tests Reviewers: rven, kradhakrishnan, yhchiang, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51183	2015-12-07 19:40:51 -08:00
Jay Edgar	b28b7c6dd9	Added callback notification when a snapshot is created Summary: When SetSnapshot() is used the caller immediately knows a snapshot has been created, but when SetSnapshotOnNextOperation() is used the caller needs a way to get notified when that snapshot has been generated. This creates an interface that the client can implement that will be called at the time the snapshot is created. Test Plan: Added a new SetSnapshotOnNextOperationWithNotification test into the transaction_test. Reviewers: sdong, anthony Reviewed By: anthony Subscribers: yoshinorim, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51177	2015-12-04 10:20:36 -08:00
Alex Yang	e8180f9901	added public api to schedule flush/compaction, code to prevent race with db::open Summary: Fixes T8781168. Added a new function EnableAutoCompactions in db.h to be publicly avialable. This allows compaction to be re-enabled after disabling it via SetOptions Refactored code to set the dbptr earlier on in TransactionDB::Open and DB::Open Temporarily disable auto_compaction in TransactionDB::Open until dbptr is set to prevent race condition. Test Plan: Ran make all check verified fix on myrocks side: was able to reproduce the seg fault with ../tools/mysqltest.sh --mem --force rocksdb.drop_table method was to manually sleep the thread after DB::Open but before TransactionDB ptr was assigned in transaction_db_impl.cc: DB::Open(db_options, dbname, column_families_copy, handles, &db); clock_t goal = (60000 * 10) + clock(); while (goal > clock()); ...dbptr(aka rdb) gets assigned below verified my changes fixed the issue. Also added unit test 'ToggleAutoCompaction' in transaction_test.cc Reviewers: hermanlee4, anthony Reviewed By: anthony Subscribers: alex, dhruba Differential Revision: https://reviews.facebook.net/D51147	2015-12-03 22:59:44 -08:00
Islam AbdelRahman	72930485b5	Fix clang build Summary: Fix clang Test Plan: make check Reviewers: sdong, yhchiang, rven Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D51417	2015-11-30 10:03:07 -08:00
sdong	6bbfa1874b	BackupDB to have a mode to use file size in file name Summary: Getting file size from all the backup files can take a long time. In some cases, the sizes are available in file names. We allow a mode to get those sizes from file name. Test Plan: Make some unit tests in backupable_db_test to run in such a mode. Make sure RocksDB Lite builds too. Reviewers: IslamAbdelRahman, rven, yhchiang, kradhakrishnan, anthony, igor Reviewed By: igor Subscribers: muthu, asameet, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51243	2015-11-25 11:55:37 -08:00
Igor Canadi	f3ea00bc85	Merge pull request #856 from ceph/wip-env EnvWrapper: add ReuseWritableFile	2015-11-25 11:38:09 -08:00
Sage Weil	4cedd6b038	EnvWrapper: add ReuseWritableFile This was missed when ReuseWritableFile was added to Env in `1bcafb62f4`. Signed-off-by: Sage Weil <sage@redhat.com>	2015-11-25 14:30:05 -05:00
Venkatesh Radhakrishnan	81be49c755	Have a way for compaction filter to ignore snapshots Summary: Provide an API for compaction filter to specify that it needs to be applied even if there are snapshots. Test Plan: DBTestCompactionFilter.CompactionFilterIgnoreSnapshot Reviewers: yhchiang, IslamAbdelRahman, sdong, anthony Reviewed By: anthony Subscribers: yoshinorim, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51087	2015-11-20 15:57:26 -08:00
SherlockNoMad	bd7be035e0	Support Memtable Factory Parse in option_helper.cc	2015-11-17 14:29:01 -08:00
sdong	9bc9c93bd4	Move to version 4.3 Summary: RocksDB 4.2 is already cut. Move to 4.3 Test Plan: Not needed Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D50799	2015-11-16 14:29:08 -08:00
Islam AbdelRahman	a163cc2d5a	Lint everything Summary: ``` arc2 lint --everything ``` run the linter on the whole code repo to fix exisitng lint issues Test Plan: make check -j64 Reviewers: sdong, rven, anthony, kradhakrishnan, yhchiang Reviewed By: yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D50769	2015-11-16 12:56:21 -08:00
Yueh-Hsuan Chiang	d781da8164	Add CheckOptionsCompatibility() API to options_util Summary: Add CheckOptionsCompatibility() API to options_util that returns Status::OK if the input DBOptions and ColumnFamilyDescriptors are compatible with the latest options stored in the specified DB path. Test Plan: Added tests in options_util_test Reviewers: igor, anthony, IslamAbdelRahman, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D50649	2015-11-12 16:52:51 -08:00
Yueh-Hsuan Chiang	e11f676e34	Add OptionsUtil::LoadOptionsFromFile() API Summary: This patch adds OptionsUtil::LoadOptionsFromFile() and OptionsUtil::LoadLatestOptionsFromDB(), which allow developers to construct DBOptions and ColumnFamilyOptions from a RocksDB options file. Note that most pointer-typed options such as merge_operator will not be constructed. With this API, developers no longer need to remember all the options in order to reopen an existing rocksdb instance like the following: DBOptions db_options; std::vector<std::string> cf_names; std::vector<ColumnFamilyOptions> cf_opts; // Load primitive-typed options from an existing DB OptionsUtil::LoadLatestOptionsFromDB( dbname, &db_options, &cf_names, &cf_opts); // Initialize necessary pointer-typed options cf_opts[0].merge_operator.reset(new MyMergeOperator()); ... // Construct the vector of ColumnFamilyDescriptor std::vector<ColumnFamilyDescriptor> cf_descs; for (size_t i = 0; i < cf_opts.size(); ++i) { cf_descs.emplace_back(cf_names[i], cf_opts[i]); } // Open the DB DB* db = nullptr; std::vector<ColumnFamilyHandle*> cf_handles; auto s = DB::Open(db_options, dbname, cf_descs, &handles, &db); Test Plan: Augment existing tests in column_family_test options_test db_test Reviewers: igor, IslamAbdelRahman, sdong, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D49095	2015-11-12 06:52:43 -08:00
Yueh-Hsuan Chiang	e114f0abb8	Enable RocksDB to persist Options file. Summary: This patch allows rocksdb to persist options into a file on DB::Open, SetOptions, and Create / Drop ColumnFamily. Options files are created under the same directory as the rocksdb instance. In addition, this patch also adds a fail_if_missing_options_file in DBOptions that makes any function call return non-ok status when it is not able to persist options properly. // If true, then DB::Open / CreateColumnFamily / DropColumnFamily // / SetOptions will fail if options file is not detected or properly // persisted. // // DEFAULT: false bool fail_if_missing_options_file; Options file names are formatted as OPTIONS-<number>, and RocksDB will always keep the latest two options files. Test Plan: Add options_file_test. options_test column_family_test Reviewers: igor, IslamAbdelRahman, sdong, anthony Reviewed By: anthony Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D48285	2015-11-10 22:58:01 -08:00
Siying Dong	7ed2c3e45b	Merge pull request #823 from yuslepukhin/fix_off_t_type Make use of portable `uint64_t` type to make possible 64-bit file access	2015-11-10 18:57:17 -08:00
Dmitri Smirnov	5270b33bd3	Make use of portable `uint64_t` type to make possible file access in 64-bit. Currently, a signed off_t type is being used for the following interfaces for both offset and the length in bytes: * `Allocate` * `RangeSync` On Linux `off_t` is automatically either 32 or 64-bit depending on the platform. On Windows it is always a 32-bit signed long which limits file access and in particular space pre-allocation to effectively 2 Gb. Proposal is to replace off_t with uint64_t as a portable type always access files with 64-bit interfaces. May need to modify posix code but lack resources to test it.	2015-11-10 17:03:42 -08:00
Nathan Bronson	631863c63b	track WriteBatch contents Summary: Parallel writes will only be possible for certain combinations of flags and WriteBatch contents. Traversing the WriteBatch at write time to check these conditions would be expensive, but it is very cheap to keep track of when building WriteBatch-es. When loading WriteBatch-es during recovery, a deferred computation state is used so that the flags never need to be computed. Test Plan: 1. add asserts and EXPECT_EQ-s 2. make check Reviewers: sdong, igor Reviewed By: igor Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D50337	2015-11-10 16:56:06 -08:00
Yueh-Hsuan Chiang	f3ca28ab03	Correct the comment of GetApproximateMemoryUsageByType Summary: Correct the comment of GetApproximateMemoryUsageByType. Test Plan: No code change. Reviewers: igor, sdong, anthony, IslamAbdelRahman Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D50409	2015-11-08 09:02:35 -08:00

1 2 3 4 5 ...

972 commits