rocksdb/include/leveldb
Haobo Xu 778e179046 [RocksDB] Sync file to disk incrementally
Summary:
During compaction, we sync the output files after they are fully written out. This causes unnecessary blocking of the compaction thread and burstiness of the write traffic.
This diff simply asks the OS to sync data incrementally as they are written, on the background. The hope is that, at the final sync, most of the data are already on disk and we would block less on the sync call. Thus, each compaction runs faster and we could use fewer number of compaction threads to saturate IO.
In addition, the write traffic will be smoothed out, hopefully reducing the IO P99 latency too.

Some quick tests show 10~20% improvement in per thread compaction throughput. Combined with posix advice on compaction read, just 5 threads are enough to almost saturate the udb flash bandwidth for 800 bytes write only benchmark.
What's more promising is that, with saturated IO, iostat shows average wait time is actually smoother and much smaller.
For the write only test 800bytes test:
Before the change:  await  occillate between 10ms and 3ms
After the change: await ranges 1-3ms

Will test against read-modify-write workload too, see if high read latency P99 could be resolved.

Will introduce a parameter to control the sync interval in a follow up diff after cleaning up EnvOptions.

Test Plan: make check; db_bench; db_stress

Reviewers: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11115
2013-06-12 12:53:59 -07:00
..
c.h Fix poor error on num_levels mismatch and few other minor improvements 2013-01-25 15:37:26 -08:00
cache.h Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
compaction_filter.h [RocksDB] Cleanup compaction filter to use a class interface, instead of function pointer and additional context pointer. 2013-05-13 14:06:10 -07:00
comparator.h A number of fixes: 2011-10-31 17:22:06 +00:00
db.h Very basic Multiget and simple test cases. 2013-06-05 11:22:38 -07:00
env.h [RocksDB] Sync file to disk incrementally 2013-06-12 12:53:59 -07:00
filter_policy.h Added bloom filter support. 2012-04-17 08:36:46 -07:00
iterator.h A number of fixes: 2011-10-31 17:22:06 +00:00
ldb_tool.h [RocksDB] Expose LDB functioanality as a library call - clients can build their own LDB binary with additional options 2013-04-11 20:21:49 -07:00
merge_operator.h [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
options.h [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
slice.h manifest_dump: Add --hex=1 option 2012-12-16 08:58:28 -08:00
statistics.h [Rocksdb] fix wrong assert 2013-06-10 13:14:14 -07:00
status.h [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
table_builder.h Fix all the lint errors. 2012-11-28 17:18:41 -08:00
transaction_log_iterator.h Do not allow Transaction Log Iterator to fall ahead when writer is writing the same file 2013-03-06 14:05:53 -08:00
types.h GetUpdatesSince API to enable replication. 2012-12-07 11:42:13 -08:00
write_batch.h [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00