Find a file
Dhruba Borthakur 95dda37858 Move filesize-based-sorting to outside the Mutex
Summary:
When a new version is created, we sort all the files at every
level based on their size. This is necessary because we want
to compact the largest file first. The sorting takes quite a
bit of CPU.

Moved the sorting code to be outside the mutex. Also, the
earlier code was sorting files at all levels but we do not
need to sort the highest-number level because those files
are never the cause of any compaction. To reduce sorting
costs, we sort only the first few files in each level
because it is likely that those are the only files in that
level that will be picked for compaction.

At steady state, I have seen that this patch increase
throughout from 1500 writes/sec to 1700 writes/sec at the
end of a 72 hour run. The cpu saving by not sorting the
last level was not distinctive in this test run because
there were only 100K files in the highest numbered level.
I expect the cpu saving to be significant when the number of
files is much higher.

This is mostly an early preview and not ready for rigorous review.

With this patch, the writs/sec is now bottlenecked not by the sorting code but by GetOverlappingInputs. I am working on a patch to optimize GetOverlappingInputs.

Test Plan: make check

Reviewers: MarkCallaghan, heyongqiang

Reviewed By: heyongqiang

Differential Revision: https://reviews.facebook.net/D6411
2012-11-07 15:39:44 -08:00
db Move filesize-based-sorting to outside the Mutex 2012-11-07 15:39:44 -08:00
doc merge 1.5 2012-08-28 11:43:33 -07:00
hdfs This is the mega-patch multi-threaded compaction 2012-10-19 14:00:53 -07:00
helpers/memenv Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
include/leveldb Merge branch 'master' into performance 2012-11-05 16:51:55 -08:00
java Add LevelDb's JNI wrapper 2012-10-05 13:13:49 -07:00
port Make compression options configurable. These include window-bits, level and strategy for ZlibCompression 2012-11-02 11:26:39 -07:00
scribe fix db_test error with scribe logger turned on 2012-08-28 11:22:58 -07:00
snappy Build with gcc-4.7.1-glibc-2.14.1. 2012-09-17 10:56:26 -07:00
table Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
thrift Implement RowLocks for assoc schema 2012-10-03 23:19:01 -07:00
tools Merge branch 'master' into performance 2012-11-07 15:11:37 -08:00
util Merge branch 'master' into performance 2012-11-07 15:11:37 -08:00
.arcconfig Support arcdiff. 2012-05-09 23:35:05 -07:00
.gitignore Added bloom filter support. 2012-04-17 08:36:46 -07:00
AUTHORS reverting disastrous MOE commit, returning to r21 2011-04-19 23:11:15 +00:00
build_detect_platform Keep build_detect_platform portable 2012-10-26 14:20:04 -07:00
build_detect_version Record the version of the source repository that was used to build the leveldb library. 2012-08-24 15:18:43 -07:00
fbcode.gcc471.sh Enable SSE when building with fbcode support. 2012-10-18 08:43:25 -07:00
LICENSE reverting disastrous MOE commit, returning to r21 2011-04-19 23:11:15 +00:00
Makefile Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
NEWS sync with upstream @ 21409451 2011-05-21 02:17:43 +00:00
README @20776309 2011-04-20 22:48:11 +00:00
README.fb Enable SSE when building with fbcode support. 2012-10-18 08:43:25 -07:00
TODO A number of smaller fixes and performance improvements: 2011-06-22 02:36:45 +00:00

leveldb: A key-value store
Authors: Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

The code under this directory implements a system for maintaining a
persistent key/value store.

See doc/index.html for more explanation.
See doc/impl.html for a brief overview of the implementation.

The public interface is in include/*.h.  Callers should not include or
rely on the details of any other header files in this package.  Those
internal APIs may be changed without warning.

Guide to header files:

include/db.h
    Main interface to the DB: Start here

include/options.h
    Control over the behavior of an entire database, and also
    control over the behavior of individual reads and writes.

include/comparator.h
    Abstraction for user-specified comparison function.  If you want
    just bytewise comparison of keys, you can use the default comparator,
    but clients can write their own comparator implementations if they
    want custom ordering (e.g. to handle different character
    encodings, etc.)

include/iterator.h
    Interface for iterating over data. You can get an iterator
    from a DB object.

include/write_batch.h
    Interface for atomically applying multiple updates to a database.

include/slice.h
    A simple module for maintaining a pointer and a length into some
    other byte array.

include/status.h
    Status is returned from many of the public interfaces and is used
    to report success and various kinds of errors.

include/env.h
    Abstraction of the OS environment.  A posix implementation of
    this interface is in util/env_posix.cc

include/table.h
include/table_builder.h
    Lower-level modules that most clients probably won't use directly