Java API consistency between RocksDB.put() , .merge() and Transaction.put() , .merge() (#11019)
Summary:
### Implement new Java API get()/put()/merge() methods, and transactional variants.
The Java API methods are very inconsistent in terms of how they pass parameters (byte[], ByteBuffer), and what variants and defaulted parameters they support. We try to bring some consistency to this.
* All APIs should support calls with ByteBuffer parameters.
* Similar methods (RocksDB.get() vs Transaction.get()) should support as similar as possible sets of parameters for predictability.
* get()-like methods should provide variants where the caller supplies the target buffer, for the sake of efficiency. Allocation costs in Java can be significant when large buffers are repeatedly allocated and freed.
### API Additions
1. RockDB.get implement indirect ByteBuffers. Added indirect ByteBuffers and supporting native methods for get().
2. RocksDB.Iterator implement missing (byte[], offset, length) variants for key() and value() parameters.
3. Transaction.get() implement missing methods, based on RocksDB.get. Added ByteBuffer.get with and without column family. Added byte[]-as-target get.
4. Transaction.iterator() implement a getIterator() which defaults ReadOptions; as per RocksDB.iterator(). Rationalize support API for this and RocksDB.iterator()
5. RocksDB.merge implement ByteBuffer methods; both direct and indirect buffers. Shadow the methods of RocksDB.put; RocksDB.put only offers ByteBuffer API with explicit WriteOptions. Duplicated this with RocksDB.merge
6. Transaction.merge implement methods as per RocksDB.merge methods. Transaction is already constructed with WriteOptions, so no explicit WriteOptions methods required.
7. Transaction.mergeUntracked implement the same API methods as Transaction.merge except the ones that use assumeTracked, because that’s not a feature of merge untracked.
### Support Changes (C++)
The current JNI code in C++ supports multiple variants of methods through a number of helper functions. There are numerous TODO suggestions in the code proposing that the helpers be re-factored/shared.
We have taken a different approach for the new methods; we have created wrapper classes `JDirectBufferSlice`, `JDirectBufferPinnableSlice`, `JByteArraySlice` and `JByteArrayPinnableSlice` RAII classes which construct slices from JNI parameters and can then be passed directly to RocksDB methods. For instance, the `Java_org_rocksdb_Transaction_getDirect` method is implemented like this:
```
try {
ROCKSDB_NAMESPACE::JDirectBufferSlice key(env, jkey_bb, jkey_off,
jkey_part_len);
ROCKSDB_NAMESPACE::JDirectBufferPinnableSlice value(env, jval_bb, jval_off,
jval_part_len);
ROCKSDB_NAMESPACE::KVException::ThrowOnError(
env, txn->Get(*read_options, column_family_handle, key.slice(),
&value.pinnable_slice()));
return value.Fetch();
} catch (const ROCKSDB_NAMESPACE::KVException& e) {
return e.Code();
}
```
Notice the try/catch mechanism with the `KVException` class, which combined with RAII and the wrapper classes means that there is no ad-hoc cleanup necessary in the JNI methods.
We propose to extend this mechanism to existing JNI methods as further work.
### Support Changes (Java)
Where there are multiple parameter-variant versions of the same method, we use fewer or just one supporting native method for all of them. This makes maintenance a bit easier and reduces the opportunity for coding errors mixing up (untyped) object handles.
In order to support this efficiently, some classes need to have default values for column families and read options added and cached so that they are not re-constructed on every method call.
This PR closes https://github.com/facebook/rocksdb/issues/9776
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11019
Reviewed By: ajkr
Differential Revision: D52039446
Pulled By: jowlyzhang
fbshipit-source-id: 45d0140a4887e42134d2e56520e9b8efbd349660
2023-12-11 19:03:17 +00:00
|
|
|
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
|
|
|
|
// This source code is licensed under both the GPLv2 (found in the
|
|
|
|
// COPYING file in the root directory) and Apache 2.0 License
|
|
|
|
// (found in the LICENSE.Apache file in the root directory).
|
|
|
|
//
|
|
|
|
// This file defines helper methods for Java API write methods
|
|
|
|
//
|
|
|
|
|
|
|
|
#pragma once
|
|
|
|
|
|
|
|
#include <jni.h>
|
|
|
|
|
|
|
|
#include <cstring>
|
|
|
|
#include <exception>
|
|
|
|
#include <functional>
|
|
|
|
#include <string>
|
|
|
|
|
|
|
|
#include "rocksdb/rocksdb_namespace.h"
|
|
|
|
#include "rocksdb/slice.h"
|
|
|
|
#include "rocksdb/status.h"
|
|
|
|
#include "rocksjni/portal.h"
|
|
|
|
|
|
|
|
namespace ROCKSDB_NAMESPACE {
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Exception class used to make the flow of key/value (Put(), Get(),
|
|
|
|
* Merge(), ...) calls clearer.
|
|
|
|
*
|
|
|
|
* This class is used by Java API JNI methods in try { save/fetch } catch { ...
|
|
|
|
* } style.
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
class KVException : public std::exception {
|
|
|
|
public:
|
|
|
|
// These values are expected on Java API calls to represent the result of a
|
|
|
|
// Get() which has failed; a negative length is returned to indicate an error.
|
|
|
|
static const int kNotFound = -1; // the key was not found in RocksDB
|
|
|
|
static const int kStatusError =
|
|
|
|
-2; // there was some other error fetching the value for the key
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Throw a KVException (and potentially a Java exception) if the
|
|
|
|
* RocksDB status is "bad"
|
|
|
|
*
|
|
|
|
* @param env JNI environment needed to create a Java exception
|
|
|
|
* @param status RocksDB status to examine
|
|
|
|
*/
|
|
|
|
static void ThrowOnError(JNIEnv* env, const Status& status) {
|
|
|
|
if (status.ok()) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
if (status.IsNotFound()) {
|
|
|
|
// IsNotFound does not generate a Java Exception, any other bad status
|
|
|
|
// does..
|
|
|
|
throw KVException(kNotFound);
|
|
|
|
}
|
|
|
|
ROCKSDB_NAMESPACE::RocksDBExceptionJni::ThrowNew(env, status);
|
|
|
|
throw KVException(kStatusError);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Throw a KVException and a Java exception
|
|
|
|
*
|
|
|
|
* @param env JNI environment needed to create a Java exception
|
|
|
|
* @param message content of the exception we will throw
|
|
|
|
*/
|
|
|
|
static void ThrowNew(JNIEnv* env, const std::string& message) {
|
|
|
|
ROCKSDB_NAMESPACE::RocksDBExceptionJni::ThrowNew(env, message);
|
|
|
|
throw KVException(kStatusError);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Throw a KVException if there is already a Java exception in the JNI
|
|
|
|
* enviroment
|
|
|
|
*
|
|
|
|
* @param env
|
|
|
|
*/
|
|
|
|
static void ThrowOnError(JNIEnv* env) {
|
|
|
|
if (env->ExceptionCheck()) {
|
|
|
|
throw KVException(kStatusError);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
KVException(jint code) : kCode_(code){};
|
|
|
|
|
|
|
|
virtual const char* what() const throw() {
|
|
|
|
return "Exception raised by JNI. There may be a Java exception in the "
|
|
|
|
"JNIEnv. Please check!";
|
|
|
|
}
|
|
|
|
|
|
|
|
jint Code() const { return kCode_; }
|
|
|
|
|
|
|
|
private:
|
|
|
|
jint kCode_;
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Construct a slice with the contents of a Java byte array
|
|
|
|
*
|
|
|
|
* The slice refers to an array into which the Java byte array's whole region is
|
|
|
|
* copied
|
|
|
|
*/
|
|
|
|
class JByteArraySlice {
|
|
|
|
public:
|
|
|
|
JByteArraySlice(JNIEnv* env, const jbyteArray& jarr, const jint jarr_off,
|
|
|
|
const jint jarr_len)
|
|
|
|
: arr_(new jbyte[jarr_len]),
|
|
|
|
slice_(reinterpret_cast<char*>(arr_), jarr_len) {
|
|
|
|
env->GetByteArrayRegion(jarr, jarr_off, jarr_len, arr_);
|
|
|
|
KVException::ThrowOnError(env);
|
|
|
|
};
|
|
|
|
|
|
|
|
~JByteArraySlice() {
|
|
|
|
slice_.clear();
|
|
|
|
delete[] arr_;
|
|
|
|
};
|
|
|
|
|
|
|
|
Slice& slice() { return slice_; }
|
|
|
|
|
|
|
|
private:
|
|
|
|
jbyte* arr_;
|
|
|
|
Slice slice_;
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Construct a slice with the contents of a direct Java ByterBuffer
|
|
|
|
*
|
|
|
|
* The slice refers directly to the contents of the buffer, no copy is made.
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
class JDirectBufferSlice {
|
|
|
|
public:
|
|
|
|
JDirectBufferSlice(JNIEnv* env, const jobject& jbuffer,
|
|
|
|
const jint jbuffer_off, const jint jbuffer_len)
|
|
|
|
: slice_(static_cast<char*>(env->GetDirectBufferAddress(jbuffer)) +
|
|
|
|
jbuffer_off,
|
|
|
|
jbuffer_len) {
|
|
|
|
KVException::ThrowOnError(env);
|
|
|
|
jlong capacity = env->GetDirectBufferCapacity(jbuffer);
|
|
|
|
if (capacity < jbuffer_off + jbuffer_len) {
|
|
|
|
auto message = "Direct buffer offset " + std::to_string(jbuffer_off) +
|
|
|
|
" + length " + std::to_string(jbuffer_len) +
|
|
|
|
" exceeds capacity " + std::to_string(capacity);
|
|
|
|
KVException::ThrowNew(env, message);
|
|
|
|
slice_.clear();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
~JDirectBufferSlice() { slice_.clear(); };
|
|
|
|
|
|
|
|
Slice& slice() { return slice_; }
|
|
|
|
|
|
|
|
private:
|
|
|
|
Slice slice_;
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Wrap a pinnable slice with a method to retrieve the contents back into
|
|
|
|
* Java
|
|
|
|
*
|
|
|
|
* The Java Byte Array version sets the byte array's region from the slice
|
|
|
|
*/
|
|
|
|
class JByteArrayPinnableSlice {
|
|
|
|
public:
|
|
|
|
/**
|
|
|
|
* @brief Construct a new JByteArrayPinnableSlice object referring to an
|
|
|
|
* existing java byte buffer
|
|
|
|
*
|
|
|
|
* @param env
|
|
|
|
* @param jbuffer
|
|
|
|
* @param jbuffer_off
|
|
|
|
* @param jbuffer_len
|
|
|
|
*/
|
|
|
|
JByteArrayPinnableSlice(JNIEnv* env, const jbyteArray& jbuffer,
|
|
|
|
const jint jbuffer_off, const jint jbuffer_len)
|
|
|
|
: env_(env),
|
|
|
|
jbuffer_(jbuffer),
|
|
|
|
jbuffer_off_(jbuffer_off),
|
|
|
|
jbuffer_len_(jbuffer_len){};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Construct an empty new JByteArrayPinnableSlice object
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
JByteArrayPinnableSlice(JNIEnv* env) : env_(env){};
|
|
|
|
|
|
|
|
PinnableSlice& pinnable_slice() { return pinnable_slice_; }
|
|
|
|
|
|
|
|
~JByteArrayPinnableSlice() { pinnable_slice_.Reset(); };
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief copy back contents of the pinnable slice into the Java ByteBuffer
|
|
|
|
*
|
|
|
|
* @return jint min of size of buffer and number of bytes in value for
|
|
|
|
* requested key
|
|
|
|
*/
|
|
|
|
jint Fetch() {
|
|
|
|
const jint pinnable_len = static_cast<jint>(pinnable_slice_.size());
|
|
|
|
const jint result_len = std::min(jbuffer_len_, pinnable_len);
|
|
|
|
env_->SetByteArrayRegion(
|
|
|
|
jbuffer_, jbuffer_off_, result_len,
|
|
|
|
reinterpret_cast<const jbyte*>(pinnable_slice_.data()));
|
|
|
|
KVException::ThrowOnError(
|
|
|
|
env_); // exception thrown: ArrayIndexOutOfBoundsException
|
|
|
|
|
|
|
|
return pinnable_len;
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief create a new Java buffer and copy the result into it
|
|
|
|
*
|
|
|
|
* @return jbyteArray the java buffer holding the result
|
|
|
|
*/
|
|
|
|
jbyteArray NewByteArray() {
|
|
|
|
const jint pinnable_len = static_cast<jint>(pinnable_slice_.size());
|
JNI get_helper code sharing / multiGet() use efficient batch C++ support (#12344)
Summary:
Implement RAII-based helpers for JNIGet() and multiGet()
Replace JNI C++ helpers `rocksdb_get_helper, rocksdb_get_helper_direct`, `multi_get_helper`, `multi_get_helper_direct`, `multi_get_helper_release_keys`, `txn_get_helper`, and `txn_multi_get_helper`.
The model is to entirely do away with a single helper, instead a number of utility methods allow each separate
JNI `Get()` and `MultiGet()` method to organise their parameters efficiently, then call the underlying C++ `db->Get()`,
`db->MultiGet()`, `txn->Get()`, or `txn->MultiGet()` method itself, and use further utilities to retrieve results.
Roughly speaking:
* get keys into C++ form
* Call C++ Get()
* get results and status into Java form
We achieve a useful performance gain as part of this work; by using the updated C++ multiGet we immediately pick up its performance gains (batch improvements to multiGet C++ were previously implemented, but not until now used by Java/JNI). multiGetBB already uses the batched C++ multiGet(), and all other benchmarks show consistent improvement after the changes:
## Before:
```
Benchmark (columnFamilyTestType) (keyCount) (keySize) (multiGetSize) (valueSize) Mode Cnt Score Error Units
MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 256 thrpt 25 5315.459 ± 20.465 ops/s
MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 1024 thrpt 25 5673.115 ± 78.299 ops/s
MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 4096 thrpt 25 2616.860 ± 46.994 ops/s
MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 16384 thrpt 25 1700.058 ± 24.034 ops/s
MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 65536 thrpt 25 791.171 ± 13.955 ops/s
MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 256 thrpt 25 6129.929 ± 94.200 ops/s
MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 1024 thrpt 25 7012.405 ± 97.886 ops/s
MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 4096 thrpt 25 2799.014 ± 39.352 ops/s
MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 16384 thrpt 25 1417.205 ± 22.272 ops/s
MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 65536 thrpt 25 655.594 ± 13.050 ops/s
MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 256 thrpt 25 6147.247 ± 82.711 ops/s
MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 1024 thrpt 25 7004.213 ± 79.251 ops/s
MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 4096 thrpt 25 2715.154 ± 110.017 ops/s
MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 16384 thrpt 25 1408.070 ± 31.714 ops/s
MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 65536 thrpt 25 623.829 ± 57.374 ops/s
MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 256 thrpt 25 6119.243 ± 116.313 ops/s
MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 1024 thrpt 25 6931.873 ± 128.094 ops/s
MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 4096 thrpt 25 2678.253 ± 39.113 ops/s
MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 16384 thrpt 25 1337.384 ± 19.500 ops/s
MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 65536 thrpt 25 625.596 ± 14.525 ops/s
```
## After:
```
Benchmark (columnFamilyTestType) (keyCount) (keySize) (multiGetSize) (valueSize) Mode Cnt Score Error Units
MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 256 thrpt 25 5191.074 ± 78.250 ops/s
MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 1024 thrpt 25 5378.692 ± 260.682 ops/s
MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 4096 thrpt 25 2590.183 ± 34.844 ops/s
MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 16384 thrpt 25 1634.793 ± 34.022 ops/s
MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 65536 thrpt 25 786.455 ± 8.462 ops/s
MultiGetBenchmarks.multiGetBB200 1_column_family 10000 1024 100 256 thrpt 25 5285.055 ± 11.676 ops/s
MultiGetBenchmarks.multiGetBB200 1_column_family 10000 1024 100 1024 thrpt 25 5586.758 ± 213.008 ops/s
MultiGetBenchmarks.multiGetBB200 1_column_family 10000 1024 100 4096 thrpt 25 2527.172 ± 17.106 ops/s
MultiGetBenchmarks.multiGetBB200 1_column_family 10000 1024 100 16384 thrpt 25 1819.547 ± 12.958 ops/s
MultiGetBenchmarks.multiGetBB200 1_column_family 10000 1024 100 65536 thrpt 25 803.861 ± 9.963 ops/s
MultiGetBenchmarks.multiGetBB200 20_column_families 10000 1024 100 256 thrpt 25 5253.793 ± 28.020 ops/s
MultiGetBenchmarks.multiGetBB200 20_column_families 10000 1024 100 1024 thrpt 25 5705.591 ± 20.556 ops/s
MultiGetBenchmarks.multiGetBB200 20_column_families 10000 1024 100 4096 thrpt 25 2523.377 ± 15.415 ops/s
MultiGetBenchmarks.multiGetBB200 20_column_families 10000 1024 100 16384 thrpt 25 1815.344 ± 11.309 ops/s
MultiGetBenchmarks.multiGetBB200 20_column_families 10000 1024 100 65536 thrpt 25 820.792 ± 3.192 ops/s
MultiGetBenchmarks.multiGetBB200 100_column_families 10000 1024 100 256 thrpt 25 5262.184 ± 20.477 ops/s
MultiGetBenchmarks.multiGetBB200 100_column_families 10000 1024 100 1024 thrpt 25 5706.959 ± 23.123 ops/s
MultiGetBenchmarks.multiGetBB200 100_column_families 10000 1024 100 4096 thrpt 25 2520.362 ± 9.170 ops/s
MultiGetBenchmarks.multiGetBB200 100_column_families 10000 1024 100 16384 thrpt 25 1789.185 ± 14.239 ops/s
MultiGetBenchmarks.multiGetBB200 100_column_families 10000 1024 100 65536 thrpt 25 818.401 ± 12.132 ops/s
MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 256 thrpt 25 6978.310 ± 14.084 ops/s
MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 1024 thrpt 25 7664.242 ± 22.304 ops/s
MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 4096 thrpt 25 2881.778 ± 81.054 ops/s
MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 16384 thrpt 25 1599.826 ± 7.190 ops/s
MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 65536 thrpt 25 737.520 ± 6.809 ops/s
MultiGetBenchmarks.multiGetList10 1_column_family 10000 1024 100 256 thrpt 25 6974.376 ± 10.716 ops/s
MultiGetBenchmarks.multiGetList10 1_column_family 10000 1024 100 1024 thrpt 25 7637.440 ± 45.877 ops/s
MultiGetBenchmarks.multiGetList10 1_column_family 10000 1024 100 4096 thrpt 25 2820.472 ± 42.231 ops/s
MultiGetBenchmarks.multiGetList10 1_column_family 10000 1024 100 16384 thrpt 25 1716.663 ± 8.527 ops/s
MultiGetBenchmarks.multiGetList10 1_column_family 10000 1024 100 65536 thrpt 25 755.848 ± 7.514 ops/s
MultiGetBenchmarks.multiGetList10 20_column_families 10000 1024 100 256 thrpt 25 6943.651 ± 20.040 ops/s
MultiGetBenchmarks.multiGetList10 20_column_families 10000 1024 100 1024 thrpt 25 7679.415 ± 9.114 ops/s
MultiGetBenchmarks.multiGetList10 20_column_families 10000 1024 100 4096 thrpt 25 2844.564 ± 13.388 ops/s
MultiGetBenchmarks.multiGetList10 20_column_families 10000 1024 100 16384 thrpt 25 1729.545 ± 5.983 ops/s
MultiGetBenchmarks.multiGetList10 20_column_families 10000 1024 100 65536 thrpt 25 783.218 ± 1.530 ops/s
MultiGetBenchmarks.multiGetList10 100_column_families 10000 1024 100 256 thrpt 25 6944.276 ± 29.995 ops/s
MultiGetBenchmarks.multiGetList10 100_column_families 10000 1024 100 1024 thrpt 25 7670.301 ± 8.986 ops/s
MultiGetBenchmarks.multiGetList10 100_column_families 10000 1024 100 4096 thrpt 25 2839.828 ± 12.421 ops/s
MultiGetBenchmarks.multiGetList10 100_column_families 10000 1024 100 16384 thrpt 25 1730.005 ± 9.209 ops/s
MultiGetBenchmarks.multiGetList10 100_column_families 10000 1024 100 65536 thrpt 25 787.096 ± 1.977 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 256 thrpt 25 6896.944 ± 21.530 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 1024 thrpt 25 7622.407 ± 12.824 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 4096 thrpt 25 2927.538 ± 19.792 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 16384 thrpt 25 1598.041 ± 4.312 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 65536 thrpt 25 744.564 ± 9.236 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 1_column_family 10000 1024 100 256 thrpt 25 6853.760 ± 78.041 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 1_column_family 10000 1024 100 1024 thrpt 25 7360.917 ± 355.365 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 1_column_family 10000 1024 100 4096 thrpt 25 2848.774 ± 13.409 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 1_column_family 10000 1024 100 16384 thrpt 25 1727.688 ± 3.329 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 1_column_family 10000 1024 100 65536 thrpt 25 776.088 ± 7.517 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 20_column_families 10000 1024 100 256 thrpt 25 6910.339 ± 14.366 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 20_column_families 10000 1024 100 1024 thrpt 25 7633.660 ± 10.830 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 20_column_families 10000 1024 100 4096 thrpt 25 2787.799 ± 81.775 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 20_column_families 10000 1024 100 16384 thrpt 25 1726.517 ± 6.830 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 20_column_families 10000 1024 100 65536 thrpt 25 787.597 ± 3.362 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 100_column_families 10000 1024 100 256 thrpt 25 6922.445 ± 10.493 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 100_column_families 10000 1024 100 1024 thrpt 25 7604.710 ± 48.043 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 100_column_families 10000 1024 100 4096 thrpt 25 2848.788 ± 15.783 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 100_column_families 10000 1024 100 16384 thrpt 25 1730.837 ± 6.497 ops/s
MultiGetBenchmarks.multiGetListExplicitCF20 100_column_families 10000 1024 100 65536 thrpt 25 794.557 ± 1.869 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 256 thrpt 25 6918.716 ± 15.766 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 1024 thrpt 25 7626.692 ± 9.394 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 4096 thrpt 25 2871.382 ± 72.155 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 16384 thrpt 25 1598.786 ± 4.819 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 65536 thrpt 25 748.469 ± 7.234 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 1_column_family 10000 1024 100 256 thrpt 25 6922.666 ± 17.131 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 1_column_family 10000 1024 100 1024 thrpt 25 7623.890 ± 8.805 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 1_column_family 10000 1024 100 4096 thrpt 25 2850.698 ± 18.004 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 1_column_family 10000 1024 100 16384 thrpt 25 1727.623 ± 4.868 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 1_column_family 10000 1024 100 65536 thrpt 25 774.534 ± 10.025 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 20_column_families 10000 1024 100 256 thrpt 25 5486.251 ± 13.582 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 20_column_families 10000 1024 100 1024 thrpt 25 4920.656 ± 44.557 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 20_column_families 10000 1024 100 4096 thrpt 25 3922.913 ± 25.686 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 20_column_families 10000 1024 100 16384 thrpt 25 2873.106 ± 4.336 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 20_column_families 10000 1024 100 65536 thrpt 25 802.404 ± 8.967 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 100_column_families 10000 1024 100 256 thrpt 25 4817.996 ± 18.042 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 100_column_families 10000 1024 100 1024 thrpt 25 4243.922 ± 13.929 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 100_column_families 10000 1024 100 4096 thrpt 25 3175.998 ± 7.773 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 100_column_families 10000 1024 100 16384 thrpt 25 2321.990 ± 12.501 ops/s
MultiGetBenchmarks.multiGetListRandomCF30 100_column_families 10000 1024 100 65536 thrpt 25 1753.028 ± 7.130 ops/s
```
Closes https://github.com/facebook/rocksdb/issues/11518
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12344
Reviewed By: cbi42
Differential Revision: D54809714
Pulled By: pdillinger
fbshipit-source-id: bee3b949720abac073bce043b59ce976a11e99eb
2024-03-12 19:42:08 +00:00
|
|
|
jbyteArray jbuffer =
|
|
|
|
ROCKSDB_NAMESPACE::JniUtil::createJavaByteArrayWithSizeCheck(
|
|
|
|
env_, pinnable_slice_.data(), pinnable_len);
|
Java API consistency between RocksDB.put() , .merge() and Transaction.put() , .merge() (#11019)
Summary:
### Implement new Java API get()/put()/merge() methods, and transactional variants.
The Java API methods are very inconsistent in terms of how they pass parameters (byte[], ByteBuffer), and what variants and defaulted parameters they support. We try to bring some consistency to this.
* All APIs should support calls with ByteBuffer parameters.
* Similar methods (RocksDB.get() vs Transaction.get()) should support as similar as possible sets of parameters for predictability.
* get()-like methods should provide variants where the caller supplies the target buffer, for the sake of efficiency. Allocation costs in Java can be significant when large buffers are repeatedly allocated and freed.
### API Additions
1. RockDB.get implement indirect ByteBuffers. Added indirect ByteBuffers and supporting native methods for get().
2. RocksDB.Iterator implement missing (byte[], offset, length) variants for key() and value() parameters.
3. Transaction.get() implement missing methods, based on RocksDB.get. Added ByteBuffer.get with and without column family. Added byte[]-as-target get.
4. Transaction.iterator() implement a getIterator() which defaults ReadOptions; as per RocksDB.iterator(). Rationalize support API for this and RocksDB.iterator()
5. RocksDB.merge implement ByteBuffer methods; both direct and indirect buffers. Shadow the methods of RocksDB.put; RocksDB.put only offers ByteBuffer API with explicit WriteOptions. Duplicated this with RocksDB.merge
6. Transaction.merge implement methods as per RocksDB.merge methods. Transaction is already constructed with WriteOptions, so no explicit WriteOptions methods required.
7. Transaction.mergeUntracked implement the same API methods as Transaction.merge except the ones that use assumeTracked, because that’s not a feature of merge untracked.
### Support Changes (C++)
The current JNI code in C++ supports multiple variants of methods through a number of helper functions. There are numerous TODO suggestions in the code proposing that the helpers be re-factored/shared.
We have taken a different approach for the new methods; we have created wrapper classes `JDirectBufferSlice`, `JDirectBufferPinnableSlice`, `JByteArraySlice` and `JByteArrayPinnableSlice` RAII classes which construct slices from JNI parameters and can then be passed directly to RocksDB methods. For instance, the `Java_org_rocksdb_Transaction_getDirect` method is implemented like this:
```
try {
ROCKSDB_NAMESPACE::JDirectBufferSlice key(env, jkey_bb, jkey_off,
jkey_part_len);
ROCKSDB_NAMESPACE::JDirectBufferPinnableSlice value(env, jval_bb, jval_off,
jval_part_len);
ROCKSDB_NAMESPACE::KVException::ThrowOnError(
env, txn->Get(*read_options, column_family_handle, key.slice(),
&value.pinnable_slice()));
return value.Fetch();
} catch (const ROCKSDB_NAMESPACE::KVException& e) {
return e.Code();
}
```
Notice the try/catch mechanism with the `KVException` class, which combined with RAII and the wrapper classes means that there is no ad-hoc cleanup necessary in the JNI methods.
We propose to extend this mechanism to existing JNI methods as further work.
### Support Changes (Java)
Where there are multiple parameter-variant versions of the same method, we use fewer or just one supporting native method for all of them. This makes maintenance a bit easier and reduces the opportunity for coding errors mixing up (untyped) object handles.
In order to support this efficiently, some classes need to have default values for column families and read options added and cached so that they are not re-constructed on every method call.
This PR closes https://github.com/facebook/rocksdb/issues/9776
Pull Request resolved: https://github.com/facebook/rocksdb/pull/11019
Reviewed By: ajkr
Differential Revision: D52039446
Pulled By: jowlyzhang
fbshipit-source-id: 45d0140a4887e42134d2e56520e9b8efbd349660
2023-12-11 19:03:17 +00:00
|
|
|
KVException::ThrowOnError(env_); // OutOfMemoryError
|
|
|
|
|
|
|
|
return jbuffer;
|
|
|
|
}
|
|
|
|
|
|
|
|
private:
|
|
|
|
JNIEnv* env_;
|
|
|
|
jbyteArray jbuffer_;
|
|
|
|
jint jbuffer_off_;
|
|
|
|
jint jbuffer_len_;
|
|
|
|
PinnableSlice pinnable_slice_;
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Wrap a pinnable slice with a method to retrieve the contents back into
|
|
|
|
* Java
|
|
|
|
*
|
|
|
|
* The Java Direct Buffer version copies the memory of the buffer from the slice
|
|
|
|
*/
|
|
|
|
class JDirectBufferPinnableSlice {
|
|
|
|
public:
|
|
|
|
JDirectBufferPinnableSlice(JNIEnv* env, const jobject& jbuffer,
|
|
|
|
const jint jbuffer_off, const jint jbuffer_len)
|
|
|
|
: buffer_(static_cast<char*>(env->GetDirectBufferAddress(jbuffer)) +
|
|
|
|
jbuffer_off),
|
|
|
|
jbuffer_len_(jbuffer_len) {
|
|
|
|
jlong capacity = env->GetDirectBufferCapacity(jbuffer);
|
|
|
|
if (capacity < jbuffer_off + jbuffer_len) {
|
|
|
|
auto message =
|
|
|
|
"Invalid value argument. Capacity is less than requested region. "
|
|
|
|
"offset " +
|
|
|
|
std::to_string(jbuffer_off) + " + length " +
|
|
|
|
std::to_string(jbuffer_len) + " exceeds capacity " +
|
|
|
|
std::to_string(capacity);
|
|
|
|
KVException::ThrowNew(env, message);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
PinnableSlice& pinnable_slice() { return pinnable_slice_; }
|
|
|
|
|
|
|
|
~JDirectBufferPinnableSlice() { pinnable_slice_.Reset(); };
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief copy back contents of the pinnable slice into the Java DirectBuffer
|
|
|
|
*
|
|
|
|
* @return jint min of size of buffer and number of bytes in value for
|
|
|
|
* requested key
|
|
|
|
*/
|
|
|
|
jint Fetch() {
|
|
|
|
const jint pinnable_len = static_cast<jint>(pinnable_slice_.size());
|
|
|
|
const jint result_len = std::min(jbuffer_len_, pinnable_len);
|
|
|
|
|
|
|
|
memcpy(buffer_, pinnable_slice_.data(), result_len);
|
|
|
|
return pinnable_len;
|
|
|
|
};
|
|
|
|
|
|
|
|
private:
|
|
|
|
char* buffer_;
|
|
|
|
jint jbuffer_len_;
|
|
|
|
PinnableSlice pinnable_slice_;
|
|
|
|
};
|
|
|
|
|
|
|
|
} // namespace ROCKSDB_NAMESPACE
|