1. 07 Jun, 2019 3 commits
  2. 06 Jun, 2019 4 commits
    • Yanqin Jin's avatar
      Add support for timestamp in Get/Put (#5079) · 340ed4fa
      Yanqin Jin authored
      Summary:
      It's useful to be able to (optionally) associate key-value pairs with user-provided timestamps. This PR is an early effort towards this goal and continues the work of facebook#4942. A suite of new unit tests exist in DBBasicTestWithTimestampWithParam. Support for timestamp requires the user to provide timestamp as a slice in `ReadOptions` and `WriteOptions`. All timestamps of the same database must share the same length, format, etc. The format of the timestamp is the same throughout the same database, and the user is responsible for providing a comparator function (Comparator) to order the <key, timestamp> tuples. Once created, the format and length of the timestamp cannot change (at least for now).
      
      Test plan (on devserver):
      ```
      $COMPILE_WITH_ASAN=1 make -j32 all
      $./db_basic_test --gtest_filter=Timestamp/DBBasicTestWithTimestampWithParam.PutAndGet/*
      $make check
      ```
      All tests must pass.
      
      We also run the following db_bench tests to verify whether there is regression on Get/Put while timestamp is not enabled.
      ```
      $TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillseq,readrandom -num=1000000
      $TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=1000000
      ```
      Repeat for 6 times for both versions.
      
      Results are as follows:
      ```
      |        | readrandom | fillrandom |
      | master | 16.77 MB/s | 47.05 MB/s |
      | PR5079 | 16.44 MB/s | 47.03 MB/s |
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5079
      
      Differential Revision: D15132946
      
      Pulled By: riversand963
      
      fbshipit-source-id: 833a0d657eac21182f0f206c910a6438154c742c
      340ed4fa
    • Yanqin Jin's avatar
      Fix tsan error (#5414) · cb1bf09b
      Yanqin Jin authored
      Summary:
      Previous code has a warning when compile with tsan, leading to an error since we have -Werror.
      Compilation result
      ```
      In file included from ./env/env_chroot.h:12,
                       from env/env_test.cc:40:
      ./include/rocksdb/env.h: In instantiation of ‘rocksdb::Status rocksdb::DynamicLibrary::LoadFunction(const string&, std::function<T>*) [with T = void*(void*, const char*); std::__cxx11::string = std::__cxx11::basic_string<char>]’:
      env/env_test.cc:260:5:   required from here
      ./include/rocksdb/env.h:1010:17: error: cast between incompatible function types from ‘rocksdb::DynamicLibrary::FunctionPtr’ {aka ‘void* (*)()’} to ‘void* (*)(void*, const char*)’ [-Werror=cast-function-type]
           *function = reinterpret_cast<T*>(ptr);
                       ^~~~~~~~~~~~~~~~~~~~~~~~~
      cc1plus: all warnings being treated as errors
      make: *** [env/env_test.o] Error 1
      ```
      It also has another error reported by clang
      ```
      env/env_posix.cc:141:11: warning: Value stored to 'err' during its initialization is never read
          char* err = dlerror();  // Clear any old error
                ^~~   ~~~~~~~~~
      1 warning generated.
      ```
      
      Test plan (on my devserver).
      ```
      $make clean
      $OPT=-g ROCKSDB_FBCODE_BUILD_WITH_PLATFORM007=1 COMPILE_WITH_TSAN=1 make -j32
      $
      $make clean
      $USE_CLANG=1 TEST_TMPDIR=/dev/shm/rocksdb OPT=-g make -j1 analyze
      ```
      Both should pass.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5414
      
      Differential Revision: D15637315
      
      Pulled By: riversand963
      
      fbshipit-source-id: 8e307483761019a4d5998cab92d49516d7edffbf
      cb1bf09b
    • Yanqin Jin's avatar
      Disable dynamic extension support by default for CMake (#5419) · 267b9b10
      Yanqin Jin authored
      Summary:
      We have users reporting linking error while building RocksDB using CMake, and we do not enable dynamic extension feature for them. The fix is to add `-DROCKSDB_NO_DYNAMIC_EXTENSION` to CMake by default.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5419
      
      Differential Revision: D15676792
      
      Pulled By: riversand963
      
      fbshipit-source-id: d45aaacfc64ea61646fd7329c352cd760145baf3
      267b9b10
    • anand76's avatar
      Add a MultiRead() method to Env (#5311) · 0153e145
      anand76 authored
      Summary:
      Define the Env:: MultiRead() method to allow callers to request multiple block reads in one shot. The underlying Env implementation can parallelize it if it chooses to in order to reduce the overall IO latency.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5311
      
      Differential Revision: D15502172
      
      Pulled By: anand1976
      
      fbshipit-source-id: 2b228269c2e11b5f54694d6b2bb3119c8a8ce2b9
      0153e145
  3. 05 Jun, 2019 2 commits
  4. 04 Jun, 2019 5 commits
    • Mark Rambacher's avatar
      Add support for loading dynamic libraries into the RocksDB environment (#5281) · c8267120
      Mark Rambacher authored
      Summary:
      This change adds a Dynamic Library class to the RocksDB Env.  Dynamic libraries are populated via the  Env::LoadLibrary method.
      
      The addition of dynamic library support allows for a few different features to be developed:
      1.  The compression code can be changed to use dynamic library support.  This would allow RocksDB to determine at run-time what compression packages were installed.  This change would eliminate the need to make sure the build-time and run-time environment had the same library set.  It would also simplify some of the Java build issues (where it attempts to build and include various packages inside the RocksDB jars).
      
      2.  Along with other features (to be provided in a subsequent PR), this change would allow code/configurations to be added to RocksDB at run-time.  For example, the build system includes code for building an "rados" environment and adding "Cassandra" features.  Instead of these extensions being built into the base RocksDB code, these extensions could be loaded at run-time as required/appropriate, either by configuration or explicitly.
      
      We intend to push out other changes in support of the extending RocksDB at run-time via configurations.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5281
      
      Differential Revision: D15447613
      
      Pulled By: riversand963
      
      fbshipit-source-id: 452cd4f54511c0bceee18f6d9d919aae9fd25fef
      c8267120
    • anand76's avatar
      Ignore shutdown error during compaction (#5400) · 5d6e8df1
      anand76 authored
      Summary:
      The PR #5275 separated the column dropped and shutdown status codes. However, there were a couple of places in compaction where this change ended up treating a ShutdownInProgress() error as a real error and set bg_error. This caused MyRocks unit test to fail due to WAL writes during shutdown returning this error. Fix it by ignoring the shutdown status during compaction.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5400
      
      Differential Revision: D15611680
      
      Pulled By: anand1976
      
      fbshipit-source-id: c602e97840e3ae24eb420d61e0ce95d3e6258632
      5d6e8df1
    • Maysam Yabandeh's avatar
      Call ValidateOptions from SetOptions (#5368) · ae05a83e
      Maysam Yabandeh authored
      Summary:
      Currently we validate options in DB::Open. However the validation step is missing when options are dynamically updated in ::SetOptions. The patch fixes that.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5368
      
      Differential Revision: D15540101
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d27bbffd8f0252d1b50bcf59e0a70a278ed937f4
      ae05a83e
    • Siying Dong's avatar
      Move util/trace_replay.* to trace_replay/ (#5376) · 5851cb7f
      Siying Dong authored
      Summary:
      util/ means for lower level libraries. trace_replay is highly integrated to DB and sometimes call DB. Move it out to a separate directory.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5376
      
      Differential Revision: D15550938
      
      Pulled By: siying
      
      fbshipit-source-id: f46dce5ceffdc05a73f26379c7bb1b79ebe6c207
      5851cb7f
    • haoyuhuang's avatar
      Make GetEntryFromCache a member function. (#5394) · 349db904
      haoyuhuang authored
      Summary:
      The commit makes GetEntryFromCache become a member function. It also makes all its callers become member functions.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5394
      
      Differential Revision: D15579222
      
      Pulled By: HaoyuHuang
      
      fbshipit-source-id: 07509c42ee9022dcded54950012bd3bd562aa1ae
      349db904
  5. 01 Jun, 2019 8 commits
  6. 31 May, 2019 14 commits
    • Yanqin Jin's avatar
      Fix compilation error in LITE mode (#5391) · 83f7a8ee
      Yanqin Jin authored
      Summary:
      Add macro ROCKSDB_LITE to fix compilation.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5391
      
      Differential Revision: D15574522
      
      Pulled By: riversand963
      
      fbshipit-source-id: 95aea83c5d9b2bf98a3ba0ef9167b63c9be2988b
      83f7a8ee
    • Zhongyi Xie's avatar
      move LevelCompactionPicker to a separate file (#5369) · ab8f6c01
      Zhongyi Xie authored
      Summary:
      In order to improve code readability, this PR moves LevelCompactionBuilder and LevelCompactionPicker to compaction_picker_level.h and .cc
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5369
      
      Differential Revision: D15540172
      
      Pulled By: miasantreble
      
      fbshipit-source-id: c1a578b93f127cd63661b53f32b356e6edd349af
      ab8f6c01
    • Sagar Vemuri's avatar
      Reorder DBImpl's private section (#5385) · ff9d2868
      Sagar Vemuri authored
      Summary:
      The methods and fields in the private section of DBImpl were all intermingled, making it hard to figure out where the fields/methods start and where they end. I cleaned up the code a little so that all the type declaration are at the beginning, followed by methods, and all the data fields are at the end. This follows
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5385
      
      Differential Revision: D15566978
      
      Pulled By: sagar0
      
      fbshipit-source-id: 4618a7d819ad4e2d7cc9ae1af2c59f400140bb1b
      ff9d2868
    • Yanqin Jin's avatar
      Fix WAL replay by skipping old write batches (#5170) · b9f59006
      Yanqin Jin authored
      Summary:
      1. Fix a bug in WAL replay in which write batches with old sequence numbers are mistakenly inserted into memtables.
      2. Add support for benchmarking secondary instance to db_bench_tool.
      With changes made in this PR, we can start benchmarking secondary instance
      using two processes. It is also possible to vary the frequency at which the
      secondary instance tries to catch up with the primary. The info log of the
      secondary can be found in a directory whose path can be specified with
      '-secondary_path'.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5170
      
      Differential Revision: D15564608
      
      Pulled By: riversand963
      
      fbshipit-source-id: ce97688ed3d33f69d3a0b9266ebbbbf887aa0ec8
      b9f59006
    • Siying Dong's avatar
      Move some memory related files from util/ to memory/ (#5382) · 8843129e
      Siying Dong authored
      Summary:
      Move arena, allocator, and memory tools under util to a separate memory/ directory.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5382
      
      Differential Revision: D15564655
      
      Pulled By: siying
      
      fbshipit-source-id: 9cd6b5d0d3d52b39606e19221fa154596e5852a5
      8843129e
    • Yanqin Jin's avatar
      Add class-level comments to version-related classes (#5348) · f1302eba
      Yanqin Jin authored
      Summary:
      As title.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5348
      
      Differential Revision: D15564595
      
      Pulled By: riversand963
      
      fbshipit-source-id: dd45aa86a70e0343c2e9ef702fad165163f548e6
      f1302eba
    • Sagar Vemuri's avatar
      Fix flaky DBTest2.PresetCompressionDict test (#5378) · 1b59a490
      Sagar Vemuri authored
      Summary:
      Fix flaky DBTest2.PresetCompressionDict test.
      
      This PR fixes two issues with the test:
      1. Replaces `GetSstFiles` with `TotalSize`, which is based on `DB::GetColumnFamilyMetaData` so that only the size of the live SST files is taken into consideration when computing the total size of all sst files. Earlier, with `GetSstFiles`, even obsolete files were getting picked up.
      1. In ZSTD compression, it is sometimes possible that using a trained dictionary is not better than using an untrained one. Using a trained dictionary performs well in 99% of the cases, but still in the remaining ~1% of the cases (out of 10000 runs) using an untrained dictionary gets better compression results.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5378
      
      Differential Revision: D15559100
      
      Pulled By: sagar0
      
      fbshipit-source-id: c35adbf13871f520a2cec48f8bad9ff27ff7a0b4
      1b59a490
    • Vijay Nadimpalli's avatar
      Organizing rocksdb/table directory by format · 50e47079
      Vijay Nadimpalli authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5373
      
      Differential Revision: D15559425
      
      Pulled By: vjnadimpalli
      
      fbshipit-source-id: 5d6d6d615582bedd96a4b879bb25d429a6de8b55
      50e47079
    • Sagar Vemuri's avatar
      Fix env_options_for_read spelling in CompactionJob · e6298626
      Sagar Vemuri authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5380
      
      Differential Revision: D15563386
      
      Pulled By: sagar0
      
      fbshipit-source-id: 8b26aef47cfc40ff8016daf815582f21cdd40df2
      e6298626
    • Levi Tamasi's avatar
      Move the index readers out of the block cache (#5298) · 1e355842
      Levi Tamasi authored
      Summary:
      Currently, when the block cache is used for index blocks as well, it is
      not really the index block that is stored in the cache but an
      IndexReader object. Since this object is not pure data (it has, for
      instance, pointers that might dangle), it's not really sharable. To
      avoid the issues around this, the current code uses a dummy unique cache
      key for each TableReader to store the IndexReader, and erases the
      IndexReader entry when the TableReader is closed. Instead of doing this,
      the new code moves the IndexReader out of the cache altogether. In
      particular, instead of the TableReader owning, or caching/pinning the
      IndexReader based on the customer's settings, the TableReader
      unconditionally owns the IndexReader, which in turn owns/caches/pins
      the index block (which is itself sharable and thus can be safely put in
      the cache without any hacks).
      
      Note: the change has two side effects:
      1) Partitions of partitioned indexes no longer affect the read
      amplification statistics.
      2) Eviction statistics for index blocks are temporarily broken. We plan to fix
      this in a separate phase.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5298
      
      Differential Revision: D15303203
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 935a69ba59d87d5e44f42e2310619b790c366e47
      1e355842
    • anand76's avatar
      Fix reopen voting logic in db_stress when using MultiGet (#5374) · bd44ec20
      anand76 authored
      Summary:
      When the --reopen option is non-zero, the DB is reopened after every ops_per_thread/(reopen+1) ops, with the check being done after every op. With MultiGet, we might do multiple ops in one iteration, which broke the logic that checked when to synchronize among the threads and reopen the DB. This PR fixes that logic.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5374
      
      Differential Revision: D15559780
      
      Pulled By: anand1976
      
      fbshipit-source-id: ee6563a68045df7f367eca3cbc2500d3e26359ef
      bd44ec20
    • Siying Dong's avatar
      Move test related files under util/ to test_util/ (#5377) · e9e0101c
      Siying Dong authored
      Summary:
      There are too many types of files under util/. Some test related files don't belong to there or just are just loosely related. Mo
      ve them to a new directory test_util/, so that util/ is cleaner.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5377
      
      Differential Revision: D15551366
      
      Pulled By: siying
      
      fbshipit-source-id: 0f5c8653832354ef8caa31749c0143815d719e2c
      e9e0101c
    • anand76's avatar
      Increase Trash/DB size ratio in DBSSTTest.RateLimitedWALDelete (#5366) · a984040f
      anand76 authored
      Summary:
      By increasing the ratio, we ensure that all files go through background deletion and eliminate flakiness due to timing of deletions.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5366
      
      Differential Revision: D15549992
      
      Pulled By: anand1976
      
      fbshipit-source-id: d137375cd791fc1a802841412755d6e2b8fd7688
      a984040f
    • Zhongyi Xie's avatar
      Fix FIFO dynamic options sanitization (#5367) · 87fe4bca
      Zhongyi Xie authored
      Summary:
      When dynamically setting options, we check the option type info and skip options that are marked deprecated. However this check is only done at top level, which results in bugs where SetOptions will corrupt option values and cause unexpected system behavior iff a deprecated second level option is set dynamically.
      For exmaple, the following call:
      ```
      dbfull()->SetOptions(
          {{"compaction_options_fifo",
              "{allow_compaction=true;max_table_files_size=1024;ttl=731;}"}});
      ```
      was from pre 6.0 release when `ttl` was part of `compaction_options_fifo`. Now that it got moved out of `compaction_options_fifo`, this call will incorrectly set `compaction_options_fifo.max_table_files_size` to 731 (as `max_table_files_size` is the first one in `OptionsHelper::fifo_compaction_options_type_info` struct) and cause files to gett evicted much faster than expected.
      
      This PR adds verification to second level options like `compaction_options_fifo.ttl` or `compaction_options_fifo.max_table_files_size` when set dynamically, and filter out those marked as deprecated.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5367
      
      Differential Revision: D15530998
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 818258be5c3abe09cd82d62f3c083572d70fecdd
      87fe4bca
  7. 30 May, 2019 1 commit
  8. 29 May, 2019 3 commits
    • Maysam Yabandeh's avatar
      WritePrepared: skip_concurrency_control option (#5330) · eab4f49a
      Maysam Yabandeh authored
      Summary:
      This enables the user to set TransactionDBOptions::skip_concurrency_control so the standard `DB::Write(const WriteOptions& opts, WriteBatch* updates)` would skip the concurrency control. This would give higher throughput to the users who know their use case doesn't need concurrency control.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5330
      
      Differential Revision: D15525932
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 68421ac1ba34f549a4a8de9ce4c2dccf6fb4b06b
      eab4f49a
    • Maysam Yabandeh's avatar
      WritePrepared: disableWAL in commit without prepare (#5327) · f5576c33
      Maysam Yabandeh authored
      Summary:
      When committing a transaction without prepare, WritePrepared simply writes the batch to db and add the commit entry to CommitCache. When two_write_queues=true, following the rule of committing only from 2nd write queue, the first write, writes the batch and the only thing the 2nd write does is to write the commit entry to CommitCache. Currently the write batch in 2nd write is set to an empty LogData entry, while the write to the WAL could simply be entirely disabled.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5327
      
      Differential Revision: D15424546
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 3d9ea3922d5196984c584d62a3ed57e1f7ca7b9f
      f5576c33
    • Siying Dong's avatar
      Add comments in compaction_picker.h · 4d0c3b1f
      Siying Dong authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5357
      
      Differential Revision: D15522825
      
      Pulled By: siying
      
      fbshipit-source-id: d775386b9d10c7179f5d3af2c821ed213abfacdf
      4d0c3b1f