1. 01 May, 2020 5 commits
    • anand76's avatar
      Pass a timeout to FileSystem for random reads (#6751) · ab13d43e
      anand76 authored
      Calculate ```IOOptions::timeout``` using ```ReadOptions::deadline``` and pass it to ```FileSystem::Read/FileSystem::MultiRead```. This allows us to impose a tighter bound on the time taken by Get/MultiGet on FileSystem/Envs that support IO timeouts. Even on those that don't support, check in ```RandomAccessFileReader::Read``` and ```MultiRead``` and return ```Status::TimedOut()``` if the deadline is exceeded.
      For now, TableReader creation, which might do file opens and reads, are not covered. It will be implemented in another PR.
      Update existing unit tests to verify the correct timeout value is being passed
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6751
      Reviewed By: riversand963
      Differential Revision: D21285631
      Pulled By: anand1976
      fbshipit-source-id: d89af843e5a91ece866e87aa29438b52a65a8567
    • Peter Dillinger's avatar
      Fix assertion that can fail on sst corruption (#6780) · eecd8fba
      Peter Dillinger authored
      An assertion that a char == a CompressionType (unsigned char)
      originally cast from a char can fail if the original value is negative,
      due to numeric promotion.  The assertion should pass even if the value
      is invalid CompressionType, because the callee
      UncompressBlockContentsForCompressionType checks for that and reports
      status appropriately.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6780
      Test Plan:
      Temporarily change kZSTD = 0x88 and see tests fail. Make this
      change (in addition), and tests pass.
      Reviewed By: siying
      Differential Revision: D21328498
      Pulled By: pdillinger
      fbshipit-source-id: 61caf8d815581ce49261ecb7ab0f396e9ac4bb92
    • Levi Tamasi's avatar
      Keep track of obsolete blob files in VersionSet (#6755) · fe238e54
      Levi Tamasi authored
      The patch adds logic to keep track of obsolete blob files. A blob file becomes
      obsolete when the last `shared_ptr` that points to the corresponding
      `SharedBlobFileMetaData` object goes away, which, in turn, happens when the
      last `Version` that contains the blob file is destroyed. No longer needed blob
      files are added to the obsolete list in `VersionSet` using a custom deleter to
      avoid unnecessary coupling between `SharedBlobFileMetaData` and `VersionSet`.
      Obsolete blob files are returned by `VersionSet::GetObsoleteFiles` and stored
      in `JobContext`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6755
      Test Plan: `make check`
      Reviewed By: riversand963
      Differential Revision: D21233155
      Pulled By: ltamasi
      fbshipit-source-id: 47757e06fdc0127f27ed57f51abd27893d9a7b7a
    • Adam Retter's avatar
      Add Slack forum to README (#6773) · cf342464
      Adam Retter authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6773
      Reviewed By: siying
      Differential Revision: D21310229
      Pulled By: pdillinger
      fbshipit-source-id: c0d52d0c51121d307d7d5c1374abc7bf78b0c4cf
    • Ziyue Yang's avatar
      Add an option for parallel compression in for db_stress (#6722) · e619a20e
      Ziyue Yang authored
      This commit adds an `compression_parallel_threads` option in
      db_stress. It also fixes the naming of parallel compression
      option in db_bench to keep it aligned with others.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6722
      Reviewed By: pdillinger
      Differential Revision: D21091385
      fbshipit-source-id: c9ba8c4e5cc327ff9e6094a6dc6a15fcff70f100
  2. 30 Apr, 2020 4 commits
  3. 29 Apr, 2020 6 commits
    • Peter Dillinger's avatar
      Fix LITE build (#6770) · 8086e5e2
      Peter Dillinger authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6770
      Test Plan: make LITE=1 check
      Reviewed By: ajkr
      Differential Revision: D21296261
      Pulled By: pdillinger
      fbshipit-source-id: b6075cc13a6d6db48617b7e0e9ebeea9364dfd9f
    • anand76's avatar
      Fix a valgrind failure due to DBBasicTestMultiGetDeadline (#6756) · 335ea73e
      anand76 authored
      Fix a valgrind failure.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6756
      Test Plan: valgrind_test
      Reviewed By: pdillinger
      Differential Revision: D21284660
      Pulled By: anand1976
      fbshipit-source-id: 39bf1bd130b6adb585ddbf2f9aa2f53dbf666f80
    • mrambacher's avatar
      Add Functions to OptionTypeInfo (#6422) · 618bf638
      mrambacher authored
      Added functions for parsing, serializing, and comparing elements to OptionTypeInfo.  These functions allow all of the special cases that could not be handled directly in the map of OptionTypeInfo to be moved into the map.  Using these functions, every type can be handled via the map rather than special cased.
      By adding these functions, the code for handling options can become more standardized (fewer special cases) and (eventually) handled completely by common classes.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6422
      Test Plan: pass make check
      Reviewed By: siying
      Differential Revision: D21269005
      Pulled By: zhichao-cao
      fbshipit-source-id: 9ba71c721a38ebf9ee88259d60bd81b3282b9077
    • Peter Dillinger's avatar
      Clarifying comments in db.h (#6768) · b810e62b
      Peter Dillinger authored
      And fix a confusingly worded log message
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6768
      Reviewed By: anand1976
      Differential Revision: D21284527
      Pulled By: pdillinger
      fbshipit-source-id: f03c1422c229a901c3a65e524740452349626164
    • Peter Dillinger's avatar
      Basic MultiGet support for partitioned filters (#6757) · bae6f586
      Peter Dillinger authored
      In MultiGet, access each applicable filter partition only once
      per batch, rather than for each applicable key. Also,
      * Fix Bloom stats for MultiGet
      * Fix/refactor MultiGetContext::Range::KeysLeft, including
      * Add efficient BitsSetToOne implementation
      * Assert that MultiGetContext::Range does not go beyond shift range
      Performance test: Generate db:
          $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10 -partition_index_and_filters=true
      Before (middle performing run of three; note some missing Bloom stats):
          $ ./db_bench --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
          multireadrandom :      26.403 micros/op 597517 ops/sec; (548427 of 671968 found)
          rocksdb.block.cache.filter.hit COUNT : 83443275
          rocksdb.bloom.filter.useful COUNT : 0
          rocksdb.bloom.filter.full.positive COUNT : 0
          rocksdb.bloom.filter.full.true.positive COUNT : 7931450
          rocksdb.number.multiget.get COUNT : 385984
          rocksdb.number.multiget.keys.read COUNT : 12351488
          rocksdb.number.multiget.bytes.read COUNT : 793145000
          rocksdb.number.multiget.keys.found COUNT : 7931450
      After (middle performing run of three):
          $ ./db_bench_new --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
          multireadrandom :      21.024 micros/op 752963 ops/sec; (705188 of 863968 found)
          rocksdb.block.cache.filter.hit COUNT : 49856682
          rocksdb.bloom.filter.useful COUNT : 45684579
          rocksdb.bloom.filter.full.positive COUNT : 10395458
          rocksdb.bloom.filter.full.true.positive COUNT : 9908456
          rocksdb.number.multiget.get COUNT : 481984
          rocksdb.number.multiget.keys.read COUNT : 15423488
          rocksdb.number.multiget.bytes.read COUNT : 990845600
          rocksdb.number.multiget.keys.found COUNT : 9908456
      So that's about 25% higher throughput even for random keys
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6757
      Test Plan: unit test included
      Reviewed By: anand1976
      Differential Revision: D21243256
      Pulled By: pdillinger
      fbshipit-source-id: 5644a1468d9e8c8575be02f4e04bc5d62dbbb57f
    • Peter Dillinger's avatar
      HISTORY.md update for bzip upgrade (#6767) · a7f0b27b
      Peter Dillinger authored
      See https://github.com/facebook/rocksdb/issues/6714 and https://github.com/facebook/rocksdb/issues/6703
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6767
      Reviewed By: riversand963
      Differential Revision: D21283307
      Pulled By: pdillinger
      fbshipit-source-id: 8463bec725669d13846c728ad4b5bde43f9a84f8
  4. 28 Apr, 2020 8 commits
    • Peter Dillinger's avatar
      Update HISTORY.md for block cache redundant adds (#6764) · 4574d751
      Peter Dillinger authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6764
      Reviewed By: ltamasi
      Differential Revision: D21267108
      Pulled By: pdillinger
      fbshipit-source-id: a3dfe2dbe4e8f6309a53eb72903ef58d52308f97
    • Yanqin Jin's avatar
      Fix timestamp support for MultiGet (#6748) · d4398e08
      Yanqin Jin authored
      1. Avoid nullptr dereference when passing timestamp to KeyContext creation.
      2. Construct LookupKey correctly with timestamp when creating MultiGetContext.
      3. Compare without timestamp when sorting KeyContexts.
      Fixes https://github.com/facebook/rocksdb/issues/6745
      Test plan (dev server):
      make check
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6748
      Reviewed By: pdillinger
      Differential Revision: D21258691
      Pulled By: riversand963
      fbshipit-source-id: 44e65b759c18b9986947783edf03be4f890bb004
    • Cheng Chang's avatar
      Fix build under LITE (#6758) · 4cd859ed
      Cheng Chang authored
      GetSupportedCompressions needs to be defined under LITE.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6758
      Test Plan: build under LITE
      Reviewed By: zhichao-cao
      Differential Revision: D21247937
      Pulled By: cheng-chang
      fbshipit-source-id: 880e59d3e107cdd736d16427a68c5641d1318fb4
    • Levi Tamasi's avatar
      Destroy any ColumnFamilyHandles in BlobDB::Open upon error (#6763) · bea91d5d
      Levi Tamasi authored
      If an error happens during BlobDBImpl::Open after the base DB has been
      opened, we need to destroy the `ColumnFamilyHandle`s returned by `DB::Open`
      to prevent an assertion in `ColumnFamilySet`'s destructor from being hit.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6763
      Test Plan: Ran `make check` and tested using the BlobDB mode of `db_bench`.
      Reviewed By: riversand963
      Differential Revision: D21262643
      Pulled By: ltamasi
      fbshipit-source-id: 60ebc7ab19be66cf37fbe5f6d8957d58470f3d3b
    • Albert Hse-Lin Chen's avatar
      Fixed minor typo in comment for MergeOperator::FullMergeV2() (#6759) · cc8d16ef
      Albert Hse-Lin Chen authored
      Fixed minor typo in comment for FullMergeV2().
      Last operand up to snapshot should be +4 instead of +3.
      Signed-off-by: default avatarAlbert Hse-Lin Chen <hselin@kalista.io>
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6759
      Reviewed By: cheng-chang
      Differential Revision: D21260295
      Pulled By: zhichao-cao
      fbshipit-source-id: cc942306f246c8606538feb30bfdf6df9fb6c54e
    • Peter Dillinger's avatar
      Stats for redundant insertions into block cache (#6681) · 249eff0f
      Peter Dillinger authored
      Since read threads do not coordinate on loading data into block
      cache, two threads between Lookup and Insert can end up loading and
      inserting the same data. This is particularly concerning with
      cache_index_and_filter_blocks since those are hot and more likely to
      be race targets if ejected from (or not pre-populated in) the cache.
      Particularly with moves toward disaggregated / network storage, the cost
      of redundant retrieval might be high, and we should at least have some
      hard statistics from which we can estimate impact.
      Example with full filter thrashing "cliff":
          $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 14181
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 476
          rocksdb.block.cache.data.add COUNT : 12749
          rocksdb.block.cache.data.add.redundant COUNT : 18
          rocksdb.block.cache.filter.add COUNT : 1003
          rocksdb.block.cache.filter.add.redundant COUNT : 217
          rocksdb.block.cache.index.add COUNT : 429
          rocksdb.block.cache.index.add.redundant COUNT : 241
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 1182223
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 302728
          rocksdb.block.cache.data.add COUNT : 31425
          rocksdb.block.cache.data.add.redundant COUNT : 12
          rocksdb.block.cache.filter.add COUNT : 795455
          rocksdb.block.cache.filter.add.redundant COUNT : 130238
          rocksdb.block.cache.index.add COUNT : 355343
          rocksdb.block.cache.index.add.redundant COUNT : 172478
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681
      Test Plan: Some manual testing (above) and unit test covering key metrics is included
      Reviewed By: ltamasi
      Differential Revision: D21134113
      Pulled By: pdillinger
      fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87
    • Akanksha Mahajan's avatar
      Allow sst_dump to check size of different compression levels and report time (#6634) · 75b13ea9
      Akanksha Mahajan authored
      Summary : 1. Add two arguments --compression_level_from and --compression_level_to to check
      	  the compression size with different compression level in the given range. Users must
                specify one compression type else it will error out. Both from and to levels must
      	  also be specified together.
      	  2. Display the time taken to compress each file with different compressions by default.
      Test Plan : make -j64 check
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6634
      Test Plan: make -j64 check
      Reviewed By: anand1976
      Differential Revision: D20810282
      Pulled By: akankshamahajan15
      fbshipit-source-id: ac9098d3c079a1fad098f6678dbedb4d888a791b
    • Peter Dillinger's avatar
      Understand common build variables passed as make variables (#6740) · 791e5714
      Peter Dillinger authored
      Some common build variables like USE_CLANG and
      COMPILE_WITH_UBSAN did not work if specified as make variables, as in
      `make USE_CLANG=1 check` etc. rather than (in theory less hygienic)
      `USE_CLANG=1 make check`. This patches Makefile to export some commonly
      used ones to build_detect_platform so that they work. (I'm skeptical of
      a broad `export` in Makefile because it's hard to predict how random
      make variables might affect various invoked tools.)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6740
      Test Plan: manual / CI
      Reviewed By: siying
      Differential Revision: D21229011
      Pulled By: pdillinger
      fbshipit-source-id: b00c69b23eb2a13105bc8d860ce2d1e61ac5a355
  5. 27 Apr, 2020 1 commit
    • Yanqin Jin's avatar
      Update buckifier to unblock future internal release (#6726) · 3b2f2719
      Yanqin Jin authored
      Some recent PRs added new source files or modified TARGETS file manually.
      During next internal release, executing the following command will revert the
      manual changes.
      Update buckifier so that the following command
      python buckfier/buckify_rocksdb.py
      does not change TARGETS file.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6726
      Test Plan:
      python buckifier/buckify_rocksdb.py
      Reviewed By: siying
      Differential Revision: D21098930
      Pulled By: riversand963
      fbshipit-source-id: e884f507fefef88163363c9097a460c98f1ed850
  6. 25 Apr, 2020 4 commits
    • Cheng Chang's avatar
      Disable O_DIRECT in stress test when db directory does not support direct IO (#6727) · 0a776178
      Cheng Chang authored
      In crash test, the db directory might be set to /dev/shm or /tmp, in certain environments such as internal testing infrastructure, neither of these directories support direct IO, so direct IO is never enabled in crash test.
      This PR sets up SyncPoints in direct IO related code paths to disable O_DIRECT flag in calls to `open`, so the direct IO code paths will be executed, all direct IO related assertions will be checked, but no real direct IO request will be issued to the file system.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6727
      Test Plan:
      export CRASH_TEST_EXT_ARGS="--use_direct_reads=1 --mmap_read=0"
      make -j24 crash_test
      Reviewed By: zhichao-cao
      Differential Revision: D21139250
      Pulled By: cheng-chang
      fbshipit-source-id: db9adfe78d91aa4759835b1af91c5db7b27b62ee
    • Cheng Chang's avatar
      Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) · 40497a87
      Cheng Chang authored
      In https://github.com/facebook/rocksdb/pull/6455, we modified the interface of `RandomAccessFileReader::Read` to be able to get rid of memcpy in direct IO mode.
      This PR applies the new interface to `BlockFetcher` when reading blocks from SST files in direct IO mode.
      Without this PR, in direct IO mode, when fetching and uncompressing compressed blocks, `BlockFetcher` will first copy the raw compressed block into `BlockFetcher::compressed_buf_` or `BlockFetcher::stack_buf_` inside `RandomAccessFileReader::Read` depending on the block size. then during uncompressing, it will copy the uncompressed block into `BlockFetcher::heap_buf_`.
      In this PR, we get rid of the first memcpy and directly uncompress the block from `direct_io_buf_` to `heap_buf_`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6689
      Test Plan: A new unit test `block_fetcher_test` is added.
      Reviewed By: anand1976
      Differential Revision: D21006729
      Pulled By: cheng-chang
      fbshipit-source-id: 2370b92c24075692423b81277415feb2aed5d980
    • Cheng Chang's avatar
      Fix unused variable of r in release mode (#6750) · 1758f76f
      Cheng Chang authored
      In release mode, asserts are not compiled, so `r` is not used, causing compiler warnings.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6750
      Test Plan: make check under release mode
      Reviewed By: anand1976
      Differential Revision: D21220365
      Pulled By: cheng-chang
      fbshipit-source-id: fd4afa9843d54af68c4da8660ec61549803e1167
    • anand76's avatar
      Silence false alarms in db_stress fault injection (#6741) · 9e7b7e2c
      anand76 authored
      False alarms are caused by codepaths that intentionally swallow IO
      make crash_test
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6741
      Reviewed By: ltamasi
      Differential Revision: D21181138
      Pulled By: anand1976
      fbshipit-source-id: 5ccfbc68eb192033488de6269e59c00f2c65ce00
  7. 24 Apr, 2020 4 commits
  8. 22 Apr, 2020 3 commits
  9. 21 Apr, 2020 5 commits
    • Andrew Kryczka's avatar
      Prevent uninitialized load in `IndexBlockIter` (#6736) · f9155a34
      Andrew Kryczka authored
      When index block is empty or an error happens while reading it,
      `Invalidate()` is called rather than `Initialize()`. So `Seek()` must
      not refer to member variables that are only initialized in
      `Initialize()` until it is sure `Initialize()` has been called.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6736
      Reviewed By: siying
      Differential Revision: D21139641
      Pulled By: ajkr
      fbshipit-source-id: 71c58cc1adbd795dc3729dd5023bf7df1515ff32
    • Akanksha Mahajan's avatar
      Set max_background_flushes dynamically (#6701) · 03a1d95d
      Akanksha Mahajan authored
      1. Add changes so that max_background_flushes can be set dynamically.
                         2. Add a testcase DBOptionsTest.SetBackgroundFlushThreads which set the
                              max_background_flushes dynamically using SetDBOptions.
      TestPlan:  1. make -j64 check
                        2. Using new testcase DBOptionsTest.SetBackgroundFlushThreads
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6701
      Reviewed By: ajkr
      Differential Revision: D21028010
      Pulled By: akankshamahajan15
      fbshipit-source-id: 5f949e4a8fd3c32537b637947b7ee09a69cfc7c1
    • Peter Dillinger's avatar
      C++20 compatibility (#6697) · 31da5e34
      Peter Dillinger authored
      Based on https://github.com/facebook/rocksdb/issues/6648 (CLA Signed), but heavily modified / extended:
      * Implicit capture of this via [=] deprecated in C++20, and [=,this] not standard before C++20 -> now using explicit capture lists
      * Implicit copy operator deprecated in gcc 9 -> add explicit '= default' definition
      * std::random_shuffle deprecated in C++17 and removed in C++20 -> migrated to a replacement in RocksDB random.h API
      * Add the ability to build with different std version though -DCMAKE_CXX_STANDARD=11/14/17/20 on the cmake command line
      * Minimal rebuild flag of MSVC is deprecated and is forbidden with /std:c++latest (C++20)
      * Added MSVC 2019 C++11 & MSVC 2019 C++20 in AppVeyor
      * Added GCC 9 C++11 & GCC9 C++20 in Travis
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6697
      Test Plan: make check and CI
      Reviewed By: cheng-chang
      Differential Revision: D21020318
      Pulled By: pdillinger
      fbshipit-source-id: 12311be5dbd8675a0e2c817f7ec50fa11c18ab91
    • sdong's avatar
      crash_test to cover index_type kBinarySearchWithFirstKey (#6721) · fe206f4f
      sdong authored
      Recently index_type kBinarySearchWithFirstKey is improved so that the API guarantee is exactly the same as other types and it is ready for wide production. We should cover it in crash tst.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6721
      Test Plan: Run crash_test
      Reviewed By: anand1976
      Differential Revision: D21099781
      fbshipit-source-id: fda91eba831d9eacbb140c703e9768bb1701f935
    • Peter Dillinger's avatar
      Fix tabs and lint-ignores (#6734) · 45d2b4ef
      Peter Dillinger authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6734
      Reviewed By: cheng-chang
      Differential Revision: D21134556
      Pulled By: pdillinger
      fbshipit-source-id: 3636cc1d1333137b70031f8277458781c21631fb