1. 29 Apr, 2020 2 commits
    • Peter Dillinger's avatar
      Basic MultiGet support for partitioned filters (#6757) · bae6f586
      Peter Dillinger authored
      In MultiGet, access each applicable filter partition only once
      per batch, rather than for each applicable key. Also,
      * Fix Bloom stats for MultiGet
      * Fix/refactor MultiGetContext::Range::KeysLeft, including
      * Add efficient BitsSetToOne implementation
      * Assert that MultiGetContext::Range does not go beyond shift range
      Performance test: Generate db:
          $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10 -partition_index_and_filters=true
      Before (middle performing run of three; note some missing Bloom stats):
          $ ./db_bench --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
          multireadrandom :      26.403 micros/op 597517 ops/sec; (548427 of 671968 found)
          rocksdb.block.cache.filter.hit COUNT : 83443275
          rocksdb.bloom.filter.useful COUNT : 0
          rocksdb.bloom.filter.full.positive COUNT : 0
          rocksdb.bloom.filter.full.true.positive COUNT : 7931450
          rocksdb.number.multiget.get COUNT : 385984
          rocksdb.number.multiget.keys.read COUNT : 12351488
          rocksdb.number.multiget.bytes.read COUNT : 793145000
          rocksdb.number.multiget.keys.found COUNT : 7931450
      After (middle performing run of three):
          $ ./db_bench_new --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
          multireadrandom :      21.024 micros/op 752963 ops/sec; (705188 of 863968 found)
          rocksdb.block.cache.filter.hit COUNT : 49856682
          rocksdb.bloom.filter.useful COUNT : 45684579
          rocksdb.bloom.filter.full.positive COUNT : 10395458
          rocksdb.bloom.filter.full.true.positive COUNT : 9908456
          rocksdb.number.multiget.get COUNT : 481984
          rocksdb.number.multiget.keys.read COUNT : 15423488
          rocksdb.number.multiget.bytes.read COUNT : 990845600
          rocksdb.number.multiget.keys.found COUNT : 9908456
      So that's about 25% higher throughput even for random keys
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6757
      Test Plan: unit test included
      Reviewed By: anand1976
      Differential Revision: D21243256
      Pulled By: pdillinger
      fbshipit-source-id: 5644a1468d9e8c8575be02f4e04bc5d62dbbb57f
    • Peter Dillinger's avatar
      HISTORY.md update for bzip upgrade (#6767) · a7f0b27b
      Peter Dillinger authored
      See https://github.com/facebook/rocksdb/issues/6714 and https://github.com/facebook/rocksdb/issues/6703
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6767
      Reviewed By: riversand963
      Differential Revision: D21283307
      Pulled By: pdillinger
      fbshipit-source-id: 8463bec725669d13846c728ad4b5bde43f9a84f8
  2. 28 Apr, 2020 8 commits
    • Peter Dillinger's avatar
      Update HISTORY.md for block cache redundant adds (#6764) · 4574d751
      Peter Dillinger authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6764
      Reviewed By: ltamasi
      Differential Revision: D21267108
      Pulled By: pdillinger
      fbshipit-source-id: a3dfe2dbe4e8f6309a53eb72903ef58d52308f97
    • Yanqin Jin's avatar
      Fix timestamp support for MultiGet (#6748) · d4398e08
      Yanqin Jin authored
      1. Avoid nullptr dereference when passing timestamp to KeyContext creation.
      2. Construct LookupKey correctly with timestamp when creating MultiGetContext.
      3. Compare without timestamp when sorting KeyContexts.
      Fixes https://github.com/facebook/rocksdb/issues/6745
      Test plan (dev server):
      make check
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6748
      Reviewed By: pdillinger
      Differential Revision: D21258691
      Pulled By: riversand963
      fbshipit-source-id: 44e65b759c18b9986947783edf03be4f890bb004
    • Cheng Chang's avatar
      Fix build under LITE (#6758) · 4cd859ed
      Cheng Chang authored
      GetSupportedCompressions needs to be defined under LITE.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6758
      Test Plan: build under LITE
      Reviewed By: zhichao-cao
      Differential Revision: D21247937
      Pulled By: cheng-chang
      fbshipit-source-id: 880e59d3e107cdd736d16427a68c5641d1318fb4
    • Levi Tamasi's avatar
      Destroy any ColumnFamilyHandles in BlobDB::Open upon error (#6763) · bea91d5d
      Levi Tamasi authored
      If an error happens during BlobDBImpl::Open after the base DB has been
      opened, we need to destroy the `ColumnFamilyHandle`s returned by `DB::Open`
      to prevent an assertion in `ColumnFamilySet`'s destructor from being hit.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6763
      Test Plan: Ran `make check` and tested using the BlobDB mode of `db_bench`.
      Reviewed By: riversand963
      Differential Revision: D21262643
      Pulled By: ltamasi
      fbshipit-source-id: 60ebc7ab19be66cf37fbe5f6d8957d58470f3d3b
    • Albert Hse-Lin Chen's avatar
      Fixed minor typo in comment for MergeOperator::FullMergeV2() (#6759) · cc8d16ef
      Albert Hse-Lin Chen authored
      Fixed minor typo in comment for FullMergeV2().
      Last operand up to snapshot should be +4 instead of +3.
      Signed-off-by: default avatarAlbert Hse-Lin Chen <hselin@kalista.io>
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6759
      Reviewed By: cheng-chang
      Differential Revision: D21260295
      Pulled By: zhichao-cao
      fbshipit-source-id: cc942306f246c8606538feb30bfdf6df9fb6c54e
    • Peter Dillinger's avatar
      Stats for redundant insertions into block cache (#6681) · 249eff0f
      Peter Dillinger authored
      Since read threads do not coordinate on loading data into block
      cache, two threads between Lookup and Insert can end up loading and
      inserting the same data. This is particularly concerning with
      cache_index_and_filter_blocks since those are hot and more likely to
      be race targets if ejected from (or not pre-populated in) the cache.
      Particularly with moves toward disaggregated / network storage, the cost
      of redundant retrieval might be high, and we should at least have some
      hard statistics from which we can estimate impact.
      Example with full filter thrashing "cliff":
          $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 14181
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 476
          rocksdb.block.cache.data.add COUNT : 12749
          rocksdb.block.cache.data.add.redundant COUNT : 18
          rocksdb.block.cache.filter.add COUNT : 1003
          rocksdb.block.cache.filter.add.redundant COUNT : 217
          rocksdb.block.cache.index.add COUNT : 429
          rocksdb.block.cache.index.add.redundant COUNT : 241
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 1182223
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 302728
          rocksdb.block.cache.data.add COUNT : 31425
          rocksdb.block.cache.data.add.redundant COUNT : 12
          rocksdb.block.cache.filter.add COUNT : 795455
          rocksdb.block.cache.filter.add.redundant COUNT : 130238
          rocksdb.block.cache.index.add COUNT : 355343
          rocksdb.block.cache.index.add.redundant COUNT : 172478
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681
      Test Plan: Some manual testing (above) and unit test covering key metrics is included
      Reviewed By: ltamasi
      Differential Revision: D21134113
      Pulled By: pdillinger
      fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87
    • Akanksha Mahajan's avatar
      Allow sst_dump to check size of different compression levels and report time (#6634) · 75b13ea9
      Akanksha Mahajan authored
      Summary : 1. Add two arguments --compression_level_from and --compression_level_to to check
      	  the compression size with different compression level in the given range. Users must
                specify one compression type else it will error out. Both from and to levels must
      	  also be specified together.
      	  2. Display the time taken to compress each file with different compressions by default.
      Test Plan : make -j64 check
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6634
      Test Plan: make -j64 check
      Reviewed By: anand1976
      Differential Revision: D20810282
      Pulled By: akankshamahajan15
      fbshipit-source-id: ac9098d3c079a1fad098f6678dbedb4d888a791b
    • Peter Dillinger's avatar
      Understand common build variables passed as make variables (#6740) · 791e5714
      Peter Dillinger authored
      Some common build variables like USE_CLANG and
      COMPILE_WITH_UBSAN did not work if specified as make variables, as in
      `make USE_CLANG=1 check` etc. rather than (in theory less hygienic)
      `USE_CLANG=1 make check`. This patches Makefile to export some commonly
      used ones to build_detect_platform so that they work. (I'm skeptical of
      a broad `export` in Makefile because it's hard to predict how random
      make variables might affect various invoked tools.)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6740
      Test Plan: manual / CI
      Reviewed By: siying
      Differential Revision: D21229011
      Pulled By: pdillinger
      fbshipit-source-id: b00c69b23eb2a13105bc8d860ce2d1e61ac5a355
  3. 27 Apr, 2020 1 commit
    • Yanqin Jin's avatar
      Update buckifier to unblock future internal release (#6726) · 3b2f2719
      Yanqin Jin authored
      Some recent PRs added new source files or modified TARGETS file manually.
      During next internal release, executing the following command will revert the
      manual changes.
      Update buckifier so that the following command
      python buckfier/buckify_rocksdb.py
      does not change TARGETS file.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6726
      Test Plan:
      python buckifier/buckify_rocksdb.py
      Reviewed By: siying
      Differential Revision: D21098930
      Pulled By: riversand963
      fbshipit-source-id: e884f507fefef88163363c9097a460c98f1ed850
  4. 25 Apr, 2020 4 commits
    • Cheng Chang's avatar
      Disable O_DIRECT in stress test when db directory does not support direct IO (#6727) · 0a776178
      Cheng Chang authored
      In crash test, the db directory might be set to /dev/shm or /tmp, in certain environments such as internal testing infrastructure, neither of these directories support direct IO, so direct IO is never enabled in crash test.
      This PR sets up SyncPoints in direct IO related code paths to disable O_DIRECT flag in calls to `open`, so the direct IO code paths will be executed, all direct IO related assertions will be checked, but no real direct IO request will be issued to the file system.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6727
      Test Plan:
      export CRASH_TEST_EXT_ARGS="--use_direct_reads=1 --mmap_read=0"
      make -j24 crash_test
      Reviewed By: zhichao-cao
      Differential Revision: D21139250
      Pulled By: cheng-chang
      fbshipit-source-id: db9adfe78d91aa4759835b1af91c5db7b27b62ee
    • Cheng Chang's avatar
      Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) · 40497a87
      Cheng Chang authored
      In https://github.com/facebook/rocksdb/pull/6455, we modified the interface of `RandomAccessFileReader::Read` to be able to get rid of memcpy in direct IO mode.
      This PR applies the new interface to `BlockFetcher` when reading blocks from SST files in direct IO mode.
      Without this PR, in direct IO mode, when fetching and uncompressing compressed blocks, `BlockFetcher` will first copy the raw compressed block into `BlockFetcher::compressed_buf_` or `BlockFetcher::stack_buf_` inside `RandomAccessFileReader::Read` depending on the block size. then during uncompressing, it will copy the uncompressed block into `BlockFetcher::heap_buf_`.
      In this PR, we get rid of the first memcpy and directly uncompress the block from `direct_io_buf_` to `heap_buf_`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6689
      Test Plan: A new unit test `block_fetcher_test` is added.
      Reviewed By: anand1976
      Differential Revision: D21006729
      Pulled By: cheng-chang
      fbshipit-source-id: 2370b92c24075692423b81277415feb2aed5d980
    • Cheng Chang's avatar
      Fix unused variable of r in release mode (#6750) · 1758f76f
      Cheng Chang authored
      In release mode, asserts are not compiled, so `r` is not used, causing compiler warnings.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6750
      Test Plan: make check under release mode
      Reviewed By: anand1976
      Differential Revision: D21220365
      Pulled By: cheng-chang
      fbshipit-source-id: fd4afa9843d54af68c4da8660ec61549803e1167
    • anand76's avatar
      Silence false alarms in db_stress fault injection (#6741) · 9e7b7e2c
      anand76 authored
      False alarms are caused by codepaths that intentionally swallow IO
      make crash_test
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6741
      Reviewed By: ltamasi
      Differential Revision: D21181138
      Pulled By: anand1976
      fbshipit-source-id: 5ccfbc68eb192033488de6269e59c00f2c65ce00
  5. 24 Apr, 2020 4 commits
  6. 22 Apr, 2020 3 commits
  7. 21 Apr, 2020 5 commits
    • Andrew Kryczka's avatar
      Prevent uninitialized load in `IndexBlockIter` (#6736) · f9155a34
      Andrew Kryczka authored
      When index block is empty or an error happens while reading it,
      `Invalidate()` is called rather than `Initialize()`. So `Seek()` must
      not refer to member variables that are only initialized in
      `Initialize()` until it is sure `Initialize()` has been called.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6736
      Reviewed By: siying
      Differential Revision: D21139641
      Pulled By: ajkr
      fbshipit-source-id: 71c58cc1adbd795dc3729dd5023bf7df1515ff32
    • Akanksha Mahajan's avatar
      Set max_background_flushes dynamically (#6701) · 03a1d95d
      Akanksha Mahajan authored
      1. Add changes so that max_background_flushes can be set dynamically.
                         2. Add a testcase DBOptionsTest.SetBackgroundFlushThreads which set the
                              max_background_flushes dynamically using SetDBOptions.
      TestPlan:  1. make -j64 check
                        2. Using new testcase DBOptionsTest.SetBackgroundFlushThreads
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6701
      Reviewed By: ajkr
      Differential Revision: D21028010
      Pulled By: akankshamahajan15
      fbshipit-source-id: 5f949e4a8fd3c32537b637947b7ee09a69cfc7c1
    • Peter Dillinger's avatar
      C++20 compatibility (#6697) · 31da5e34
      Peter Dillinger authored
      Based on https://github.com/facebook/rocksdb/issues/6648 (CLA Signed), but heavily modified / extended:
      * Implicit capture of this via [=] deprecated in C++20, and [=,this] not standard before C++20 -> now using explicit capture lists
      * Implicit copy operator deprecated in gcc 9 -> add explicit '= default' definition
      * std::random_shuffle deprecated in C++17 and removed in C++20 -> migrated to a replacement in RocksDB random.h API
      * Add the ability to build with different std version though -DCMAKE_CXX_STANDARD=11/14/17/20 on the cmake command line
      * Minimal rebuild flag of MSVC is deprecated and is forbidden with /std:c++latest (C++20)
      * Added MSVC 2019 C++11 & MSVC 2019 C++20 in AppVeyor
      * Added GCC 9 C++11 & GCC9 C++20 in Travis
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6697
      Test Plan: make check and CI
      Reviewed By: cheng-chang
      Differential Revision: D21020318
      Pulled By: pdillinger
      fbshipit-source-id: 12311be5dbd8675a0e2c817f7ec50fa11c18ab91
    • sdong's avatar
      crash_test to cover index_type kBinarySearchWithFirstKey (#6721) · fe206f4f
      sdong authored
      Recently index_type kBinarySearchWithFirstKey is improved so that the API guarantee is exactly the same as other types and it is ready for wide production. We should cover it in crash tst.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6721
      Test Plan: Run crash_test
      Reviewed By: anand1976
      Differential Revision: D21099781
      fbshipit-source-id: fda91eba831d9eacbb140c703e9768bb1701f935
    • Peter Dillinger's avatar
      Fix tabs and lint-ignores (#6734) · 45d2b4ef
      Peter Dillinger authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6734
      Reviewed By: cheng-chang
      Differential Revision: D21134556
      Pulled By: pdillinger
      fbshipit-source-id: 3636cc1d1333137b70031f8277458781c21631fb
  8. 18 Apr, 2020 2 commits
    • Yanqin Jin's avatar
      Add IsDirectory() to Env and FS (#6711) · 243852ec
      Yanqin Jin authored
      IsDirectory() is a common API to check whether a path is a regular file or
      POSIX: call stat() and use S_ISDIR(st_mode)
      Windows: PathIsDirectoryA() and PathIsDirectoryW()
      HDFS: FileSystem.IsDirectory()
      Java: File.IsDirectory()
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6711
      Test Plan: make check
      Reviewed By: anand1976
      Differential Revision: D21053520
      Pulled By: riversand963
      fbshipit-source-id: 680aadfd8ce982b63689190cf31b3145d5a89e27
    • sdong's avatar
      crash_test to cover small max_open_files (#6719) · 63d82e57
      sdong authored
      RocksDB behavior is different while max_open_files is small or large. Add the coverage to small max_open_files.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6719
      Test Plan: Run crash_test
      Reviewed By: pdillinger
      Differential Revision: D21081021
      fbshipit-source-id: e3e211761a9bd25d93d19a61c1f7b62d48cf5e3c
  9. 17 Apr, 2020 7 commits
    • Nicolas Pépin-Perreault's avatar
      Add RocksIterator::Refresh (#6573) · 9e6f3efc
      Nicolas Pépin-Perreault authored
      This PR exposes the `Iterator::Refresh` method to the Java API by adding it on the `RocksIteratorInterface` interface. There are three concrete implementations: `RocksIterator`, `SstFileReaderIterator`, and `WBWIRocksIterator`. For the first two cases, the JNI side simply delegates to the underlying `Iterator::Refresh` method; in the last case, as it doesn't share an ancestor, and per the discussion in https://github.com/facebook/rocksdb/issues/3465, a `Status::NotSupported` exception is thrown.
      As the last PR had no activity in a while, I'm opening a new one - I'm completely fine with merging the previous PR if it gets completed before this is reviewed.
      Let me know if there's anything missing or anything else I can do 👍
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6573
      Reviewed By: cheng-chang
      Differential Revision: D20604666
      Pulled By: pdillinger
      fbshipit-source-id: 4de17df1180c3b87b76cfdd77b674b81fc0563f7
    • Adam Retter's avatar
      Keep building RocksJava on all architectures (#6583) · 9ca49bd4
      Adam Retter authored
      Adding solid support for multiple architectures was initially triggered by RocksJava users. As such I would like to keep the CI for RocksJava on all architectures, to ensure we don't break backwards compatibility.
      pdillinger okay let's see how long it takes to complete Travis-CI with this one...
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6583
      Reviewed By: cheng-chang
      Differential Revision: D21036718
      Pulled By: pdillinger
      fbshipit-source-id: 97afe0db2e4c575cc0284fdc1d4cc45d5deb2272
    • Adam Retter's avatar
      Update RocksJava static version of bzip2 (#6714) · 5fef0ffd
      Adam Retter authored
      Updates the version of bzip2 used for RocksJava static builds.
      Please, can we also get this cherry-picked to:
      1. 6.7.fb
      2. 6.8.fb
      3. 6.9.fb
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6714
      Reviewed By: cheng-chang
      Differential Revision: D21067233
      Pulled By: pdillinger
      fbshipit-source-id: 8164b7eb99c5ca7b2021ab8c371ba9ded4cb4f7e
    • Andrew Kryczka's avatar
      Fix CF import with overlapping SST files (#6663) · 6717ada8
      Andrew Kryczka authored
      Invariant checking should use internal key comparator rather than
      `sstableKeyCompare()`. The latter was intended for checking whether a
      compaction input file's neighboring files need to be included in the
      same compaction. Using it for invariant checking was leading to false
      positives for files with overlapping endpoints.
      Fixes https://github.com/facebook/rocksdb/issues/6647.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6663
      Test Plan: regression test
      Reviewed By: zhichao-cao
      Differential Revision: D20910466
      Pulled By: ajkr
      fbshipit-source-id: f0b70dad7c4096fce635cab7a36f16e14f74ae3f
    • sdong's avatar
      crash_test to cover options.avoid_flush_during_recovery (#6712) · 73523bae
      sdong authored
      Options.avoid_flush_during_recovery is uncovered in crash_test. Add the coverage with a chance of 1/8, as it is a less frequently used options.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6712
      Test Plan: Run crash_test and see the option can be used or not used by chance.
      Reviewed By: ltamasi
      Differential Revision: D21056566
      fbshipit-source-id: c3b1521517cfc204786e6ef8c6acd7fffda64793
    • Yueh-Hsuan Chiang's avatar
      Add env_fault_injection argument to db_stress (#6687) · 5801af46
      Yueh-Hsuan Chiang authored
      Add env_fault_injection argument to db_stress.  When enabled,
      FaultInjectionTestEnv will be used instead.  Currently this
      option does not support running with other env setting.
      This will allow
      us to later manually produce error when running db_crashtest.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6687
      Test Plan:
      make db_stress -j32
      ./db_stress --env_fault_injection
      ./db_stress --env_fault_injection --hdfs   // expect error message
      Reviewed By: ajkr
      Differential Revision: D21014683
      Pulled By: yhchiang
      fbshipit-source-id: 0724aeac37efd57adb72a37defe6dbd3bfa8106a
    • Cheng Chang's avatar
      Fix warning when O_CLOEXEC is not defined (#6695) · 27679723
      Cheng Chang authored
      Compilation fails on systems that do not support O_CLOEXEC. Fix it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6695
      Test Plan: compile without O_CLOEXEC support
      Reviewed By: anand1976
      Differential Revision: D21011850
      Pulled By: cheng-chang
      fbshipit-source-id: f1bf1cce2aa65c7d10b5a9613e941db30e928347
  10. 16 Apr, 2020 3 commits
    • Mike Kolupaev's avatar
      Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) · e45673de
      Mike Kolupaev authored
      Context: Index type `kBinarySearchWithFirstKey` added the ability for sst file iterator to sometimes report a key from index without reading the corresponding data block. This is useful when sst blocks are cut at some meaningful boundaries (e.g. one block per key prefix), and many seeks land between blocks (e.g. for each prefix, the ranges of keys in different sst files are nearly disjoint, so a typical seek needs to read a data block from only one file even if all files have the prefix). But this added a new error condition, which rocksdb code was really not equipped to deal with: `InternalIterator::value()` may fail with an IO error or Status::Incomplete, but it's just a method returning a Slice, with no way to report error instead. Before this PR, this type of error wasn't handled at all (an empty slice was returned), and kBinarySearchWithFirstKey implementation was considered a prototype.
      Now that we (LogDevice) have experimented with kBinarySearchWithFirstKey for a while and confirmed that it's really useful, this PR is adding the missing error handling.
      It's a pretty inconvenient situation implementation-wise. The error needs to be reported from InternalIterator when trying to access value. But there are ~700 call sites of `InternalIterator::value()`, most of which either can't hit the error condition (because the iterator is reading from memtable or from index or something) or wouldn't benefit from the deferred loading of the value (e.g. compaction iterator that reads all values anyway). Adding error handling to all these call sites would needlessly bloat the code. So instead I made the deferred value loading optional: only the call sites that may use deferred loading have to call the new method `PrepareValue()` before calling `value()`. The feature is enabled with a new bool argument `allow_unprepared_value` to a bunch of methods that create iterators (it wouldn't make sense to put it in ReadOptions because it's completely internal to iterators, with virtually no user-visible effect). Lmk if you have better ideas.
      Note that the deferred value loading only happens for *internal* iterators. The user-visible iterator (DBIter) always prepares the value before returning from Seek/Next/etc. We could go further and add an API to defer that value loading too, but that's most likely not useful for LogDevice, so it doesn't seem worth the complexity for now.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6621
      Test Plan: make -j5 check . Will also deploy to some logdevice test clusters and look at stats.
      Reviewed By: siying
      Differential Revision: D20786930
      Pulled By: al13n321
      fbshipit-source-id: 6da77d918bad3780522e918f17f4d5513d3e99ee
    • anand76's avatar
      Remove a printf from db_stress that's not useful info (#6705) · 610a09cc
      anand76 authored
      This was causing db_crashtest.py to wrongly assume an error by parsing the output. Hopefully this will stabilize the crash tests.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6705
      Test Plan: make blackbox_crash_test
      Reviewed By: ltamasi
      Differential Revision: D21043335
      Pulled By: anand1976
      fbshipit-source-id: 5cddd112b124d4e2ebd11724a17d4ef0f50c1cf8
    • sdong's avatar
      Two Improvements to tools/check_format_compatible.sh (#6702) · 165560fb
      sdong authored
      Improve it in two ways:
      1. tools/check_format_compatible.sh is not friendly to run outside FB environment. remove the hard-coded http proxy setting. Instead, move it to Legocastle configuration
      2. Always disable warning as error, so that older build is more likely to pass.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6702
      Test Plan: Run the test and make sure at least it doesn't break.
      Reviewed By: riversand963
      Differential Revision: D21033329
      fbshipit-source-id: 88b4ec1ec49547b772790050a165466bdc4a62a0
  11. 15 Apr, 2020 1 commit
    • anand76's avatar
      Fix a couple of bugs in db_stress fault injection (#6700) · 234e2ed5
      anand76 authored
      1. Fix a memory leak in FaultInjectionTestFS in the stack trace related
      2. Check status of all MultiGet keys before deciding whether an error
      was swallowed, instead of assuming an ok status for any key means an
      undetected error
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6700
      Test Plan: Run db_stress with asan and fault injection
      Reviewed By: cheng-chang
      Differential Revision: D21021498
      Pulled By: anand1976
      fbshipit-source-id: 489191efd1ab0fa834923a1e1d57253a7a315465