1. 08 7月, 2020 2 次提交
    • Peter Dillinger's avatar
      Exclude c_test from buck build opt mode · 9912ebe1
      Peter Dillinger 创作于
      Summary: Fix a Facebook internal build
      
      Test Plan: buck build @mode/opt :c_test :c_test_bin (was compilation
      failure, now "not found")
      buck build @mode/dev :c_test :c_test_bin (still passes)
      9912ebe1
    • sdong's avatar
      Fix test in buck test (#7076) · 0fdd019f
      sdong 创作于
      Summary:
      This is to fix special logic to run tests inside FB.
      Buck test is broken after moving to cpp_unittest(). Move c_test back to the previous approach.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7076
      
      Test Plan: Watch the Sandcastle run
      
      Reviewed By: ajkr
      
      Differential Revision: D22370096
      
      fbshipit-source-id: 4a464d0903f2c76ae2de3a8ad373ffc9bedec64c
      0fdd019f
  2. 10 6月, 2020 1 次提交
  3. 06 6月, 2020 2 次提交
  4. 04 6月, 2020 1 次提交
    • sdong's avatar
      Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) · afa35188
      sdong 创作于
      Summary:
      This reverts commit 8d87e9cea13b09e12d0d22d4fdebdd5bacbb5c6e.
      
      Based on offline discussions, it's too early to upgrade to gtest 1.10, as it prevents some developers from using an older version of gtest to integrate to some other systems. Revert it for now.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6923
      
      Reviewed By: pdillinger
      
      Differential Revision: D21864799
      
      fbshipit-source-id: d0726b1ff649fc911b9378f1763316200bd363fc
      afa35188
  5. 02 6月, 2020 1 次提交
  6. 30 5月, 2020 1 次提交
    • Peter Dillinger's avatar
      Allow missing "unversioned" python, as in CentOS 8 (#6883) · 0c56fc4d
      Peter Dillinger 创作于
      Summary:
      RocksDB Makefile was assuming existence of 'python' command,
      which is not present in CentOS 8. We avoid using 'python' if 'python3' is available.
      
      Also added fancy logic to format-diff.sh to make clang-format-diff.py for Python2 work even with Python3 only (as some CentOS 8 FB machines come equipped)
      
      Also, now use just 'python3' for PYTHON if not found so that an informative
      "command not found" error will result rather than something weird.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6883
      
      Test Plan: manually tried some variants, 'make check' on a fresh CentOS 8 machine without 'python' executable or Python2 but with clang-format-diff.py for Python2.
      
      Reviewed By: gg814
      
      Differential Revision: D21767029
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 54761b376b140a3922407bdc462f3572f461d0e9
      0c56fc4d
  7. 21 5月, 2020 1 次提交
  8. 17 5月, 2020 1 次提交
  9. 15 5月, 2020 1 次提交
    • Cheng Chang's avatar
      Enable IO Uring in MultiGet in direct IO mode (#6815) · 91b75532
      Cheng Chang 创作于
      Summary:
      Currently, in direct IO mode, `MultiGet` retrieves the data blocks one by one instead of in parallel, see `BlockBasedTable::RetrieveMultipleBlocks`.
      
      Since direct IO is supported in `RandomAccessFileReader::MultiRead` in https://github.com/facebook/rocksdb/pull/6446, this PR applies `MultiRead` to `MultiGet` so that the data blocks can be retrieved in parallel.
      
      Also, in direct IO mode and when data blocks are compressed and need to uncompressed, this PR only allocates one continuous aligned buffer to hold the data blocks, and then directly uncompress the blocks to insert into block cache, there is no longer intermediate copies to scratch buffers.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6815
      
      Test Plan:
      1. added a new unit test `BlockBasedTableReaderTest::MultiGet`.
      2. existing unit tests and stress tests  contain tests against `MultiGet` in direct IO mode.
      
      Reviewed By: anand1976
      
      Differential Revision: D21426347
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: b8446ae0e74152444ef9111e97f8e402ac31b24f
      91b75532
  10. 29 4月, 2020 1 次提交
    • mrambacher's avatar
      Add Functions to OptionTypeInfo (#6422) · 618bf638
      mrambacher 创作于
      Summary:
      Added functions for parsing, serializing, and comparing elements to OptionTypeInfo.  These functions allow all of the special cases that could not be handled directly in the map of OptionTypeInfo to be moved into the map.  Using these functions, every type can be handled via the map rather than special cased.
      
      By adding these functions, the code for handling options can become more standardized (fewer special cases) and (eventually) handled completely by common classes.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6422
      
      Test Plan: pass make check
      
      Reviewed By: siying
      
      Differential Revision: D21269005
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 9ba71c721a38ebf9ee88259d60bd81b3282b9077
      618bf638
  11. 27 4月, 2020 1 次提交
    • Yanqin Jin's avatar
      Update buckifier to unblock future internal release (#6726) · 3b2f2719
      Yanqin Jin 创作于
      Summary:
      Some recent PRs added new source files or modified TARGETS file manually.
      During next internal release, executing the following command will revert the
      manual changes.
      Update buckifier so that the following command
      ```
      python buckfier/buckify_rocksdb.py
      ```
      does not change TARGETS file.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6726
      
      Test Plan:
      ```
      python buckifier/buckify_rocksdb.py
      ```
      
      Reviewed By: siying
      
      Differential Revision: D21098930
      
      Pulled By: riversand963
      
      fbshipit-source-id: e884f507fefef88163363c9097a460c98f1ed850
      3b2f2719
  12. 25 4月, 2020 1 次提交
    • Cheng Chang's avatar
      Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) · 40497a87
      Cheng Chang 创作于
      Summary:
      In https://github.com/facebook/rocksdb/pull/6455, we modified the interface of `RandomAccessFileReader::Read` to be able to get rid of memcpy in direct IO mode.
      This PR applies the new interface to `BlockFetcher` when reading blocks from SST files in direct IO mode.
      
      Without this PR, in direct IO mode, when fetching and uncompressing compressed blocks, `BlockFetcher` will first copy the raw compressed block into `BlockFetcher::compressed_buf_` or `BlockFetcher::stack_buf_` inside `RandomAccessFileReader::Read` depending on the block size. then during uncompressing, it will copy the uncompressed block into `BlockFetcher::heap_buf_`.
      
      In this PR, we get rid of the first memcpy and directly uncompress the block from `direct_io_buf_` to `heap_buf_`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6689
      
      Test Plan: A new unit test `block_fetcher_test` is added.
      
      Reviewed By: anand1976
      
      Differential Revision: D21006729
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 2370b92c24075692423b81277415feb2aed5d980
      40497a87
  13. 11 4月, 2020 2 次提交
    • anand76's avatar
      Fault injection in db_stress (#6538) · 5c19a441
      anand76 创作于
      Summary:
      This PR implements a fault injection mechanism for injecting errors in reads in db_stress. The FaultInjectionTestFS is used for this purpose. A thread local structure is used to track the errors, so that each db_stress thread can independently enable/disable error injection and verify observed errors against expected errors. This is initially enabled only for Get and MultiGet, but can be extended to iterator as well once its proven stable.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6538
      
      Test Plan:
      crash_test
      make check
      
      Reviewed By: riversand963
      
      Differential Revision: D20714347
      
      Pulled By: anand1976
      
      fbshipit-source-id: d7598321d4a2d72bda0ced57411a337a91d87dc7
      5c19a441
    • Yanqin Jin's avatar
      Compaction with timestamp: input boundaries (#6645) · 0c05624d
      Yanqin Jin 创作于
      Summary:
      Towards making compaction logic compatible with user timestamp.
      When computing boundaries and overlapping ranges for inputs of compaction, We need to compare SSTs by user key without timestamp.
      
      Test plan (devserver):
      ```
      make check
      ```
      Several individual tests:
      ```
      ./version_set_test --gtest_filter=VersionStorageInfoTimestampTest.GetOverlappingInputs
      ./db_with_timestamp_compaction_test
      ./db_with_timestamp_basic_test
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6645
      
      Reviewed By: ltamasi
      
      Differential Revision: D20960012
      
      Pulled By: riversand963
      
      fbshipit-source-id: ad377fa9eb481bf7a8a3e1824aaade48cdc653a4
      0c05624d
  14. 09 4月, 2020 1 次提交
    • Cheng Chang's avatar
      Add unit test for TransactionLockMgr (#6599) · d648a0e1
      Cheng Chang 创作于
      Summary:
      Although there are tests related to locking in transaction_test, this new test directly tests against TransactionLockMgr.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6599
      
      Test Plan: make transaction_lock_mgr_test && ./transaction_lock_mgr_test
      
      Reviewed By: lth
      
      Differential Revision: D20673749
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 1fa4a13218e68d785f5a99924556751a8c5c0f31
      d648a0e1
  15. 08 4月, 2020 1 次提交
  16. 02 4月, 2020 1 次提交
    • Ziyue Yang's avatar
      Add pipelined & parallel compression optimization (#6262) · 03a781a9
      Ziyue Yang 创作于
      Summary:
      This PR adds support for pipelined & parallel compression optimization for `BlockBasedTableBuilder`. This optimization makes block building, block compression and block appending a pipeline, and uses multiple threads to accelerate block compression. Users can set `CompressionOptions::parallel_threads` greater than 1 to enable compression parallelism.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6262
      
      Reviewed By: ajkr
      
      Differential Revision: D20651306
      
      fbshipit-source-id: 62125590a9c15b6d9071def9dc72589c1696a4cb
      03a781a9
  17. 27 3月, 2020 1 次提交
    • Levi Tamasi's avatar
      Add blob files to VersionStorageInfo/VersionBuilder (#6597) · 6f62322f
      Levi Tamasi 创作于
      Summary:
      The patch adds a couple of classes to represent metadata about
      blob files: `SharedBlobFileMetaData` contains the information elements
      that are immutable (once the blob file is closed), e.g. blob file number,
      total number and size of blob files, checksum method/value, while
      `BlobFileMetaData` contains attributes that can vary across versions like
      the amount of garbage in the file. There is a single `SharedBlobFileMetaData`
      for each blob file, which is jointly owned by the `BlobFileMetaData` objects
      that point to it; `BlobFileMetaData` objects, in turn, are owned by `Version`s
      and can also be shared if the (immutable _and_ mutable) state of the blob file
      is the same in two versions.
      
      In addition, the patch adds the blob file metadata to `VersionStorageInfo`, and extends
      `VersionBuilder` so that it can apply blob file related `VersionEdit`s (i.e. those
      containing `BlobFileAddition`s and/or `BlobFileGarbage`), and save blob file metadata
      to a new `VersionStorageInfo`. Consistency checks are also extended to ensure
      that table files point to blob files that are part of the `Version`, and that all blob files
      that are part of any given `Version` have at least some _non_-garbage data in them.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6597
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D20656803
      
      Pulled By: ltamasi
      
      fbshipit-source-id: f1f74d135045b3b42d0146f03ee576ef0a4bfd80
      6f62322f
  18. 21 3月, 2020 2 次提交
    • Yanqin Jin's avatar
      Attempt to recover from db with missing table files (#6334) · fb09ef05
      Yanqin Jin 创作于
      Summary:
      There are situations when RocksDB tries to recover, but the db is in an inconsistent state due to SST files referenced in the MANIFEST being missing. In this case, previous RocksDB will just fail the recovery and return a non-ok status.
      This PR enables another possibility. During recovery, RocksDB checks possible MANIFEST files, and try to recover to the most recent state without missing table file. `VersionSet::Recover()` applies version edits incrementally and "materializes" a version only when this version does not reference any missing table file. After processing the entire MANIFEST, the version created last will be the latest version.
      `DBImpl::Recover()` calls `VersionSet::Recover()`. Afterwards, WAL replay will *not* be performed.
      To use this capability, set `options.best_efforts_recovery = true` when opening the db. Best-efforts recovery is currently incompatible with atomic flush.
      
      Test plan (on devserver):
      ```
      $make check
      $COMPILE_WITH_ASAN=1 make all && make check
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6334
      
      Reviewed By: anand1976
      
      Differential Revision: D19778960
      
      Pulled By: riversand963
      
      fbshipit-source-id: c27ea80f29bc952e7d3311ecf5ee9c54393b40a8
      fb09ef05
    • Cheng Chang's avatar
      Support direct IO in RandomAccessFileReader::MultiRead (#6446) · 4fc21664
      Cheng Chang 创作于
      Summary:
      By supporting direct IO in RandomAccessFileReader::MultiRead, the benefits of parallel IO (IO uring) and direct IO can be combined.
      
      In direct IO mode, read requests are aligned and merged together before being issued to RandomAccessFile::MultiRead, so blocks in the original requests might share the same underlying buffer, the shared buffers are returned in `aligned_bufs`, which is a new parameter of the `MultiRead` API.
      
      For example, suppose alignment requirement for direct IO is 4KB, one request is (offset: 1KB, len: 1KB), another request is (offset: 3KB, len: 1KB), then since they all belong to page (offset: 0, len: 4KB), `MultiRead` only reads the page with direct IO into a buffer on heap, and returns 2 Slices referencing regions in that same buffer. See `random_access_file_reader_test.cc` for more examples.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6446
      
      Test Plan: Added a new test `random_access_file_reader_test.cc`.
      
      Reviewed By: anand1976
      
      Differential Revision: D20097518
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: ca48a8faf9c3af146465c102ef6b266a363e78d1
      4fc21664
  19. 17 3月, 2020 1 次提交
    • sdong's avatar
      De-template block based table iterator (#6531) · d6690809
      sdong 创作于
      Summary:
      Right now block based table iterator is used as both of iterating data for block based table, and for the index iterator for partitioend index. This was initially convenient for introducing a new iterator and block type for new index format, while reducing code change. However, these two usage doesn't go with each other very well. For example, Prev() is never called for partitioned index iterator, and some other complexity is maintained in block based iterators, which is not needed for index iterator but maintainers will always need to reason about it. Furthermore, the template usage is not following Google C++ Style which we are following, and makes a large chunk of code tangled together. This commit separate the two iterators. Right now, here is what it is done:
      1. Copy the block based iterator code into partitioned index iterator, and de-template them.
      2. Remove some code not needed for partitioned index. The upper bound check and tricks are removed. We never tested performance for those tricks when partitioned index is enabled in the first place. It's unlikelyl to generate performance regression, as creating new partitioned index block is much rarer than data blocks.
      3. Separate out the prefetch logic to a helper class and both classes call them.
      
      This commit will enable future follow-ups. One direction is that we might separate index iterator interface for data blocks and index blocks, as they are quite different.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6531
      
      Test Plan: build using make and cmake. And build release
      
      Differential Revision: D20473108
      
      fbshipit-source-id: e48011783b339a4257c204cc07507b171b834b0f
      d6690809
  20. 14 3月, 2020 1 次提交
  21. 13 3月, 2020 2 次提交
  22. 12 3月, 2020 1 次提交
    • Cheng Chang's avatar
      Cache result of GetLogicalBufferSize in Linux (#6457) · 2d9efc9a
      Cheng Chang 创作于
      Summary:
      In Linux, when reopening DB with many SST files, profiling shows that 100% system cpu time spent for a couple of seconds for `GetLogicalBufferSize`. This slows down MyRocks' recovery time when site is down.
      
      This PR introduces two new APIs:
      1. `Env::RegisterDbPaths` and `Env::UnregisterDbPaths` lets `DB` tell the env when it starts or stops using its database directories . The `PosixFileSystem` takes this opportunity to set up a cache from database directories to the corresponding logical block sizes.
      2. `LogicalBlockSizeCache` is defined only for OS_LINUX to cache the logical block sizes.
      
      Other modifications:
      1. rename `logical buffer size` to `logical block size` to be consistent with Linux terms.
      2. declare `GetLogicalBlockSize` in `PosixHelper` to expose it to `PosixFileSystem`.
      3. change the functions `IOError` and `IOStatus` in `env/io_posix.h` to have external linkage since they are used in other translation units too.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6457
      
      Test Plan:
      1. A new unit test is added for `LogicalBlockSizeCache` in `env/io_posix_test.cc`.
      2. A new integration test is added for `DB` operations related to the cache in `db/db_logical_block_size_cache_test.cc`.
      
      `make check`
      
      Differential Revision: D20131243
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 3077c50f8065c0bffb544d8f49fb10bba9408d04
      2d9efc9a
  23. 11 3月, 2020 1 次提交
    • Levi Tamasi's avatar
      Split BlobFileState into an immutable and a mutable part (#6502) · f5bc3b99
      Levi Tamasi 创作于
      Summary:
      It's never too soon to refactor something. The patch splits the recently
      introduced (`VersionEdit` related) `BlobFileState` into two classes
      `BlobFileAddition` and `BlobFileGarbage`. The idea is that once blob files
      are closed, they are immutable, and the only thing that changes is the
      amount of garbage in them. In the new design, `BlobFileAddition` contains
      the immutable attributes (currently, the count and total size of all blobs, checksum
      method, and checksum value), while `BlobFileGarbage` contains the mutable
      GC-related information elements (count and total size of garbage blobs). This is a
      better fit for the GC logic and is more consistent with how SST files are handled.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6502
      
      Test Plan: `make check`
      
      Differential Revision: D20348352
      
      Pulled By: ltamasi
      
      fbshipit-source-id: ff93f0121e80ab15e0e0a6525ba0d6af16a0e008
      f5bc3b99
  24. 05 3月, 2020 1 次提交
    • Zhichao Cao's avatar
      Introduce FaultInjectionTestFS to test fault File system instead of Env (#6414) · e62fe506
      Zhichao Cao 创作于
      Summary:
      In the current code base, we can use FaultInjectionTestEnv to simulate the env issue such as file write/read errors, which are used in most of the test. The PR https://github.com/facebook/rocksdb/issues/5761 introduce the File System as a new Env API. This PR implement the FaultInjectionTestFS, which can be used to simulate when File System has issues such as IO error. user can specify any IOStatus error as input, such that FS corresponding actions will return certain error to the caller.
      
      A set of ErrorHandlerFSTests are introduced for testing
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6414
      
      Test Plan: pass make asan_check, pass error_handler_fs_test.
      
      Differential Revision: D20252421
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: e922038f8ce7e6d1da329fd0bba7283c4b779a21
      e62fe506
  25. 25 2月, 2020 2 次提交
    • Levi Tamasi's avatar
      Add blob file state to VersionEdit (#6416) · d87c10c6
      Levi Tamasi 创作于
      Summary:
      BlobDB currently does not keep track of blob files: no records are written to
      the manifest when a blob file is added or removed, and upon opening a database,
      the list of blob files is populated simply based on the contents of the blob directory.
      This means that lost blob files cannot be detected at the moment. We plan to solve
      this issue by making blob files a part of `Version`; as a first step, this patch makes
      it possible to store information about blob files in `VersionEdit`. Currently, this information
      includes blob file number, total number and size of all blobs, and total number and size
      of garbage blobs. However, the format is extensible: new fields can be added in
      both a forward compatible and a forward incompatible manner if needed (similarly
      to `kNewFile4`).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6416
      
      Test Plan: `make check`
      
      Differential Revision: D19894234
      
      Pulled By: ltamasi
      
      fbshipit-source-id: f9753e1f2aedf6dadb70c09b345207cb9c58c329
      d87c10c6
    • sdong's avatar
      Buck config: Re-enable liburing under Linux (#6451) · eb367d45
      sdong 创作于
      Summary:
      The known bug of liburing has been fixed. Now we can re-enable liburing under Linux
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6451
      
      Test Plan: Watch internal CI
      
      Differential Revision: D20079009
      
      fbshipit-source-id: 04a6f53a900ff721f9a62a188cf906771b5d68d2
      eb367d45
  26. 14 2月, 2020 1 次提交
  27. 11 2月, 2020 2 次提交
    • Cheng Chang's avatar
      Add utility class Defer (#6382) · dafb5680
      Cheng Chang 创作于
      Summary:
      Add a utility class `Defer` to defer the execution of a function until the Defer object goes out of scope.
      Used in VersionSet:: ProcessManifestWrites as an example.
      The inline comments for class `Defer` have more details.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6382
      
      Test Plan: `make defer_test version_set_test && ./defer_test && ./version_set_test`
      
      Differential Revision: D19797538
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: b1a9b7306e4fd4f48ec2ab55783caa561a315f0f
      dafb5680
    • Zhichao Cao's avatar
      Checksum for each SST file and stores in MANIFEST (#6216) · 4369f2c7
      Zhichao Cao 创作于
      Summary:
      In the current code base, RocksDB generate the checksum for each block and verify the checksum at usage. Current PR enable SST file checksum. After a SST file is generated by Flush or Compaction, RocksDB generate the SST file checksum and store the checksum value and checksum method name in the vs_info and MANIFEST as part for the FileMetadata.
      
      Added the enable_sst_file_checksum to Options to enable or disable file checksum. Added sst_file_checksum to Options such that user can plugin their own SST file checksum calculate method via overriding the SstFileChecksum class. The checksum information inlcuding uint32_t checksum value and a checksum name (string).  A new tool is added to LDB such that user can dump out a list of file checksum information from MANIFEST. If user enables the file checksum but does not provide the sst_file_checksum instance, RocksDB will use the default crc32checksum implemented in table/sst_file_checksum_crc32c.h
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6216
      
      Test Plan: Added the testing case in table_test and ldb_cmd_test to verify checksum is correct in different level. Pass make asan_check.
      
      Differential Revision: D19171461
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: b2e53479eefc5bb0437189eaa1941670e5ba8b87
      4369f2c7
  28. 08 2月, 2020 2 次提交
    • Cheng Chang's avatar
      Support move semantics for PinnableSlice (#6374) · b42fa149
      Cheng Chang 创作于
      Summary:
      It's logically correct for PinnableSlice to support move semantics to transfer ownership of the pinned memory region. This PR adds both move constructor and move assignment to PinnableSlice.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6374
      
      Test Plan:
      A set of unit tests for the move semantics are added in slice_test.
      So `make slice_test && ./slice_test`.
      
      Differential Revision: D19739254
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: f898bd811bb05b2d87384ec58b645e9915e8e0b1
      b42fa149
    • Chad Austin's avatar
      Fix Buck build on macOS (#6378) · 25fbdc5a
      Chad Austin 创作于
      Summary:
      liburing is a Linux-specific dependency, so make sure it's configured in the Linux-only Buck rules.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6378
      
      Test Plan:
      ```
      ~/fbcode $ cp internal_repo_rocksdb/repo/TARGETS rocksdb/src
      ~/fbcode $ buck build mode/mac eden
      ```
      
      Reviewed By: chadaustin
      
      Differential Revision: D19760039
      
      Pulled By: riversand963
      
      fbshipit-source-id: 2abfce81c8b17965ef76012262cd117708e0294f
      25fbdc5a
  29. 14 12月, 2019 2 次提交
    • anand76's avatar
      Introduce a new storage specific Env API (#5761) · afa2420c
      anand76 创作于
      Summary:
      The current Env API encompasses both storage/file operations, as well as OS related operations. Most of the APIs return a Status, which does not have enough metadata about an error, such as whether its retry-able or not, scope (i.e fault domain) of the error etc., that may be required in order to properly handle a storage error. The file APIs also do not provide enough control over the IO SLA, such as timeout, prioritization, hinting about placement and redundancy etc.
      
      This PR separates out the file/storage APIs from Env into a new FileSystem class. The APIs are updated to return an IOStatus with metadata about the error, as well as to take an IOOptions structure as input in order to allow more control over the IO.
      
      The user can set both ```options.env``` and ```options.file_system``` to specify that RocksDB should use the former for OS related operations and the latter for storage operations. Internally, a ```CompositeEnvWrapper``` has been introduced that inherits from ```Env``` and redirects individual methods to either an ```Env``` implementation or the ```FileSystem``` as appropriate. When options are sanitized during ```DB::Open```, ```options.env``` is replaced with a newly allocated ```CompositeEnvWrapper``` instance if both env and file_system have been specified. This way, the rest of the RocksDB code can continue to function as before.
      
      This PR also ports PosixEnv to the new API by splitting it into two - PosixEnv and PosixFileSystem. PosixEnv is defined as a sub-class of CompositeEnvWrapper, and threading/time functions are overridden with Posix specific implementations in order to avoid an extra level of indirection.
      
      The ```CompositeEnvWrapper``` translates ```IOStatus``` return code to ```Status```, and sets the severity to ```kSoftError``` if the io_status is retryable. The error handling code in RocksDB can then recover the DB automatically.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5761
      
      Differential Revision: D18868376
      
      Pulled By: anand1976
      
      fbshipit-source-id: 39efe18a162ea746fabac6360ff529baba48486f
      afa2420c
    • Peter Dillinger's avatar
      Add useful idioms to Random API (OneInOpt, PercentTrue) (#6154) · 58d46d19
      Peter Dillinger 创作于
      Summary:
      And clean up related code, especially in stress test.
      
      (More clean up of db_stress_test_base.cc coming after this.)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6154
      
      Test Plan: make check, make blackbox_crash_test for a bit
      
      Differential Revision: D18938180
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 524d27621b8dbb25f6dff40f1081e7c00630357e
      58d46d19
  30. 09 12月, 2019 1 次提交
    • sdong's avatar
      Break db_stress_tool.cc to a list of source files (#6134) · 7d79b326
      sdong 创作于
      Summary:
      db_stress_tool.cc now is a giant file. In order to main it easier to improve and maintain, break it down to multiple source files.
      Most classes are turned into their own files. Separate .h and .cc files are created for gflag definiations. Another .h and .cc files are created for some common functions. Some test execution logic that is only loosely related to class StressTest is moved to db_stress_driver.h and db_stress_driver.cc. All the files are located under db_stress_tool/. The directory name is created as such because if we end it with either stress or test, .gitignore will ignore any file under it and makes it prone to issues in developements.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6134
      
      Test Plan: Build under GCC7 with and without LITE on using GNU Make. Build with GCC 4.8. Build with cmake with -DWITH_TOOL=1
      
      Differential Revision: D18876064
      
      fbshipit-source-id: b25d0a7451840f31ac0f5ebb0068785f783fdf7d
      7d79b326
  31. 08 12月, 2019 1 次提交
    • sdong's avatar
      PosixRandomAccessFile::MultiRead() to use I/O uring if supported (#5881) · e3a82bb9
      sdong 创作于
      Summary:
      Right now, PosixRandomAccessFile::MultiRead() executes read requests in parallel. In this PR, it leverages I/O Uring library to run it in parallel, even when page cache is enabled. This function will fall back if the kernel version doesn't support it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5881
      
      Test Plan: Run the unit test on a kernel version supporting it and make sure all tests pass, and run a unit test on kernel version supporting it and see it pass. Before merging, will also run stress test and see it passes.
      
      Differential Revision: D17742266
      
      fbshipit-source-id: e05699c925ac04fdb42379456a4e23e4ebcb803a
      e3a82bb9