1. 04 Jun, 2020 5 commits
      Add OptionTypeInfo::Vector to parse/serialize vectors (#6424)
      mrambacher authored
      The OptionTypeInfo::Vector method allows a vector<T> to be converted to/from strings via the options.
      The kVectorInt and kVectorCompressionType vectors were replaced with this methodology.
      As part of this change, the NextToken method was added to the OptionTypeInfo.  This method was refactored from code within the StringToMap function.
      Future types that could use this functionality include the EventListener vectors.
      Posix threads (#6865)
      Lucian Petrut authored
      Rocksdb is using the c++11 std::threads feature. The issue is that
      MINGW only supports it when using Posix threads.
      This change will allow rocksdb::port::WindowsThread to be replaced
      with std::thread, which in turn will allow Rocksdb to be cross
      compiled using MINGW.
      At the same time, we'll have to use GetCurrentProcessId instead of _getpid.
      Signed-off-by: default avatarLucian Petrut <lpetrut@cloudbasesolutions.com>
      Some fixes for gcc 4.8 and add to Travis (#6915)
      Peter Dillinger authored
      People keep breaking the gcc 4.8 compilation due to different
      warnings for shadowing member functions with locals. Adding to Travis
      to keep compatibility. (gcc 4.8 is default on CentOS 7.)
      Improve consistency checks in VersionBuilder (#6901)
      Levi Tamasi authored
      The patch cleans up the code and improves the consistency checks around
      adding/deleting table files in `VersionBuilder`. Namely, it makes the checks
      stricter and improves them in the following ways:
      1) A table file can now only be deleted from the LSM tree using the level it
      resides on. Earlier, there was some unnecessary wiggle room for
      trivially moved files (they could be deleted using a lower level number than
      the actual one).
      2) A table file cannot be added to the tree if it is already present in the tree
      on any level (not just the target level). The earlier code only had an assertion
      (which is a no-op in release builds) that the newly added file is not already
      present on the target level.
      3) The above consistency checks around state transitions are now mandatory,
      as opposed to the earlier `CheckConsistencyForDeletes`, which was a no-op
      in release mode unless `force_consistency_checks` was set to `true`. The rationale
      here is that assuming that the initial state is consistent, a valid transition leads to a
      next state that is also consistent; however, an *invalid* transition offers no such
      guarantee. Hence it makes sense to validate the transitions unconditionally,
      and save `force_consistency_checks` for the paranoid checks that re-validate
      the entire state.
      4) The new checks build on the mechanism introduced in https://github.com/facebook/rocksdb/pull/6862,
      which enables us to efficiently look up the location (level and position within level)
      of files in a `Version` by file number. This makes the consistency checks much more
      efficient than the earlier `CheckConsistencyForDeletes`, which essentially
      performed a linear search.
      Fix handling of too-small filter partition size (#6905)
      Peter Dillinger authored
      Because ARM and some other platforms have a larger cache line
      size, they have a larger minimum filter size, which causes recently
      added PartitionedMultiGet test in db_bloom_filter_test to fail on those
      platforms. The code would actually end up using larger partitions,
      because keys_per_partition_ would be 0 and never == number of keys
      The code now attempts to get as close as possible to the small target
      size, while fully utilizing that filter size, if the target partition
      size is smaller than the minimum filter size.
      Also updated the test to break more uniformly across platforms
  2. 03 Jun, 2020 6 commits
      Fix potential overflow of unsigned type in for loop (#6902)
      Zhichao Cao authored
      x.size() -1 or y - 1 can overflow to an extremely large value when x.size() pr y is 0 when they are unsigned type. The end condition of i in the for loop will be extremely large, potentially causes segment fault. Fix them.
      Replace Status with IOStatus in CopyFile and CreateFile (#6916)
      Zhichao Cao authored
      Replace Status with IOStatus in CopyFile and CreateFile.
      Remove gtest dependency in non-test code under utilities/cassandra (#6908)
      sdong authored
      production code under utilities/cassandra depends on gtest.h. Remove them.
      Expose rocksdb_options_copy function to the C API (#6880)
      Stanislav Tkach authored
      For ApproximateSizes, pro-rate table metadata size over data blocks (#6784)
      Peter Dillinger authored
      The implementation of GetApproximateSizes was inconsistent in
      its treatment of the size of non-data blocks of SST files, sometimes
      including and sometimes now. This was at its worst with large portion
      of table file used by filters and querying a small range that crossed
      a table boundary: the size estimate would include large filter size.
      It's conceivable that someone might want only to know the size in terms
      of data blocks, but I believe that's unlikely enough to ignore for now.
      Similarly, there's no evidence the internal function AppoximateOffsetOf
      is used for anything other than a one-sided ApproximateSize, so I intend
      to refactor to remove redundancy in a follow-up commit.
      So to fix this, GetApproximateSizes (and implementation details
      ApproximateSize and ApproximateOffsetOf) now consistently include in
      their returned sizes a portion of table file metadata (incl filters
      and indexes) based on the size portion of the data blocks in range. In
      other words, if a key range covers data blocks that are X% by size of all
      the table's data blocks, returned approximate size is X% of the total
      file size. It would technically be more accurate to attribute metadata
      based on number of keys, but that's not computationally efficient with
      data available and rarely a meaningful difference.
      Also includes miscellaneous comment improvements / clarifications.
      Also included is a new approximatesizerandom benchmark for db_bench.
      No significant performance difference seen with this change, whether ~700 ops/sec with cache_index_and_filter_blocks and small cache or ~150k ops/sec without cache_index_and_filter_blocks.
      Reduce dependency on gtest dependency in release code (#6907)
      sdong authored
      Release code now depends on gtest, indirectly through including "test_util/testharness.h". This creates multiple problems. One important reason is the definition of IGNORE_STATUS_IF_ERROR() in test_util/testharness.h. Move it to sync_point.h instead.
      Note that utilities/cassandra/format.h still depends on "test_util/testharness.h". This will be resolved in a separate diff.
  3. 02 Jun, 2020 5 commits
  4. 30 May, 2020 1 commit
      Allow missing "unversioned" python, as in CentOS 8 (#6883)
      Peter Dillinger authored
      RocksDB Makefile was assuming existence of 'python' command,
      which is not present in CentOS 8. We avoid using 'python' if 'python3' is available.
      Also added fancy logic to format-diff.sh to make clang-format-diff.py for Python2 work even with Python3 only (as some CentOS 8 FB machines come equipped)
      Also, now use just 'python3' for PYTHON if not found so that an informative
      "command not found" error will result rather than something weird.
  5. 29 May, 2020 3 commits
      avoid `IterKey::UpdateInternalKey()` in `BlockIter` (#6843)
      Andrew Kryczka authored
      `IterKey::UpdateInternalKey()` is an error-prone API as it's
      incompatible with `IterKey::TrimAppend()`, which is used for
      decoding delta-encoded internal keys. This PR stops using it in
      `BlockIter`. Instead, it assigns global seqno in a separate `IterKey`'s
      buffer when needed. The logic for safely getting a Slice with global
      seqno properly assigned is encapsulated in `GlobalSeqnoAppliedKey`.
      `BinarySeek()` is also migrated to use this API (previously it ignored
      global seqno entirely).
      Add timestamp to delete (#6253)
      Yanqin Jin authored
      Preliminary user-timestamp support for delete.
      If ["a", ts=100] exists, you can delete it by calling `DB::Delete(write_options, key)` in which `write_options.timestamp` points to a `ts` higher than 100.
      A new ValueType, i.e. `kTypeDeletionWithTimestamp` is added for deletion marker with timestamp.
      The reason for a separate `kTypeDeletionWithTimestamp`: RocksDB may drop tombstones (keys with kTypeDeletion) when compacting them to the bottom level. This is OK and useful if timestamp is disabled. When timestamp is enabled, should we still reuse `kTypeDeletion`, we may drop the tombstone with a more recent timestamp, causing deleted keys to re-appear.
      Test plan (dev server)
      make check
      Make it possible to look up files by number in VersionStorageInfo (#6862)
      Levi Tamasi authored
      Does what it says on the can: the patch adds a hash map to `VersionStorageInfo`
      that maps file numbers to file locations, i.e. (level, position in level) pairs. This
      will enable stricter consistency checks in `VersionBuilder`. The patch also fixes
      all the unit tests that used duplicate file numbers in a version (which would trigger
      an assertion with the new code).
  6. 28 May, 2020 2 commits
  7. 27 May, 2020 1 commit
  8. 26 May, 2020 1 commit
      cmake: link env_librados_test against rados (#6855)
      Kefu Chai authored
      otherwise we have FTBFS like:
      2020-05-18T15:12:06.400 INFO:tasks.workunit.client.0.smithi032.stdout:[100%] Linking CXX executable env_librados_test
      2020-05-18T15:12:06.620 INFO:tasks.workunit.client.0.smithi032.stderr:/usr/bin/ld: CMakeFiles/rocksdb_env_librados_test.dir/utilities/env_librados_test.cc.o: undefined reference to symbol
      2020-05-18T15:12:06.620 INFO:tasks.workunit.client.0.smithi032.stderr:/usr/bin/ld: /lib/librados.so.2: error adding symbols: DSO missing from command line
      2020-05-18T15:12:06.620 INFO:tasks.workunit.client.0.smithi032.stderr:collect2: error: ld returned 1 exit status
      this addresses the regression introduced by 07204837
      which hides the symbols exposed by `${THIRDPARTY_LIBS}` from
      consumers of librocksdb
      Signed-off-by: default avatarKefu Chai <tchaikov@gmail.com>
  9. 25 May, 2020 1 commit
      fix transaction rollback in db_stress TestMultiGet (#6873)
      Andrew Kryczka authored
      There were further uses of `txn` after `RollbackTxn(txn)` leading to
      stress test errors. Moved the rollback to the end of the function.
  10. 24 May, 2020 1 commit
      pin image version in circle CI builds (#6876)
      Andrew Kryczka authored
      somehow the windows-server-2019-vs2019 image changed in a way that made
      VS 14 2015 the default. This caused an error when we specify VS 16 2019
      as the cmake generator. I could not figure out the right arguments/env
      vars to get the latest VS working so pinned the image to the previous
      version instead.
  11. 23 May, 2020 3 commits
      Misc things for ASSERT_STATUS_CHECKED, also gcc 4.8.5 (#6871)
      Peter Dillinger authored
      * Print stack trace on status checked failure
      * Make folly_synchronization_distributed_mutex_test a parallel test
      * Disable ldb_test.py and rocksdb_dump_test.sh with
        ASSERT_STATUS_CHECKED (broken)
      * Fix shadow warning in random_access_file_reader.h reported by gcc
        4.8.5 (ROCKSDB_NO_FBCODE), also https://github.com/facebook/rocksdb/issues/6866
      * Work around compiler bug on max_align_t for gcc < 4.9
      * Remove an apparently wrong comment in status.h
      * Use check_some in Travis config (for proper diagnostic output)
      * Fix ignored Status in loop in options_helper.cc
      Fix warning -Wextra-semi. NFC. (#6869)
      Marek Kurdej authored
      Minor fix.
      CLA signed.
      Fix/expand ASSERT_STATUS_CHECKED build, add to Travis (#6870)
      Peter Dillinger authored
      Fixed some option handling code that recently broke the
      ASSERT_STATUS_CHECKED build for options_test.
      Added all other existing tests that pass under ASSERT_STATUS_CHECKED to
      the whitelist.
      Added a Travis configuration to run all whitelisted tests with
      ASSERT_STATUS_CHECKED. (Someday we might enable this check by default in
      debug builds.)
  12. 22 May, 2020 3 commits
      Change autovector to have a reserved size in LITE mode (#6868)
      mrambacher authored
      Previously in LITE mode, an autovector did not have a reserved size. When
      elements were added to the vector, the underlying array could be reallocated.
      There was a set of code that never expands the autovector and was doing &autovector::back().  When the vector is resized, the old addresses may become invalid, causing a later exception to be thrown.
      By reserving space in the autovector up front, this problem is eliminated for those uses where the vector will never exceed the initial size.
      the resize happens, these pointers become invalid, leading to SEGV or other exceptions.
      This change allows the autovector to be fully populated before we take the address of any of its elements, thereby elminating the potential for a resize.
      There is comparable code to this change in Version::MultiGet for dealing with the context objects.
    • Andrew Kryczka's avatar
      skip direct I/O tests in rocksdb lite (#6867)
      Andrew Kryczka authored
      Fix a couple places where direct I/O was used even though it is
      unsupported in lite builds.
      Add Struct Type to OptionsTypeInfo (#6425)
      mrambacher authored
      Added code for generically handing structs to OptionTypeInfo.  A struct is a collection of variables handled by their own map of OptionTypeInfos.  Examples of structs include Compaction and Cache options.
  13. 21 May, 2020 4 commits
      Clean up some code related to file checksums (#6861)
      Peter Dillinger authored
      * Add missing unit test for schema stability of FileChecksumGenCrc32c
        (previously was only comparing to itself)
      * A lot of clarifying comments
      * Add some assertions for preconditions
      * Rename WritableFileWriter::CalculateFileChecksum -> UpdateFileChecksum
      * Simplify FileChecksumGenCrc32c with shared functions
      * Implement EndianSwapValue to replace unused EndianTransform
      And incidentally since I had trouble with 'make check-format' GitHub action disagreeing with local run,
      * Output full diagnostic information when 'make check-format' fails in CI
      Fix a bug in crash_test_with_txn (#6860)
      anand76 authored
      In NoBatchedOpsStress::TestMultiGet, call txn->Get() when transactions
      are in use.
      Generate file checksum in SstFileWriter (#6859)
      Zhichao Cao authored
      If Option.file_checksum_gen_factory is set, rocksdb generates the file checksum during flush and compaction based on the checksum generator created by the factory and store the checksum and function name in vstorage and Manifest.
      This PR enable file checksum generation in SstFileWrite and store the checksum and checksum function name in the  ExternalSstFileInfo, such that application can use them for other purpose, for example, ingest the file checksum with files in IngestExternalFile().
    • Peter Dillinger's avatar
      Peter Dillinger authored
      ... so that we have freedom to upgrade it (see https://github.com/facebook/rocksdb/issues/6808).
      As a side benefit, gtest will no longer be linked into main library in
      buck build.
  14. 20 May, 2020 4 commits
      Status check enforcement for io_posix_test and options_settable_test (#6857)
      Akanksha Mahajan authored
      Added status check enforcement for io_posix_test and options_settable_test
    • anand76's avatar
      anand76 authored
      Add MultiGet to VerifyDb and check consistency with Get in TestMultiGet.
      Test plan -
      make crash_test
      ASAN crash test
    • Levi Tamasi's avatar
      Levi Tamasi authored
      This patch is groundwork for an upcoming change to store the set of
      linked SSTs in `BlobFileMetaData`. With the current code, a new
      `BlobFileMetaData` object is created each time a `VersionEdit` touches
      a certain blob file. This is fine as long as these objects are lightweight
      and cheap to create; however, with the addition of the linked SST set, it would
      be very inefficient since the set would have to be copied over and over again.
      Note that this is the same kind of problem that `VersionBuilder` is solving
      w/r/t `Version`s and files, and we can apply the same solution; that is, we can
      accumulate the changes in a different mutable object, and apply the delta in
      one shot when the changes are committed. The patch does exactly that by
      adding a new `BlobFileMetaDataDelta` class to `VersionBuilder`. In addition,
      it turns the existing `GetBlobFileMetaData` helper into `IsBlobFileInVersion`
      (which is fine since that's the only thing the method's clients care about now),
      and adds a couple of helper methods that can create a `BlobFileMetaData`
      object from the `BlobFileMetaData` in the base (if applicable) and the delta
      when the `Version` is saved.
    • mrambacher's avatar
      mrambacher authored
      Under MacOS when running with make -j 8 check, the temporary directory generated was > 100 characters.  This caused the tests to do nothing under MacOS.  Most of them still reported success for doing nothing, but ReadaheadSize was expecting the test to run.
      By making the option name longer, the tests will no run successfully (and do something!)
