1. 10 Oct, 2018 6 commits
    • jsteemann's avatar
      avoid copying when iterating using range-based for (#4459) · 141ef7f8
      jsteemann authored
      this avoids a few copies of std::string and other structs
      in the context of range-based for loops. instead of copying
      the values for each iteration, use a const reference to avoid
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4459
      Differential Revision: D10282045
      Pulled By: sagar0
      fbshipit-source-id: 5012e910dca279abd2be847e1fb432d96274edfb
    • moozzyk's avatar
      JNI support for ReadOptions::iterate_lower_bound (#4444) · f45c0d20
      moozzyk authored
      Fixes: #4401
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4444
      Differential Revision: D10282120
      Pulled By: sagar0
      fbshipit-source-id: d9ddcc1b132208ae7f806fa2106add6fec1baa11
    • jsteemann's avatar
      fix typo in error message, twice (#4457) · 517d3b8b
      jsteemann authored
      Fixes a typo in error messages returned by Iterator::GetProperty(...)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4457
      Differential Revision: D10281965
      Pulled By: sagar0
      fbshipit-source-id: 1cd3c665f467ef06cdfd9f482692e6f8568f3d22
    • Jiri Appl's avatar
      Enable building of ARM32 (#4349) · b0026e1f
      Jiri Appl authored
      The original logic was assuming that the only architectures that the code would build for on Windows were x86 and x64. This change will enable building for arm32 on Windows as well.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4349
      Differential Revision: D10280887
      Pulled By: sagar0
      fbshipit-source-id: 9ca0bede25505d22e13acf916d38aeeaaf5d981a
    • Abhishek Madan's avatar
      Truncate range tombstones by leveraging InternalKeys (#4432) · 3a4bd36f
      Abhishek Madan authored
      To more accurately truncate range tombstones at SST boundaries,
      we now represent them in RangeDelAggregator using InternalKeys, which
      are end-key-exclusive as they were before this change.
      During compaction, "atomic compaction unit boundaries" (the range of
      keys contained in neighbouring and overlaping SSTs) are propagated down
      to RangeDelAggregator to truncate range tombstones at those boundariies
      instead. See https://github.com/facebook/rocksdb/pull/4432#discussion_r221072219 and https://github.com/facebook/rocksdb/pull/4432#discussion_r221138683
      for motivating examples.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4432
      Differential Revision: D10263952
      Pulled By: abhimadan
      fbshipit-source-id: 2fe85ff8a02b3a6a2de2edfe708012797a7bd579
    • Zhongyi Xie's avatar
      add locking around calls to RecalculateWriteStallConditions in column_family_test (#4474) · 283a700f
      Zhongyi Xie authored
      this should fix the current failing TSAN jobs:
      The callstack for TSAN:
      > WARNING: ThreadSanitizer: data race (pid=87440)
        Read of size 8 at 0x7d580000fce0 by thread T22 (mutexes: write M548703):
          #0 rocksdb::InternalStats::DumpCFStatsNoFileHistogram(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) db/internal_stats.cc:1204 (column_family_test+0x00000080eca7)
          #1 rocksdb::InternalStats::DumpCFStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) db/internal_stats.cc:1169 (column_family_test+0x0000008106d0)
          #2 rocksdb::InternalStats::HandleCFStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, rocksdb::Slice) db/internal_stats.cc:578 (column_family_test+0x000000810720)
          #3 rocksdb::InternalStats::GetStringProperty(rocksdb::DBPropertyInfo const&, rocksdb::Slice const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) db/internal_stats.cc:488 (column_family_test+0x00000080670c)
          #4 rocksdb::DBImpl::DumpStats() db/db_impl.cc:625 (column_family_test+0x00000070ce9a)
      >  Previous write of size 8 at 0x7d580000fce0 by main thread:
          #0 rocksdb::InternalStats::AddCFStats(rocksdb::InternalStats::InternalCFStatsType, unsigned long) db/internal_stats.h:324 (column_family_test+0x000000693bbf)
          #1 rocksdb::ColumnFamilyData::RecalculateWriteStallConditions(rocksdb::MutableCFOptions const&) db/column_family.cc:818 (column_family_test+0x000000693bbf)
          #2 rocksdb::ColumnFamilyTest_WriteStallSingleColumnFamily_Test::TestBody() db/column_family_test.cc:2563 (column_family_test+0x0000005e5a49)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4474
      Differential Revision: D10262099
      Pulled By: miasantreble
      fbshipit-source-id: 1247973a3ca32e399b4575d3401dd5439c39efc5
  2. 09 Oct, 2018 7 commits
    • Zhongyi Xie's avatar
      move dump stats to a separate thread (#4382) · cac87fcf
      Zhongyi Xie authored
      Currently statistics are supposed to be dumped to info log at intervals of `options.stats_dump_period_sec`. However the implementation choice was to bind it with compaction thread, meaning if the database has been serving very light traffic, the stats may not get dumped at all.
      We decided to separate stats dumping into a new timed thread using `TimerQueue`, which is already used in blob_db. This will allow us schedule new timed tasks with more deterministic behavior.
      Tested with db_bench using `--stats_dump_period_sec=20` in command line:
      > LOG:2018/09/17-14:07:45.575025 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
      LOG:2018/09/17-14:08:05.643286 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
      LOG:2018/09/17-14:08:25.691325 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
      LOG:2018/09/17-14:08:45.740989 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
      LOG content:
      > 2018/09/17-14:07:45.575025 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS -------
      2018/09/17-14:07:45.575080 7fe99fbfe700 [WARN] [db/db_impl.cc:606]
      ** DB Stats **
      Uptime(secs): 20.0 total, 20.0 interval
      Cumulative writes: 4447K writes, 4447K keys, 4447K commit groups, 1.0 writes per commit group, ingest: 5.57 GB, 285.01 MB/s
      Cumulative WAL: 4447K writes, 0 syncs, 4447638.00 writes per sync, written: 5.57 GB, 285.01 MB/s
      Cumulative stall: 00:00:0.012 H:M:S, 0.1 percent
      Interval writes: 4447K writes, 4447K keys, 4447K commit groups, 1.0 writes per commit group, ingest: 5700.71 MB, 285.01 MB/s
      Interval WAL: 4447K writes, 0 syncs, 4447638.00 writes per sync, written: 5.57 MB, 285.01 MB/s
      Interval stall: 00:00:0.012 H:M:S, 0.1 percent
      ** Compaction Stats [default] **
      Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4382
      Differential Revision: D9933051
      Pulled By: miasantreble
      fbshipit-source-id: 6d12bb1e4977674eea4bf2d2ac6d486b814bb2fa
    • Fosco Marotto's avatar
      Update version macro for 5.17 (#4472) · 35f26bec
      Fosco Marotto authored
      Forgot this in previous commit.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4472
      Differential Revision: D10244227
      Pulled By: gfosco
      fbshipit-source-id: ba0cf7a2f5271f0d9f9443004e2620887cd5fd11
    • DorianZheng's avatar
      Fix DBImpl::GetColumnFamilyHandleUnlocked race condition (#4391) · 27090ae8
      DorianZheng authored
      - Fix DBImpl API race condition
      The timeline of execution flow is as follow:
      timeline              user_thread1                      user_thread2
      t1   |     cfh = GetColumnFamilyHandleUnlocked(0)
      t2   |     id1 = cfh->GetID()
      t3   |                                                GetColumnFamilyHandleUnlocked(1)
      t4   |     id2 = cfh->GetID()
      The original implementation return a pointer to a stateful variable, so that the return `ColumnFamilyHandle` will be changed when another thread calls `GetColumnFamilyHandleUnlocked` with different `column family id`
      - Expose ColumnFamily ID to compaction event listener
      - Fix the return status of `DBImpl::GetLatestSequenceForKey`
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4391
      Differential Revision: D10221243
      Pulled By: yiwu-arbug
      fbshipit-source-id: dec60ee9ff0c8261a2f2413a8506ec1063991993
    • DorianZheng's avatar
      Expose column family id to OnCompactionCompleted (#4466) · e0f05754
      DorianZheng authored
      The controller you requested could not be found. PTAL
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4466
      Differential Revision: D10241358
      Pulled By: yiwu-arbug
      fbshipit-source-id: 99664eb286860a6c8844d50efeb0ef6f0e10dd1e
    • DorianZheng's avatar
      Fix return status of DBImpl::GetLatestSequenceForKey · 7487a762
      DorianZheng authored
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4467
      Differential Revision: D10241418
      Pulled By: yiwu-arbug
      fbshipit-source-id: f6adbe7292b2c934e14971c7432b3eb115c35026
    • Fosco Marotto's avatar
      Update HISTORY.md to current status (#4471) · b787cf9e
      Fosco Marotto authored
      5.16.x status wasn't tracked, and also updated for pending 5.17 release.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4471
      Differential Revision: D10240925
      Pulled By: gfosco
      fbshipit-source-id: 95ab368a04a65b201d2518097af69edf2402f544
    • Ben Clay's avatar
      RocksJava: memory_util support (#4446) · c9048021
      Ben Clay authored
      JNI passthrough for utilities/memory/memory_util.cc
      sagar0 adamretter
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4446
      Differential Revision: D10174578
      Pulled By: sagar0
      fbshipit-source-id: d1d196d771dff22afb7ef7500f308233675696f8
  3. 06 Oct, 2018 2 commits
  4. 05 Oct, 2018 4 commits
  5. 04 Oct, 2018 1 commit
  6. 03 Oct, 2018 3 commits
    • Igor Canadi's avatar
      Introduce CacheAllocator, a custom allocator for cache blocks (#4437) · 1cf5deb8
      Igor Canadi authored
      This is a conceptually simple change, but it touches many files to
      pass the allocator through function calls.
      We introduce CacheAllocator, which can be used by clients to configure
      custom allocator for cache blocks. Our motivation is to hook this up
      with folly's `JemallocNodumpAllocator`
      but there are many other possible use cases.
      Additionally, this commit cleans up memory allocation in
      `util/compression.h`, making sure that all allocations are wrapped in a
      unique_ptr as soon as possible.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4437
      Differential Revision: D10132814
      Pulled By: yiwu-arbug
      fbshipit-source-id: be1343a4b69f6048df127939fea9bbc96969f564
    • Yanqin Jin's avatar
      Check for compression lib support before test exec (#4443) · 4e58b2ea
      Yanqin Jin authored
      Before running CompactFilesTest.SentinelCompressionType, we should check
      whether zlib and snappy are supported.
      CompactFilesTest.SentinelCompressionType is a newly added test. Compilation and
      linking with different options, e.g. COMPILE_WITH_TSAN, COMPILE_WITH_ASAN, etc.
      lead to generation of different binaries. On the one hand, it's not clear why
      zlib or snappy is present under ASAN, but not under TSAN. On the other hand,
      changing the compilation flags for TSAN or ASAN seems a bigger change worth much
      more attention. To unblock the cont-runs, I suggest that we simply add these
      two checks at the beginning of the test, as we did for
      GeneralTableTest.ApproximateOffsetOfCompressed in table/table_test.cc.
      Future actions include invesigating the absence of zlib and snappy when
      compiling with TSAN, i.e. COMPILE_WITH_TSAN=1, if necessary.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4443
      Differential Revision: D10140935
      Pulled By: riversand963
      fbshipit-source-id: 62f96d1e685386accd2ef0b98f6f754d3fd67b3e
    • Jakub Cech's avatar
      Adding IOTA Foundation to USERS.MD (#4436) · d78b2893
      Jakub Cech authored
      Adding IOTA Foundation to USERS.MD
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4436
      Differential Revision: D10108142
      Pulled By: sagar0
      fbshipit-source-id: 948dc9f7169cec5c113ae347f1af765a41355aae
  7. 02 Oct, 2018 2 commits
    • Gihwan Oh's avatar
      Add proper newline markdown (#4434) · 477107d6
      Gihwan Oh authored
      Add newline for readability
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4434
      Differential Revision: D10127684
      Pulled By: riversand963
      fbshipit-source-id: 39f3ed7eaea655b6ff83474bc9f7616c6ad59107
    • Yanqin Jin's avatar
      Remove a race condition between lsdir and rm (#4440) · be5cc4c7
      Yanqin Jin authored
      In DBCompactionTestWithParam::ManualLevelCompactionOutputPathId, there is
      a race condition between `DBTestBase::GetSstFileCount` and
      `DBImpl::PurgeObsoleteFiles`. The following graph explains why.
      Timeline  db_compact_test_t              bg_flush_t         bg_compact_t
          |  [initiate bg flush and
          |      start waiting]
          |                                     flush
          |                                     DeleteObsoleteFiles
          |  [waken up by bg_flush_t which
          |   signaled in DeleteObsoleteFiles]
          |  [initiate compaction and
          |   start waiting]
          |                                                         [compact,
          |                                                          set manual.done to true]
          |                                   [signal at the end of
          |                                    BackgroundCallFlush]
          |  [waken up by bg_flush_t
          |   which signaled before
          |   returning from
          |   BackgroundCallFlush]
          |  Check manual.done is true
          |  GetSstFileCount    <-- race condition -->           PurgeObsoleteFiles
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4440
      Differential Revision: D10122628
      Pulled By: riversand963
      fbshipit-source-id: 3ede73c39fee6ad804dc6ac1ed84759c7e63977f
  8. 01 Oct, 2018 1 commit
  9. 28 Sep, 2018 2 commits
  10. 27 Sep, 2018 3 commits
    • Sagar Vemuri's avatar
      assert in PosixEnv::FileExists should be based on errno (#4427) · b1dad4cf
      Sagar Vemuri authored
      The assert in PosixEnv::FileExists is currently based on the return value of `access` syscall. Instead it should be based on errno.
      Initially I wanted to remove this assert as [`access`](https://linux.die.net/man/2/access) can error out in a few other cases (like EROFS). But on thinking more it feels like the assert is doing the right thing ...  its good to crash on EROFS, EFAULT, EINVAL, and other major filesystem related problems so that the user is immediately aware of the problems while testing.
      (I think it might be ok to crash on EIO as well, but there might be a specific reason why it was decided not to crash for EIO, and I don't have that context. So letting the letting the assert checks remain as is for now).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4427
      Differential Revision: D10037200
      Pulled By: sagar0
      fbshipit-source-id: 5cc96116a2e53cef701f444a8b5290576f311e51
    • Andrew Kryczka's avatar
      Fix benchmark script with vector memtable (#4428) · d56070d8
      Andrew Kryczka authored
      I guess we didn't update this script when `--allow_concurrent_memtable_write` became true by default.
      Fixes #4413.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4428
      Differential Revision: D10036452
      Pulled By: ajkr
      fbshipit-source-id: f464be0642bd096d9040f82cdc3eae614a902183
    • Yi Wu's avatar
      Improve log handling when recover without flush (#4405) · dc813e4b
      Yi Wu authored
      Improve log handling when avoid_flush_during_recovery=true.
      1. restore total_log_size_ after recovery, by summing up existing log sizes. Fixes #4253.
      2. truncate the last existing log, since this log can contain preallocated space and it will be a waste to keep the space. It avoids a crash loop of user application cause a lot of log with non-trivial size being created and ultimately take up all disk space.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4405
      Differential Revision: D9953933
      Pulled By: yiwu-arbug
      fbshipit-source-id: 967780fee8acec7f358b6eb65190fb4684f82e56
  11. 26 Sep, 2018 2 commits
    • Nikhil Benesch's avatar
      Handle tombstones at the same seqno in the CollapsedRangeDelMap (#4424) · 17edc82a
      Nikhil Benesch authored
      The CollapsedRangeDelMap was entirely mishandling tombstones at the same
      sequence number when the tombstones did not have identical start and end
      keys. Such tombstones are common since 90fc4069, which causes
      tombstones to be split during compactions.
      For example, if the tombstone [a, c) @ 1 lies across a compaction
      boundary at b, it will be split into [a, b) @ 1 and [b, c) @ 1. Without
      this patch, the collapsed range deletion map would look like this:
        a -> 1
        b -> 1
        c -> 0
      Notice how the b -> 1 entry is redundant. When the tombstones overlap,
      the problem is even worse. Consider tombstones [a, c) @ 1 and [b, d) @
      1, which produces this map without this patch:
        a -> 1
        b -> 1
        c -> 0
        d -> 0
      This map is corrupt, as a map can never contain adjacent sentinel (zero)
      entries. When the iterator advances from b to c, it will notice that c
      is a sentinel enty and skip to d--but d is also a sentinel entry! Asking
      what tombstone this iterator points to will trigger an assertion, as it
      is not pointing to a valid tombstone.
      /cc ajkr
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4424
      Differential Revision: D10039248
      Pulled By: abhimadan
      fbshipit-source-id: 6d737c1e88d60e80cf27286726627ba44463e7f4
    • Yi Wu's avatar
      Update TARGETS file template (#4426) · 31d46993
      Yi Wu authored
      Update template of TARGETS file according to recent changes in #4371 , #4363 and https://github.com/facebook/rocksdb/commit/dbf44c314b4adf3276afc1ca797b88944ca3162c.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4426
      Differential Revision: D10025053
      Pulled By: yiwu-arbug
      fbshipit-source-id: e6a0a702bfd401fc1af240ee446f5690f0bcd85d
  12. 22 Sep, 2018 1 commit
    • Abhishek Madan's avatar
      Improve RangeDelAggregator benchmarks (#4395) · 3c350a7c
      Abhishek Madan authored
      Improve time measurements for AddTombstones to only include the
      call and not the VectorIterator setup. Also add a new
      add_tombstones_per_run flag to call AddTombstones multiple times per
      aggregator, which will help simulate more realistic workloads.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4395
      Differential Revision: D9996811
      Pulled By: abhimadan
      fbshipit-source-id: 5865a95c323fbd9b3606493013664b4890fe5a02
  13. 21 Sep, 2018 2 commits
  14. 20 Sep, 2018 3 commits
  15. 19 Sep, 2018 1 commit