1. 21 Apr, 2018 1 commit
  2. 10 Apr, 2018 3 commits
  3. 24 Mar, 2018 3 commits
  4. 17 Mar, 2018 2 commits
    • Andrew Kryczka's avatar
      update history and bump patch number · b4fc156f
      Andrew Kryczka authored
    • Andrew Kryczka's avatar
      Fix WAL corruption from checkpoint/backup race condition · 34ccab02
      Andrew Kryczka authored
      `Writer::WriteBuffer` was always called at the beginning of checkpoint/backup. But that log writer has no internal synchronization, which meant the same buffer could be flushed twice in a race condition case, causing a WAL entry to be duplicated. Then subsequent WAL entries would be at unexpected offsets, causing the 32KB block boundaries to be overlapped and manifesting as a corruption.
      This PR fixes the behavior to only use `WriteBuffer` (via `FlushWAL`) in checkpoint/backup when manual WAL flush is enabled. In that case, users are responsible for providing synchronization between WAL flushes. We can also consider removing the call entirely.
      Closes https://github.com/facebook/rocksdb/pull/3603
      Differential Revision: D7277447
      Pulled By: ajkr
      fbshipit-source-id: 1b15bd7fd930511222b075418c10de0aaa70a35a
  5. 15 Mar, 2018 4 commits
  6. 09 Mar, 2018 1 commit
  7. 01 Mar, 2018 2 commits
  8. 28 Feb, 2018 2 commits
    • Andrew Kryczka's avatar
      skip CompactRange flush based on memtable contents · 3ae00472
      Andrew Kryczka authored
      CompactRange has a call to Flush because we guarantee that, at the time it's called, all existing keys in the range will be pushed through the user's compaction filter. However, previously the flush was done blindly, so it'd happen even if the memtable does not contain keys in the range specified by the user. This caused unnecessarily many L0 files to be created, leading to write stalls in some cases. This PR checks the memtable's contents, and decides to flush only if it overlaps with `CompactRange`'s range.
      - Move the memtable overlap check logic from `ExternalSstFileIngestionJob` to `ColumnFamilyData::RangesOverlapWithMemtables`
      - Reuse the above logic in `CompactRange` and skip flushing if no overlap
      Closes https://github.com/facebook/rocksdb/pull/3520
      Differential Revision: D7018897
      Pulled By: ajkr
      fbshipit-source-id: a3c6b1cfae56687b49dd89ccac7c948e53545934
    • Siying Dong's avatar
      Update comments in DB::Close() · c287c098
      Siying Dong authored
      Summary: Closes https://github.com/facebook/rocksdb/pull/3543
      Differential Revision: D7093251
      Pulled By: siying
      fbshipit-source-id: 4066b82c95ecb65866c5842d68ab13ab9f85d567
  9. 27 Feb, 2018 3 commits
    • Istvan Szukacs's avatar
      Adding CentOS 7 Vagrantfile & build script · d6336563
      Istvan Szukacs authored
      I have updated the Vagrantfile to have an entry for CentOS 7. Also created a simple build script which is pretty similar to the one in Beringei.
      How to test:
      vagrant up centos7
      Implement -j X for the build.
      Closes https://github.com/facebook/rocksdb/pull/3530
      Differential Revision: D7090739
      Pulled By: ajkr
      fbshipit-source-id: 9f9eda5b507568993543d08de7ce168dfc12282e
    • Zhongyi Xie's avatar
      DB:Open should fail on tmpfs when use_direct_reads=true · ad05cbb1
      Zhongyi Xie authored
      > $ TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32
      DB path: [/dev/shm/dbbench]
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      db_bench: tpp.c:84: __pthread_tpp_change_priority: Assertion `new_prio == -1 || (new_prio >= fifo_min_prio && new_prio <= fifo_max_prio)' failed.
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument
      > TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      open error: Not implemented: Direct I/O is not supported by the specified DB.
      Closes https://github.com/facebook/rocksdb/pull/3539
      Differential Revision: D7082658
      Pulled By: miasantreble
      fbshipit-source-id: f9d9c6ec3b5e9e049cab52154940ee101ba4d342
    • Dmitri Smirnov's avatar
      Fix a memory leak in WindowsThread · 7eb292da
      Dmitri Smirnov authored
      _endthreadex does not return and thus objects
        for stack destructors do not run. This creates a memory leak.
        We remove the calls since _enthreadex called automatically after the
        threadproc returns i.e. thread exits.
      Closes https://github.com/facebook/rocksdb/pull/3542
      Differential Revision: D7088713
      Pulled By: ajkr
      fbshipit-source-id: 749ecafc6a9572f587f76e516547e07734349a54
  10. 24 Feb, 2018 2 commits
  11. 23 Feb, 2018 5 commits
  12. 22 Feb, 2018 2 commits
    • Andrew Kryczka's avatar
      BackupEngine gluster-friendly file naming convention · b0929776
      Andrew Kryczka authored
      Use the rsync tempfile naming convention in our `BackupEngine`. The temp file follows the format, `.<filename>.<suffix>`, which is later renamed to `<filename>`. We fix `tmp` as the `<suffix>` as we don't need to use random bytes for now. The benefit is gluster treats this tempfile naming convention specially and applies hashing only to `<filename>`, so the file won't need to be linked or moved when it's renamed. Our gluster team suggested this will make things operationally easier.
      Closes https://github.com/facebook/rocksdb/pull/3463
      Differential Revision: D6893333
      Pulled By: ajkr
      fbshipit-source-id: fd7622978f4b2487fce33cde40dd3124f16bcaa8
    • Maysam Yabandeh's avatar
      WritePrepared Txn: fix non-emptied PreparedHeap bug · 828211e9
      Maysam Yabandeh authored
      Under a certain sequence of accessing PreparedHeap, there was a bug that would not successfully empty the heap. This would result in performance issues when the heap content is moved to old_prepared_ after max_evicted_seq_ advances the orphan prepared sequence numbers. The patch fixed the bug and add more unit tests. It also does more logging when the unlikely scenarios are faced
      Closes https://github.com/facebook/rocksdb/pull/3526
      Differential Revision: D7038486
      Pulled By: maysamyabandeh
      fbshipit-source-id: f1e40bea558f67b03d2a29131fcb8734c65fce97
  13. 21 Feb, 2018 4 commits
    • Sagar Vemuri's avatar
      Add rocksdb.iterator.internal-key property · 8ada876d
      Sagar Vemuri authored
      Added a new iterator property: `rocksdb.iterator.internal-key` to get the internal-key (converted to user key) at which the iterator stopped.
      Closes https://github.com/facebook/rocksdb/pull/3525
      Differential Revision: D7033694
      Pulled By: sagar0
      fbshipit-source-id: d51e6c00f5e9d766c6276ef79774b81c6c5216f8
    • jsteemann's avatar
      save redundant key lookup in map of locked keys · e9c31ab1
      jsteemann authored
      In case it is found that a key is already marked as locked in a
      stripe's map of locked keys, it is not necessary to look it up
      again using `std::unordered_map<std::string, ...>::at(size_t)`.
      Instead, we can use the already found position using the iterator
      produced by the previous `find` operation. Reusing the iterator
      will avoid having to hash the key again and do additional "random"
      memory lookups in the map of keys (though the data will very
      likely sit available in caches here already due to the previous
      find operation)
      Closes https://github.com/facebook/rocksdb/pull/3505
      Differential Revision: D7036446
      Pulled By: sagar0
      fbshipit-source-id: cced51547b2bd2d49394f6bc8c5896f09fa80f68
    • Andrew Kryczka's avatar
      fix handling of empty string as checkpoint directory · 1960e73e
      Andrew Kryczka authored
      - made `CreateCheckpoint` properly return `InvalidArgument` when called with an empty directory. Previously it triggered an assertion failure due to a bug in the logic.
      - made `ldb` set empty `checkpoint_dir` if that's what the user specifies, so that we can use it to properly test `CreateCheckpoint` in the future.
      Differential Revision: D6874562
      fbshipit-source-id: dcc1bd41768261d9338987fa7711444289707ed7
    • Igor Sugak's avatar
      fix shift UBSAN error in col_buf_encoder.cc · 5263da63
      Igor Sugak authored
      Add a static cast to perform the left shift as with an unsigned type.
      make ubsan_check
      Closes https://github.com/facebook/rocksdb/pull/3517
      Reviewed By: sagar0
      Differential Revision: D7016044
      Pulled By: igorsugak
      fbshipit-source-id: baf72f6197edd8f7220d010b15a23d6de6a72c49
  14. 17 Feb, 2018 3 commits
    • Po-Chuan Hsieh's avatar
      Fix build with USE_RTTI=0 · ab446dc2
      Po-Chuan Hsieh authored
      utilities/column_aware_encoding_util.cc:61:23: error: cannot use dynamic_cast with -fno-rtti
      1 error generated.
      It was added as a [local patch](https://svnweb.freebsd.org/ports/head/databases/rocksdb/files/patch-utilities-column_aware_encoding_util.cc) on FreeBSD since RocksDB 5.8.
      It also fixes #2707.
      Closes https://github.com/facebook/rocksdb/pull/3514
      Differential Revision: D7005571
      Pulled By: siying
      fbshipit-source-id: 351a9055d21d0accdd7a932e8e7bfcd3c8e22068
    • Maysam Yabandeh's avatar
      WritePrepared Txn: optimizations for sysbench update_noindex · c178da05
      Maysam Yabandeh authored
      These are optimization that we applied to improve sysbech's update_noindex performance.
      1. Make use of LIKELY compiler hint
      2. Move std::atomic so the subclass
      3. Make use of skip_prepared in non-2pc transactions.
      Closes https://github.com/facebook/rocksdb/pull/3512
      Differential Revision: D7000075
      Pulled By: maysamyabandeh
      fbshipit-source-id: 1ab8292584df1f6305a4992973fb1b7933632181
    • Mike Kolupaev's avatar
      Fix deadlock in ColumnFamilyData::InstallSuperVersion() · 97307d88
      Mike Kolupaev authored
      Deadlock: a memtable flush holds DB::mutex_ and calls ThreadLocalPtr::Scrape(), which locks ThreadLocalPtr mutex; meanwhile, a thread exit handler locks ThreadLocalPtr mutex and calls SuperVersionUnrefHandle, which tries to lock DB::mutex_.
      This deadlock is hit all the time on our workload. It blocks our release.
      In general, the problem is that ThreadLocalPtr takes an arbitrary callback and calls it while holding a lock on a global mutex. The same global mutex is (at least in some cases) locked by almost all ThreadLocalPtr methods, on any instance of ThreadLocalPtr. So, there'll be a deadlock if the callback tries to do anything to any instance of ThreadLocalPtr, or waits for another thread to do so.
      So, probably the only safe way to use ThreadLocalPtr callbacks is to do only do simple and lock-free things in them.
      This PR fixes the deadlock by making sure that local_sv_ never holds the last reference to a SuperVersion, and therefore SuperVersionUnrefHandle never has to do any nontrivial cleanup.
      I also searched for other uses of ThreadLocalPtr to see if they may have similar bugs. There's only one other use, in transaction_lock_mgr.cc, and it looks fine.
      Closes https://github.com/facebook/rocksdb/pull/3510
      Reviewed By: sagar0
      Differential Revision: D7005346
      Pulled By: al13n321
      fbshipit-source-id: 37575591b84f07a891d6659e87e784660fde815f
  15. 16 Feb, 2018 3 commits
    • Andrew Kryczka's avatar
      fix advance reservation of arena block addresses · 0454f781
      Andrew Kryczka authored
      Calling `std::vector::reserve()` causes memory to be reallocated and then data to be moved. It was called prior to adding every block. This reallocation could be done a huge amount of times, e.g., for users with large index blocks.
      Instead, we can simply use `std::vector::emplace_back()` in such a way that preserves the no-memory-leak guarantee, while letting the vector decide when to reallocate space. Now I see reallocation/moving happen O(logN) times, rather than O(N) times, where N is the final size of vector.
      Closes https://github.com/facebook/rocksdb/pull/3508
      Differential Revision: D6994228
      Pulled By: ajkr
      fbshipit-source-id: ab7c11e13ff37c8c6c8249be7a79566a4068cd27
    • Yi Wu's avatar
      Legocastle job to report lite build binary size to scuba · 989d1231
      Yi Wu authored
      Add a legocastle job to continuously build the last 10 commits every 4 hours and report lite build binary size to scuba.
      Closes https://github.com/facebook/rocksdb/pull/3511
      Differential Revision: D7001730
      Pulled By: yiwu-arbug
      fbshipit-source-id: 7c8ca87c46d663c786a0d32be69ebbe7b19a5eb9
    • Maysam Yabandeh's avatar
      Unbreak MemTableRep API change · 8eb1d445
      Maysam Yabandeh authored
      The MemTableRep API was broken by this commit: 813719e9
      This patch reverts the changes and instead adds InsertKey (and etc.) overloads to extend the MemTableRep API without breaking the existing classes that inherit from it.
      Closes https://github.com/facebook/rocksdb/pull/3513
      Differential Revision: D7004134
      Pulled By: maysamyabandeh
      fbshipit-source-id: e568d91fe1e17dd76c0c1f6c7dd51a18633b1c4f