1. 06 6月, 2020 1 次提交
    • Peter Dillinger's avatar
      Misc things for ASSERT_STATUS_CHECKED, also gcc 4.8.5 (#6871) · 1482c869
      Peter Dillinger 创作于
      Summary:
      * Print stack trace on status checked failure
      * Make folly_synchronization_distributed_mutex_test a parallel test
      * Disable ldb_test.py and rocksdb_dump_test.sh with
        ASSERT_STATUS_CHECKED (broken)
      * Fix shadow warning in random_access_file_reader.h reported by gcc
        4.8.5 (ROCKSDB_NO_FBCODE), also https://github.com/facebook/rocksdb/issues/6866
      * Work around compiler bug on max_align_t for gcc < 4.9
      * Remove an apparently wrong comment in status.h
      * Use check_some in Travis config (for proper diagnostic output)
      * Fix ignored Status in loop in options_helper.cc
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6871
      
      Test Plan: manual, CI
      
      Reviewed By: ajkr
      
      Differential Revision: D21706619
      
      Pulled By: pdillinger
      
      fbshipit-source-id: daf6364173d6689904eb394461a69a11f5bee2cb
      1482c869
  2. 01 5月, 2020 1 次提交
    • Cheng Chang's avatar
      Make users explicitly be aware of prepare before commit (#6775) · ef0c3eda
      Cheng Chang 创作于
      Summary:
      In current commit protocol of pessimistic transaction, if the transaction is not prepared before commit, the commit protocol implicitly assumes that the user wants to commit without prepare.
      
      This PR adds TransactionOptions::skip_prepare, the default value is `true` because if set to `false`, all existing users who commit without prepare need to update their code to set skip_prepare to true. Although this does not force the user to explicitly express their intention of skip_prepare, it at least lets the user be aware of the assumption of being able to commit without prepare.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6775
      
      Test Plan: added a new unit test TransactionTest::CommitWithoutPrepare
      
      Reviewed By: lth
      
      Differential Revision: D21313270
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 3d95b7c9b2d6cdddc09bdd66c561bc4fae8c3251
      ef0c3eda
  3. 28 4月, 2020 1 次提交
    • Peter Dillinger's avatar
      Stats for redundant insertions into block cache (#6681) · 249eff0f
      Peter Dillinger 创作于
      Summary:
      Since read threads do not coordinate on loading data into block
      cache, two threads between Lookup and Insert can end up loading and
      inserting the same data. This is particularly concerning with
      cache_index_and_filter_blocks since those are hot and more likely to
      be race targets if ejected from (or not pre-populated in) the cache.
      
      Particularly with moves toward disaggregated / network storage, the cost
      of redundant retrieval might be high, and we should at least have some
      hard statistics from which we can estimate impact.
      
      Example with full filter thrashing "cliff":
      
          $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10
          ...
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 14181
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 476
          rocksdb.block.cache.data.add COUNT : 12749
          rocksdb.block.cache.data.add.redundant COUNT : 18
          rocksdb.block.cache.filter.add COUNT : 1003
          rocksdb.block.cache.filter.add.redundant COUNT : 217
          rocksdb.block.cache.index.add COUNT : 429
          rocksdb.block.cache.index.add.redundant COUNT : 241
          $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
          rocksdb.block.cache.add COUNT : 1182223
          rocksdb.block.cache.add.failures COUNT : 0
          rocksdb.block.cache.add.redundant COUNT : 302728
          rocksdb.block.cache.data.add COUNT : 31425
          rocksdb.block.cache.data.add.redundant COUNT : 12
          rocksdb.block.cache.filter.add COUNT : 795455
          rocksdb.block.cache.filter.add.redundant COUNT : 130238
          rocksdb.block.cache.index.add COUNT : 355343
          rocksdb.block.cache.index.add.redundant COUNT : 172478
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681
      
      Test Plan: Some manual testing (above) and unit test covering key metrics is included
      
      Reviewed By: ltamasi
      
      Differential Revision: D21134113
      
      Pulled By: pdillinger
      
      fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87
      249eff0f
  4. 21 2月, 2020 1 次提交
    • sdong's avatar
      Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) · fdf882de
      sdong 创作于
      Summary:
      When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
      
      Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
      
      Differential Revision: D19977691
      
      fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
      fdf882de
  5. 14 12月, 2019 1 次提交
    • anand76's avatar
      Introduce a new storage specific Env API (#5761) · afa2420c
      anand76 创作于
      Summary:
      The current Env API encompasses both storage/file operations, as well as OS related operations. Most of the APIs return a Status, which does not have enough metadata about an error, such as whether its retry-able or not, scope (i.e fault domain) of the error etc., that may be required in order to properly handle a storage error. The file APIs also do not provide enough control over the IO SLA, such as timeout, prioritization, hinting about placement and redundancy etc.
      
      This PR separates out the file/storage APIs from Env into a new FileSystem class. The APIs are updated to return an IOStatus with metadata about the error, as well as to take an IOOptions structure as input in order to allow more control over the IO.
      
      The user can set both ```options.env``` and ```options.file_system``` to specify that RocksDB should use the former for OS related operations and the latter for storage operations. Internally, a ```CompositeEnvWrapper``` has been introduced that inherits from ```Env``` and redirects individual methods to either an ```Env``` implementation or the ```FileSystem``` as appropriate. When options are sanitized during ```DB::Open```, ```options.env``` is replaced with a newly allocated ```CompositeEnvWrapper``` instance if both env and file_system have been specified. This way, the rest of the RocksDB code can continue to function as before.
      
      This PR also ports PosixEnv to the new API by splitting it into two - PosixEnv and PosixFileSystem. PosixEnv is defined as a sub-class of CompositeEnvWrapper, and threading/time functions are overridden with Posix specific implementations in order to avoid an extra level of indirection.
      
      The ```CompositeEnvWrapper``` translates ```IOStatus``` return code to ```Status```, and sets the severity to ```kSoftError``` if the io_status is retryable. The error handling code in RocksDB can then recover the DB automatically.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5761
      
      Differential Revision: D18868376
      
      Pulled By: anand1976
      
      fbshipit-source-id: 39efe18a162ea746fabac6360ff529baba48486f
      afa2420c
  6. 17 9月, 2019 1 次提交
    • andrew's avatar
      Allow users to stop manual compactions (#3971) · 62268300
      andrew 创作于
      Summary:
      Manual compaction may bring in very high load because sometime the amount of data involved in a compaction could be large, which may affect online service. So it would be good if the running compaction making the server busy can be stopped immediately. In this implementation, stopping manual compaction condition is only checked in slow process. We let deletion compaction and trivial move go through.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/3971
      
      Test Plan: add tests at more spots.
      
      Differential Revision: D17369043
      
      fbshipit-source-id: 575a624fb992ce0bb07d9443eb209e547740043c
      62268300
  7. 07 8月, 2019 1 次提交
    • Vijay Nadimpalli's avatar
      New API to get all merge operands for a Key (#5604) · d150e014
      Vijay Nadimpalli 创作于
      Summary:
      This is a new API added to db.h to allow for fetching all merge operands associated with a Key. The main motivation for this API is to support use cases where doing a full online merge is not necessary as it is performance sensitive. Example use-cases:
      1. Update subset of columns and read subset of columns -
      Imagine a SQL Table, a row is encoded as a K/V pair (as it is done in MyRocks). If there are many columns and users only updated one of them, we can use merge operator to reduce write amplification. While users only read one or two columns in the read query, this feature can avoid a full merging of the whole row, and save some CPU.
      2. Updating very few attributes in a value which is a JSON-like document -
      Updating one attribute can be done efficiently using merge operator, while reading back one attribute can be done more efficiently if we don't need to do a full merge.
      ----------------------------------------------------------------------------------------------------
      API :
      Status GetMergeOperands(
            const ReadOptions& options, ColumnFamilyHandle* column_family,
            const Slice& key, PinnableSlice* merge_operands,
            GetMergeOperandsOptions* get_merge_operands_options,
            int* number_of_operands)
      
      Example usage :
      int size = 100;
      int number_of_operands = 0;
      std::vector<PinnableSlice> values(size);
      GetMergeOperandsOptions merge_operands_info;
      db_->GetMergeOperands(ReadOptions(), db_->DefaultColumnFamily(), "k1", values.data(), merge_operands_info, &number_of_operands);
      
      Description :
      Returns all the merge operands corresponding to the key. If the number of merge operands in DB is greater than merge_operands_options.expected_max_number_of_operands no merge operands are returned and status is Incomplete. Merge operands returned are in the order of insertion.
      merge_operands-> Points to an array of at-least merge_operands_options.expected_max_number_of_operands and the caller is responsible for allocating it. If the status returned is Incomplete then number_of_operands will contain the total number of merge operands found in DB for key.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5604
      
      Test Plan:
      Added unit test and perf test in db_bench that can be run using the command:
      ./db_bench -benchmarks=getmergeoperands --merge_operator=sortlist
      
      Differential Revision: D16657366
      
      Pulled By: vjnadimpalli
      
      fbshipit-source-id: 0faadd752351745224ee12d4ae9ef3cb529951bf
      d150e014
  8. 21 5月, 2019 1 次提交
  9. 28 3月, 2019 1 次提交
  10. 27 3月, 2019 1 次提交
    • Yanqin Jin's avatar
      Support for single-primary, multi-secondary instances (#4899) · 9358178e
      Yanqin Jin 创作于
      Summary:
      This PR allows RocksDB to run in single-primary, multi-secondary process mode.
      The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary.
      Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`.
      
      This PR has several components:
      1. (Originally in #4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary.
      
      2. (Similar to #4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue.
      
      3. (Originally in #4710 and #4820). Add implementation of the secondary, i.e. `DBImplSecondary`.
      3.1 Tail the primary's MANIFEST during recovery.
      3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`.
      3.3 Tailing WAL will be in a future PR.
      
      4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4899
      
      Differential Revision: D14510945
      
      Pulled By: riversand963
      
      fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
      9358178e
  11. 06 9月, 2018 1 次提交
  12. 24 7月, 2018 1 次提交
    • Chang Su's avatar
      move static msgs out of Status class (#4144) · 374c37da
      Chang Su 创作于
      Summary:
      The member msgs of class Status contains all types of status messages.
      When users dump a Status object, msgs will confuse users. So move it out
      of class Status by making it as file-local static variable.
      
      Closes #3831 .
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4144
      
      Differential Revision: D8941419
      
      Pulled By: sagar0
      
      fbshipit-source-id: 56b0510258465ff26db15aa6b04e01532e053e3d
      374c37da
  13. 14 7月, 2018 1 次提交
  14. 29 6月, 2018 1 次提交
    • Anand Ananthabhotla's avatar
      Allow DB resume after background errors (#3997) · 52d4c9b7
      Anand Ananthabhotla 创作于
      Summary:
      Currently, if RocksDB encounters errors during a write operation (user requested or BG operations), it sets DBImpl::bg_error_ and fails subsequent writes. This PR allows the DB to be resumed for certain classes of errors. It consists of 3 parts -
      1. Introduce Status::Severity in rocksdb::Status to indicate whether a given error can be recovered from or not
      2. Refactor the error handling code so that setting bg_error_ and deciding on severity is in one place
      3. Provide an API for the user to clear the error and resume the DB instance
      
      This whole change is broken up into multiple PRs. Initially, we only allow clearing the error for Status::NoSpace() errors during background flush/compaction. Subsequent PRs will expand this to include more errors and foreground operations such as Put(), and implement a polling mechanism for out-of-space errors.
      Closes https://github.com/facebook/rocksdb/pull/3997
      
      Differential Revision: D8653831
      
      Pulled By: anand1976
      
      fbshipit-source-id: 6dc835c76122443a7668497c0226b4f072bc6afd
      52d4c9b7
  15. 28 6月, 2018 1 次提交
    • Daniel Black's avatar
      Remove bogus gcc-8.1 warning (#3870) · e5ae1bb4
      Daniel Black 创作于
      Summary:
      Various rearrangements of the cch maths failed or replacing = '\0' with
      memset failed to convince the compiler it was nul terminated. So took
      the perverse option of changing strncpy to strcpy.
      
      Return null if memory couldn't be allocated.
      
      util/status.cc: In static member function ‘static const char* rocksdb::Status::CopyState(const char*)’:
      util/status.cc:28:15: error: ‘char* strncpy(char*, const char*, size_t)’ output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
         std::strncpy(result, state, cch - 1);
         ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
      util/status.cc:19:18: note: length computed here
             std::strlen(state) + 1; // +1 for the null terminator
             ~~~~~~~~~~~^~~~~~~
      cc1plus: all warnings being treated as errors
      make: *** [Makefile:645: shared-objects/util/status.o] Error 1
      
      closes #2705
      Closes https://github.com/facebook/rocksdb/pull/3870
      
      Differential Revision: D8594114
      
      Pulled By: anand1976
      
      fbshipit-source-id: ab20f3a456a711e4d29144ebe630e4fe3c99ec25
      e5ae1bb4
  16. 07 3月, 2018 1 次提交
    • amytai's avatar
      Disallow compactions if there isn't enough free space · 0a3db28d
      amytai 创作于
      Summary:
      This diff handles cases where compaction causes an ENOSPC error.
      This does not handle corner cases where another background job is started while compaction is running, and the other background job triggers ENOSPC, although we do allow the user to provision for these background jobs with SstFileManager::SetCompactionBufferSize.
      It also does not handle the case where compaction has finished and some other background job independently triggers ENOSPC.
      
      Usage: Functionality is inside SstFileManager. In particular, users should set SstFileManager::SetMaxAllowedSpaceUsage, which is the reference highwatermark for determining whether to cancel compactions.
      Closes https://github.com/facebook/rocksdb/pull/3449
      
      Differential Revision: D7016941
      
      Pulled By: amytai
      
      fbshipit-source-id: 8965ab8dd8b00972e771637a41b4e6c645450445
      0a3db28d
  17. 16 7月, 2017 1 次提交
  18. 11 4月, 2017 1 次提交
  19. 18 2月, 2017 1 次提交
    • Marcin Dlugajczyk's avatar
      New subcode for IOError to detect the ESTALE errno · a618a16f
      Marcin Dlugajczyk 创作于
      Summary:
      I'd like to propose a patch to expose a new IOError type with subcode kStaleFile to allow to detect when ESTALE error is returned. This allows the rocksdb consumers to handle this error separately from other IOErrors.
      
      I've also added a missing string representation for the kDeadlock subcode, I believe calling ToString() on Status object with that subcode would result in an out of band access in the msgs array,
      
      Please let me know if you have any questions or would like me to make any changes to this pull request.
      Closes https://github.com/facebook/rocksdb/pull/1748
      
      Differential Revision: D4387675
      
      Pulled By: IslamAbdelRahman
      
      fbshipit-source-id: 67feb13
      a618a16f
  20. 04 1月, 2017 1 次提交
  21. 20 10月, 2016 1 次提交
    • Manuel Ung's avatar
      Implement deadlock detection · 4edd39fd
      Manuel Ung 创作于
      Summary: Implement deadlock detection. This is done by maintaining a TxnID -> TxnID map which represents the edges in the wait for graph (this is named `wait_txn_map_`).
      
      Test Plan: transaction_test
      
      Reviewers: IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D64491
      4edd39fd
  22. 08 9月, 2016 1 次提交
  23. 23 8月, 2016 1 次提交
  24. 10 2月, 2016 1 次提交
  25. 23 12月, 2015 1 次提交
    • Dmitri Smirnov's avatar
      Make Status moveable · dbb8260f
      Dmitri Smirnov 创作于
        Status is a class which is frequently returned by value from functions.
        Making it movable avoids 99% of the copies automatically
        on return by value.
      dbb8260f
  26. 11 11月, 2015 1 次提交
    • Yueh-Hsuan Chiang's avatar
      Enable RocksDB to persist Options file. · e114f0ab
      Yueh-Hsuan Chiang 创作于
      Summary:
      This patch allows rocksdb to persist options into a file on
      DB::Open, SetOptions, and Create / Drop ColumnFamily.
      Options files are created under the same directory as the rocksdb
      instance.
      
      In addition, this patch also adds a fail_if_missing_options_file in DBOptions
      that makes any function call return non-ok status when it is not able to
      persist options properly.
      
        // If true, then DB::Open / CreateColumnFamily / DropColumnFamily
        // / SetOptions will fail if options file is not detected or properly
        // persisted.
        //
        // DEFAULT: false
        bool fail_if_missing_options_file;
      
      Options file names are formatted as OPTIONS-<number>, and RocksDB
      will always keep the latest two options files.
      
      Test Plan:
      Add options_file_test.
      
      options_test
      column_family_test
      
      Reviewers: igor, IslamAbdelRahman, sdong, anthony
      
      Reviewed By: anthony
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48285
      e114f0ab
  27. 01 9月, 2015 1 次提交
    • agiardullo's avatar
      Support static Status messages · 77a28615
      agiardullo 创作于
      Summary: Provide a way to specify a detailed static error message for a Status without incurring a memcpy.  Let me know what people think of this approach.
      
      Test Plan: added simple test
      
      Reviewers: igor, yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D44259
      77a28615
  28. 18 8月, 2015 1 次提交
    • Andres Notzli's avatar
      Simplify querying of merge results · f32a5720
      Andres Notzli 创作于
      Summary:
      While working on supporting mixing merge operators with
      single deletes ( https://reviews.facebook.net/D43179 ),
      I realized that returning and dealing with merge results
      can be made simpler. Submitting this as a separate diff
      because it is not directly related to single deletes.
      
      Before, callers of merge helper had to retrieve the merge
      result in one of two ways depending on whether the merge
      was successful or not (success = result of merge was single
      kTypeValue). For successful merges, the caller could query
      the resulting key/value pair and for unsuccessful merges,
      the result could be retrieved in the form of two deques of
      keys and values. However, with single deletes, a successful merge
      does not return a single key/value pair (if merge
      operands are merged with a single delete, we have to generate
      a value and keep the original single delete around to make
      sure that we are not accidentially producing a key overwrite).
      In addition, the two existing call sites of the merge
      helper were taking the same actions independently from whether
      the merge was successful or not, so this patch simplifies that.
      
      Test Plan: make clean all check
      
      Reviewers: rven, sdong, yhchiang, anthony, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43353
      f32a5720
  29. 12 8月, 2015 2 次提交
    • agiardullo's avatar
      Transaction error statuses · 0db807ec
      agiardullo 创作于
      Summary:
      Based on feedback from spetrunia, we should better differentiate error statuses for transaction failures.
      
      https://github.com/MySQLOnRocksDB/mysql-5.6/issues/86#issuecomment-124605954
      
      Test Plan: unit tests
      
      Reviewers: rven, kradhakrishnan, spetrunia, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43323
      0db807ec
    • agiardullo's avatar
      Pessimistic Transactions · c2f2cb02
      agiardullo 创作于
      Summary:
      Initial implementation of Pessimistic Transactions.  This diff contains the api changes discussed in D38913.  This diff is pretty large, so let me know if people would prefer to meet up to discuss it.
      
      MyRocks folks:  please take a look at the API in include/rocksdb/utilities/transaction[_db].h and let me know if you have any issues.
      
      Also, you'll notice a couple of TODOs in the implementation of RollbackToSavePoint().  After chatting with Siying, I'm going to send out a separate diff for an alternate implementation of this feature that implements the rollback inside of WriteBatch/WriteBatchWithIndex.  We can then decide which route is preferable.
      
      Next, I'm planning on doing some perf testing and then integrating this diff into MongoRocks for further testing.
      
      Test Plan: Unit tests, db_bench parallel testing.
      
      Reviewers: igor, rven, sdong, yhchiang, yoshinorim
      
      Reviewed By: sdong
      
      Subscribers: hermanlee4, maykov, spetrunia, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D40869
      c2f2cb02
  30. 21 7月, 2015 1 次提交
    • agiardullo's avatar
      Improved FileExists API · 06429408
      agiardullo 创作于
      Summary: Add new CheckFileExists method.  Considered changing the FileExists api but didn't want to break anyone's builds.
      
      Test Plan: unit tests
      
      Reviewers: yhchiang, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42003
      06429408
  31. 14 7月, 2015 1 次提交
    • Igor Canadi's avatar
      Deprecate WriteOptions::timeout_hint_us · 5aea98dd
      Igor Canadi 创作于
      Summary:
      In one of our recent meetings, we discussed deprecating features that are not being actively used. One of those features, at least within Facebook, is timeout_hint. The feature is really nicely implemented, but if nobody needs it, we should remove it from our code-base (until we get a valid use-case). Some arguments:
      * Less code == better icache hit rate, smaller builds, simpler code
      * The motivation for adding timeout_hint_us was to work-around RocksDB's stall issue. However, we're currently addressing the stall issue itself (see @sdong's recent work on stall write_rate), so we should never see sharp lock-ups in the future.
      * Nobody is using the feature within Facebook's code-base. Googling for `timeout_hint_us` also doesn't yield any users.
      
      Test Plan: make check
      
      Reviewers: anthony, kradhakrishnan, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: sdong, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41937
      5aea98dd
  32. 30 5月, 2015 1 次提交
    • agiardullo's avatar
      Optimistic Transactions · dc9d70de
      agiardullo 创作于
      Summary: Optimistic transactions supporting begin/commit/rollback semantics.  Currently relies on checking the memtable to determine if there are any collisions at commit time.  Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty.  You should probably start with transaction.h to get an overview of what is currently supported.
      
      Test Plan: Added a new test, but still need to look into stress testing.
      
      Reviewers: yhchiang, igor, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: adamretter, MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D33435
      dc9d70de
  33. 08 11月, 2014 1 次提交
    • Yueh-Hsuan Chiang's avatar
      CompactFiles, EventListener and GetDatabaseMetaData · 28c82ff1
      Yueh-Hsuan Chiang 创作于
      Summary:
      This diff adds three sets of APIs to RocksDB.
      
      = GetColumnFamilyMetaData =
      * This APIs allow users to obtain the current state of a RocksDB instance on one column family.
      * See GetColumnFamilyMetaData in include/rocksdb/db.h
      
      = EventListener =
      * A virtual class that allows users to implement a set of
        call-back functions which will be called when specific
        events of a RocksDB instance happens.
      * To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners
      
      = CompactFiles =
      * CompactFiles API inputs a set of file numbers and an output level, and RocksDB
        will try to compact those files into the specified level.
      
      = Example =
      * Example code can be found in example/compact_files_example.cc, which implements
        a simple external compactor using EventListener, GetColumnFamilyMetaData, and
        CompactFiles API.
      
      Test Plan:
      listener_test
      compactor_test
      example/compact_files_example
      export ROCKSDB_TESTS=CompactFiles
      db_test
      export ROCKSDB_TESTS=MetaData
      db_test
      
      Reviewers: ljin, igor, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: MarkCallaghan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D24705
      28c82ff1
  34. 07 11月, 2014 1 次提交
    • Igor Canadi's avatar
      Turn -Wshadow back on · 9f20395c
      Igor Canadi 创作于
      Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc.
      
      Test Plan: compiles
      
      Reviewers: ljin, yhchiang, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D28131
      9f20395c
  35. 03 9月, 2014 1 次提交
    • Feng Zhu's avatar
      fix dropping column family bug · 8438a193
      Feng Zhu 创作于
      Summary: 1. db/db_impl.cc:2324 (DBImpl::BackgroundCompaction) should not raise bg_error_ when column family is dropped during compaction.
      
      Test Plan: 1. db_stress
      
      Reviewers: ljin, yhchiang, dhruba, igor, sdong
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22653
      8438a193
  36. 04 7月, 2014 1 次提交
    • Yueh-Hsuan Chiang's avatar
      Add timeout_hint_us to WriteOptions and introduce Status::TimeOut. · d4d338de
      Yueh-Hsuan Chiang 创作于
      Summary:
      This diff adds timeout_hint_us to WriteOptions.  If it's non-zero, then
      1) writes associated with this options MAY be aborted when it has been
        waiting for longer than the specified time.  If an abortion happens,
        associated writes will return Status::TimeOut.
      2) the stall time of the associated write caused by flush or compaction
        will be limited by timeout_hint_us.
      
      The default value of timeout_hint_us is 0 (i.e., OFF.)
      
      The statistics of timeout writes will be recorded in WRITE_TIMEDOUT.
      
      Test Plan:
      export ROCKSDB_TESTS=WriteTimeoutAndDelayTest
      make db_test
      ./db_test
      
      Reviewers: igor, ljin, haobo, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D18837
      d4d338de
  37. 15 5月, 2014 1 次提交
  38. 13 2月, 2014 1 次提交
    • Lei Jin's avatar
      IOError cleanup · 994c327b
      Lei Jin 创作于
      Summary: Clean up IOErrors so that it only indicates errors talking to device.
      
      Test Plan: make all check
      
      Reviewers: igor, haobo, dhruba, emayanke
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15831
      994c327b
  39. 27 12月, 2013 1 次提交
    • Siying Dong's avatar
      Avoid malloc in NotFound key status if no message is given. · 18df47b7
      Siying Dong 创作于
      Summary:
      In some places we have NotFound status created with empty message, but it doesn't avoid a malloc. With this patch, the malloc is avoided for that case.
      
      The motivation of it is that I found in db_bench readrandom test when all keys are not existing, about 4% of the total running time is spent on malloc of Status, plus a similar amount of CPU spent on free of them, which is not necessary.
      
      Test Plan: make all check
      
      Reviewers: dhruba, haobo, igor
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D14691
      18df47b7