Skip to content

[bugfix] compare key equality in blob file reader using user comparator instead of simply using equal sign #13457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

lnparker
Copy link

With custom comparator which only compare part of user_key, record.key is always not equal to user_key
in BlobFileReader::VerifyBlob().
For example, user_key is always suffixed with ttl, and custom comparator without ttl like below would make record.key not equal to user_key.

const size_t TTL_LEN_IN_BYTES = 8;
class NoTTLComparator: public rocksdb::Comparator {
public:
    static const char* kClassName() { return "NoTTLComparator";  }
    const char* Name() const override {return kClassName();}
    int Compare(const rocksdb::Slice& a, const rocksdb::Slice& b) const override {
          rocksdb::Slice a_no_ttl = rocksdb::Slice(a.data(), a.size()-TTL_LEN_IN_BYTES);
          rocksdb::Slice b_no_ttl = rocksdb::Slice(b.data(), b.size()-TTL_LEN_IN_BYTES);
          return rocksdb::BytewiseComparator()->Compare(a_no_ttl, b_no_ttl);
    }

    void FindShortestSeparator(std::string* start, const rocksdb::Slice& limit) const {};
    void FindShortSuccessor(std::string* key) const {};
};

@facebook-github-bot
Copy link
Contributor

Hi @lnparker!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@lnparker lnparker changed the title [bugfix] compare key equality in blob file reader using user comparator instead simply using equal sign [bugfix] compare key equality in blob file reader using user comparator instead of simply using equal sign Mar 12, 2025
@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jaykorean
Copy link
Contributor

@lnparker Would you be able to make format and update the PR? Then I will reimport this for internal review.

@facebook-github-bot
Copy link
Contributor

@lnparker has updated the pull request. You must reimport the pull request before landing.

lnparker and others added 2 commits March 17, 2025 15:50
@facebook-github-bot
Copy link
Contributor

@lnparker has updated the pull request. You must reimport the pull request before landing.

@lnparker
Copy link
Author

@lnparker Would you be able to make format and update the PR? Then I will reimport this for internal review.

@jaykorean already done. thanks

@jaykorean jaykorean self-requested a review March 18, 2025 00:42
@facebook-github-bot
Copy link
Contributor

@bigfootjon has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

2 similar comments
@facebook-github-bot
Copy link
Contributor

@bigfootjon has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@bigfootjon has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@jaykorean has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@jowlyzhang jowlyzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blob verification exactly needs to do a byte by byte check. If you are doing user defined ttl with the example customer comparator, how ttl is encoded into a user key should be completely opaque to RocksDB and consistently followed through out all DB read/write invocations. All keys provided via DB::Write(key, value), DB::Get(key) should have its key format follow the expectation of the Comparator and the last 8 bytes being TTL. Otherwise, it will by design fail. Both record_slice.user_key and user_key are expected to have TTL already included in it and blob verification expect them to be exactly the same.

@lnparker
Copy link
Author

lnparker commented Apr 1, 2025

Blob verification exactly needs to do a byte by byte check. If you are doing user defined ttl with the example customer comparator, how ttl is encoded into a user key should be completely opaque to RocksDB and consistently followed through out all DB read/write invocations. All keys provided via DB::Write(key, value), DB::Get(key) should have its key format follow the expectation of the Comparator and the last 8 bytes being TTL. Otherwise, it will by design fail. Both record_slice.user_key and user_key are expected to have TTL already included in it and blob verification expect them to be exactly the same.

@jowlyzhang @jaykorean
Hi, in my case, when setting the key via DB::Write(key, value), the key is suffixed with timestamp like (now() +36000).
When getting the key via DB::Get(key), we can not confirm the exact timestamp, so we get the key suffixed with empty timestamp.
In another scenario, one key can be written repeatedly suffixed with different timestamp, but we want to treat them to be the same key considering the following compaction.
Using the custom comparator provided before, [userkey][timstamp:0] is equal to [userkey][timestamp:now()+36000], but record_slice.user_key and user_key is not exactly the same.
Can this case be supportted?

@jowlyzhang
Copy link
Contributor

when setting the key via DB::Write(key, value), the key is suffixed with timestamp like (now() +36000). When getting the key via DB::Get(key), we can not confirm the exact timestamp, so we get the key suffixed with empty timestamp.

Get(key) is a point look up that is intended to look for the exact match. A range scan is needed in this case with the key without ttl part as prefix. That gets all versions of keys.

Making a custom comparator to treat [userkey][timstamp:0] and [userkey][timestamp:now()+36000] as the same keys is a hack that would break RocksDB invariants, such as this blob verification. With that said, why not encode ttl into the value, as opposed to key so that point look up can work. It seems like you do not want to retain multiple versions of the same key with different timestamps. RocksDB's built-in ttl is doing similar thing to encode time into value: https://github.com/facebook/rocksdb/wiki/Time-to-Live

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants