Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support get all key from state for composite keys #153

Merged
merged 1 commit into from
Feb 12, 2025

Conversation

pfi79
Copy link
Contributor

@pfi79 pfi79 commented Feb 10, 2025

I may need to get all the keys (composite and not composite) for system service.
I don't have that option right now.
My change offers an option on how this can be done.

If I could get them, I could figure out which key is for what.

@pfi79 pfi79 requested a review from a team as a code owner February 10, 2025 16:22
@denyeart
Copy link
Contributor

This goes against guardrails that were intentionally put in place so that users would get back either regular keys or composite keys, so that they could handle them accordingly. Returning a mix of key types will likely confuse users.

If you really need this, maybe it would be better to split into two functions, one for regular keys and one for composite keys?

@pfi79
Copy link
Contributor Author

pfi79 commented Feb 10, 2025

This goes against guardrails that were intentionally put in place so that users would get back either regular keys or composite keys, so that they could handle them accordingly. Returning a mix of key types will likely confuse users.

If you really need this, maybe it would be better to split into two functions, one for regular keys and one for composite keys?

Ok.
I'll think about it.
I'll take my time with it.

@pfi79
Copy link
Contributor Author

pfi79 commented Feb 10, 2025

This goes against guardrails that were intentionally put in place so that users would get back either regular keys or composite keys, so that they could handle them accordingly. Returning a mix of key types will likely confuse users.

If you really need this, maybe it would be better to split into two functions, one for regular keys and one for composite keys?

We should probably add a function just to get all composite keys.
Because with existing methods, I can get all non-composite keys

@pfi79 pfi79 force-pushed the get-aqll-key-value branch 2 times, most recently from a9ff69c to 15acc10 Compare February 10, 2025 18:54
@pfi79
Copy link
Contributor Author

pfi79 commented Feb 10, 2025

This goes against guardrails that were intentionally put in place so that users would get back either regular keys or composite keys, so that they could handle them accordingly. Returning a mix of key types will likely confuse users.

If you really need this, maybe it would be better to split into two functions, one for regular keys and one for composite keys?

Made only for composite keys

@pfi79 pfi79 changed the title add support get all key from state add support get all key from state for composite keys Feb 10, 2025
Copy link
Contributor

@denyeart denyeart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but let's see if others have comments tomorrow.

Signed-off-by: Fedor Partanskiy <[email protected]>
@pfi79 pfi79 force-pushed the get-aqll-key-value branch from 15acc10 to 8b55978 Compare February 11, 2025 04:45
@bestbeforetoday
Copy link
Member

Just for reference, a related discussion that I dug out of the old Jira data:

When we do range query on simple keys, it does return composite key as well. For example, the following range query on marble02 chaincode example returns both simple key such as marble2, marble3, and composite key such as colornamebluemarble2, colornamebluemarble2 where it should have returned only marble2 and marble3.

$ peer chaincode query -n mycc1 -v 0 -c {"Args":["getMarblesByRange","a","z"]} -o 127.0.0.1:7050 -C ch1
Query Result: [

{"Key":"color~namebluemarble3", "Record":} \{"Key":"color~nameredmarble2", "Record":}

{"Key":"marble2", "Record":\{"docType":"marble","name":"marble2","color":"red","size":50,"owner":"tom"}},
{"Key":"marble3", "Record":\{"docType":"marble","name":"marble3","color":"blue","size":70,"owner":"tom"}}

How fabric can differentiate between a simple or composite so that GetStateByRange() can return only simple keys?

Approach-1 Ledger adds a null character 0x00 as a first character in simple key.
The constructCompositeKey(namespace, key) in ledger check whether the key contains \x00. If it does not contain \x00, then the key is a simple key and add a \x00 in the front of the key. The splitCompositeKey() removes the added \x00 in the front of the key if exist. For this approach to work, we shouldn't allow 0x00 in original simple key. Still, this approach will not work as PutState() cannot be used for both simple key and composite key when we want to allow \x00 in composite key (added by CreateCompositeKey) but not in simplekey.

Approach-2: Ledger adds a null character 0x00 as a first character in simple key
This approach is an extension to approach-1. Expose a new chaincode API named PutCompositeState(objectType, attributes, value). Internally, we can call createCompositeKey() instead of exposing an API for that followed by handlePutState(). We have done a similar thing in GetStateByPartialCompositeKey(objectType, attributes). This would simplify the chaincode as well. We can then let chaincode use PutState() API only for simple key and do not allow \x00. At ledger side, we use the approach-1 .

Approach-3: Add a separate namespace for simple and composite key.
Expose two new chaincode APIs named GetCompositeState() and PutCompositeState(). In front to the composite key, chaincode APIs adds c to denote that the key is composite. Similarly, chaincode APIs such as GetState(), PutState(), GetStateByRange() adds ‘s’ in the front of simple key. At ledger side, Next() of each result iterator removes ‘c’ or ‘s’ from the front of each key.

Approach-4 Add separate namespace for simple and composite key.
Instead of adding ‘c’ and ‘s’ from chaincode APIs, we can add these namespace in ledger side. For this, we need to expose two new chaincode APIs and ledger APIs named GetCompositeState() and PutCompositeState(). Now, the ledger API adds ‘s’ and ‘c’ and Next() of each result iterator removes ‘c’ or ‘s’ from the front of each key.

Approach-5 Chaincode API CreateCompositeKey adds a null character 0x00 as a first character in composite key.
In current implementation of GetStateByRange(), we are not concatenating anything with startKey and endKey. In GetStateByPartialCompositeKey(), we concatenate U+10FFFF with the endKey. Hence, prepending 0x00 on composite key will not induce a collision. For this approach to work, we shouldn’t allow 0x00 as first character in simple key. Still, this approach will not work as PutState() cannot be used for both simple key and composite key when we want to allow \x00 as first character in composite key (added by CreateCompositeKey) but not in simplekey.

If startKey of a range query is empty (for an open ended search), current implementation treats the startKey as 0x00 (null). Hence, we replace the empty startKey with 0x01 to avoid collision. Now, there is no collision when chaincode issues open-ended search like (i) startKey="" endKey="", (ii) startKey="a" endKey="z".

TODO: we need to document somewhere that a simpleKey cannot start with a null (0x00) as we cannot introduce a new chaincode API PutCompositeState() at this time.


Composite keys (ck) have an objectType prefix. Each part of composite key is delimited by null character, for example (spaces included for visual clarity only):

chaincodeid 0x00 objectType 0x00 ck1 0x00 ck2 0x00// composite key

This design ensures that various composite key types have an objectType namespace to guarantee no collisions across types. For example range queries and partial composite key queries will never overlap.

Issue: Simple keys (sk) and composite key objectType are in the same namespace. Range queries on simple keys will pick up composite keys in an objectType scope. They should be in different scopes to ensure no collisions. To illustrate the issue, note that simple key and composite key objectType are in the same namespace / level:

chaincodeid 0x00 sk                   // simple key 
chaincodeid 0x00 objectType 0x00 ck1 0x00 ck2 0x00   // composite key

Solution
Automatically put simple keys in a reserved namespace, for example the empty “” object space.

Note the simple key and composite key distinct namespaces now (spaces added for visual clarity):

chaincodeid 0x00                    0x00 sk         // simple key
chaincodeid 0x00 objectType 0x00 ck1 0x00 ck2 0x00// composite key 

A range query from a to z on a simple key would look like:

chaincodeid 0x00 0x00 a  TO   chaincodeid 0x00 0x00 z

Approach 2,3,4 are out assuming we do not allow chaincode API changes.

Can approach 1 and approach 5 work, if we document (but don't validate in code) that simple keys, or any part of a composite key, should not contain 0x00?


Yes, both approach would work for that assumption. Currently, we are checking for 0x00 in attributes of composite key and return error if exist. We only need to document for simple keys.

Approach 1 requires change only in ledger side whereas Approach 5 requires change only in chaincode shim.


After discussion with Senthil it was decided to proceed with Approach 5. [~Senthil1] Please add any more details to the Approach 5 Description above, for example the unbounded range query impact and anything else you find during imlementation. Also note the typo ""CreateCompositeCreate"" :-)


Before this change, it was possible to specify composite keys as parameters to getStateByRange(), now it is not possible. Was this intended?


Yes, it was intended. Now, getStateByRange() returns only simple keys and getStateByPartialCompositeKey() returns only composite keys. We have different namespaces for simple keys and composite keys to avoid collisions.


Let me be devil's advocate and propose relaxing the validation constraint...

If we allowed getStateByRange() to accept composite keys, I can see there would be collisions on the open-ended range queries.

But if we allowed it when both startKey and endKey were included, I think there would be value without issues.

Using the marbles02 example, say I want to query for blue marble1 through blue marble9, where <color,id> is the composite key. Wouldn't it be safe to query on <blue,marble1> to <blue,marble9>?

So perhaps instead of rejecting all range queries with composite keys, we could relax it and instead validate that the first parts of the composite keys are the same in startKey and endKey. I believe this would protect against the concerns, while allowing range queries within a composite key space. What do you think?

@pfi79
Copy link
Contributor Author

pfi79 commented Feb 11, 2025

Just for reference, a related discussion that I dug out of the old Jira data:

Thank you for this excursion.

Could you please clarify how this should or could affect my pr?

@bestbeforetoday
Copy link
Member

It shouldn't necessarily. I just wanted to capture related information so it doesn't get lost. Since there is no issue raised for this change, this PR is currently the only place to do that.

@bestbeforetoday
Copy link
Member

Your change to allow query of all composite keys does have some relation to previous user requests to query ranges of composite keys. I don't see any issue with a specific function to get all composite keys, especially as the implementation seems pretty straightforward. Maybe the range query requirement should be (re)considered alongside this too.

I've raised issue #154 to capture the information around that request to avoid cluttering up this PR.

@pfi79
Copy link
Contributor Author

pfi79 commented Feb 11, 2025

I have my grievances with the ‘range’. I'll list them there.

But here I have a utilitarian task. I maintain a shared library. Others use it to create their own chaincodes. These others can create any composite and non-composite keys. But there is a requirement for the library: it must provide functionality that will enumerate all keys with values.
And to search all non-composite keys is a trivial task.
But for composite keys there is no such functionality. That's what I'm trying to add.

@pfi79
Copy link
Contributor Author

pfi79 commented Feb 11, 2025

I think this is fine, but let's see if others have comments tomorrow.

I don't think there's anything against it.

@denyeart denyeart merged commit fc2b4bb into hyperledger:main Feb 12, 2025
6 checks passed
@denyeart
Copy link
Contributor

@pfi79 Do you have other enhancements planned in the short term, or shall I cut another release now?

@pfi79
Copy link
Contributor Author

pfi79 commented Feb 12, 2025

@pfi79 Do you have other enhancements planned in the short term, or shall I cut another release now?

No, I'm not planning anything else. It can be released.

@pfi79 pfi79 deleted the get-aqll-key-value branch February 12, 2025 05:41
@denyeart
Copy link
Contributor

@pfi79 released - https://github.com/hyperledger/fabric-chaincode-go/releases/tag/v2.3.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants