Skip to content
This repository was archived by the owner on Nov 26, 2024. It is now read-only.

SER-13233 Add support for sorting on multiple columns using CollectinOptions sortby. #680

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

YouWillNeverWalkAl1
Copy link

Added support for specifying multiple sort by columns in the CollectionOptions sortBy string separated by commas. Each column needs to be defined in allowedSortColumns in order to be used in the sort.
The sortBy string could e.g. contain the string "name asc, id desc" which would result in the corresponding sortby clause.

@JonasHallFnx JonasHallFnx self-requested a review January 18, 2024 07:45
Copy link
Contributor

@JonasHallFnx JonasHallFnx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice feature!

Some things to consider though:

  • If there is even a slight risk for sql-injection, I don't think it is worth the risk. I currently see such a risk with this PR.
  • Readability of the code is important, especially to assess the risk of the code. I think the readability and structure of the code can be improved.
  • Performance is key in code that runs often. If the input looks like the old normal case, you could make an early exit without running regex and other slow stuff. Perhaps queryAnnotation.allowedSortColumns() could be a Set in order to avoid nested loops?

For me the security part is really important. This is one of the few places where an injection could happen. I would like the code to be very easy to read and that the query is only built from strings that are defined by the application, using lookups based on the input, rather than using the input and propagating it to the sql.

@YouWillNeverWalkAl1
Copy link
Author

YouWillNeverWalkAl1 commented Jan 19, 2024

I've added that if sortby contains only one column it will continue as before (without using regexp).
As to the security considerations I can only say that if you look at the code you will see that the sort columns inserted comes solely from allowed columns and sort order only from the enum SortOrder. If they try anything else the CollectionOptions.SortOrder.valueOf will fail (actually it will not be recognized by the regexp).
As to the readability of the code there might be improvements to make (as I wrote the code I think it's pretty obvious what it does but for other it might not be). Please feel free to do any changes you see fit but I think this functionality should be included in the framework.

@@ -17,6 +21,7 @@ public class CollectionOptionsQueryPart implements QueryPart {
private final int collectionOptionsArgIndex;
private final Query queryAnnotation;
private final boolean isMono;
private static final Pattern SORT_BY_PART = Pattern.compile("(?<sortby>[a-zåäö0-9_]+)(\\s+(?<order>asc|desc))?");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need åäö?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. We are a swedish company...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to think that åäö is not being used even though Fortnox is a swedish company :). Regardless, it is probably ok to leave it as is.

}
} else {
Collections.reverse(Arrays.asList(sortByParts));
for (var sortByPart: sortByParts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to filter out the columns that have already been appended to the ordering. For instance, sortBy="name asc, id desc, name asc, id desc, ..." should not become "ORDER BY name ASC, id DESC, name ASC, id DESC, ..." but rather "ORDER BY name ASC, id DESC".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would make the code more complicated and would not change the result. It is also a fault made by the user of RW and should not occur.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would make the code more complicated and would not change the output. I am not sure I understand what you mean that it is a fault made by the user of RW and should not occur. Do you mean that the responsibility of dealing with "sortby=id asc, name desc, id asc, name desc, ..." falls on the application developer?

Even though it would make the code more complicated and doesn't change the output (as well as probably a slight performance impact), I prefer to see the filtering because I find it unreasonable that we would allow an end user to craft a query that looks like that. Furthermore, while it may not be a security concern related to postgres, I have other security concerns that we can discuss in private.

collectionOptionsHasOrderBy = true;
addOrderBy(sql, buildOrderBy(collectionOptions.getOrder(), allowed));
break;
var sortByParts = sortBy.split(",");
Copy link
Member

@softqwewasd softqwewasd Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be good to have a check for max length for the sortBy unless Netty's query decoder already checks that?

Copy link
Author

@YouWillNeverWalkAl1 YouWillNeverWalkAl1 Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know. sortBy is however not a new parameter and that change request is not addressing this change. How long should it be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but we are now splitting it and running a regex for each column when sorting by multiple columns compared to only sorting by a single column with an equals comparison? I'd say 256 bytes is a reasonable limit, but we can put it at 1024 to be on the safe side, doesn't make that much of a difference performance wise.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants