Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't return a parent field if its type is not listed in the types parameter #123148

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

luizgpsantos
Copy link
Contributor

When setting the types query parameter in a field capability request, we aim to return only fields with the specified types. However, we currently return object fields if an object has at least one child of the requested type. This PR verifies whether the parent field type is included in the types parameters before adding it to the list of fields to be returned.

fixes: #109797

@luizgpsantos luizgpsantos added >enhancement Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations labels Feb 21, 2025
@luizgpsantos luizgpsantos requested a review from piergm February 21, 2025 14:17
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine added the external-contributor Pull request authored by a developer outside the Elasticsearch team label Feb 21, 2025
Copy link
Member

@piergm piergm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good @luizgpsantos! I left few comments but we are in the right track!
Thanks!

@@ -214,7 +215,10 @@ static Map<String, IndexFieldCapabilities> retrieveFieldCaps(
null,
Map.of()
);
responseMap.put(parentField, fieldCap);

if (filter == null || Arrays.asList(types).contains(type)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful, by calling Arrays.asList in a loop, here we are converting every time the array to a new ArrayList and it is not very efficient.
There are a couple of options here:

  1. We can convert the array to a ArrayList before the for loop (this should be safe since we are not changing the array content).
  2. Use a Set instead of an ArrayList (also placed outside the loop), in this way the .contains call is O(1), while for ArrayList is O(n).
    Side note: I don't expect this array to ever become huge therefore both approach should be safe to use but to be extra safe I'd opt for the Set.
    If we are using Set we can even go for Set.of(types) since it returns an unmodifiable set!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes sense. To avoid the duplicated initialization of Set.of(types), I changed some other places. Also, now I have to check for the parent in the filters, so I used the same Set strategy.

assertNotNull(response.get("field4.field5"));
assertNotNull(response.get("_index"));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should this work if we use the filters -parent or +parent? (ref).
Maybe we can add some more test coverage to make sure we have the expected behaviour here!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two extra tests to cover that. What surprised me is that +parent doesn't exist as I would expect. Instead, parent is the affirmative form to retrieve the parent fields.

@piergm
Copy link
Member

piergm commented Feb 21, 2025

@elasticmachine test this please

@piergm piergm self-assigned this Feb 21, 2025
@javanna
Copy link
Member

javanna commented Feb 24, 2025

Drive-by comment: I don't understand in general why it is useful to return objects entirely in the field caps output. What capabilities do objects have? They are just container and are not so useful in terms of querying and aggregating on them?

@javanna javanna changed the title Don't return a parent filed if its type is not listed in the types parameter Don't return a parent field if its type is not listed in the types parameter Feb 24, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @luizgpsantos, I've created a changelog YAML for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_field_caps with types query parameter returns also objects types if it's a parent of a requested type
4 participants