Skip to content

Return empty data instead of throwing when Fields API tries to access non-available "fielddata" #99583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ldematte
Copy link
Contributor

@ldematte ldematte commented Sep 14, 2023

Address the problem highlighted in #94403

The PR is unfortunately large because I went through to make the change complete and buildable (fixing all the tests and mocks too). For review, I think the best way is to concentrate on 5 files/classes:

  • The new method in MappedFieldType.java
  • SearchExecutionContext.java, the constructor and how it is used in IndexService.java (I added a self-comment there)
    • Compare with the constructor in SearchLookup.java - I added a self-comment there as well, to request input
  • How the new method is used in LeafDocLookup.java in the code path specific for field-API access
    • See also self-comment on FieldFactoryWrapper alternatives to return an EmptyField - either make it better or remove it/move things one level up.

After exploring the interactions between context, docs, fielddata etc. I think that the best way to solve the issue is if there is a way for MappedFieldTypes to expose if they can read fielddata. This way the field API can avoid using the fielddata builder (and throwing), and return an EmptyField instead.

To do this, I added a public boolean isFielddataSupported(FieldDataContext fieldDataContext) to MappedFieldType; exposing it to LeafDocLookup with the appropriate context required a bit of propagation, through IndexService, SearchExecutionContext and SearchLookup and the implementation of the new method in every MappedFieldType derivate class. In the end however code changes were simple, but spread across the codebase.

Closes #94403

@@ -106,7 +111,7 @@ public SearchExecutionContext(
int shardRequestIndex,
IndexSettings indexSettings,
BitsetFilterCache bitsetFilterCache,
BiFunction<MappedFieldType, FieldDataContext, IndexFieldData<?>> indexFieldDataLookup,
IndexFieldDataLookup indexFieldDataLookup,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that introducing an interface here instead of adding another BiFunction would have made things clearer, but I have second thoughts about it. See SearchLookup for contrast, where instead I introduced an additional function

*/
public SearchLookup(
Function<String, MappedFieldType> fieldTypeLookup,
TriFunction<MappedFieldType, Supplier<SearchLookup>, MappedFieldType.FielddataOperation, IndexFieldData<?>> fieldDataLookup,
TriFunction<MappedFieldType, Supplier<SearchLookup>, MappedFieldType.FielddataOperation, Boolean> fieldDataPredicate,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I introduced a second TriFunction instead of an interface that includes both the lookup and the predicate - see SearchExecutionContext for that approach.
I'd say we keep one for consistency, but it's not clear to me which one is more readable so I kept both and I am asking for feedback.

@@ -660,7 +661,17 @@ public SearchExecutionContext newSearchExecutionContext(
shardRequestIndex,
indexSettings,
indexCache.bitsetFilterCache(),
indexFieldData::getForField,
new SearchExecutionContext.IndexFieldDataLookup() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where the single interface with both lookup and predicate methods is "worse": the anonymous class is not brilliant to read.
If we like the single interface, I would like to extract a static inner class (adapter) to make it nicer (more readable). Thoughts?

@@ -100,6 +104,23 @@ private FieldFactoryWrapper getFactoryForField(String fieldName) {
throw new IllegalArgumentException("No field found for [" + fieldName + "] in mapping");
}

if (fieldDataPredicate.apply(fieldType, SCRIPT) == false) {
return new FieldFactoryWrapper(new DocValuesScriptFieldFactory() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is temporary; if we are happy with this approach, I would suggest to make FieldFactoryWrapper an interface with 2 concrete implementations, one equivalent to the concrete class now, one "empty". Hiding the factory, etc.

But there is a preliminary Q on this point: I see the factory wrapper is (can) shared with doc-API access and cached. Should we avoid it? Should we pull the check one level up (inside getScriptField), avoid the wrapper and the cache altogether? (I think it might be wrong to cache it, as the predicate could return different answers based on context).
I wrote it like this before realizing the predicate could change based on context; maybe the cache is for the captured context only, so no problem, but better check and in doubt move this out.

@ldematte ldematte changed the title Fields api fieldata no throw Return empty data instead of throwing when Fields API tries to access non-available "fielddata" Sep 19, 2023
@mattc58 mattc58 added v8.12.0 and removed v8.11.0 labels Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Script: Fields API should not throw for disabled field data
5 participants