Skip to content

Adding support to exclude semantic_text subfields #127664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
1e18ab8
Adding support to exclude semantic_text subfields
Samiul-TheSoccerFan May 2, 2025
fe2fc53
Update docs/changelog/127664.yaml
Samiul-TheSoccerFan May 2, 2025
92cfc54
Updating changelog file
Samiul-TheSoccerFan May 2, 2025
5cddc44
remove duplicate test from yaml file
Samiul-TheSoccerFan May 2, 2025
313ba7b
Adding support to exclude semantic_text subfields from mapper builders
Samiul-TheSoccerFan May 6, 2025
1d37d43
Adding support for generic field types
Samiul-TheSoccerFan May 8, 2025
bec04f5
refactoring to use builder and setting exclude value from semantic_te…
Samiul-TheSoccerFan May 8, 2025
f98b8da
update in semantic_text mapper and fetcher to incorporate the support…
Samiul-TheSoccerFan May 8, 2025
b270339
Fix code style issue
Samiul-TheSoccerFan May 8, 2025
0ca741d
adding node feature for yaml tests
Samiul-TheSoccerFan May 8, 2025
be2d6ef
Adding more restrictive checks on yaml tests and few refactoring
Samiul-TheSoccerFan May 8, 2025
e47c818
Returns metadata fields from metadata mappers
Samiul-TheSoccerFan May 22, 2025
c6f768a
returns all source fields for fieldcaps
Samiul-TheSoccerFan May 22, 2025
485bd85
gather all fields and iterate to process for fieldcaps api
Samiul-TheSoccerFan May 22, 2025
45ada04
revert back all changes from MappedFieldtype and subclasses
Samiul-TheSoccerFan May 22, 2025
978f771
revert back exclude logic from semantic_text mapper
Samiul-TheSoccerFan May 22, 2025
a5c0772
fix lint issues
Samiul-TheSoccerFan May 22, 2025
a790296
fix lint issues
Samiul-TheSoccerFan May 22, 2025
2afd580
Adding runtime fields into fieldCaps
Samiul-TheSoccerFan May 23, 2025
59d497c
Fix linting issue
Samiul-TheSoccerFan May 23, 2025
0dbfb54
removing unused functions that used in previous implementation
Samiul-TheSoccerFan May 23, 2025
c1f6f60
fix multifield tests failure
Samiul-TheSoccerFan May 23, 2025
a7ee4ea
getting alias fields for field caps
Samiul-TheSoccerFan May 23, 2025
bb67f02
adding support for query time runtime fields
Samiul-TheSoccerFan May 26, 2025
a95e888
[CI] Auto commit changes from spotless
elasticsearchmachine May 26, 2025
0fcb4d2
Fix empty mapping fieldCaps call
Samiul-TheSoccerFan May 26, 2025
46e1074
Address passthrough behavior for mappers
Samiul-TheSoccerFan May 27, 2025
17025b3
Fix SearchAsYoutype mapper failures
Samiul-TheSoccerFan May 28, 2025
5384dec
rename abstract method to have more meaningful name
Samiul-TheSoccerFan May 28, 2025
1cef39f
Rename mapper function to match its functionality
Samiul-TheSoccerFan May 28, 2025
a831ac8
Adding filtering for infernece subfields
Samiul-TheSoccerFan Jun 5, 2025
adcb2d2
revert back previous implementation changes
Samiul-TheSoccerFan Jun 5, 2025
225bc23
merge from main
Samiul-TheSoccerFan Jun 5, 2025
0b11a20
Merge branch 'main' into exclude-subfields-for-field-caps-api
elasticmachine Jun 9, 2025
22f9af9
Adding yaml test for field caps not filtering multi-field
Samiul-TheSoccerFan Jun 9, 2025
7589cde
Fixing yaml test
Samiul-TheSoccerFan Jun 9, 2025
5fd925e
Merge branch 'main' into exclude-subfields-for-field-caps-api
elasticmachine Jun 9, 2025
cbce7c5
Adding comment why .infernece filter is added
Samiul-TheSoccerFan Jun 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/127664.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 127664
summary: Exclude `semantic_text` subfields from field capabilities API
area: "Mapping"
type: enhancement
issues: []
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

package org.elasticsearch.action.fieldcaps;

import org.elasticsearch.cluster.metadata.InferenceFieldMetadata;
import org.elasticsearch.cluster.metadata.MappingMetadata;
import org.elasticsearch.core.Booleans;
import org.elasticsearch.core.Nullable;
Expand All @@ -30,6 +31,7 @@
import org.elasticsearch.tasks.CancellableTask;

import java.io.IOException;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
Expand Down Expand Up @@ -256,6 +258,14 @@ private static Predicate<MappedFieldType> buildFilter(String[] filters, String[]
Set<String> acceptedTypes = Set.of(fieldTypes);
fcf = ft -> acceptedTypes.contains(ft.familyTypeName());
}

// Exclude internal ".inference" subfields of semantic_text fields from the field capabilities response
Collection<InferenceFieldMetadata> inferenceFields = context.getMappingLookup().inferenceFields().values();
for (InferenceFieldMetadata inferenceField : inferenceFields) {
Predicate<MappedFieldType> next = ft -> ft.name().startsWith(inferenceField.getName() + ".inference") == false;
fcf = fcf == null ? next : fcf.and(next);
}

for (String filter : filters) {
if ("parent".equals(filter) || "-parent".equals(filter)) {
continue;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

import java.util.Set;

import static org.elasticsearch.xpack.inference.mapper.SemanticTextFieldMapper.SEMANTIC_TEXT_EXCLUDE_SUB_FIELDS_FROM_FIELD_CAPS;
import static org.elasticsearch.xpack.inference.mapper.SemanticTextFieldMapper.SEMANTIC_TEXT_SUPPORT_CHUNKING_CONFIG;
import static org.elasticsearch.xpack.inference.queries.SemanticKnnVectorQueryRewriteInterceptor.SEMANTIC_KNN_FILTER_FIX;
import static org.elasticsearch.xpack.inference.queries.SemanticKnnVectorQueryRewriteInterceptor.SEMANTIC_KNN_VECTOR_QUERY_REWRITE_INTERCEPTION_SUPPORTED;
Expand Down Expand Up @@ -59,7 +60,8 @@ public Set<NodeFeature> getTestFeatures() {
SemanticTextFieldMapper.SEMANTIC_TEXT_HANDLE_EMPTY_INPUT,
TEST_RULE_RETRIEVER_WITH_INDICES_THAT_DONT_RETURN_RANK_DOCS,
SEMANTIC_TEXT_SUPPORT_CHUNKING_CONFIG,
SEMANTIC_TEXT_MATCH_ALL_HIGHLIGHTER
SEMANTIC_TEXT_MATCH_ALL_HIGHLIGHTER,
SEMANTIC_TEXT_EXCLUDE_SUB_FIELDS_FROM_FIELD_CAPS
);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,9 @@ public class SemanticTextFieldMapper extends FieldMapper implements InferenceFie
public static final NodeFeature SEMANTIC_TEXT_SKIP_INFERENCE_FIELDS = new NodeFeature("semantic_text.skip_inference_fields");
public static final NodeFeature SEMANTIC_TEXT_BIT_VECTOR_SUPPORT = new NodeFeature("semantic_text.bit_vector_support");
public static final NodeFeature SEMANTIC_TEXT_SUPPORT_CHUNKING_CONFIG = new NodeFeature("semantic_text.support_chunking_config");
public static final NodeFeature SEMANTIC_TEXT_EXCLUDE_SUB_FIELDS_FROM_FIELD_CAPS = new NodeFeature(
"semantic_text.exclude_sub_fields_from_field_caps"
);

public static final String CONTENT_TYPE = "semantic_text";
public static final String DEFAULT_ELSER_2_INFERENCE_ID = DEFAULT_ELSER_ID;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -359,3 +359,76 @@ setup:
index: test-always-include-inference-id-index

- exists: test-always-include-inference-id-index.mappings.properties.semantic_field.inference_id

---
"Field caps exclude chunks and embedding fields":
- requires:
cluster_features: "semantic_text.exclude_sub_fields_from_field_caps"
reason: field caps api exclude semantic_text subfields from 9.1.0 & 8.19.0

- do:
field_caps:
include_empty_fields: true
index: test-index
fields: "*"

- match: { indices: [ "test-index" ] }
- exists: fields.sparse_field
- exists: fields.dense_field
- not_exists: fields.sparse_field.inference.chunks.embeddings
- not_exists: fields.sparse_field.inference.chunks.offset
- not_exists: fields.sparse_field.inference.chunks
- not_exists: fields.sparse_field.inference
- not_exists: fields.dense_field.inference.chunks.embeddings
- not_exists: fields.dense_field.inference.chunks.offset
- not_exists: fields.dense_field.inference.chunks
- not_exists: fields.dense_field.inference

---
"Field caps does not exclude multi-fields under semantic_text":
- requires:
cluster_features: "semantic_text.exclude_sub_fields_from_field_caps"
reason: field caps api exclude semantic_text subfields from 9.1.0 & 8.19.0
- do:
indices.create:
index: test-multi-field-index
body:
settings:
index:
mapping:
semantic_text:
use_legacy_format: false
mappings:
properties:
sparse_field:
type: semantic_text
inference_id: sparse-inference-id
fields:
sparse_keyword_field:
type: keyword
dense_field:
type: semantic_text
inference_id: dense-inference-id
fields:
dense_keyword_field:
type: keyword

- do:
field_caps:
include_empty_fields: true
index: test-multi-field-index
fields: "*"

- match: { indices: [ "test-multi-field-index" ] }
- exists: fields.sparse_field
- exists: fields.dense_field
- exists: fields.sparse_field\.sparse_keyword_field
- exists: fields.dense_field\.dense_keyword_field
- not_exists: fields.sparse_field.inference.chunks.embeddings
- not_exists: fields.sparse_field.inference.chunks.offset
- not_exists: fields.sparse_field.inference.chunks
- not_exists: fields.sparse_field.inference
- not_exists: fields.dense_field.inference.chunks.embeddings
- not_exists: fields.dense_field.inference.chunks.offset
- not_exists: fields.dense_field.inference.chunks
- not_exists: fields.dense_field.inference
Original file line number Diff line number Diff line change
Expand Up @@ -307,3 +307,26 @@ setup:
another_field:
type: keyword

---
"Field caps exclude chunks embedding and text fields":
- requires:
cluster_features: "semantic_text.exclude_sub_fields_from_field_caps"
reason: field caps api exclude semantic_text subfields from 9.1.0 & 8.19.0

- do:
field_caps:
include_empty_fields: true
index: test-index
fields: "*"

- match: { indices: [ "test-index" ] }
- exists: fields.sparse_field
- exists: fields.dense_field
- not_exists: fields.sparse_field.inference.chunks.embeddings
- not_exists: fields.sparse_field.inference.chunks.text
- not_exists: fields.sparse_field.inference.chunks
- not_exists: fields.sparse_field.inference
- not_exists: fields.dense_field.inference.chunks.embeddings
- not_exists: fields.dense_field.inference.chunks.text
- not_exists: fields.dense_field.inference.chunks
- not_exists: fields.dense_field.inference
Loading