Skip to content

[Feature request] Specialized language segmentation for mixed-language pages #1184

@atomarna

Description

@atomarna

I would like to make use of the newly introduced specialized language segmentation on a mixed-language site. Specifically, I have a text in English interspaced with quotes in Japanese. Since the site has an English lang attribute, the Japanese quotes are indexed as single entries (as shown in the playground) and searching for words within the quotes does not return any results, only searching for the entire quote or its begining does.

I am not sure how common the use case is, but it would be extremely useful if I could turn on the segmentation for a site with an English lang attribute, provided it is applied only to the non-Latin script present on the pages.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions