Skip to content

[BUG] PPL multisearch without further processing fails #5145

@VladNastase

Description

@VladNastase

What is the bug?
When doing a multisearch with two (or more) subsearches, the multisearch fails if no fields are selected either in the subsearch or the multisearch or if no further processing is done (e.g. by using the stats command).

All example queries are runnable in the OpenSearch Playground, using the sample data (specifically the opensearch_dashboards_sample_data_logs index/datasource).
The following query:

| multisearch 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='US'] 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='IT']

results in the following error (formatted for better readability):

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: Error while preparing plan [LogicalSystemLimit(sort0=[$20], dir0=[DESC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
  LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], utc_time=[$15], bytes=[$16], machine=[$17], response=[$20], clientip=[$21], host=[$22], event=[$23], phpmemory=[$25], timestamp=[$26], @timestamp=[$27])
    LogicalSort(sort0=[$27], dir0=[DESC])
      LogicalUnion(all=[true])
        LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], @timestamp=[$33])
          LogicalFilter(condition=[=($13, 'US')])
            LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], _id=[$27], _index=[$28], _score=[$29], _maxscore=[$30], _sort=[$31], _routing=[$32], @timestamp=[$26])
              CalciteLogicalIndexScan(table=[[OpenSearch, opensearch_dashboards_sample_data_logs]])
        LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], @timestamp=[$33])
          LogicalFilter(condition=[=($13, 'IT')])
            LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], _id=[$27], _index=[$28], _score=[$29], _maxscore=[$30], _sort=[$31], _routing=[$32], @timestamp=[$26])
              CalciteLogicalIndexScan(table=[[OpenSearch, opensearch_dashboards_sample_data_logs]])]",
    "type": "RuntimeException"
  },
  "status": 500
}

The following two queries work:

| multisearch 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='US'] 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='IT'] 
| fields @timestamp, geo, bytes
| multisearch 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='US' | fields @timestamp, geo, bytes] 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='IT' | fields @timestamp, geo, bytes] 

Trying to select all fields using * does not work. The following query:

| multisearch 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='US'] 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='IT'] 
| fields `*`

results in the following error (formatted for better readability):

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: Error while preparing plan [LogicalSystemLimit(sort0=[$27], dir0=[DESC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
  LogicalSort(sort0=[$27], dir0=[DESC])
    LogicalUnion(all=[true])
      LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], @timestamp=[$33])
        LogicalFilter(condition=[=($13, 'US')])
          LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], _id=[$27], _index=[$28], _score=[$29], _maxscore=[$30], _sort=[$31], _routing=[$32], @timestamp=[$26])
            CalciteLogicalIndexScan(table=[[OpenSearch, opensearch_dashboards_sample_data_logs]])
      LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], @timestamp=[$33])
        LogicalFilter(condition=[=($13, 'IT')])
          LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], geo.coordinates=[$11], geo.dest=[$12], geo.src=[$13], geo.srcdest=[$14], utc_time=[$15], bytes=[$16], machine=[$17], machine.os=[$18], machine.ram=[$19], response=[$20], clientip=[$21], host=[$22], event=[$23], event.dataset=[$24], phpmemory=[$25], timestamp=[$26], _id=[$27], _index=[$28], _score=[$29], _maxscore=[$30], _sort=[$31], _routing=[$32], @timestamp=[$26])
            CalciteLogicalIndexScan(table=[[OpenSearch, opensearch_dashboards_sample_data_logs]])]",
    "type": "RuntimeException"
  },
  "status": 500
}

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Go to the OpenSearch playground.
  2. Use the queries provided above either in the Dev Console or the Query Workbench.

What is the expected behavior?
In the absence of any field selection, select all the fields in the result.

What is your host/environment?

  • OS: N/A
  • Version 3.4.0
  • Plugins: N/A

Do you have any screenshots?
N/A

Do you have any additional context?
The following query also works:

| multisearch 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='US'] 
    [search source=opensearch_dashboards_sample_data_logs | where geo.src='IT'] 
| stats avg(bytes)

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languagebugSomething isn't working

    Type

    No type

    Projects

    Status

    Not Started

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions