Skip to content

bug: BIGQUERY backend generates invalid query when calling .distinct on subset on table with array column #10553

Open
@greg-offerfit

Description

@greg-offerfit

What happened?

Given a GBQ table with the following schema:

id: int
int_ids: array<!int64>  (int64 REPEATED)

Calling table.distinct(on=["id"]).execute() fails with the following error:

google.api_core.exceptions.BadRequest: 400 The argument to ARRAY_AGG must not be an array type but was ARRAY at [7:5]; reason: invalidQuery, location: query, message: The argument to ARRAY_AGG must not be an array type but was ARRAY at [7:5]

I expected this to generate a valid query and return a DataFrame.

See below comment for minimal reproduction.

What version of ibis are you using?

9.5.0

What backend(s) are you using, if any?

BigQuery

Relevant log output

google.api_core.exceptions.BadRequest: 400 The argument to ARRAY_AGG must not be an array type but was ARRAY<INT64> at [7:5]; reason: invalidQuery, location: query, message: The argument to ARRAY_AGG must not be an array type but was ARRAY<INT64> at [7:5]

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior inside of ibis

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions