Skip to content

docs(ops): operation matrix for streaming ops #9751

Open
@chloeh13q

Description

@chloeh13q

Please describe the issue

Which streaming operations are supported in Ibis is a question that has come up several times. Right now we 1) don't distinguish between streaming ops and batch ops in the operation matrix and 2) don't list support for backends in streaming mode vs batch mode separately.

A few thoughts:

  1. Some streaming ops are not distinct Ibis operators. E.g. stream-table join, stream-stream join. So the matrix would require some manual work (although the number of such streaming ops is small and we only have to do it for 3 streaming backends right now).
  2. Support matrix for non-streaming-specific ops can get a little complicated because, e.g., in Spark Structured Streaming, ops support depends on the output mode.
  3. But in general, I think it's helpful to be able to give users a general sense of what they can and cannot do in Ibis streaming right now. Although to some extent I think we should, out of the box, already support whatever is expressible in SQL, except in cases where batch and streaming syntax differ (which should be the minority, not the majority). And in most cases, what we don't support is probably just limited by what the engine doesn't support (e.g. Spark Structured Streaming does not support over aggregation). But we haven't exactly validated this because we don't run the generic backend unit test suite for streaming (Flink/Spark).

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsDocumentation related issues or PRsstreamingIssue related to streaming APIs or backends

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions