Skip to content

Add a hint to tell that the row iterator state is the row number in tables with row access #342

Open
@ronisbr

Description

Hi!

We have a problem in PrettyTables.jl related to huge matrices with row access. The current method, AFAIK, to access an element in those tables is to iterate the rows until we reach the desired one and then retrieve the column:

    # Get the i-th row by iterating the row table.
    it, state = iterate(rtable.table)

    for _ in 2:i
        it, state = iterate(rtable.table, state)
        it === nothing && error("The row `i` does not exist.")
    end

This method is OK for any usage that requires to read the entire data. However, when you are printing huge tables, you usually want a part of the table. In the case of PrettyTables.jl, we have a mode in which we print the beginning and the end of the table. Hence, for each printed column, we need to iterate through all the rows to the very bottom, even though we only need a few of them, leading to a very slow process.

The problem appeared here: ronisbr/PrettyTables.jl#217 (comment)

If we add to the Tables.jl API some hint that the state in the iteration is the row id, I can access, let's say, the element in the row 1,000,000 by just:

it, ~ = iterate(stable.table, 1_000_000)

instead of looping through the entire iterator.

In the case of GeoTables.jl, which uses the row ID as the state, the printing process was reduced from 8s to 0.001s when we only show 10 rows and 10 columns with "middle" cropping mode.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions