Currently, there is no way to determine if a row group should be pruned when trying to limit the work done to find a particular (set of) records. Coarse-grain column chunk statistics can say whether a row group contains a particular value on a sorted column. This is useful for avoiding a full scan of the whole file, and with small enough row groups, it works well on its own. They just aren't visible on the JavaScript API of ColumnChunkMetaData
The page index is more granular, and lets you use the limit and offset options on ReaderOptions without doing more explicit scanning of a file from JavaScript. It does make initially opening the file more expensive so they aren't automatically read as implemented right now, but they are also not visible in the JavaScript API either.
Supporting the ColumnChunkMetaData.statistics is pretty simple, adding a new method to the existing ColumnChunkMetaData WASM type mirroring the native one. Supporting the page index involves adding a new method for loading one or more columns' indices and packing them up so they can be used from JavaScript, which could be much more work.
Currently, there is no way to determine if a row group should be pruned when trying to limit the work done to find a particular (set of) records. Coarse-grain column chunk statistics can say whether a row group contains a particular value on a sorted column. This is useful for avoiding a full scan of the whole file, and with small enough row groups, it works well on its own. They just aren't visible on the JavaScript API of
ColumnChunkMetaDataThe page index is more granular, and lets you use the
limitandoffsetoptions onReaderOptionswithout doing more explicit scanning of a file from JavaScript. It does make initially opening the file more expensive so they aren't automatically read as implemented right now, but they are also not visible in the JavaScript API either.Supporting the
ColumnChunkMetaData.statisticsis pretty simple, adding a new method to the existingColumnChunkMetaDataWASM type mirroring the native one. Supporting the page index involves adding a new method for loading one or more columns' indices and packing them up so they can be used from JavaScript, which could be much more work.