You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've been enjoying using this crate and built our own Rust-Python Parquet tooling with PyO3 to circumvent the numerous and massive memory leak issues we've encountered using pyarrow. So far it's been a blast!
One thing that strucks us though is that most of the methods that pyarrow uses are actually heavily multithreaded in its respective C++ code, at least it seems to use more than one core and is actually faster than our own Rust tooling.
Is there some easy way to enable multithreading with this crate? Or does one have to implement everything on their own?
How are trivial methods even multithreaded in pyarrow? For example, just reading a RowGroup as a RecordBatch seems to use many cores. Is there one thread per RowGroup column and then, the data is "merged" somehow afterwards, or how is this achieved exactly?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey!
We've been enjoying using this crate and built our own Rust-Python Parquet tooling with PyO3 to circumvent the numerous and massive memory leak issues we've encountered using
pyarrow
. So far it's been a blast!One thing that strucks us though is that most of the methods that
pyarrow
uses are actually heavily multithreaded in its respective C++ code, at least it seems to use more than one core and is actually faster than our own Rust tooling.Is there some easy way to enable multithreading with this crate? Or does one have to implement everything on their own?
How are trivial methods even multithreaded in
pyarrow
? For example, just reading a RowGroup as aRecordBatch
seems to use many cores. Is there one thread per RowGroup column and then, the data is "merged" somehow afterwards, or how is this achieved exactly?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions