-
Notifications
You must be signed in to change notification settings - Fork 581
burn-dataset: Catch import.py unsuccessful exits #3236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
burn-dataset: Catch import.py unsuccessful exits #3236
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3236 +/- ##
==========================================
- Coverage 82.12% 82.12% -0.01%
==========================================
Files 966 966
Lines 123257 123261 +4
==========================================
+ Hits 101229 101232 +3
- Misses 22028 22029 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Did you manage to find what the root segfault issue was?
Yes and no, the general cause seems to be a disagreement between NixOS and the pyarrow native lib pulled in by Burn's venv. The details seem unclear when looking at the corefile. My Nix shell provides a couple things relevant to this crash - mostly libc stuff, libstdc++ and Python (3.12). Full ldd output: $ ldd ~/.cache/burn-dataset/venv/lib/python3.12/site-packages/pyarrow/libarrow_python.so.2000
linux-vdso.so.1 (0x00007ffd867f9000)
libarrow_substrait.so.2000 => /home/drozdziak1/.cache/burn-dataset/venv/lib/python3.12/site-packages/pyarrow/libarrow_substrait.so.2000 (0x00007f6e00a00000)
libarrow_dataset.so.2000 => /home/drozdziak1/.cache/burn-dataset/venv/lib/python3.12/site-packages/pyarrow/libarrow_dataset.so.2000 (0x00007f6e00833000)
libarrow_acero.so.2000 => /home/drozdziak1/.cache/burn-dataset/venv/lib/python3.12/site-packages/pyarrow/libarrow_acero.so.2000 (0x00007f6e006b4000)
libparquet.so.2000 => /home/drozdziak1/.cache/burn-dataset/venv/lib/python3.12/site-packages/pyarrow/libparquet.so.2000 (0x00007f6dffe00000)
libarrow.so.2000 => /home/drozdziak1/.cache/burn-dataset/venv/lib/python3.12/site-packages/pyarrow/libarrow.so.2000 (0x00007f6dfc800000)
libdl.so.2 => /nix/store/p9kdj55g5l39nbrxpjyz5wc1m0s7rzsx-glibc-2.40-66/lib/libdl.so.2 (0x00007f6e00def000)
librt.so.1 => /nix/store/p9kdj55g5l39nbrxpjyz5wc1m0s7rzsx-glibc-2.40-66/lib/librt.so.1 (0x00007f6e00dea000)
libstdc++.so.6 => /nix/store/w47x7y9xjgq7n3171ckg722ddwf2pi3v-gcc-13.3.0-lib/lib/libstdc++.so.6 (0x00007f6dfc400000)
libm.so.6 => /nix/store/p9kdj55g5l39nbrxpjyz5wc1m0s7rzsx-glibc-2.40-66/lib/libm.so.6 (0x00007f6dffd19000)
libgcc_s.so.1 => /nix/store/w47x7y9xjgq7n3171ckg722ddwf2pi3v-gcc-13.3.0-lib/lib/libgcc_s.so.1 (0x00007f6e00dc3000)
libc.so.6 => /nix/store/p9kdj55g5l39nbrxpjyz5wc1m0s7rzsx-glibc-2.40-66/lib/libc.so.6 (0x00007f6dfc208000)
libpthread.so.0 => /nix/store/p9kdj55g5l39nbrxpjyz5wc1m0s7rzsx-glibc-2.40-66/lib/libpthread.so.0 (0x00007f6e00dbe000)
/nix/store/p9kdj55g5l39nbrxpjyz5wc1m0s7rzsx-glibc-2.40-66/lib64/ld-linux-x86-64.so.2 (0x00007f6e00ff0000) coredump excerptI stared at the dumped core with
and the thread's backtrace:
|
Pull Request Template
Checklist
cargo run-checks
command has been executed.Related Issues/PRs
n/a (searched for "exit", "status", "sqlite")
Changes
Problem
I was chasing an unrelated issue that manifested by
SqliteDataset
being unable to open the database file created byimporter.py
. After running the script manually with the args thatburn
used, it turned out to be a segfault. After looking atdownloader.rs
I noticed that only thewait()
result is validated, but not theExitStatus
. This resulted inOk
returned despiteimport.py
segfault and later caused issues toSqliteDataset
.Solution
Check for
!ExitStatus::success()
, useImporterError::Unknown
like existing code does.Note: A cleaner solution might be something like
handle.wait().map(|status| status.exit_ok()).map_err(...)?
butexit_ok()
is not stable yet.Testing
Before
After