I was able to repeat it once with this script, but it fails to consistently reproduce:
using StableRNGs, MLJ, MLJXGBoostInterface, DataFrames
X = float.(rand(StableRNG(1), [0,0,0,0,0,1], 10000, 4));
y = rand(StableRNG(2), 0:1, 10000);
classifier = Pipeline(; standardizer=Standardizer(), classifier=XGBoostClassifier());
mach = machine(classifier, DataFrame(X, :auto), coerce(y, OrderedFactor))
fit!(mach)
Here's the time that it happened:
julia> using StableRNGs, MLJ, MLJXGBoostInterface, DataFrames
julia> X = float.(rand(StableRNG(1), [0,0,0,0,1], 10000, 4))
10000×4 Matrix{Float64}:
0.0 1.0 0.0 1.0
0.0 1.0 0.0 0.0
1.0 1.0 0.0 1.0
0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 0.0 0.0
⋮
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 0.0 0.0
julia> y = rand(StableRNG(2), 0:1, 10000);
julia> classifier = Pipeline(Standardizer(), XGBoostClassifier());
julia> mach = machine(classifier, DataFrame(X, :auto), coerce(y, OrderedFactor))
untrained Machine; does not cache data
model: ProbabilisticPipeline(standardizer = Standardizer(features = Symbol[], …), …)
args:
1: Source @134 ⏎ Table{AbstractVector{Continuous}}
2: Source @800 ⏎ AbstractVector{OrderedFactor{2}}
julia> fit!(mach)
[ Info: Training machine(ProbabilisticPipeline(standardizer = Standardizer(features = Symbol[], …), …), …).
[ Info: Training machine(:standardizer, …).
[ Info: Training machine(:xg_boost_classifier, …).
[ Info: XGBoost: starting training.
[19107] signal (11.2): Segmentation fault: 11
in expression starting at REPL[53]:1
_ZN7xgboost4tree20CommonRowPartitioner14UpdatePositionIhLb0ELb0ENS0_14CPUExpandEntryEEEvPKNS_7ContextERKNS_16GHistIndexMatrixERKNS_6common12ColumnMatrixERKNSt3__16vectorIT2_NSE_9allocatorISG_EEEEPKNS_7RegTreeE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
_ZN7xgboost4tree20CommonRowPartitioner14UpdatePositionINS0_14CPUExpandEntryEEEvPKNS_7ContextERKNS_16GHistIndexMatrixERKNSt3__16vectorIT_NSA_9allocatorISC_EEEEPKNS_7RegTreeE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
_ZN7xgboost4tree11HistUpdater14UpdatePositionEPNS_7DMatrixEPKNS_7RegTreeERKNSt3__16vectorINS0_14CPUExpandEntryENS7_9allocatorIS9_EEEE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
_ZN7xgboost4tree10UpdateTreeINS0_14CPUExpandEntryENS0_11HistUpdaterEEEvPNS_6common7MonitorENS_6linalg10TensorViewIKNS_6detail20GradientPairInternalIfEELi2EEEPT0_PNS_7DMatrixEPKNS0_10TrainParamEPNS_16HostDeviceVectorIiEEPNS_7RegTreeE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
_ZN7xgboost4tree17QuantileHistMaker6UpdateEPKNS0_10TrainParamEPNS_16HostDeviceVectorINS_6detail20GradientPairInternalIfEEEEPNS_7DMatrixENS_6common4SpanINS5_IiEELm18446744073709551615EEERKNSt3__16vectorIPNS_7RegTreeENSH_9allocatorISK_EEEE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
_ZN7xgboost3gbm6GBTree13BoostNewTreesEPNS_16HostDeviceVectorINS_6detail20GradientPairInternalIfEEEEPNS_7DMatrixEiPNSt3__16vectorINS2_IiEENSA_9allocatorISC_EEEEPNSB_INSA_10unique_ptrINS_7RegTreeENSA_14default_deleteISI_EEEENSD_ISL_EEEE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
_ZN7xgboost3gbm6GBTree7DoBoostEPNS_7DMatrixEPNS_16HostDeviceVectorINS_6detail20GradientPairInternalIfEEEEPNS_20PredictionCacheEntryEPKNS_11ObjFunctionE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
_ZN7xgboost11LearnerImpl13UpdateOneIterEiNSt3__110shared_ptrINS_7DMatrixEEE at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
XGBoosterUpdateOneIter at /Users/eph/.julia/artifacts/0079d93a46694d4e5e45f4e0b6bd6a35e24f4346/lib/libxgboost.dylib (unknown line)
XGBoosterUpdateOneIter at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/Lib.jl:282 [inlined]
xgbcall at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/Lib.jl:25 [inlined]
#updateone!#72 at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:374
updateone! at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:369 [inlined]
#update!#77 at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:446
update! at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:429 [inlined]
#xgboost#82 at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:602
xgboost at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:579
unknown function (ip: 0x2a35bc457)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
fit at /Users/eph/.julia/packages/MLJXGBoostInterface/uFARS/src/MLJXGBoostInterface.jl:168
unknown function (ip: 0x2a34bc71f)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
do_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/builtins.c:768
#fit_only!#57 at /Users/eph/.julia/packages/MLJBase/iIhiI/src/machines.jl:681
fit_only! at /Users/eph/.julia/packages/MLJBase/iIhiI/src/machines.jl:607 [inlined]
#fit_only!#62 at /Users/eph/.julia/packages/MLJBase/iIhiI/src/machines.jl:752
fit_only! at /Users/eph/.julia/packages/MLJBase/iIhiI/src/machines.jl:735 [inlined]
#80 at /Users/eph/.julia/packages/MLJBase/iIhiI/src/composition/learning_networks/nodes.jl:235
unknown function (ip: 0x2a2f4c0bb)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
start_task at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/task.c:1238
Allocations: 41540566 (Pool: 41501312; Big: 39254); GC: 43
zsh: segmentation fault julia --project
From that stacktrace, the last calls before we get into the C code are:
XGBoosterUpdateOneIter at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/Lib.jl:282 [inlined]
xgbcall at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/Lib.jl:25 [inlined]
#updateone!#72 at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:374
updateone! at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:369 [inlined]
#update!#77 at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:446
update! at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:429 [inlined]
#xgboost#82 at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:602
xgboost at /Users/eph/.julia/packages/XGBoost/nqMqQ/src/booster.jl:579
which is why I filed it here rather than on MLJXGBoostInterface.jl.
I'm on v2.5.1 of XGBoost.jl, Julia v1.10.2,
julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin22.4.0)
CPU: 8 × Apple M1
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 4 default, 0 interactive, 2 GC (on 4 virtual cores)
Environment:
JULIA_NUM_THREADS = 4
JULIA_PKG_SERVER_REGISTRY_PREFERENCE = eager
I was able to repeat it once with this script, but it fails to consistently reproduce:
Here's the time that it happened:
From that stacktrace, the last calls before we get into the C code are:
which is why I filed it here rather than on MLJXGBoostInterface.jl.
I'm on v2.5.1 of XGBoost.jl, Julia v1.10.2,