-
-
Couldn't load subscription status.
- Fork 26
Open
Description
Cheers. When training a model with FluxTraining.fit!(learner, epochs) and an early stop condition is met, I am having a hard error that causes the Julia script to be teminated, which prevents execution of code lines placed after the fit! command. I believe this is unintended behavior, please kindly verify. Thanks in advance.
Code is as follows (early stop parameters purposedly set to small numbers):
ms = [accuracy,
t.Metric(LibML.IoU, device=gpu, name="IoU"),
]
cbs = [ToGPU(),
StopOnNaNLoss(),
Checkpointer(modelsfolder),
EarlyStopping(1),
EarlyStopping(NumberSinceBest(1)),
EarlyStopping(Threshold(0.5)),
Metrics(ms...),
LogMetrics(TensorBoardBackend(tbfolder)),
]
learner = t.Learner(model, lossFunction;
data=(trainset, validset),
optimizer=modelOptimizer,
callbacks=cbs,
)
epochs = 100
FluxTraining.fit!(learner, epochs)
@info "project finished"
Error message as follows:
ERROR: CancelFittingException("Stop triggered by EarlyStopping.Patience(1) stopping criterion. ")
Stacktrace:
[1] on(::FluxTraining.Events.EpochEnd, phase::ValidationPhase, cb::EarlyStopping, learner::FluxTraining.Protected{Learner})
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/callbacks/earlystopping.jl:72
[2] _on(e::FluxTraining.Events.EpochEnd, p::ValidationPhase, cb::EarlyStopping, learner::Learner)
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/callbacks/callback.jl:254
[3] handle(runner::FluxTraining.LinearRunner, event::FluxTraining.Events.EpochEnd, phase::ValidationPhase, learner::Learner)
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/callbacks/execution.jl:12
[4] (::FluxTraining.var"#handlefn#81"{Learner, ValidationPhase})(e::FluxTraining.Events.EpochEnd)
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/training.jl:102
[5] runepoch(epochfn::FluxTraining.var"#71#72"{…}, learner::Learner, phase::ValidationPhase)
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/training.jl:106
[6] epoch!
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/training.jl:22 [inlined]
[7] fit!(learner::Learner, nepochs::Int64, ::Tuple{MLUtils.DataLoader{…}, MLUtils.DataLoader{…}})
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/training.jl:169
[8] fit!(learner::Learner, nepochs::Int64)
@ FluxTraining ~/.julia/packages/FluxTraining/xCOPx/src/training.jl:174
[9] top-level scope
@ ~/projects/pascalvoc-segmentation/8-training.jl:123
Some type information was truncated. Use `show(err)` to see complete types.
julia>
Metadata
Metadata
Assignees
Labels
No labels