[early WIP] Fix/rationalize loss-tallying#2922
[early WIP] Fix/rationalize loss-tallying#2922gojomo wants to merge 3 commits intopiskvorky:developfrom
Conversation
8c61787 to
33ef202
Compare
|
Changes so far in
Though the real goal is sensible loss-tallying across all classes, I think these small changes already remedy #2735 (float32 swallows large loss-values) & #2743 (worker losses clobber each other). An oddity from looking at per-epoch loss across a full run: all my |
|
Training FB |
|
As a point of comparison, Facebook's Gensim should probably collect & report 2Vec-class training loss in a comparable way, so that numbers on algorithmically-analogous runs are broadly similar, for familiarity to users & as a cross-check of whatever it is we're doing. |
|
+1 on matching FB's logic. What is "trial-count"? Is the average taken over words or something else? |
|
Unsure; their c++ (with a separate class for 'loss') is different enough from our code that I couldn't tell at-a-glance & will need to study it a bit more. |
|
@gojomo cleaning up the loss-tallying logic still very much welcome. Did you figure out the "increasing loss" mystery? We're planning to make a Gensim release soon – whether this PR gets in now or later, it will be a great addition. |
|
These changes would likely apply, & help a bit in But getting consistent loss-tallying working in Never figured out why our |
PR to eventually address loss-tallying issues: #2617, #2735, #2743. Early tinkering stage.