Skip to content

Commit 1e4d227

Browse files
alexanderguzhvameta-codesync[bot]
authored andcommitted
Introduce an early stop threshold for Kmeans (#4894)
Summary: Basically speaking, the feature allows kmeans iterations to stop once the improvement in an error becomes marginal. Pull Request resolved: #4894 Reviewed By: limqiying Differential Revision: D95987848 Pulled By: alibeklfc fbshipit-source-id: 4422d08c9e04880cbef84093f0c4ae714dff254c
1 parent c24598a commit 1e4d227

2 files changed

Lines changed: 12 additions & 1 deletion

File tree

faiss/Clustering.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -569,7 +569,12 @@ void Clustering::train_encoded(
569569
if (i > 0) {
570570
float prev_obj =
571571
iteration_stats[iteration_stats.size() - 2].obj;
572-
if (obj == prev_obj) {
572+
573+
double change = (prev_obj == 0)
574+
? std::numeric_limits<double>::max()
575+
: std::abs(prev_obj - obj) / std::abs(prev_obj);
576+
577+
if (change >= 0 && change <= early_stop_threshold) {
573578
if (verbose) {
574579
printf("\n Converged at iteration %d: "
575580
"objective did not change\n",

faiss/Clustering.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,12 @@ struct ClusteringParameters {
6969
/// Only used when init_method = AFK_MC2.
7070
/// Longer chains give better approximation but are slower.
7171
uint16_t afkmc2_chain_length = 50;
72+
73+
/// Early stop threshold, the range is [0, 1].
74+
/// The value of 0 implies a default Faiss behavior,
75+
/// so the training process stops only if an error
76+
/// is unchanged from the previous iteration.
77+
double early_stop_threshold = 0.0;
7278
};
7379

7480
struct ClusteringIterationStats {

0 commit comments

Comments
 (0)