-
Notifications
You must be signed in to change notification settings - Fork 471
OPENNLP-124: Maxent/Perceptron training should report progess back via an API #758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hello @mawiesne / @kottmann . Good day! Is this item waiting to be picked up? https://issues.apache.org/jira/browse/OPENNLP-124. Attached the output of a couple of existing Tests (Perceptron Trainer) based on the integration with Console based TrainingProgressMonitor. |
FYI: @jzonthemtn + @rzo1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the draft. I added some thoughts / comments.
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
...lp-tools/src/main/java/opennlp/tools/monitoring/PrevNIterationAccuracyLessThanTolerance.java
Outdated
Show resolved
Hide resolved
...lp-tools/src/main/java/opennlp/tools/monitoring/PrevNIterationAccuracyLessThanTolerance.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/StopCriteria.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/TrainingProgressMonitor.java
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/ml/perceptron/PerceptronTrainer.java
Outdated
Show resolved
Hide resolved
Hope you are well. I have tried to fix the review comments. Would you pls. be able to review once more and direct towards the intended solution? Many thanks in advance. Some queries and ToDos:
Please can you clarify, what is the use of numberCorrectEvents and totalEvents parameters? In my current implementation, I have not used them, instead I found stopCriteria is sufficient. Pls. take a look.
|
/** | ||
* Stop-criteria for a {@link opennlp.tools.ml.model.AbstractModel} training . | ||
*/ | ||
public interface StopCriteria extends Predicate<Double> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be made as extends Predicate<Object>
;
So that this common API is flexible to receive any object as input to verify testing Criteria? views pls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even nicer
public interface StopCriteria<T extends Number> extends Predicate<T> {
This way, several numerical stop criteria can be defined, either as integer/long or float/double precision.
Don't think that we need sth different than Number
as base type for T
, do we?
* used to store the training progress events. Callers of this method can invoke it periodically | ||
* during training progress to avoid storing too much progress related data. | ||
*/ | ||
void displayAndClear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be made to accept a boolean input parameter which decides whether the underlying data structure is cleared or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If parameter clear
is introduced, method should be renamed to void display(boolean clear)
opennlp-tools/src/test/java/opennlp/tools/monitoring/DefaultTrainingProgressMonitorTest.java
Show resolved
Hide resolved
Hi Reviewers - All checks are green now. This is available for review. If possible, pls. take a look. |
Thx @NishantShri4 for moving this topic forward! Could you squash those commits and force push the resulting single commit? Once available, we'll have a detailed look and provide feedback. |
Thanks @mawiesne. This is done (rebase, squash and force push). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx @NishantShri4 for providing this substantial contribution. I've left feedback by comments to further improve it. Once addressed, I'll re-check and potentially, @rzo1 can add his final thoughts/checks then.
@@ -72,6 +72,13 @@ | |||
<scope>test</scope> | |||
</dependency> | |||
|
|||
<dependency> | |||
<groupId>org.assertj</groupId> | |||
<artifactId>assertj-core</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NishantShri4 Can we avoid AssertJ as extra dependency? I wonder, why assertions of JUnit 5.x are not sufficient?
* | ||
* @param trainParams The {@link TrainingParameters} to use. | ||
* @param reportMap The {@link Map} instance used as report map. | ||
* @param config The {@link TrainingConfiguration} to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The training configuration should not be null at runtime, I guess?
Pls therefore add:
"Must not be {@code null}." to the parameter JavaDoc.
Same applies for different scenarios, pls re-iterate JavaDoc with that idea in mind.
* | ||
* @param trainParams The {@link TrainingParameters} to use. | ||
* @param reportMap The {@link Map} instance used as report map. | ||
* @param config The {@link TrainingConfiguration} to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment for Trainer
interface.
@@ -108,4 +124,12 @@ protected void addToReport(String key, String value) { | |||
reportMap.put(key, value); | |||
} | |||
|
|||
/** | |||
* Retrieves the {@link TrainingConfiguration} associated with a {@link AbstractTrainer}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should read "with an...", extra "n"
* and a null {@link opennlp.tools.monitoring.StopCriteria}. | ||
* If not provided, the actual {@link opennlp.tools.monitoring.StopCriteria} | ||
* will be decided by the {@link EventTrainer} implementation. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add specific parameter JavaDoc.
Pls add specific return value JavaDoc.
import ch.qos.logback.classic.Logger; | ||
import ch.qos.logback.classic.spi.ILoggingEvent; | ||
import ch.qos.logback.core.read.ListAppender; | ||
import org.assertj.core.api.Assertions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to eliminate dependance on org.assertj.core...
here and elsewhere.
} | ||
|
||
@ParameterizedTest() | ||
@CsvSource( {"0.01,false", "-0.01,false", "0.00001,true", "-0.00001,true"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a nice approach! 👍
.00002))); | ||
} | ||
|
||
@ParameterizedTest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra brackets here, not required when no arguments are specified.
new TrainingParameters(Map.of(LOG_LIKELIHOOD_THRESHOLD_PARAM,5.))); | ||
} | ||
|
||
@ParameterizedTest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra brackets here, not required when no arguments are specified.
@@ -180,6 +180,7 @@ | |||
<!-- Dependency versions --> | |||
<junit.version>5.12.1</junit.version> | |||
<junit5-system-exit.version>2.0.2</junit5-system-exit.version> | |||
<assertj-core.version>3.27.3</assertj-core.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid it, see comment(s) above.
Thank you for contributing to Apache OpenNLP.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions for build issues and submit an update to your PR as soon as possible.