-
Notifications
You must be signed in to change notification settings - Fork 481
OPENNLP-124: Maxent/Perceptron training should report progress back via an API #758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @mawiesne / @kottmann . Good day! Is this item waiting to be picked up? https://issues.apache.org/jira/browse/OPENNLP-124. Attached the output of a couple of existing Tests (Perceptron Trainer) based on the integration with Console based TrainingProgressMonitor. |
FYI: @jzonthemtn + @rzo1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the draft. I added some thoughts / comments.
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
...lp-tools/src/main/java/opennlp/tools/monitoring/PrevNIterationAccuracyLessThanTolerance.java
Outdated
Show resolved
Hide resolved
...lp-tools/src/main/java/opennlp/tools/monitoring/PrevNIterationAccuracyLessThanTolerance.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/StopCriteria.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/TrainingProgressMonitor.java
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/ml/perceptron/PerceptronTrainer.java
Outdated
Show resolved
Hide resolved
Hope you are well. I have tried to fix the review comments. Would you pls. be able to review once more and direct towards the intended solution? Many thanks in advance. Some queries and ToDos:
Please can you clarify, what is the use of numberCorrectEvents and totalEvents parameters? In my current implementation, I have not used them, instead I found stopCriteria is sufficient. Pls. take a look.
|
opennlp-tools/src/main/java/opennlp/tools/monitoring/StopCriteria.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/monitoring/TrainingProgressMonitor.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/test/java/opennlp/tools/monitoring/DefaultTrainingProgressMonitorTest.java
Outdated
Show resolved
Hide resolved
Hi Reviewers - All checks are green now. This is available for review. If possible, pls. take a look. |
Thx @NishantShri4 for moving this topic forward! Could you squash those commits and force push the resulting single commit? Once available, we'll have a detailed look and provide feedback. |
Thanks @mawiesne. This is done (rebase, squash and force push). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx @NishantShri4 for providing this substantial contribution. I've left feedback by comments to further improve it. Once addressed, I'll re-check and potentially, @rzo1 can add his final thoughts/checks then.
opennlp-tools/src/main/java/opennlp/tools/ml/AbstractTrainer.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/ml/AbstractTrainer.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/main/java/opennlp/tools/ml/TrainerFactory.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/test/java/opennlp/tools/monitoring/DefaultTrainingProgressMonitorTest.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/test/java/opennlp/tools/monitoring/IterDeltaAccuracyUnderToleranceTest.java
Show resolved
Hide resolved
opennlp-tools/src/test/java/opennlp/tools/monitoring/IterDeltaAccuracyUnderToleranceTest.java
Outdated
Show resolved
Hide resolved
opennlp-tools/src/test/java/opennlp/tools/monitoring/LogLikelihoodThresholdBreachedTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
We are currently discussing, if an ICLA is required due to the size of this contribution. Stay tuned 🙂
Thanks very much @mawiesne for the detailed review earlier. Very useful. I have pushed changes earlier to answer/fix the review comments. I could see two approvals available now. Thanks to the approvers for their time. |
@NishantShri4 can you fill an ICLA for your contribution please? Details can be found here https://www.apache.org/licenses/contributor-agreements.html If you sign it, please add "OpenNLP" in the section "notify project". You don't need to add an Apache ID. Thanks! |
Thanks @rzo1. This is done (signed ICLA is sent to :[email protected]). |
Thanks again (and of course, we truly appreciate your contribution)! We’ll go ahead and merge this PR once we get confirmation from the secretary. |
commit 52955e9 Author: Nishant Shrivastava <[email protected]> Date: Sat Jun 14 18:50:09 2025 +0100 OPENNLP-1745: SentenceDetector - Add Junit test for useTokenEnd = false commit fe59eb9 Merge: 67ac7b2 05f69a4 Author: Nishant Shrivastava <[email protected]> Date: Sat Jun 14 07:29:36 2025 +0100 Merge remote-tracking branch 'origin/main' commit 05f69a4 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Jun 9 11:49:38 2025 +0200 OPENNLP-1724: Update JUnit to 5.13.1 (apache#790) Bumps `junit.version` from 5.13.0 to 5.13.1. Updates `org.junit.jupiter:junit-jupiter-api` from 5.13.0 to 5.13.1 - [Release notes](https://github.com/junit-team/junit5/releases) - [Commits](junit-team/junit-framework@r5.13.0...r5.13.1) Updates `org.junit.jupiter:junit-jupiter-engine` from 5.13.0 to 5.13.1 - [Release notes](https://github.com/junit-team/junit5/releases) - [Commits](junit-team/junit-framework@r5.13.0...r5.13.1) Updates `org.junit.jupiter:junit-jupiter-params` from 5.13.0 to 5.13.1 - [Release notes](https://github.com/junit-team/junit5/releases) - [Commits](junit-team/junit-framework@r5.13.0...r5.13.1) --- updated-dependencies: - dependency-name: org.junit.jupiter:junit-jupiter-api dependency-version: 5.13.1 dependency-type: direct:production update-type: version-update:semver-patch - dependency-name: org.junit.jupiter:junit-jupiter-engine dependency-version: 5.13.1 dependency-type: direct:production update-type: version-update:semver-patch - dependency-name: org.junit.jupiter:junit-jupiter-params dependency-version: 5.13.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 32f4ef7 Author: Richard Zowalla <[email protected]> Date: Sat Jun 7 21:21:36 2025 +0200 Disable merge request requirement for opennlp-2.x (apache#789) commit 8abfe0d Author: Richard Zowalla <[email protected]> Date: Sat Jun 7 20:45:08 2025 +0200 Remove code review requirement for 2.x branch to allow cherry picking already reviewed commits. (apache#788) commit 89e4260 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Jun 2 09:17:43 2025 +0200 OPENNLP-1724: Update JUnit to 5.13.0 (apache#787) Bumps `junit.version` from 5.12.2 to 5.13.0. Updates `org.junit.jupiter:junit-jupiter-api` from 5.12.2 to 5.13.0 - [Release notes](https://github.com/junit-team/junit5/releases) - [Commits](junit-team/junit-framework@r5.12.2...r5.13.0) Updates `org.junit.jupiter:junit-jupiter-engine` from 5.12.2 to 5.13.0 - [Release notes](https://github.com/junit-team/junit5/releases) - [Commits](junit-team/junit-framework@r5.12.2...r5.13.0) Updates `org.junit.jupiter:junit-jupiter-params` from 5.12.2 to 5.13.0 - [Release notes](https://github.com/junit-team/junit5/releases) - [Commits](junit-team/junit-framework@r5.12.2...r5.13.0) --- updated-dependencies: - dependency-name: org.junit.jupiter:junit-jupiter-api dependency-version: 5.13.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.junit.jupiter:junit-jupiter-engine dependency-version: 5.13.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.junit.jupiter:junit-jupiter-params dependency-version: 5.13.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 2c8e58b Author: Martin Wiesner <[email protected]> Date: Sat May 24 20:59:20 2025 +0200 OPENNLP-1708: Raise OpenNLP version to 3.x on main branch (apache#785) * OPENNLP-1708: Raise OpenNLP version to 3.x on main branch - adjusts all pom.xml files towards 3.0.0-SNAPSHOT - adjusts upper major model version to 3.x - adds static method Version#between for simpler version range checks in BaseModel - adds 'opennlp-2.x' branch to protected branches in .asf.yml - updates README.md with infos on 'Branches and Merging Strategy' - cures a typo - adds external link to the ONNX website commit 0db3c10 Author: Richard Zowalla <[email protected]> Date: Tue May 20 21:35:27 2025 +0200 OPENNLP-1545 - Close ZipInputStream in BaseModel (apache#784) commit 2ed9949 Author: Martin Wiesner <[email protected]> Date: Tue May 20 16:26:59 2025 +0200 OPENNLP-1734: Adjust GH CI config to build with Java 25-ea (apache#781) commit 5eec98c Author: NishantShri4 <[email protected]> Date: Thu May 15 09:25:06 2025 +0100 OPENNLP-1731: Add Junits for NGramLanguageModelTool (apache#778) * OPENNLP-1731: Add Junits for NGramLanguageModelTool * OPENNLP-1731: AbstractLoggerTest : Corrected a javadoc comment. * OPENNLP-1731: Add Junits for NGramLanguageModelTool * OPENNLP-1731: AbstractLoggerTest : Corrected a javadoc comment. * OPENNLP-1731: Fixed a Generic RawType warning. * OPENNLP-1731: Rebased against upstream. * OPENNLP-1731: Rebased against upstream. * OPENNLP-1731: Rebased against upstream (removed extra new line). * OPENNLP-1731: Removed an extra newline. commit 67ac7b2 Author: Nishant Shrivastava <[email protected]> Date: Mon May 12 20:23:00 2025 +0100 OPENNLP-1731: Removed an extra newline. commit 0d95dd9 Merge: 35de220 2580a20 Author: Nishant Shrivastava <[email protected]> Date: Mon May 12 20:20:59 2025 +0100 Merge remote-tracking branch 'origin/main' # Conflicts: # opennlp-tools/src/test/java/opennlp/tools/AbstractLoggerTest.java commit 35de220 Author: Nishant Shrivastava <[email protected]> Date: Mon May 12 20:19:54 2025 +0100 OPENNLP-1731: Rebased against upstream (removed extra new line). commit e09f2ad Author: Nishant Shrivastava <[email protected]> Date: Mon May 12 20:18:29 2025 +0100 OPENNLP-1731: Rebased against upstream. commit 6d84e2f Author: Nishant Shrivastava <[email protected]> Date: Mon May 12 20:16:21 2025 +0100 OPENNLP-1731: Rebased against upstream. commit 2580a20 Merge: 0a20ef5 46d2d78 Author: Nishant Shrivastava <[email protected]> Date: Mon May 12 20:09:17 2025 +0100 Merge remote-tracking branch 'origin/main' # Conflicts: # opennlp-tools/src/test/java/opennlp/tools/monitoring/DefaultTrainingProgressMonitorTest.java commit 0a20ef5 Author: Nishant Shrivastava <[email protected]> Date: Mon May 12 20:06:51 2025 +0100 OPENNLP-1731: Fixed a Generic RawType warning. commit cfa425f Author: Nishant Shrivastava <[email protected]> Date: Sun May 11 17:32:59 2025 +0100 OPENNLP-1731: AbstractLoggerTest : Corrected a javadoc comment. commit a7eb44a Author: Nishant Shrivastava <[email protected]> Date: Sat May 10 23:24:16 2025 +0100 OPENNLP-1731: Add Junits for NGramLanguageModelTool commit f7be29d Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Mon May 12 20:35:53 2025 +0200 Minor: Regenerated NOTICE File for 21a2a2a (apache#783) Signed-off-by: GitHub <[email protected]> Co-authored-by: mawiesne <[email protected]> commit 21a2a2a Author: Martin Wiesner <[email protected]> Date: Mon May 12 20:34:19 2025 +0200 OPENNLP-1733: Remove implements Serializable from LanguageDetector (apache#780) commit 7c72cb0 Author: Martin Wiesner <[email protected]> Date: Mon May 12 20:32:46 2025 +0200 OPENNLP-1732: Eliminate use of raw types for StopCriteria (apache#779) commit e4f5ce2 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon May 12 20:32:10 2025 +0200 OPENNLP-1730: Update ONNX runtime to 1.22.0 (apache#782) Bumps `onnxruntime.version` from 1.21.1 to 1.22.0. Updates `com.microsoft.onnxruntime:onnxruntime` from 1.21.1 to 1.22.0 - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v1.21.1...v1.22.0) Updates `com.microsoft.onnxruntime:onnxruntime_gpu` from 1.21.1 to 1.22.0 - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v1.21.1...v1.22.0) --- updated-dependencies: - dependency-name: com.microsoft.onnxruntime:onnxruntime dependency-version: 1.22.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: com.microsoft.onnxruntime:onnxruntime_gpu dependency-version: 1.22.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 46d2d78 Author: Nishant Shrivastava <[email protected]> Date: Sun May 11 17:32:59 2025 +0100 OPENNLP-1731: AbstractLoggerTest : Corrected a javadoc comment. commit 01a4695 Author: Nishant Shrivastava <[email protected]> Date: Sat May 10 23:24:16 2025 +0100 OPENNLP-1731: Add Junits for NGramLanguageModelTool commit 1675317 Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed Apr 30 18:16:50 2025 +0200 Minor: Regenerated NOTICE File for 95cd7c8 (apache#776) Signed-off-by: GitHub <[email protected]> Co-authored-by: mawiesne <[email protected]> commit 95cd7c8 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed Apr 30 18:15:51 2025 +0200 OPENNLP-1730: Update ONNX runtime to 1.21.1 (apache#774) Bumps `onnxruntime.version` from 1.21.0 to 1.21.1. Updates `com.microsoft.onnxruntime:onnxruntime` from 1.21.0 to 1.21.1 - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v1.21.0...v1.21.1) Updates `com.microsoft.onnxruntime:onnxruntime_gpu` from 1.21.0 to 1.21.1 - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v1.21.0...v1.21.1) --- updated-dependencies: - dependency-name: com.microsoft.onnxruntime:onnxruntime dependency-version: 1.21.1 dependency-type: direct:production update-type: version-update:semver-patch - dependency-name: com.microsoft.onnxruntime:onnxruntime_gpu dependency-version: 1.21.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit 7c85b94 Author: Martin Wiesner <[email protected]> Date: Wed Apr 30 18:12:19 2025 +0200 OPENNLP-1729: Provide easier loading of Models for given model lang and type (apache#775) - extracts ModelType from DownloadUtil - adds new methods to ClassPathModelLoader to obtain actual model instances easily - adds ClassPathModelProvider interface - adds DefaultClassPathModelProvider which combines existing classes to achieve easier access to model objects via classpath loading - adds JUnit tests for the new classes - adds and improves JavaDoc commit 28e2de6 Author: NishantShri4 <[email protected]> Date: Fri Apr 25 18:59:14 2025 +0100 OPENNLP-124: Maxent/Perceptron training should report progress back via an API (apache#758) * OPENNLP-124 : Maxent/Perceptron training should report progress back via an API * OPENNLP-124 : Fixed Review Comments * OPENNLP-124 : Updated javadoc for the new Trainer.init method commit 2720a1b Author: Martin Wiesner <[email protected]> Date: Fri Apr 25 17:32:20 2025 +0200 OPENNLP-1728: Improve JavaDoc of opennlp.tools.models package (apache#772) commit e1843dc Author: Martin Wiesner <[email protected]> Date: Wed Apr 23 21:42:29 2025 +0200 OPENNLP-1727: Correct example snippet for loading a model from the classpath (apache#771)
Thank you for contributing to Apache OpenNLP.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions for build issues and submit an update to your PR as soon as possible.