Skip to content
This repository was archived by the owner on Dec 13, 2021. It is now read-only.

JOSHUA-252 Make it possible to use Maven to build Joshua #12

Merged
merged 29 commits into from
Jun 1, 2016

Conversation

lewismc
Copy link
Member

@lewismc lewismc commented May 14, 2016

Hi Folks,
This PR is a beast. I think a Google Hangout would be best to talk through what is going on.
Basically it,

  • Mavenizes the entire codebase e.g. sets out the src/main/java/org/apache/joshua/... structure.
  • goes a long way to implementing the correct structure and notation for Unit testing e.g. src/test/java/org/apache/joshua/...
  • moves everything from $JOSHUA_HOM/tst to the correct src/test/java/... directories then removes tst
  • moves all test .java files from $JOSHUA_HOME/test to src/test/java/...
  • removes the previous Eclipse project definitions which were committed to the repos
  • removes the thrax submodule. We should just pull this in as a dependency from now on. I've therefore put up a PR for this over on Publish thrax to Sonatype joshua-decoder/thrax#11
    ...

As I expected there are a bunch of files which include Objects which ni longer exist. If you pull this PR in to a branch and execute mvn clean install you will see that this is the case with around 6 or 7 test files. We need to collectively decide what to do with the missing imports.

Apart from that, work still t be done here is as follows

  • ensure that all testing resources are moved into $JOSHUA_HOME/src/test/resources. This folder is not currently created.
  • Remove build.xml
  • plan on how we are going to compile and package GIZA++, KenLM, etc. This is something @KellenSunderland @buggtd and I discussed at the ApacheCon meetup however there is still work to be done here.

The first thing i think, is to get the test classes compiling. Once we have done that we can move on to getting the tests passing and stabilizing the Java build code.
After that we can move on to the C++ build challenges.

All in all, a VERY successful week for Joshua. Looking forward to stabilizing the Maven build as it will make things so much easier or us all moving forward.

@thammegowda
Copy link
Member

@lewismc Please merge this ASAP.
This is a great PR. Maven for the win!

@thammegowda
Copy link
Member

My ant build is failing..

[ivy:resolve]     http://ivyroundup.googlecode.com/svn/trunk/repo/modules/org.apache.commons/commons-cli/1.2/packager.xml
[ivy:resolve]       ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve]       ::          UNRESOLVED DEPENDENCIES         ::
[ivy:resolve]       ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve]       :: net.sourceforge.ant-doxygen#ant-doxygen;1.6.1: not found
[ivy:resolve]       :: org.apache.commons#commons-cli;1.2: not found
[ivy:resolve]       ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve] 
[ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

Merging this PR helps me to move forward for resolving other issues.

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

The PR is not finished yet. There are a number of issues as highlighted above which need attention.

@KellenSunderland
Copy link
Contributor

Hey Lewis. I'm flying today but can take a look at getting the tests to
compile next week. Thanks for all the work. I'd be game for a google
hangout to discuss any issues we have.

On Sun, May 15, 2016 at 7:19 PM, Lewis John McGibbney <
[email protected]> wrote:

The PR is not finished yet. There are a number of issues as highlighted
above which need attention.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#12 (comment)

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

ACK
Safe flight Kellen.

On Sun, May 15, 2016 at 7:25 PM, Kellen Sunderland <[email protected]

wrote:

Hey Lewis. I'm flying today but can take a look at getting the tests to
compile next week. Thanks for all the work. I'd be game for a google
hangout to discuss any issues we have.

On Sun, May 15, 2016 at 7:19 PM, Lewis John McGibbney <
[email protected]> wrote:

The PR is not finished yet. There are a number of issues as highlighted
above which need attention.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
<
https://github.com/apache/incubator-joshua/pull/12#issuecomment-219335321>


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#12 (comment)

Lewis

@thammegowda
Copy link
Member

this is indeed a big change and I think it's better to split into multiple small PRs.

Maybe we can have a 'maven' branch in parallel for a short transition time,
resolve the remaining issues on that branch with smaller PRs and merge it with master?

@mjpost
Copy link
Contributor

mjpost commented May 16, 2016

I second @thammegowda's idea of merging this into a "maven" branch first which will also continue to pull the latest from master. This is a big change and I'm not going to have time to look at it too closely for a while yet (probably later next week).

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

I agree as well
It's pretty huge as I said so we need lots of eyes on it.
Current compilation issues as as follows

[INFO] 100 errors
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9.155 s
[INFO] Finished at: 2016-05-15T20:09:45-07:00
[INFO] Final Memory: 29M/252M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:testCompile (default-testCompile) on project joshua: Compilation failure: Compilation failure:
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[21,38] error: cannot find symbol
[ERROR] package org.apache.joshua.decoder.ff.tm
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[22,38] error: cannot find symbol
[ERROR] package org.apache.joshua.decoder.ff.tm
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[192,22] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/PrintRules.java:[29,42] error: package org.apache.joshua.util.quantization does not exist
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/PrintRules.java:[30,42] error: package org.apache.joshua.util.quantization does not exist
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/PrintRules.java:[42,10] error: cannot find symbol
[ERROR] class PrintRules
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[38,24] error: cannot find symbol
[ERROR] class ArityPhrasePenaltyFFTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[46,16] error: cannot find symbol
[ERROR] class ArityPhrasePenaltyFFTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[50,4] error: cannot find symbol
[ERROR] class ArityPhrasePenaltyFFTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[50,47] error: cannot find symbol
[ERROR] class ArityPhrasePenaltyFFTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[58,25] error: cannot find symbol
[ERROR] class ArityPhrasePenaltyFFTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/ArityPhrasePenaltyFFTest.java:[60,69] error: cannot find symbol
[ERROR] class ArityPhrasePenaltyFFTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[106,4] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[106,28] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[118,4] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[118,28] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[124,9] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[156,4] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[156,28] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[163,4] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[163,28] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[165,4] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[165,20] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[172,4] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[172,28] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[174,4] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[174,20] error: cannot find symbol
[ERROR] class ArpaFileTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[183,27] error: constructor LMGrammarBerkeley in class LMGrammarBerkeley cannot be applied to given types;
[ERROR] actual and formal argument lists differ in length
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[194,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[195,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[196,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[197,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[198,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[199,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[200,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[201,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[204,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[204,92] error: cannot find symbol
[ERROR] class JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[207,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[208,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[209,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[210,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[213,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[216,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[217,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[218,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[219,52] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java:[222,23] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/zmert/BLEUTest.java:[64,28] error: maxGramLength has protected access in BLEU
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/zmert/BLEUTest.java:[67,28] error: effLengthMethod has protected access in BLEU
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/zmert/BLEUTest.java:[67,50] error: EffectiveLengthMethod is not public in BLEU; cannot be accessed from outside package
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/lattice/LatticeTest.java:[108,37] error: cannot find symbol
[ERROR] class Lattice
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/CountRules.java:[49,26] error: incompatible types: String cannot be converted to File
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/PrintRules.java:[63,24] error: incompatible types: String cannot be converted to File
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/PrintRules.java:[66,23] error: cannot find symbol
[ERROR] class PrintRules
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/PrintRules.java:[160,6] error: cannot find symbol
[ERROR] class PrintRules
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/packed/VocabTest.java:[32,41] error: incompatible types: String cannot be converted to File
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[78,38] error: cannot find symbol
[ERROR] class Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[82,28] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[94,38] error: cannot find symbol
[ERROR] class Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[100,28] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[112,6] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[112,33] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[141,6] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[141,35] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[148,6] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/decoder/DecoderThreadTest.java:[148,38] error: cannot find symbol
[ERROR] class DecoderThreadTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/MultithreadedTranslationTests.java:[110,16] error: cannot find symbol
[ERROR] variable joshuaConfig of type JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/MultithreadedTranslationTests.java:[118,4] error: cannot find symbol
[ERROR] class MultithreadedTranslationTests
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/MultithreadedTranslationTests.java:[118,33] error: cannot find symbol
[ERROR] class MultithreadedTranslationTests
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[48,6] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[49,6] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[49,27] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[54,6] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[54,45] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[91,4] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[107,32] error: cannot find symbol
[ERROR] class Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[110,22] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[142,16] error: cannot find symbol
[ERROR] class Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[143,6] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[143,27] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[147,6] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/CorpusArrayTest.java:[147,45] error: cannot find symbol
[ERROR] class CorpusArrayTest
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/util/io/BinaryTest.java:[48,23] error: constructor Vocabulary in class Vocabulary cannot be applied to given types;
[ERROR] actual and formal argument lists differ in length
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/util/io/BinaryTest.java:[55,11] error: cannot find symbol
[ERROR] variable vocab of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/util/io/BinaryTest.java:[57,36] error: type argument Vocabulary is not within bounds of type-variable E
[ERROR]
[ERROR] E extends Externalizable declared in class BinaryIn
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/StructuredTranslationTest.java:[118,16] error: cannot find symbol
[ERROR] variable joshuaConfig of type JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/StructuredTranslationTest.java:[131,16] error: cannot find symbol
[ERROR] variable joshuaConfig of type JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/StructuredTranslationTest.java:[146,16] error: cannot find symbol
[ERROR] variable joshuaConfig of type JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/StructuredTranslationTest.java:[168,16] error: cannot find symbol
[ERROR] variable joshuaConfig of type JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/StructuredTranslationTest.java:[187,16] error: cannot find symbol
[ERROR] variable joshuaConfig of type JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/system/StructuredTranslationTest.java:[207,16] error: cannot find symbol
[ERROR] variable joshuaConfig of type JoshuaConfiguration
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[48,24] error: constructor Vocabulary in class Vocabulary cannot be applied to given types;
[ERROR] actual and formal argument lists differ in length
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[52,29] error: cannot find symbol
[ERROR] variable vocab1 of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[54,29] error: no suitable method found for getWords(no arguments)
[ERROR] is not applicable
[ERROR] (actual and formal argument lists differ in length)
[ERROR] method Vocabulary.getWords(Iterable<Integer>) is not applicable
[ERROR] (actual and formal argument lists differ in length)
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[55,28] error: cannot find symbol
[ERROR] variable vocab1 of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[55,51] error: cannot find symbol
[ERROR] class Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[56,30] error: no suitable method found for getWords(no arguments)
[ERROR] is not applicable
[ERROR] (actual and formal argument lists differ in length)
[ERROR] method Vocabulary.getWords(Iterable<Integer>) is not applicable
[ERROR] (actual and formal argument lists differ in length)
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[56,49] error: cannot find symbol
[ERROR] variable vocab1 of type Vocabulary
[ERROR] /usr/local/incubator-joshua/src/test/java/org/apache/joshua/corpus/vocab/VocabularyTest.java:[59,49] error: UNKNOWN_WORD is not public in Vocabulary; cannot be accessed from outside package
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

@mjpost do you know where I can find the class ArpaFile.java ? It is referenced within src/test/java/org/apache/joshua/decoder/ff/lm/ArpaFileTest.java| but not available within master source.

@mjpost
Copy link
Contributor

mjpost commented May 16, 2016

I don't. However, the "Arpa" denotes the standard SRILM format for language model files (read by both BerkeleyLM and KenLM), so the file would have been very similar to this one in BerkeleyLM.

As a warning, I have not touched many of those unit test files in years, and many of them may be way, way out of date.

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

ACK @mjpost I knew this. I am tempted to merely skip the ones for which classes no longer exist. I'm going to try and stabilize the build tonight.

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

@mjpost
Copy link
Contributor

mjpost commented May 16, 2016

Wow, nice. We might want to put some of that back in there for completeness / posterity.

@thammegowda
Copy link
Member

+1
@lewismc I will then fix logs, system.exits() and other trivial issues in code.

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

Hi Folks, this is now a feature branch and can be found at https://github.com/apache/incubator-joshua/tree/JOSHUA-252
I have set up a build for the branch as well to make interpretation and understanding a little clearer. The build can be seen at https://builds.apache.org/view/H-L/view/Joshua/job/joshua_maven/
I would kindly ask that all code committed to master branch also be committed to the JOSHUA-252 branch otherwise the codebases will quickly diverge.

@lewismc
Copy link
Member Author

lewismc commented May 16, 2016

The next step here is for me to stabilize the test suite. This will involve putting the test resources in to src/test/resources and then removing the top level test directory. I'll work on this over the next few days.

@mjpost
Copy link
Contributor

mjpost commented May 17, 2016

I didn't see your JOSHUA-252 branch. I just created a "maven" branch that pulls PRs 4 and 5 into this one. However, after pushing that up, I don't see that on github / incubator either. It seems that Apache is not pushing branches down to github? Do you know how to fix this?

Everything seems to compile for me when I type the commands @thammegowda provided. However, I don't really know how to use Maven, and eclipse is totally broken. Can you add a BUILD file that describes how to compile and how to get things set up in eclipse?

@lewismc
Copy link
Member Author

lewismc commented May 17, 2016

Yep I'll add all of this to the wiki space. In the meantime you can do

mvn eclipse:eclipse

Easy as that

On Monday, May 16, 2016, Matt Post [email protected] wrote:

I didn't see your JOSHUA-252 branch. I just created a "maven" branch that
pulls PRs 4 and 5 into this one. However, I don't see these on github /
incubator. It seems that Apache is not pushing branches down to github.

Everything seems to compile for me when I type the commands @thammegowda
provided
#13 (comment).
However, I don't really know how to use Maven, and eclipse is totally
broken. Can you add a BUILD file that describes how to compile and how to
get things set up in eclipse?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#12 (comment)

Lewis

@thammegowda
Copy link
Member

@lewismc I tried to create a training model from maven build. Found that pipeline.pl script uses berkeleyaligner.jar.

  • berkeleyaligner.jar is not in maven repo. I see a project in @mjpost 's github which has ant build so I couldnt do local maven install to try it out. Should we need to port that to maven build or do you know a quick way to install ant jar to maven local?
  • The original licence is GPL v2, I am not sure if it has been relicensed to Apache. Can we add this as dependency to create a fat jar of all dependencies?

@lewismc
Copy link
Member Author

lewismc commented May 17, 2016

You can build it locally by simply using the following scope

    <dependency>
      <groupId>edu.berkeley.nlp</groupId>
      <artifactId>jberkeleyaligner</artifactId>
      <version>X</version>
      <scope>system</scope>
      <systemPath>/path/to/berkeleyaligner.jar</systemPath>
    </dependency>

I think you can remove the version element for the time being and try that out. We may need to create a Maven build for berkeley aligner and push it to Maven Central

@mjpost
Copy link
Contributor

mjpost commented May 27, 2016

Okay, I just pushed up the changes after merging in master and making some other changes to that the tests will run (with manual calls). The code on master runs fine, so it must have been something in the 252 or 262 branches. I've done some initial work wrapping LOG.debug() calls in checks for LOG.isDebugEnabled(), but that didn't seem to help. All my tests were done with

log4j.rootLogger=OFF, stdout

I may have some more time to look at this tomorrow.

@lewismc
Copy link
Member Author

lewismc commented May 27, 2016

What kind of 'tests' are we talking about here? The only tests which I know of are invoked by running mvn clean test. These execute very quickly by the looks of it however I assume that they cover little or none of the multithreaded functionality you are referring to.
Do you have something we could code, profile and see what is taking so long @mjpost ? Thanks

One last thing. No I changed absolutely no code. I did however introduce some classes such as the ArpaFile, etc which were required to get the current unit tests running.

@mjpost
Copy link
Contributor

mjpost commented May 27, 2016

On a hunch, I went through and removed all the calls LOG.*() in Decoder, DecoderThread, JoshuaDecoder, Translation, Chart, DotChart, and ComputeNodeResult. This fixed the problem; the decoder now runs as fast as before.

@mjpost
Copy link
Contributor

mjpost commented May 27, 2016

I went into joshua/decoder/chart_parser/ComputeNodeResult; adding and removing the following lines makes a huge difference in speed. So it looks like logging is neither (a) time-safe nor (b) thread-safe.

Or that something is wrong. I can't get the decoder to output any statements using the logging properties. It's sort of a pain to have to rebuild a tarball every time you want to test something...

FYI, I won't have any more time to spend on this before Tuesday.

if (LOG.isDebugEnabled()) {
  LOG.debug("ComputeNodeResult():");
  LOG.debug("-> RULE {}", rule);
}

[snip]

if (LOG.isDebugEnabled()) {
  LOG.debug("-> item.bestedge: {}", item);
  LOG.debug("-> TAIL NODE {}", item);
}

@mjpost
Copy link
Contributor

mjpost commented May 27, 2016

Okay, can't stop scratching this itch.

On even further investigation, with some System.err.println()s, I discovered that the properties file isn't getting loaded correctly. Consequently, isDebugEnabled() is always returning true, but the logging statements are not printed anywhere visible. So the work is getting triggered, and is just not seen, and that is the problem.

I've pushed up the merge. Can you folks take care of making sure that the logger file is running correctly? You can test this by running

cd $JOSHUA/src/test/resources/bn-en/hiero/
time ./test.sh

Depending on your hardware, of course, that should run in just a few seconds tops.

Incidentally, there are at least two other problems I am hoping you can help address?

  • I see how to build a big tarball and run Maven that way, but that's a pain for development, since you have to incur the time of rebuilding the tarball. Is there a way to just run against the files in target/classes directly? I've tried this, but the problem is that the jar dependencies don't seem to get downloaded anywhere.

  • Your response might be, "actually, you just have to regenerated the tarball, that's the proper way to do things." In which case, can you describe how to make eclipse do this? Right now, when I'm developing, eclipse is always compiling files, so you can just switch over and test right away. It's a small but significant paint to have to type "(cd $JOSHUA; mvn compile assembly:single install"

  • I really like being able to use the -v (verbosity) flag to Joshua, which sets the debugging level. This is much simpler than changing a properties file and linking to that from the command line. Is it possible to set the log4j debug level programmatically based on the value of -v? We could do

    -v 0 = OFF
    -v 1 = INFO
    -v 2 = DEBUG

Or I would be okay to switch to those values directly (e.g., "-v OFF"). But the former seems more Unix-like.

Thanks again for all your work on this!

matt

@lewismc
Copy link
Member Author

lewismc commented May 27, 2016

@thammegowda please revert logging patch.

@mjpost
Copy link
Contributor

mjpost commented May 27, 2016

Just to be clear, not sure it needs to be reverted. We just need to find a way to make sure the config file gets read and that logging gets set to INFO by default. And logging should all go to STDERR, I think, not STDOUT.

In general, too, having INFO prepended to all informative lines is a bit ugly. Is this a useful convention for reasons I can't understand? It's often useful to see the decoder working with just these high-level labels, and I don't mind debug lines being prepended with DEBUG.

@lewismc
Copy link
Member Author

lewismc commented May 27, 2016

Make an entry in logging file?
Parameterized logging as implemented by @thammegowda is magnitudes more efficient than traditional string + <?>Excepton logging
@thammegowda maybe we need INFO degreded (logging sense) to DEBUG
Thank you @thammegowda for trying to understand the logging. When I was doing the build thats something I didn't approach :)

@mjpost
Copy link
Contributor

mjpost commented May 27, 2016

Yes, I hope I was clear that this is all great work. We just have to figure out how to get log4j to read the config file, which it's currently not doing. Then, we can make sure the test cases pass, and merge back into master?

@thammegowda
Copy link
Member

@lewismc @mjpost Just looked into this. I will debug this issue and come back with my findings soon.

+1 for rethinking about log levels. we can lower some frequent INFO messages to DEBUG level.

@thammegowda
Copy link
Member

Found the issue with logger config file. Seems like -Dlog4j.configuration to override is no longer supported.
Please review #18 and merge.

@thammegowda
Copy link
Member

@mjpost I am unable to run test.sh script, because -

Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: no ken in java.library.path
    at org.apache.joshua.decoder.ff.lm.KenLM.(KenLM.java:52)
    ... 10 more
Caused by: java.lang.UnsatisfiedLinkError: no ken in java.library.path
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
    at java.lang.Runtime.loadLibrary0(Runtime.java:870)
    at java.lang.System.loadLibrary(System.java:1122)
    at org.apache.joshua.decoder.ff.lm.KenLM.(KenLM.java:43)
    ... 10 more
Exception in thread "main" java.lang.RuntimeException: * FATAL: could not find a feature 'LanguageModel'

How do I pass this ?

@thammegowda
Copy link
Member

These are the logs I am getting when the level is OFF, WARN, INFO and DEBUG.

I ran echo "Subject is not object" | bin/joshua > file.txt

off.txt
warn.txt
info.txt
debug.txt

@lewismc
Copy link
Member Author

lewismc commented May 27, 2016

OK @mjpost thanks
@thammegowda for starters, more or less every file that you write at Apache, and that you wish to build community around should probably have a licensse header
https://github.com/apache/incubator-joshua/blob/JOSHUA-252/src/main/resources/log4j.properties
does not.
For starters, can you please update it.
Can you also please scope the Slf4j over Log4j I implemented over on Nutch for an example of how to help @mjpost with the logging issues.
It is important in this build that we retain the efficiency of everyones commits over the build. The build is an enabler not a master.

@lewismc
Copy link
Member Author

lewismc commented May 27, 2016

Its more than overdue as well for me to say @thammegowda thank you. The Logging effort is absolutely paradise. I looked at a lot of code during reformat and your logging patches are very welcome. Thank you Sir.

@mjpost
Copy link
Contributor

mjpost commented May 27, 2016

Okay, great! That worked. I assume the edits to $JOSHUA/bin/joshua mean that eclipse compiled files will override the jar? So I can do fast development in Eclipse?

We still have to solve the KenLM problem. My tests ran because I linked $JOSHUA/ext and $JOSHUA/lib to directories on master. Do we have ideas for how to pull in KenLM and compile it? And BerkeleyLM? I think that's the big thing remaining.

After that, I'll have to go over the log messages in a bit more detail.

@lewismc
Copy link
Member Author

lewismc commented May 27, 2016

I assume the edits to $JOSHUA/bin/joshua mean that eclipse compiled files will override the jar? So I can do fast development in Eclipse?

Yes all you need to execute in your terminal is mvn eclipse:eclipse

We still have to solve the KenLM problem.

Yes we do :(

Do we have ideas for how to pull in KenLM and compile it? And BerkeleyLM?

https://bitbucket.org/atlassian/bash-maven-plugin

Honestly thats the most I can find right now without hacking a Maven Github download build plugin. I am fed up of XML @mjpost :0 :) :) :)

@thammegowda
Copy link
Member

Hi @lewismc Thanks so much.

For starters, can you please update it.

Oh, I missed the licence header. Got it. I do not yet have write permissions to directly fix it, so guess I have to raise another PR to fix it.

Can you also please scope the Slf4j over Log4j I implemented over on Nutch for an example of how to help @mjpost with the logging issues.

Sorry, I am not clear what do you mean?

The Logging effort is absolutely paradise.

Thanks :-)

@thammegowda
Copy link
Member

@mjpost

Okay, great! That worked. I assume the edits to $JOSHUA/bin/joshua mean that eclipse compiled files will override the jar? So I can do fast development in Eclipse?

Seems like you have hard-time with eclipse maven integration.
last time when I used eclipse there was a eclipse plugin that recognized maven projects. That plugin took care of all the complexities under the hood. If you have proper setup, you can just go to "JoshuaDecoder" (or whichever class you want to run) and "Run main" (no need to package, because that takes time)

P.S. I use Intellij Idea opensource edition and the maven integration is a breeze.

@lewismc
Copy link
Member Author

lewismc commented May 28, 2016

yes @thammegowda the process is welcome and appreciated Sir. on a side note hove you tried th Docker containersTZ?

@thammegowda
Copy link
Member

thammegowda commented May 28, 2016

@lewismc Thank you very much, Sir.
Joshua in Docker - not yet tried so far, I will definitely try it.

I am much interested in your PR to Tika to integrate this translator 👍

@lewismc
Copy link
Member Author

lewismc commented May 28, 2016

ack

@mjpost
Copy link
Contributor

mjpost commented May 31, 2016

Another issue: BerkeleyLM is currently pulled from the Maven repository, but it is very old (1.1.2, but 1.1.6 is available). I'm pretty sure its author has abandoned it; how can we get it updated? Joshua has a few bug fixes on it, even. One option: we could just roll the codebase into Joshua (it's Apache 2.0).

@mjpost
Copy link
Contributor

mjpost commented May 31, 2016

So here is what is outstanding:

  • We need to either (a) get a new BerkeleyLM in Maven (anyone know how to do this?) or (b) pull the codebase into Joshua (is this permissible, including the renaming?)
  • I think we can safely leave Thrax as an external dependency that the user has to install, and provide instructions on the website for how to do so.
  • We can do the same for KenLM, and make the BerkeleyLM build and runtime tools the default.

That way we don't have to mess with any shell scripts from Maven, which is not advised (because of portability concerns).

matt

@mjpost
Copy link
Contributor

mjpost commented May 31, 2016

Unless there are any objections, I'd like to merge this branch back into master. There are a few small issues, but none that can't be addressed with documentation. I'd love to do this either tonight or tomorrow, so please voice any objections as soon as possible.

@chrismattmann
Copy link
Contributor

+1 great work! watch out there are conflicts to resolve before merging @mjpost

@lewismc
Copy link
Member Author

lewismc commented May 31, 2016

Hi @mjpost can you point me at the latest BerkeleyLM codebase? I can Mavenize it and get it in to Maven Central so we can use it as a dependency in pom.xml

@mjpost
Copy link
Contributor

mjpost commented May 31, 2016

@lewismc See https://github.com/joshua-decoder/berkeleylm/, which has the 1.1.6 release, plus a few changes that I committed. The only important change is an ability to recognize compressed files that don't happen to end with .gz, but if it's easier to just take the official 1.1.6, that's fine with me; I don't think the changes are crucial.

@mjpost
Copy link
Contributor

mjpost commented May 31, 2016

Actually @chrismattmann it's just a fast-forward at this point — all is merged on JOSHUA-252, it's just not showing up on this PR.

https://github.com/apache/incubator-joshua/tree/JOSHUA-252

@asfgit asfgit merged commit 9d6f84d into apache:master Jun 1, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants