Skip to content

Mango Build Error  #447

@ssabnis

Description

@ssabnis

Hello,

I am new to genomics project.
I am running Mango and encountering build errors. Any help is greatly appreciated.

I have the following setup:

Package Versions:

- Python 2.7.5
- java version "1.8.0_171"
- Scala code runner version 2.11.12
- Hadoop 3.1.0
- Spark 2.3.1
- npm  3.10.10**

My .bashrc file entries

export JAVA_HOME=/usr
export SPARK_HOME=/opt/spark/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH

ASSEMBLY_DIR=/home/hadoop/mango/mango-assembly/target
ASSEMBLY_JAR="$(ls -1 "$ASSEMBLY_DIR" | grep "^mango-assembly[0-9A-Za-z\_\.-]*\.jar$" | grep -v javadoc | grep -v sources || true)"
export PYSPARK_SUBMIT_ARGS="--jars ${ASSEMBLY_DIR}/${ASSEMBLY_JAR} --driver-class-path ${ASSEMBLY_DIR}/${ASSEMBLY_JAR} pyspark-shell"

Command: mvn package -P python

BUILD ERROR

bdgenomics/mango/test/alignment_test.py::AlignmentTest::test_coverage_distribution FAILED
bdgenomics/mango/test/alignment_test.py::AlignmentTest::test_fragment_distribution FAILED
bdgenomics/mango/test/alignment_test.py::AlignmentTest::test_indel_distribution FAILED
bdgenomics/mango/test/alignment_test.py::AlignmentTest::test_indel_distribution_maximal_bin_size FAILED
bdgenomics/mango/test/alignment_test.py::AlignmentTest::test_indel_distribution_no_elements FAILED
bdgenomics/mango/test/alignment_test.py::AlignmentTest::test_mapq_distribution FAILED
bdgenomics/mango/test/alignment_test.py::AlignmentTest::test_visualize_alignments FAILED
bdgenomics/mango/test/coverage_test.py::CoverageTest::test_coverage_distribution FAILED
bdgenomics/mango/test/coverage_test.py::CoverageTest::test_example_coverage FAILED
bdgenomics/mango/test/distribution_test.py::DistributionTest::test_cumulative_count_distribution FAILED
bdgenomics/mango/test/distribution_test.py::DistributionTest::test_fail_on_invalid_sample FAILED
bdgenomics/mango/test/distribution_test.py::DistributionTest::test_normalized_count_distribution FAILED
bdgenomics/mango/test/distribution_test.py::DistributionTest::test_sampling FAILED
bdgenomics/mango/test/feature_test.py::FeatureTest::test_visualize_features FAILED
bdgenomics/mango/test/notebook_test.py::NotebookTest::test_alignment_example FAILED
bdgenomics/mango/test/notebook_test.py::NotebookTest::test_coverage_example FAILED
bdgenomics/mango/test/notebook_test.py::NotebookTest::test_example FAILED
bdgenomics/mango/test/variant_test.py::VariantTest::test_visualize_variants FAILED

=================================== FAILURES ===================================
___________________ AlignmentTest.test_coverage_distribution ___________________
bdgenomics/mango/test/__init__.py:65: in setUp
    self.ss = SparkSession.builder.master('local[4]').appName(class_name).getOrCreate()
/opt/spark/spark-2.3.1-bin-hadoop2.7/python/pyspark/sql/session.py:173: in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
/opt/spark/spark-2.3.1-bin-hadoop2.7/python/pyspark/context.py:343: in getOrCreate
    SparkContext(conf=conf or SparkConf())
/opt/spark/spark-2.3.1-bin-hadoop2.7/python/pyspark/context.py:115: in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
/opt/spark/spark-2.3.1-bin-hadoop2.7/python/pyspark/context.py:292: in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

conf = <pyspark.conf.SparkConf object at 0x7f8fcdb7fd10>

    def launch_gateway(conf=None):
        """
        launch jvm gateway
        :param conf: spark configuration passed to spark-submit
        :return:
        """
        if "PYSPARK_GATEWAY_PORT" in os.environ:
            gateway_port = int(os.environ["PYSPARK_GATEWAY_PORT"])
            gateway_secret = os.environ["PYSPARK_GATEWAY_SECRET"]
        else:
            SPARK_HOME = _find_spark_home()
            # Launch the Py4j gateway using Spark's run command so that we pick up the
            # proper classpath and settings from spark-env.sh
            on_windows = platform.system() == "Windows"
            script = "./bin/spark-submit.cmd" if on_windows else "./bin/spark-submit"
            command = [os.path.join(SPARK_HOME, script)]
            if conf:
                for k, v in conf.getAll():
                    command += ['--conf', '%s=%s' % (k, v)]
            submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS", "pyspark-shell")
            if os.environ.get("SPARK_TESTING"):
                submit_args = ' '.join([
                    "--conf spark.ui.enabled=false",
                    submit_args
                ])
            command = command + shlex.split(submit_args)

            # Create a temporary directory where the gateway server should write the connection
            # information.
            conn_info_dir = tempfile.mkdtemp()
            try:
                fd, conn_info_file = tempfile.mkstemp(dir=conn_info_dir)
                os.close(fd)
                os.unlink(conn_info_file)

                env = dict(os.environ)
                env["_PYSPARK_DRIVER_CONN_INFO_PATH"] = conn_info_file

                # Launch the Java gateway.
                # We open a pipe to stdin so that the Java gateway can die when the pipe is broken
                if not on_windows:
                    # Don't send ctrl-c / SIGINT to the Java gateway:
                    def preexec_func():
                        signal.signal(signal.SIGINT, signal.SIG_IGN)
                    proc = Popen(command, stdin=PIPE, preexec_fn=preexec_func, env=env)
                else:
                    # preexec_fn not supported on Windows
                    proc = Popen(command, stdin=PIPE, env=env)

                # Wait for the file to appear, or for the process to exit, whichever happens first.
                while not proc.poll() and not os.path.isfile(conn_info_file):
                    time.sleep(0.1)

                if not os.path.isfile(conn_info_file):
                   raise Exception("Java gateway process exited before sending its port number")
                  Exception: Java gateway process exited before sending its port number

/opt/spark/spark-2.3.1-bin-hadoop2.7/python/pyspark/java_gateway.py:93: Exception
----------------------------- Captured stderr call -----------------------------
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/mango/mango-assembly/target/mango-assembly-0.0.2-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/spark-2.3.1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2018-10-12 11:20:03 ERROR SparkUncaughtExceptionHandler:91 - Uncaught exception in thread Thread[main,5,main]
java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CALLBACK_HOST
        at scala.collection.MapLike$class.default(MapLike.scala:228)
        at scala.collection.AbstractMap.default(Map.scala:59)
        at scala.collection.MapLike$class.apply(MapLike.scala:141)
        at scala.collection.AbstractMap.apply(Map.scala:59)
        at org.apache.spark.api.python.PythonGatewayServer$$anonfun$main$1.apply$mcV$sp(PythonGatewayServer.scala:50)
        at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1262)
        at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:37)
        at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions