Skip to content

Mango submit fails with parquet class not found #465

@SatyaGsk

Description

@SatyaGsk

I am using mango-distribution-0.0.1 on Hadoop 2.6.0-cdh5.14.4/spark version 2.2.0/Scala version 2.11.8
Mango submit fails with parquet class not found. I tried to pass parquet class in CLI but it not helping as shown by messages below. I also included how data files layout on HDFS.

[sm@bluedata750 bin]$ ./mango-submit --packages org.apache.parquet:parquet-hadoop:1.8.2 /user/sm/hg19.17.2bit -genes /user/sm/ensGene.bb -reads /user/sm/chr17.7500000-7515000.sam.adam -variants /user/sm/chr17.adam -show_genotypes -discover Using spark-submit=/usr/bin/spark2-submit
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/parquet/hadoop/metadata/CompressionCodecName
at org.bdgenomics.utils.cli.ParquetArgs$class.$init$(ParquetArgs.scala:40)
at org.bdgenomics.mango.cli.VizReadsArgs.(VizReads.scala:252)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at org.bdgenomics.utils.cli.Args4j$.apply(Args4j.scala:34)
at org.bdgenomics.mango.cli.VizReads$.apply(VizReads.scala:196)
at org.bdgenomics.utils.cli.BDGCommandCompanion$class.main(BDGCommand.scala:33)
at org.bdgenomics.mango.cli.VizReads$.main(VizReads.scala:125)
at org.bdgenomics.mango.cli.VizReads.main(VizReads.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.parquet.hadoop.metadata.CompressionCodecName
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 21 more
[sm@bluedata750 bin]$ ./mango-submit /user/sm/hg19.17.2bit -genes /user/sm/ensGene.bb -reads /user/sm/chr17.7500000-7515000.sam.adam -variants /user/sm/chr17.adam -show_genotypes -discover
Using spark-submit=/usr/bin/spark2-submit
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/parquet/hadoop/metadata/CompressionCodecName
at org.bdgenomics.utils.cli.ParquetArgs$class.$init$(ParquetArgs.scala:40)
at org.bdgenomics.mango.cli.VizReadsArgs.(VizReads.scala:252)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at org.bdgenomics.utils.cli.Args4j$.apply(Args4j.scala:34)
at org.bdgenomics.mango.cli.VizReads$.apply(VizReads.scala:196)
at org.bdgenomics.utils.cli.BDGCommandCompanion$class.main(BDGCommand.scala:33)
at org.bdgenomics.mango.cli.VizReads$.main(VizReads.scala:125)
at org.bdgenomics.mango.cli.VizReads.main(VizReads.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.parquet.hadoop.metadata.CompressionCodecName
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 21 more
[sm@bluedata750 bin]$ hadoop fs -ls /user/sm
Found 7 items
drwxrwx---+ - sm supergroup 0 2018-10-30 21:27 /user/sm/.sparkStaging
-rw-rw----+ 3 sm supergroup 5440756334 2018-10-30 21:03 /user/sm/LN44765.bed
drwxrwx---+ - sm supergroup 0 2018-11-09 07:19 /user/sm/chr17.7500000-7515000.sam.adam
drwxrwx---+ - sm supergroup 0 2018-10-30 21:27 /user/sm/chr17.adam
-rw-rw----+ 3 sm supergroup 91866 2018-10-17 10:28 /user/sm/chr17.vcf
-rw-rw----+ 3 sm supergroup 3344732 2018-11-09 07:28 /user/sm/ensGene.bb
-rw-rw----+ 3 sm supergroup 21252941 2018-11-09 07:21 /user/sm/hg19.17.2bit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions