The library looks interesting. I tried a simple example with a sample app but I got the following error
[error] (run-main-0) org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 3.0 failed 1 times, most recent failure: Lost task 5.0 in stage 3.0 (TID 29, localhost): java.lang.NullPointerException
[error] at com.tribbloids.spookystuff.utils.Utils$.uriSlash(Utils.scala:55)
[error] at com.tribbloids.spookystuff.utils.Utils$$anonfun$uriConcat$1.apply(Utils.scala:49)
[error] at com.tribbloids.spookystuff.utils.Utils$$anonfun$uriConcat$1.apply(Utils.scala:48)
[error] at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
[error] at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
[error] at com.tribbloids.spookystuff.utils.Utils$.uriConcat(Utils.scala:48)
[error] at com.tribbloids.spookystuff.pages.PageUtils$.autoRestore(PageUtils.scala:183)
[error] at com.tribbloids.spookystuff.actions.TraceView$$anonfun$4.apply(TraceView.scala:95)
[error] at com.tribbloids.spookystuff.actions.TraceView$$anonfun$4.apply(TraceView.scala:95)
[error] at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
[error] at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
[error] at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
[error] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
[error] at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
[error] at scala.collection.AbstractTraversable.map(Traversable.scala:105)
[error] at com.tribbloids.spookystuff.actions.TraceView.fetchOnce(TraceView.scala:95)
[error] at com.tribbloids.spookystuff.actions.TraceView$$anonfun$2.apply(TraceView.scala:83)
[error] at com.tribbloids.spookystuff.actions.TraceView$$anonfun$2.apply(TraceView.scala:83)
[error] at scala.util.Try$.apply(Try.scala:161)
[error] at com.tribbloids.spookystuff.utils.Utils$.retry(Utils.scala:22)
[error] at com.tribbloids.spookystuff.actions.TraceView.fetch(TraceView.scala:82)
[error] at com.tribbloids.spookystuff.sparkbinding.PageRowRDD$$anonfun$26.apply(PageRowRDD.scala:491)
[error] at com.tribbloids.spookystuff.sparkbinding.PageRowRDD$$anonfun$26.apply(PageRowRDD.scala:490)
[error] at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
[error] at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
[error] at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
[error] at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
[error] at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
[error] at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
[error] at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99)
[error] at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
[error] at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
[error] at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
[error] at org.apache.spark.scheduler.Task.run(Task.scala:88)
[error] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
[error] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[error] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[error] at java.lang.Thread.run(Thread.java:745)
The application is pretty simple
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setMaster("local[*]").setAppName("Test")
val sc = new SparkContext(conf)
val spooky = new com.tribbloids.spookystuff.SpookyContext(sc)
import spooky.dsl._
val df = spooky.wget("https://news.google.com/?output=rss&q=barack%20obama"
).join(S"item title".texts)(
Wget(x"http://api.mymemory.translated.net/get?q=${'A}&langpair=en|fr")
)('A ~ 'title, S"translatedText".text ~ 'translated).toDF()
val csv = df.toCSV()
csv.foreach(println)
}
}
Do you have any ideas?
The library looks interesting. I tried a simple example with a sample app but I got the following error
The application is pretty simple
Do you have any ideas?