Skip to content

[Bug] Error in Executor setup After Permanent UDF Deletion  #6812

Open
@hzxiongyinke

Description

@hzxiongyinke

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

Hello everyone,

I've encountered an issue with Kyuubi that I'm hoping the community can help with.

I created a permanent UDF in a Kyuubi instance, and later, due to requirement changes, I deleted this UDF through another driver. However, any SQL executed by the previously initiated driver now results in an error, indicating that the UDF cannot be found. Currently, the only solution I have is to restart the Kyuubi engine.

Affects Version(s)

1.8.0

Kyuubi Server Log Output

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 279.0 failed 4 times, most recent failure: Lost task 0.3 in stage 279.0 (TID 251) (core-xxxx.cn-shanghai.emr.aliyuncs.com executor 29): java.io.FileNotFoundException:  [ErrorMessage]: File not found: .GalaxyResource/bigdata_emr_sh/xxx in bucket xxx
        at com.aliyun.jindodata.api.spec.JdoNativeResult.get(JdoNativeResult.java:54)
        at com.aliyun.jindodata.api.spec.protos.coder.JdolistDirectoryReplyDecoder.decode(JdolistDirectoryReplyDecoder.java:23)
        at com.aliyun.jindodata.api.JindoCommonApis.listDirectory(JindoCommonApis.java:112)
        at com.aliyun.jindodata.call.JindoListCall.execute(JindoListCall.java:65)
        at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:665)
        at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:60)
        at org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:851)
        at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:820)
        at org.apache.spark.util.Utils$.fetchFile(Utils.scala:544)
        at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13(Executor.scala:1010)
        at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13$adapted(Executor.scala:1002)
        at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985)
        at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)
        at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
        at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:1002)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:506)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2673)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2609)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2608)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2608)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1182)
        at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1182)
        at scala.Option.foreach(Option.scala:407)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1182)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2861)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2803)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2792)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:952)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2241)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:269)

Kyuubi Engine Log Output

spark executor log error:
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] Executor: Running task 0.0 in stage 283.0 (TID 264)
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] Executor: Fetching oss://xxx/xxxx/xxx with timestamp 1731900662592
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] HadoopLoginUserInfo: TOKEN: YARN_AM_RM_TOKEN
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] HadoopLoginUserInfo: User: xxxx, authMethod: SIMPLE, ugi: xxxx (auth:SIMPLE)
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] JindoHadoopSystem: Initialized native file system: 
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] FsStats: cmd=getFileStatus, src=oss://xxxx/.xxxx/xxx/xxxx, dst=null, size=0, parameter=null, time-in-ms=77, version=6.2.0
24/11/18 16:06:18 INFO [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] FsStats: cmd=list, src=oss://xxxx/.xxxx/xxxx/xxxx, dst=null, size=0, parameter=null, time-in-ms=26, version=6.2.0
24/11/18 16:06:18 ERROR [Executor task launch worker for task 0.0 in stage 283.0 (TID 264)] Executor: Exception in task 0.0 in stage 283.0 (TID 264)
java.io.FileNotFoundException:  [ErrorMessage]: File not found: .xxxx/xxxx/xxxx in bucket xxxx
	at com.aliyun.jindodata.api.spec.JdoNativeResult.get(JdoNativeResult.java:54) ~[jindo-core-6.2.0.jar:?]
	at com.aliyun.jindodata.api.spec.protos.coder.JdolistDirectoryReplyDecoder.decode(JdolistDirectoryReplyDecoder.java:23) ~[jindo-core-6.2.0.jar:?]
	at com.aliyun.jindodata.api.JindoCommonApis.listDirectory(JindoCommonApis.java:112) ~[jindo-core-6.2.0.jar:?]
	at com.aliyun.jindodata.call.JindoListCall.execute(JindoListCall.java:65) ~[jindo-sdk-6.2.0.jar:?]
	at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:665) ~[jindo-sdk-6.2.0.jar:?]
	at com.aliyun.jindodata.common.JindoHadoopSystem.listStatus(JindoHadoopSystem.java:60) ~[jindo-sdk-6.2.0.jar:?]
	at org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:851) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:820) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:544) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13(Executor.scala:1010) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.executor.Executor.$anonfun$updateDependencies$13$adapted(Executor.scala:1002) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:985) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) ~[scala-library-2.12.15.jar:?]
	at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) ~[scala-library-2.12.15.jar:?]
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:984) ~[scala-library-2.12.15.jar:?]
	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:1002) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:506) ~[spark-core_2.12-3.3.1-dw1.2.10.jar:3.3.1-dw1.2.10]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_392]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_392]
	at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_392]
24/11/18 16:06:18 INFO [dispatcher-Executor] YarnCoarseGrainedExecutorBackend: Got assigned task 265

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions