Skip to content

I can’t start spark-cluster - Kerberos. #1010

Open
@Armadik

Description

@Armadik

good day!

I can’t start spark-cluster.

Description

I set up a laptop with a getway link. --gateway-url=http://enterprise-gateway:8888
It is not clear in the instructions how to interact with Kerberos. I created hadoop.proxyuser.
When I launch the enterprise-gateway pod, I get tgt.
I do not understand if I need to get tgt for KERNEL_USERNAME ?
I see a strange request for a proxy user. Does he need to be given access to store all Yarn logs?

Screenshots / Logs

`Starting IPython kernel for Spark in Yarn Cluster mode on behalf of user my-user

  • eval exec /opt/spark/bin/spark-submit ' -v ' '--master yarn --deploy-mode cluster --name ${KERNEL_ID:-ERROR__NO__KERNEL_ID} --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.appMasterEnv.PATH=/opt/Anaconda-2020.11-1.0/bin/python ${KERNEL_EXTRA_SPARK_OPTS}' '' /usr/local/share/jupyter/kernels/spark_python_yarn_cluster-/scripts/launch_ipykernel.py '' --RemoteProcessProxy.kernel-id 74d3c3ef-6bfb-478c-9398-ff9e471f0ea1 --RemoteProcessProxy.port-range 0..0 --RemoteProcessProxy.response-address 1.3.32.200:44837 --RemoteProcessProxy.spark-context-initialization-mode eager
    ++ exec /opt/spark/bin/spark-submit -v --master yarn --deploy-mode cluster --name 74d3c3ef-6bfb-478c-9398-ff9e471f0ea1 --conf spark.yarn.submit.waitAppCompletion=false --conf spark.yarn.appMasterEnv.PATH=/opt/Anaconda-2020.11-1.0/bin/python /usr/local/share/jupyter/kernels/spark_python_yarn_cluster/scripts/launch_ipykernel.py --RemoteProcessProxy.kernel-id 74d3c3ef-6bfb-478c-9398-ff9e471f0ea1 --RemoteProcessProxy.port-range 0..0 --RemoteProcessProxy.response-address 1.3.32.200:44837 --RemoteProcessProxy.spark-context-initialization-mode eager
    [D 2021-11-10 16:06:42.032 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:06:43.035 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:06:44.336 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:06:45.234 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:06:46.034 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:06:46.935 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:06:47.833 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying..
    ...
    [D 2021-11-10 16:09:41.232 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:09:41.433 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:09:41.433 EnterpriseGatewayApp] BaseProcessProxy.terminate(): None
    [D 2021-11-10 16:09:41.632 EnterpriseGatewayApp] ApplicationID not yet assigned for KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' - retrying...
    [D 2021-11-10 16:09:41.632 EnterpriseGatewayApp] YarnClusterProcessProxy.kill, application ID: None, kernel ID: 74d3c3ef-6bfb-478c-9398-ff9e471f0ea1, state: None, result: None
    [D 2021-11-10 16:09:41.633 EnterpriseGatewayApp] response socket still open, close it
    [E 2021-11-10 16:09:41.633 EnterpriseGatewayApp] KernelID: '74d3c3ef-6bfb-478c-9398-ff9e471f0ea1' launch timeout due to: Application ID is None. Failed to submit a new application to YARN within 180.0 seconds. Check Enterprise Gateway log for more information.
    [E 211110 16:09:41 web:2239] 500 POST /api/kernels (1.5.41.210) 181250.55ms`

conda list | grep jupyter jupyter-client 6.2.0 pypi_0 pypi jupyter-enterprise-gateway 2.5.1 pypi_0 pypi jupyter_core 4.8.1 py38h06a4308_0 defaults jupyter_server 1.4.1 py38h06a4308_0 defaults jupyter_telemetry 0.1.0 py_0 defaults jupyterhub 1.4.2 py38h06a4308_0 defaults jupyterlab 3.2.1 pyhd3eb1b0_1 defaults jupyterlab_pygments 0.1.2 py_0 defaults jupyterlab_server 2.8.2 pyhd3eb1b0_0 defaults

yarn logs -applicationId application_1635924651123_123 Unable to get ApplicationState. Attempting to fetch logs directly from the filesystem. Guessed logs' owner is proxy-user and current user <proxy-user> does not have permission to access /tmp/logs/*/logs/application_1635924651123_123. Error message found: Permission denied: user=<proxy-user>, access=EXECUTE, inode="/tmp/logs/test":test:hadoop:drwxrwx---

Environment

  • EG_IMPERSONATION_ENABLED = True
  • KERNEL_USERNAME= my-user
  • HIVE_CONF_DIR = /etc/spark/conf.cloudera.spark_on_yarn/yarn-conf
  • HADOOP_CONF_DIR = /etc/spark/conf.cloudera.spark_on_yarn/yarn-conf
  • SPARK_CONF_DIR = /etc/spark/conf.cloudera.spark_on_yarn
  • Others [e.g. Jupyter Server 5.7, JupyterHub 1.0, etc]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions