Skip to content

Getting stuck when executing Papermill with sparkmagic kernel  #717

Open
@bhadrip

Description

@bhadrip

🐛 Bug

I am running the following command with sparkmagic kernel (https://github.com/jupyter-incubator/sparkmagic)

/opt/conda/bin/papermill ./input.ipynb ./output.ipynb --log-output --log-level DEBUG --progress-bar --autosave-cell-every 5

This kernel connects to an EMR Cluster and executes sql query. Normally things work, but it gets stuck whenever we use %%sql magic. This issue is observed only when running with papermill, does not happen when running interactively.

%%sql
show databases

The kernel returns the output but papermill does not End the current cell. autosave option captures the output but the metadata shows the cell has still running

duration: null
end_time: null
exception: false
start_time: "2023-03-16T23:41:30.662269"
status: "running"

Running out of ideas to debug this issue, can someone help understand why papermill does not end the cell execution ?

debug message right before getting stuck:

msg_type: display_data
content: {'data': {'text/plain': '<IPython.core.display.HTML object>', 'text/html': '<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border="1" class="dataframe hideme">\n  <thead>\n    <tr style="text-align: right;">\n      <th></th>\n      <th>namespace</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>default</td>\n    </tr>\n  </tbody>\n</table>\n</div>'}, 'metadata': {}, 'transient': {}}
msg_type: comm_msg
content: {'data': {'method': 'update', 'state': {'msg_id': ''}, 'buffer_paths': []}, 'comm_id': 'df913063ede044d29a76d4febf6d372e'}
msg_type: status
content: {'execution_state': 'idle'}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions