Skip to content

Commit e073f7b

Browse files
authored
1011 Disable full stack trace when using spark connect (#1024)
* Prune sparkConnect SQL stack trace on AnalysisException * Add AnalysisException on MissingPackageError check * Lint * Revert changes * Fix typo * Add AnalysisException to is_non_sqlalchemy_error * Lint * Fix typo * Update is_non_sqlalchemy_error to support pysparks AnalysisException * Add Changelog * Fix typo * Update changelog
1 parent 9cd72c9 commit e073f7b

File tree

3 files changed

+17
-6
lines changed

3 files changed

+17
-6
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@
2121

2222
* [Feature] `ploomber-extension` is no longer a dependency
2323

24+
* [Feature] Disable full stack trace when using spark connect ([#1011](https://github.com/ploomber/jupysql/issues/1011)) (by [@b1ackout](https://github.com/b1ackout))
25+
2426
## 0.10.12 (2024-07-12)
2527

2628
* [Feature] Remove sqlalchemy upper bound ([#1020](https://github.com/ploomber/jupysql/pull/1020))

src/sql/run/sparkdataframe.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@
99

1010

1111
def handle_spark_dataframe(dataframe, should_cache=False):
12-
"""Execute a ResultSet sqlaproxy using pysark module."""
12+
"""Execute a ResultSet sqlaproxy using pyspark module."""
1313
if not DataFrame and not CDataFrame:
14-
raise exceptions.MissingPackageError("pysark not installed")
14+
raise exceptions.MissingPackageError("pyspark not installed")
1515

1616
return SparkResultProxy(dataframe, dataframe.columns, should_cache)
1717

src/sql/util.py

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@
77
from sqlglot.errors import ParseError
88
from sqlalchemy.exc import SQLAlchemyError
99
from ploomber_core.dependencies import requires
10+
11+
try:
12+
from pyspark.sql.utils import AnalysisException
13+
except ModuleNotFoundError:
14+
AnalysisException = None
15+
1016
import ast
1117
from os.path import isfile
1218
import re
@@ -556,11 +562,14 @@ def is_non_sqlalchemy_error(error):
556562
"pyodbc.ProgrammingError",
557563
# Clickhouse errors
558564
"DB::Exception:",
559-
# Pyspark
560-
"UNRESOLVED_ROUTINE",
561-
"PARSE_SYNTAX_ERROR",
562565
]
563-
return any(msg in str(error) for msg in specific_db_errors)
566+
is_pyspark_analysis_exception = (
567+
isinstance(error, AnalysisException) if AnalysisException else False
568+
)
569+
return (
570+
any(msg in str(error) for msg in specific_db_errors)
571+
or is_pyspark_analysis_exception
572+
)
564573

565574

566575
def if_substring_exists(string, substrings):

0 commit comments

Comments
 (0)