-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Issue checklist
- This is not a bug or a feature/enhancement request.
- I searched through the GitHub issues and this issue has not been opened before.
Issue
How to reproduce:
Run the dummy transform on my local machine failed with spark driver binding error, see below.
java.net.BindException: Can't assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
at java.base/sun.nio.ch.Net.bind0(Native Method)
...
...
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/foundry_dev_tools/utils/caches/spark_caches.py", line 272, in _read_parquet
return get_spark_session().read.format("parquet").load(os.fspath(path.joinpath("*")))
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/foundry_dev_tools/utils/spark.py", line 26, in get_spark_session
.getOrCreate()
^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/pyspark/sql/session.py", line 497, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/pyspark/context.py", line 515, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/pyspark/context.py", line 203, in __init__
self._do_init(
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/pyspark/context.py", line 296, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/pyspark/context.py", line 421, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/py4j/java_gateway.py", line 1587, in __call__
return_value = get_return_value(
^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/mambaforge/base/envs/.../lib/python3.11/site-packages/py4j/protocol.py", line 326, in get_return_value
raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.net.BindException: Can't assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.
at java.base/sun.nio.ch.Net.bind0(Native Method)
Pseudo-code:
# import os
from pyspark.sql import DataFrame
from transforms.api import Input, Output, transform_df
from myproject.datasets.utils import code_runs_windows_or_macOS
# Fix Spark driver binding issue on Mac
# os.environ.setdefault("SPARK_LOCAL_IP", "127.0.0.1")
# os.environ.setdefault("SPARK_DRIVER_BIND_ADDRESS", "127.0.0.1")
@transform_df(
Output("[OUTPUT_PATH]"),
input_df=Input("[INPUT_PATH]"),
)
def dummy_transform(input_df: DataFrame) -> DataFrame:
"""
"""
return input_df
if __name__ == "__main__":
df = dummy_transform.compute()
print("done")
How did I solve it for now:
I would like to report it incase others encounter the same issue. Specifying the environment variables (e.g. uncomment the code lines in the pseudo code) helps me solving the problem. There should be other more elegant solutions. Glad to hear opinions from experts.
System:
MacOS sequoia 15.6.1
Chip: Apple M1 Pro
Metadata
Metadata
Assignees
Labels
No labels