We are getting,
Job failed with error RuntimeError: UDF function/class data is too large!(view logs)
for the code like this:
data_sample = parsed_df
input_df = dc.read_pandas(data_sample).select("test")
if data is large enough (>10M limit)
We need to rewrite read_values to avoid using gen and do direct operations.