Spark - "spark.sql.shuffle.partitions" set to "auto" is unsupported (crashes)

### What happens?

If you do not provide `num_partitions_on_repartition` to `SparkAPI`, the code that attempts to determine this setting automatically consults `self.spark.conf.get("spark.default.parallelism")` and `self.spark.conf.get("spark.sql.shuffle.partitions")`, see https://github.com/moj-analytical-services/splink/blob/master/splink/internals/spark/database_api.py#L212

However this doesn't support the case where either of these settings are present but not numerical, it results in a string being left in `parallelism_value`, later causing an except when the division by 2 happens below on line 217. 

In particular `spark.sql.shuffle.partitions` is often `"auto"` on databricks clusters.  

### To Reproduce

Instantiate `SparkAPI` without `num_partitions_on_repartition` and `spark.sql.shuffle.partitions` set to `"auto"`

### OS:

Databricks 16.4

### Splink version:

4.0.12

### Have you tried this on the latest `master` branch?

- [x] I agree

### Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

- [x] I agree

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark - "spark.sql.shuffle.partitions" set to "auto" is unsupported (crashes) #2898

What happens?

To Reproduce

OS:

Splink version:

Have you tried this on the latest `master` branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Spark - "spark.sql.shuffle.partitions" set to "auto" is unsupported (crashes) #2898

Description

What happens?

To Reproduce

OS:

Splink version:

Have you tried this on the latest master branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Have you tried this on the latest `master` branch?