Skip to content

chore(geopandas): Replace assert_series_equal calls with pd.check_pd_series_equal() #2379

@petern48

Description

@petern48

While reviewing #2378, I noticed there are some sprinkles of assert_series_equal() calls in the test files from earlier development. We should instead use the self.check_pd_series_equal() wrapper because it validates that the result of our Geopandas functions is ps.Series (pyspark series) and not pd.Series (pandas series).

Usually, the to_pandas() call is inlined, but sometimes it occurs on another line like below. We should update these cases, too.

result = geoseries.length.to_pandas()
expected = pd.Series([0.000000, 1.414214, 3.414214, 4.828427])
assert_series_equal(result, expected)

Definition of Done:
Running the following grep command should ideally only return the call in the check_pd_series_equal function (and the import). There's definitely a lot more than that at the moment. If we can't replace every call, then it's a bad sign that maybe we're returning a non-scalable pd.Series instead of the spark version.

grep -r "assert_series_equal" sedona/python/tests/geopandas

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions