Consider the following code run on a cluster that has databricks-mosaic v0.4.2 installed on it:
import geopandas as gpd
shapefile_dir_path = "/dbfs/FileStore/tl_2024_us_state/"
states_gdf = gpd.read_file(shapefile_dir_path)
It results in the following error:
AttributeError: module 'fiona' has no attribute 'path'
Note that I did not install fiona and geopandas separately, so the code above is using the versions that were installed as mosaic's dependencies.
This is a known error. Elsewhere I have found the following explanation:
It seems fiona recently_upgraded to 1.10 .0 and that mау have broken some older versions of geopandas, which only depend оп fiona being higher than some version, not lower than. Upon closer look, geopandas up to version 0.14.3 still calls fiona.path, but in version 0.14.4 it по longer does. So upgrading geopandas to 0.14.4 should fix it. Alternatively, forcing fiona to stay оп version 1.9.б should also work.
NOTE: upgrading gеоpandas to >=1.0 seems to remove fiona as а dependency altogether, so it will also solve this issue.
This narrative contradicts the note made directly in the Mosaic README: "geopandas 0.14.4 not supported".
However, installing mosaic in my notebook like this:
%pip install databricks-mosaic
%pip uninstall -y fiona
%pip install fiona==1.9.6
dbutils.library.restartPython()
...restores the geopandas ability to read a shapefile.
For reference, mosaic's 0.4.2 dependencies on these two packages appear to be defined like this:
geopandas<0.14.4,>=0.14
fiona>=1.8.21
So, given the suggestion above, it appears that a possible fix would be to restrict the upper version of fiona to 1.9.6