Issue Description:
Hello.
I have discovered a performance degradation in the df.groupby function of pandas version 1.1.5. And I notice that setup_scripts/conda_env_linux.sh,
and other files show that it depends on pandas version 1.1.5. I am not sure whether this performance problem in pandas will affect this repository. I found some discussions on GitHub related to this issue, including #38495 and #38892.
I also found that legacy/cluster/cluster.py used the influenced api. There may be more files used the influenced api.
Suggestion
I would recommend considering an upgrade to a different version of pandas > 1.1.5 or exploring other solutions to optimize the performance.
Any other workarounds or solutions would be greatly appreciated.
Thank you!