Pandas should support reading directory of CSV files as this is a common Data Engineering need. Currently, the filepath_or_buffer argument of read_csv in Pandas must direct to a single CSV file. Other frameworks (e.g. Spark, Bodo) support reading a directory containing multiple partitioned CSV files (and then only read the files having extension .csv inside directory).
Assuming ~/path/to/csvs contains CSV files with the same schema, we wish to do
pd.read_csv('~/path/to/csvs')
rather than read the files individually.
Pandas should support reading directory of CSV files as this is a common Data Engineering need. Currently, the
filepath_or_bufferargument ofread_csvin Pandas must direct to a single CSV file. Other frameworks (e.g. Spark, Bodo) support reading a directory containing multiple partitioned CSV files (and then only read the files having extension.csvinside directory).Assuming
~/path/to/csvscontains CSV files with the same schema, we wish to dorather than read the files individually.