-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Currently, when running ContourUSV with the USCMed dataset, an error occurs in the "generate annotations" step. This step formats the existing ground truth CSV files found in the dataset to be in the appropriate format (column headers) for the evaluation step. The error traceback is shown below:
Traceback (most recent call last):
File "/Users/evana_anis/Desktop/VSCode/github_tests/contourusv/contourusv/main.py", line 375, in <module>
generate_annotations(experiment, trial, root_path, file_ext)
File "/Users/evana_anis/Desktop/VSCode/github_tests/contourusv/contourusv/generate_annotation.py", line 189, in generate_annotations
save_annotations(matched_csv, audio_file, output_path, file_ext)
File "/Users/evana_anis/Desktop/VSCode/github_tests/contourusv/contourusv/generate_annotation.py", line 109, in save_annotations
usv_data = loaders[file_ext](f)
^^^^^^^^^^^^^^^^^^^^
File "/Users/evana_anis/Desktop/VSCode/github_tests/contourusv/contourusv/generate_annotation.py", line 80, in load_csv_usv
data = pd.read_csv(file_name, header=None, names=['begin_time', 'end_time'], usecols=[0, 1])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/evana_anis/anaconda3/envs/research/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
return _read(filepath_or_buffer, kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/evana_anis/anaconda3/envs/research/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 626, in _read
return parser.read(nrows)
^^^^^^^^^^^^^^^^^^
File "/Users/evana_anis/anaconda3/envs/research/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1923, in read
) = self._engine.read( # type: ignore[attr-defined]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/evana_anis/anaconda3/envs/research/lib/python3.11/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 234, in read
chunks = self._reader.read_low_memory(nrows)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "parsers.pyx", line 838, in pandas._libs.parsers.TextReader.read_low_memory
File "parsers.pyx", line 921, in pandas._libs.parsers.TextReader._read_rows
File "parsers.pyx", line 983, in pandas._libs.parsers.TextReader._convert_column_data
pandas.errors.ParserError: Too many columns specified: expected 2 and found 1
From the traceback, we can see that the line that causes this issue is line 80 in generation.py:
data = pd.read_csv(file_name, header=None, names=['begin_time', 'end_time'], usecols=[0, 1])
The issue can be resolved by updating the line with:
data = pd.read_csv(file_name, header=None, skiprows=1, names=['begin_time', 'end_time'], sep='\t', usecols=[1, 2])
This solution is only specific to the USCMed dataset's ground truth annotations (based on the current column structure).