Skip to content

Getting "was not completed in time" error when preprocessing dataset #115

Open
@AftabHussain

Description

@AftabHussain

Hi I am getting a preprocessing error (when invoking source preprocess.sh). I don't get any error when I preprocess the same dataset with code2vec. Appreciate any advice. Here's the error:

Extracting paths from training set...
dir: <dataset dir> was not completed in time
dir: <dataset dir> was not completed in time
dir: <dataset dir> was not completed in time
dir: <dataset dir> was not completed in time
Finished extracting paths from training set
Creating histograms from the training data
subtoken vocab size:  0
node vocab size:  0
target vocab size:  0
File: <dataset_name>.raw.txt
Traceback (most recent call last):
  File "preprocess.py", line 115, in <module>
    max_contexts=int(args.max_contexts), max_data_contexts=int(args.max_data_contexts))
  File "preprocess.py", line 53, in process_file
    print('Average total contexts: ' + str(float(sum_total) / total))
ZeroDivisionError: float division by zero

This is the line that is being triggered:

print('dir: ' + str(dir) + ' was not completed in time', file=sys.stderr)

Appreciate any thoughts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions