Skip to content

About convert MARCO dataset to Dureader style  #70

@pengwei-iie

Description

@pengwei-iie

When using the script marcov2_to_dureader.py to convert MARCOv2 to dureader, it failed because ValueError: Trailing data

The command: sh run_marco2dureader_preprocess.sh ../Marco/train_v2.1.json ../Marco/train_v2.1_dureaderformat.json

But it occurs an error -- ValueError: Trailing data. Details as follow:
Traceback (most recent call last):
File "marcov1_to_dureader.py", line 33, in
df = pd.read_json(sys.argv[1])
File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 366, in read_json
return json_reader.read()
File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 467, in read
obj = self._get_object_parser(self.data)
File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 484, in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 576, in parse
self._parse_no_numpy()
File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 793, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Trailing data

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions