Skip to content

Parse 1998 data error #5

Open
Open
@GengYuIsland

Description

Hello coder, when I try to parse the data of 1998, there's an error, the func "def get_patents_list" will return a null list, if I change the code to this:

def get_patents_list(patents_txt_data):
    patents_data = []
    current_patent = []
    for line in patents_txt_data[1:]:
        cleaned_line = ' '.join(line.split())
        if cleaned_line.startswith('PATN'):
            if current_patent:
                patents_data.append(current_patent)
                current_patent = []
            current_patent.append(cleaned_line)
        else:
            current_patent.append(cleaned_line)
    if current_patent:
        patents_data.append(current_patent)
    for i in range(len(patents_data)):
        patent = patents_data[i]
        patents_data[i] = [[word for word in line.split() if word] for line in patent]
    return patents_data

Then It works.
However, it only fits 1998, when I try to use the new func to parse 1999, it didn't.
I guess you must didn't test all the years, so can you help me to solve this problem and make the code more strong? Thank you a lot.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions