Skip to content

Module to convert raw scraped data into a standardised format #5

@adiah80

Description

@adiah80

Raw scraped data from Issue #4 would need to be processed before it can be used for training the models. We need a module that aggregates the raw data into a single dataset (.csv file) containing the training features and labels.

Each tweet tweeted by someone the user follows should be considered as a data point. All the tweets that were interacted with (liked, retweeted, or commented on) should be classified as a positive instance.

Features should include the tweet text, the user who tweeted the tweet, the global tweet interaction metrics (count of likes, retweets, comments), and the tweet time.

More complex features can also be thought of and included.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions