Skip to content

data problemย #30

@bakrilawzi

Description

@bakrilawzi

While working on the project, I realized there is an issue in the training_pipeline, specifically in the hyperparameter file. I was following along with your code, but when I reached the cv_evaluate section and attempted to fit the data for cross-validation, I encountered a problem.

The issue arises due to the small number of rows available for the (1, 111) area and consumer_type combination. This limited data leads to a mismatch with the required window size, causing the process to crash. The problem stems from the grouping logic, which results in very few data points for certain groups.

I suggest generating new dummy data to address this. However, there's another challenge: the datetime_utc column currently contains only two identical timestamps, with variation only in energy_consumption and consumer_type.

The latest update of Hopsworks no longer accepts redundant data in the training set. As a result, this redundancy triggers an error during saving. Youโ€™ll need to eliminate these duplicate entries to avoid the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions