Skip to content

[Bug]: Classification Model always predicting 0 #217

@yc-um

Description

@yc-um

Contact Details

[email protected]

Short description of the problem here.

Hi, I built a classification model based on the hybrid structural model learned from a dataset with 6 features and ~40k records following the tutorial, Evaluated the model on the test set, and got 0 for precision and recall for class 1. I tried constructing the model with different parameters, but this won't seem to fix the issue and I keep getting 0 for precision and recall, even though the full dataset includes ~20% of class 1. What might be wrong here? I would appreciate any feedback or suggestions.

{'Target_0': {'precision': 0.8104828298476633, 'recall': 1.0, 'f1-score': 0.8953223046206502, 'support': 3139.0}, 'Target_1': {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 734.0}, 'accuracy': 0.8104828298476633, 'macro avg': {'precision': 0.40524141492383164, 'recall': 0.5, 'f1-score': 0.4476611523103251, 'support': 3873.0}, 'weighted avg': {'precision': 0.6568824174778763, 'recall': 0.8104828298476633, 'f1-score': 0.7256433550746763, 'support': 3873.0}}

CausalNex Version

0.12.1

Python Version

3.9.18

Relevant code snippet

sm = from_pandas(df, tabu_edges=[("Target","X1"),("Target","X2"),("Target","X3"),("Target","X4"),("Target","X5")], w_threshold=0.8)
#MANUAL ADJUSTMENT OF CONNECTIONS
sm.add_edge("X1", "Target")
sm.add_edge("X2", "X3")

from causalnex.network import BayesianNetwork

bn = BayesianNetwork(sm)

discretised_data = df.copy()

columns_to_bin = ['X1', 'X2', 'X3', 'X4', 'X5']

# Bin input data
num_bins = 5
bin_range = (0, 10)

# Loop through the columns and apply qcut to create buckets
for column in columns_to_bin:
    discretised_data[f'{column}'] = pd.cut(discretised_data[column], bins=num_bins, labels=False, retbins=False, right=True, include_lowest=True)


# Split 90% train and 10% test
from sklearn.model_selection import train_test_split
train, test = train_test_split(discretised_data, train_size=0.9, test_size=0.1, random_state=7)

bn = bn.fit_node_states(discretised_data)
bn = bn.fit_cpds(train, method="BayesianEstimator", bayes_prior="K2")

from causalnex.evaluation import classification_report
classification_report(bn, test, "Target")

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions