week2 pickle model for Lasso

with reference to the week 2 modified version of the duration-prediction.ipynb ([link](https://github.com/DataTalksClub/mlops-zoomcamp/blob/main/02-experiment-tracking/duration-prediction.ipynb))

I think that the pickle model that is log_artifact() in the last line of the following block of code is the wrong one, because the lin_reg.bin model is the one saved outside the mlflow run (the one obtained by fitting the linear regression model without regularization) and this model is different from the one fitted inside the experiment run (that is a Lasso model)

```
lr = LinearRegression()
lr.fit(X_train, y_train)

y_pred = lr.predict(X_val)

mean_squared_error(y_val, y_pred, squared=False)
7.758715210382775

with open('models/lin_reg.bin', 'wb') as f_out:
    pickle.dump((dv, lr), f_out)

with mlflow.start_run():

    mlflow.set_tag("developer", "cristian")

    mlflow.log_param("train-data-path", "./data/green_tripdata_2021-01.csv")
    mlflow.log_param("valid-data-path", "./data/green_tripdata_2021-02.csv")

    alpha = 0.1
    mlflow.log_param("alpha", alpha)
    lr = Lasso(alpha)
    lr.fit(X_train, y_train)

    y_pred = lr.predict(X_val)
    rmse = mean_squared_error(y_val, y_pred, squared=False)
    mlflow.log_metric("rmse", rmse)

    mlflow.log_artifact(local_path="models/lin_reg.bin", artifact_path="models_pickle")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week2 pickle model for Lasso #201

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

week2 pickle model for Lasso #201

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions