Skip to content

I get a "RepresenterError" when trying to log a model using service principal #1564

Open
@poppeGabrielsen

Description

@poppeGabrielsen

Which example? Describe the issue

For the below code based on this sample notebook;
https://github.com/Azure/azureml-examples/blob/main/notebooks/using-mlflow/train-with-mlflow/xgboost_service_principal.ipynb

I get a "RepresenterError" when trying to log a model using service principal, and not with user. SP is an appregistration and has the role "AzureML Data Scientist" on the aml workspace. Logging metrics with SP will not produce errors, but logging model will. Have tested with mlflow 1.27.0 and 1.28.0. Is this expected behaviour or bug? The mlflow.sklearn.autolog() as done in sample notebook does not work either.

example:

"name": "RepresenterError",
	"message": "('cannot represent an object', OrderedDict([('name', 'mlflow-env'), ('channels', ['conda-forge']), ('dependencies', ['python=3.9.7', 'pip<=21.2.4', {'pip': ['mlflow', 'cloudpickle==2.1.0', 'psutil==5.9.1', 'scikit-learn==1.1.2', 'typing-extensions==4.3.0']}])]))", .... and so on ....

Additional context

This code will generate the error:

#%%
import os
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
import mlflow
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
import pandas as pd
from azureml.core import Workspace


### SECTION A: Create credentials from service principal will produce error
os.environ['AZURE_TENANT_ID']="..."
os.environ['AZURE_CLIENT_ID']="..."
os.environ['AZURE_CLIENT_SECRET']="..."
credentials = DefaultAzureCredential()
ml_client = MLClient.from_config(credential=credentials)
ws = ml_client.workspaces.get(name="wsname")
mlflow.set_tracking_uri(ws.mlflow_tracking_uri)
### END A

### SECTION B: Alternative, use credentials for user which will not produce error (Replace A with B)
#ws = Workspace.from_config('config.json')
#ws.get_mlflow_tracking_uri()
#mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
### END B


mlflow.set_experiment("expname")

# Fit a simple model and log it
data_uri = "https://azuremlexamples.blob.core.windows.net/datasets/iris.csv"
df = pd.read_csv(data_uri)

X = df.drop(["species"], axis=1)
y = df["species"]
enc = LabelEncoder()
y = enc.fit_transform(y)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = RandomForestClassifier()
model.fit(X_train, y_train)

# log model
with mlflow.start_run() as run:
    mlflow.sklearn.log_model(model, 'modelpath')

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions