Skip to content

Commit d5c0a23

Browse files
committed
Automatically generated by github-worflow[bot] for commit: f4bd787
1 parent a5cefcb commit d5c0a23

24 files changed

+3695
-61
lines changed

README.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,52 @@
1+
### Change log [2026-01-20 18:09:09]
2+
1. Item Updated: `verify_schema` (from version: `1.0.0` to `1.0.0`)
3+
4+
### Change log [2026-01-20 18:09:03]
5+
1. Item Updated: `agent_deployer` (from version: `1.0.0` to `1.0.0`)
6+
2. Item Updated: `histogram_data_drift` (from version: `1.0.0` to `1.0.0`)
7+
3. Item Updated: `openai_proxy_app` (from version: `1.0.0` to `1.0.0`)
8+
4. Item Updated: `vllm_module` (from version: `1.0.0` to `1.0.0`)
9+
5. Item Updated: `count_events` (from version: `1.0.0` to `1.0.0`)
10+
6. Item Updated: `evidently_iris` (from version: `1.0.0` to `1.0.0`)
11+
12+
### Change log [2026-01-20 18:08:55]
13+
1. Item Updated: `test_classifier` (from version: `1.1.0` to `1.1.0`)
14+
2. Item Updated: `sklearn_classifier` (from version: `1.2.0` to `1.2.0`)
15+
3. Item Updated: `model_server_tester` (from version: `1.1.0` to `1.1.0`)
16+
4. Item Updated: `azureml_serving` (from version: `1.1.0` to `1.1.0`)
17+
5. Item Updated: `describe_dask` (from version: `1.2.0` to `1.2.0`)
18+
6. Item Updated: `batch_inference` (from version: `1.8.0` to `1.8.0`)
19+
7. Item Updated: `v2_model_server` (from version: `1.2.0` to `1.2.0`)
20+
8. Item Updated: `gen_class_data` (from version: `1.3.0` to `1.3.0`)
21+
9. Item Updated: `send_email` (from version: `1.2.0` to `1.2.0`)
22+
10. Item Updated: `tf2_serving` (from version: `1.1.0` to `1.1.0`)
23+
11. Item Updated: `aggregate` (from version: `1.4.0` to `1.4.0`)
24+
12. Item Updated: `open_archive` (from version: `1.2.0` to `1.2.0`)
25+
13. Item Updated: `describe` (from version: `1.4.0` to `1.4.0`)
26+
14. Item Updated: `v2_model_tester` (from version: `1.1.0` to `1.1.0`)
27+
15. Item Updated: `text_to_audio_generator` (from version: `1.3.0` to `1.3.0`)
28+
16. Item Updated: `pii_recognizer` (from version: `0.4.0` to `0.4.0`)
29+
17. Item Updated: `github_utils` (from version: `1.1.0` to `1.1.0`)
30+
18. Item Updated: `sklearn_classifier_dask` (from version: `1.1.1` to `1.1.1`)
31+
19. Item Updated: `azureml_utils` (from version: `1.4.0` to `1.4.0`)
32+
20. Item Updated: `question_answering` (from version: `0.5.0` to `0.5.0`)
33+
21. Item Updated: `structured_data_generator` (from version: `1.6.0` to `1.6.0`)
34+
22. Item Updated: `arc_to_parquet` (from version: `1.5.0` to `1.5.0`)
35+
23. Item Updated: `silero_vad` (from version: `1.4.0` to `1.4.0`)
36+
24. Item Updated: `load_dataset` (from version: `1.2.0` to `1.2.0`)
37+
25. Item Updated: `auto_trainer` (from version: `1.8.0` to `1.8.0`)
38+
26. Item Updated: `feature_selection` (from version: `1.6.0` to `1.6.0`)
39+
27. Item Updated: `translate` (from version: `0.3.0` to `0.3.0`)
40+
28. Item Updated: `describe_spark` (from version: `1.1.0` to `1.1.0`)
41+
29. Item Updated: `pyannote_audio` (from version: `1.3.0` to `1.3.0`)
42+
30. Item Updated: `onnx_utils` (from version: `1.3.0` to `1.3.0`)
43+
31. Item Updated: `batch_inference_v2` (from version: `2.6.0` to `2.6.0`)
44+
32. Item Updated: `transcribe` (from version: `1.2.0` to `1.2.0`)
45+
33. Item Updated: `model_server` (from version: `1.2.0` to `1.2.0`)
46+
34. Item Updated: `mlflow_utils` (from version: `1.2.0` to `1.2.0`)
47+
35. Item Updated: `noise_reduction` (from version: `1.1.0` to `1.1.0`)
48+
36. Item Updated: `hugging_face_serving` (from version: `1.1.0` to `1.1.0`)
49+
150
### Change log [2026-01-11 09:32:46]
251
1. Item Updated: `verify_schema` (from version: `1.0.0` to `1.0.0`)
352

catalog.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

functions/master/catalog.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
kind: serving
2+
verbose: false
3+
metadata:
4+
categories:
5+
- model-serving
6+
- utilities
7+
name: mlflow-utils
8+
tag: ''
9+
spec:
10+
image: mlrun/mlrun
11+
function_kind: serving_v2
12+
disable_auto_mount: false
13+
max_replicas: 4
14+
min_replicas: 1
15+
function_handler: mlflow-utils-nuclio:handler
16+
build:
17+
functionSourceCode: aW1wb3J0IHppcGZpbGUKZnJvbSB0eXBpbmcgaW1wb3J0IEFueSwgRGljdAppbXBvcnQgbWxmbG93CmZyb20gbWxydW4uc2VydmluZy52Ml9zZXJ2aW5nIGltcG9ydCBWMk1vZGVsU2VydmVyCmltcG9ydCBwYW5kYXMgYXMgcGQKCgpjbGFzcyBNTEZsb3dNb2RlbFNlcnZlcihWMk1vZGVsU2VydmVyKToKICAgICIiIgogICAgTUxGbG93IHRyYWNrZXIgTW9kZWwgc2VydmluZyBjbGFzcywgaW5oZXJpdGluZyB0aGUgVjJNb2RlbFNlcnZlciBjbGFzcyBmb3IgYmVpbmcgaW5pdGlhbGl6ZWQgYXV0b21hdGljYWxseSBieSB0aGUgbW9kZWwKICAgIHNlcnZlciBhbmQgYmUgYWJsZSB0byBydW4gbG9jYWxseSBhcyBwYXJ0IG9mIGEgbnVjbGlvIHNlcnZlcmxlc3MgZnVuY3Rpb24sIG9yIGFzIHBhcnQgb2YgYSByZWFsLXRpbWUgcGlwZWxpbmUuCiAgICAiIiIKCiAgICBkZWYgbG9hZChzZWxmKToKICAgICAgICAiIiIKICAgICAgICBsb2FkcyBhIG1vZGVsIHRoYXQgd2FzIGxvZ2dlZCBieSB0aGUgTUxGbG93IHRyYWNrZXIgbW9kZWwKICAgICAgICAiIiIKICAgICAgICAjIFVuemlwIHRoZSBtb2RlbCBkaXIgYW5kIHRoZW4gdXNlIG1sZmxvdydzIGxvYWQgZnVuY3Rpb24KICAgICAgICBtb2RlbF9maWxlLCBfID0gc2VsZi5nZXRfbW9kZWwoIi56aXAiKQogICAgICAgIG1vZGVsX3BhdGhfdW56aXAgPSBtb2RlbF9maWxlLnJlcGxhY2UoIi56aXAiLCAiIikKCiAgICAgICAgd2l0aCB6aXBmaWxlLlppcEZpbGUobW9kZWxfZmlsZSwgInIiKSBhcyB6aXBfcmVmOgogICAgICAgICAgICB6aXBfcmVmLmV4dHJhY3RhbGwobW9kZWxfcGF0aF91bnppcCkKCiAgICAgICAgc2VsZi5tb2RlbCA9IG1sZmxvdy5weWZ1bmMubG9hZF9tb2RlbChtb2RlbF9wYXRoX3VuemlwKQoKICAgIGRlZiBwcmVkaWN0KHNlbGYsIHJlcXVlc3Q6IERpY3Rbc3RyLCBBbnldKSAtPiBsaXN0OgogICAgICAgICIiIgogICAgICAgIEluZmVyIHRoZSBpbnB1dHMgdGhyb3VnaCB0aGUgbW9kZWwuIFRoZSBpbmZlcnJlZCBkYXRhIHdpbGwKICAgICAgICBiZSByZWFkIGZyb20gdGhlICJpbnB1dHMiIGtleSBvZiB0aGUgcmVxdWVzdC4KCiAgICAgICAgOnBhcmFtIHJlcXVlc3Q6IFRoZSByZXF1ZXN0IHRvIHRoZSBtb2RlbCB1c2luZyB4Z2Jvb3N0J3MgcHJlZGljdC4KICAgICAgICAgICAgICAgIFRoZSBpbnB1dCB0byB0aGUgbW9kZWwgd2lsbCBiZSByZWFkIGZyb20gdGhlICJpbnB1dHMiIGtleS4KCiAgICAgICAgOnJldHVybjogVGhlIG1vZGVsJ3MgcHJlZGljdGlvbiBvbiB0aGUgZ2l2ZW4gaW5wdXQuCiAgICAgICAgIiIiCgogICAgICAgICMgR2V0IHRoZSBpbnB1dHMgYW5kIHNldCB0byBhY2NlcHRlZCB0eXBlOgogICAgICAgIGlucHV0cyA9IHBkLkRhdGFGcmFtZShyZXF1ZXN0WyJpbnB1dHMiXSkKCiAgICAgICAgIyBQcmVkaWN0IHVzaW5nIHRoZSBtb2RlbCdzIHByZWRpY3QgZnVuY3Rpb246CiAgICAgICAgcHJlZGljdGlvbnMgPSBzZWxmLm1vZGVsLnByZWRpY3QoaW5wdXRzKQoKICAgICAgICAjIFJldHVybiBhcyBsaXN0OgogICAgICAgIHJldHVybiBwcmVkaWN0aW9ucy50b2xpc3QoKQoKZnJvbSBtbHJ1bi5ydW50aW1lcyBpbXBvcnQgbnVjbGlvX2luaXRfaG9vawpkZWYgaW5pdF9jb250ZXh0KGNvbnRleHQpOgogICAgbnVjbGlvX2luaXRfaG9vayhjb250ZXh0LCBnbG9iYWxzKCksICdzZXJ2aW5nX3YyJykKCmRlZiBoYW5kbGVyKGNvbnRleHQsIGV2ZW50KToKICAgIHJldHVybiBjb250ZXh0Lm1scnVuX2hhbmRsZXIoY29udGV4dCwgZXZlbnQpCg==
18+
requirements:
19+
- mlflow~=3.5
20+
origin_filename: ''
21+
code_origin: ''
22+
description: Mlflow model server, and additional utils.
23+
command: ''
24+
base_image_pull: false
25+
default_class: MLFlowModelServer
26+
source: ''
27+
default_handler: ''
28+
env:
29+
- name: MLRUN_HTTPDB__NUCLIO__EXPLICIT_ACK
30+
value: enabled
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
apiVersion: v1
2+
categories:
3+
- model-serving
4+
- utilities
5+
description: Mlflow model server, and additional utils.
6+
doc: ''
7+
example: mlflow_utils.ipynb
8+
generationDate: 2024-05-23:12-00
9+
hidden: false
10+
icon: ''
11+
labels:
12+
author: Iguazio
13+
maintainers: []
14+
marketplaceType: ''
15+
mlrunVersion: 1.10.0
16+
name: mlflow_utils
17+
platformVersion: ''
18+
spec:
19+
customFields:
20+
default_class: MLFlowModelServer
21+
filename: mlflow_utils.py
22+
handler: handler
23+
image: mlrun/mlrun
24+
kind: serving
25+
requirements:
26+
- mlflow~=3.5
27+
url: ''
28+
version: 1.2.0

functions/master/mlflow_utils/1.2.0/src/mlflow_utils.ipynb

Lines changed: 1353 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
import zipfile
2+
from typing import Any, Dict
3+
import mlflow
4+
from mlrun.serving.v2_serving import V2ModelServer
5+
import pandas as pd
6+
7+
8+
class MLFlowModelServer(V2ModelServer):
9+
"""
10+
MLFlow tracker Model serving class, inheriting the V2ModelServer class for being initialized automatically by the model
11+
server and be able to run locally as part of a nuclio serverless function, or as part of a real-time pipeline.
12+
"""
13+
14+
def load(self):
15+
"""
16+
loads a model that was logged by the MLFlow tracker model
17+
"""
18+
# Unzip the model dir and then use mlflow's load function
19+
model_file, _ = self.get_model(".zip")
20+
model_path_unzip = model_file.replace(".zip", "")
21+
22+
with zipfile.ZipFile(model_file, "r") as zip_ref:
23+
zip_ref.extractall(model_path_unzip)
24+
25+
self.model = mlflow.pyfunc.load_model(model_path_unzip)
26+
27+
def predict(self, request: Dict[str, Any]) -> list:
28+
"""
29+
Infer the inputs through the model. The inferred data will
30+
be read from the "inputs" key of the request.
31+
32+
:param request: The request to the model using xgboost's predict.
33+
The input to the model will be read from the "inputs" key.
34+
35+
:return: The model's prediction on the given input.
36+
"""
37+
38+
# Get the inputs and set to accepted type:
39+
inputs = pd.DataFrame(request["inputs"])
40+
41+
# Predict using the model's predict function:
42+
predictions = self.model.predict(inputs)
43+
44+
# Return as list:
45+
return predictions.tolist()
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
mlflow~=3.5
2+
lightgbm
3+
xgboost
Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
# Copyright 2018 Iguazio
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
#
15+
import tempfile
16+
import shutil
17+
import lightgbm as lgb
18+
import mlflow
19+
import mlflow.environment_variables
20+
import mlflow.xgboost
21+
import pytest
22+
import xgboost as xgb
23+
from sklearn import datasets
24+
from sklearn.metrics import accuracy_score, log_loss
25+
from sklearn.model_selection import train_test_split
26+
27+
import os
28+
# os.environ["MLRUN_IGNORE_ENV_FILE"] = "True" #TODO remove before push
29+
30+
import mlrun
31+
import mlrun.launcher.local
32+
# Important:
33+
# unlike mlconf which resets back to default after each test run, the mlflow configurations
34+
# and env vars don't, so at the end of each test we need to redo anything we set in that test.
35+
# what we cover in these tests: logging "regular" runs with, experiment name, run id and context
36+
# name (last two using mlconf), failing run mid-way, and a run with no handler.
37+
# we also test here importing of runs, artifacts and models from a previous run.
38+
39+
# simple mlflow example of lgb logging
40+
def lgb_run():
41+
# prepare train and test data
42+
iris = datasets.load_iris()
43+
X = iris.data
44+
y = iris.target
45+
X_train, X_test, y_train, y_test = train_test_split(
46+
X, y, test_size=0.2, random_state=42
47+
)
48+
49+
# enable auto logging
50+
mlflow.lightgbm.autolog()
51+
52+
train_set = lgb.Dataset(X_train, label=y_train)
53+
54+
with mlflow.start_run():
55+
# train model
56+
params = {
57+
"objective": "multiclass",
58+
"num_class": 3,
59+
"learning_rate": 0.1,
60+
"metric": "multi_logloss",
61+
"colsample_bytree": 1.0,
62+
"subsample": 1.0,
63+
"seed": 42,
64+
}
65+
# model and training data are being logged automatically
66+
model = lgb.train(
67+
params,
68+
train_set,
69+
num_boost_round=10,
70+
valid_sets=[train_set],
71+
valid_names=["train"],
72+
)
73+
74+
# evaluate model
75+
y_proba = model.predict(X_test)
76+
y_pred = y_proba.argmax(axis=1)
77+
loss = log_loss(y_test, y_proba)
78+
acc = accuracy_score(y_test, y_pred)
79+
80+
# log metrics
81+
mlflow.log_metrics({"log_loss": loss, "accuracy": acc})
82+
83+
84+
# simple mlflow example of xgb logging
85+
def xgb_run():
86+
# prepare train and test data
87+
iris = datasets.load_iris()
88+
x = iris.data
89+
y = iris.target
90+
x_train, x_test, y_train, y_test = train_test_split(
91+
x, y, test_size=0.2, random_state=42
92+
)
93+
94+
# enable auto logging
95+
mlflow.xgboost.autolog()
96+
97+
dtrain = xgb.DMatrix(x_train, label=y_train)
98+
dtest = xgb.DMatrix(x_test, label=y_test)
99+
100+
with mlflow.start_run():
101+
# train model
102+
params = {
103+
"objective": "multi:softprob",
104+
"num_class": 3,
105+
"learning_rate": 0.3,
106+
"eval_metric": "mlogloss",
107+
"colsample_bytree": 1.0,
108+
"subsample": 1.0,
109+
"seed": 42,
110+
}
111+
# model and training data are being logged automatically
112+
model = xgb.train(params, dtrain, evals=[(dtrain, "train")])
113+
# evaluate model
114+
y_proba = model.predict(dtest)
115+
y_pred = y_proba.argmax(axis=1)
116+
loss = log_loss(y_test, y_proba)
117+
acc = accuracy_score(y_test, y_pred)
118+
# log metrics
119+
mlflow.log_metrics({"log_loss": loss, "accuracy": acc})
120+
121+
122+
@pytest.mark.parametrize("handler", ["xgb_run", "lgb_run"])
123+
def test_track_run_with_experiment_name(handler):
124+
"""
125+
This test is for tracking a run logged by mlflow into mlrun while it's running using the experiment name.
126+
first activate the tracking option in mlconf, then we name the mlflow experiment,
127+
then we run some code that is being logged by mlflow using mlrun,
128+
and finally compare the mlrun we tracked with the original mlflow run using the validate func
129+
"""
130+
# Enable general tracking
131+
mlrun.mlconf.external_platform_tracking.enabled = True
132+
# Set the mlflow experiment name
133+
mlflow.environment_variables.MLFLOW_EXPERIMENT_NAME.set(f"{handler}_test_track")
134+
with tempfile.TemporaryDirectory() as test_directory:
135+
# Use SQLite backend instead of filesystem (filesystem will be deprecated in Feb 2026)
136+
db_uri = f"sqlite:///{os.path.join(test_directory, 'mlflow.db')}"
137+
mlflow.set_tracking_uri(db_uri) # Tell mlflow where to save logged data
138+
139+
# Create a project for this tester:
140+
project = mlrun.get_or_create_project(name="default", context=test_directory)
141+
142+
# Create a MLRun function using the tester source file (all the functions must be located in it):
143+
func = project.set_function(
144+
func=__file__,
145+
name=f"{handler}-test",
146+
kind="job",
147+
image="mlrun/mlrun",
148+
requirements=["mlflow"],
149+
)
150+
# mlflow creates a dir to log the run, this makes it in the tmpdir we create
151+
trainer_run = func.run(
152+
local=True,
153+
handler=handler,
154+
output_path=test_directory,
155+
)
156+
157+
# Find the MLflow logged model and prepare it for serving
158+
# Note: In MLflow 2.24+, we must dynamically discover model paths since MLflow changed
159+
# its directory structure from predictable paths (e.g., experiment_name/0/model/) to
160+
# UUID-based paths (e.g., experiment_id/run_uuid/artifacts/model/).
161+
162+
# Create MLflow client to query the tracking server
163+
mlflow_client = mlflow.tracking.MlflowClient(tracking_uri=db_uri)
164+
165+
# Get the experiment by name to obtain its ID
166+
experiment = mlflow_client.get_experiment_by_name(f"{handler}_test_track")
167+
168+
# Search for runs in this experiment and get the run ID
169+
# (There should only be one run from our training above)
170+
run_id = mlflow_client.search_runs(experiment_ids=[experiment.experiment_id])[0].info.run_id
171+
172+
# Find all models logged in this run
173+
logged_models = mlflow.search_logged_models(filter_string=f"source_run_id = '{run_id}'")
174+
175+
# Extract the artifact location and remove the "file://" prefix
176+
model_artifacts_dir = logged_models["artifact_location"].tolist()[0].replace("file://", "")
177+
178+
# Package the model artifacts as a zip file for MLFlowModelServer
179+
# Note: MLFlowModelServer requires models to be packaged as zip archives
180+
# rather than loose directories for deployment
181+
model_path = os.path.join(test_directory, f"{handler}-model-serving")
182+
os.makedirs(model_path, exist_ok=True)
183+
shutil.make_archive(os.path.join(model_path, "model"), 'zip', model_artifacts_dir)
184+
185+
serving_func = project.set_function(
186+
func=os.path.abspath("function.yaml"),
187+
name=f"{handler}-server",
188+
)
189+
model_name = f"{handler}-model"
190+
# Add the model
191+
serving_func.add_model(
192+
model_name,
193+
class_name="MLFlowModelServer",
194+
model_path=model_path,
195+
)
196+
197+
# Create a mock server
198+
server = serving_func.to_mock_server()
199+
200+
# An example taken randomly
201+
result = server.test(f"/v2/models/{model_name}/predict", {"inputs": [[5.1, 3.5, 1.4, 0.2]]})
202+
print(result)
203+
assert result
204+
# unset mlflow experiment name to default
205+
mlflow.environment_variables.MLFLOW_EXPERIMENT_NAME.unset()
206+
207+

0 commit comments

Comments
 (0)