Skip to content

Commit 935504d

Browse files
Merge pull request #13 from mlrun/main
merge main
2 parents a3e6f76 + 07ba674 commit 935504d

File tree

5 files changed

+108
-88
lines changed

5 files changed

+108
-88
lines changed

01_churn_ml_model.ipynb

Lines changed: 39 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -5,25 +5,37 @@
55
"id": "326b3005-9a8e-4b07-9d4c-bb2e2ae5ffd9",
66
"metadata": {},
77
"source": [
8-
"# Churn Model Training + Deployment Pipeline\n",
8+
"# Churn Model: training and deployment pipeline\n",
99
"\n",
10-
"The first part of this demo is to train a tabular ML model for predicting churn. This will be used later in our application pipeline to provide context to the LLM to customize the response.\n",
10+
"The first part of this demo is to train a tabular ML model for predicting churn. This will be used later in the application pipeline to provide context to the LLM to customize the response.\n",
1111
"\n",
12-
"This training pipeline is a standard ML training and deployment pipeline. The only \"unusual\" step is an additional pre-processing step that generates a sentiment analysis score using historical conversational data from a chat log. This sentiment score is used as a model input - we will see later that it is actually the most important feature when predicting churn.\n",
13-
"\n",
14-
"In addition to model training, this notebook demonstrates how to log datasets, run ML pipelines remotely, deploy models as serving functions, and enable model monitoring for production use. The workflow is fully automated using MLRun, making it easy to track experiments, manage artifacts, and monitor deployed models in real time.\n",
15-
"\n",
16-
"# Churn Model Training + Deployment Pipeline\n",
17-
"\n",
18-
"The first part of this demo is to train a tabular ML model for predicting churn. This will be used later in our application pipeline to provide context to the LLM to customize the response.\n",
19-
"\n",
20-
"This training pipeline is a standard ML training and deployment pipeline. The only \"unusual\" step is an additional pre-processing step that generates a sentiment analysis score using historical conversational data from a chat log. This sentiment score is used as a model input - we will see later that it is actually the most important feature when predicting churn.\n",
12+
"This training pipeline is a standard ML training and deployment pipeline. The only \"unusual\" step is an additional pre-processing step that generates a sentiment analysis score using historical conversational data from a chat log. This sentiment score is used as a model input - you'll see later that it is actually the most important feature when predicting churn.\n",
2113
"\n",
2214
"In addition to model training, this notebook demonstrates how to log datasets, run ML pipelines remotely, deploy models as serving functions, and enable model monitoring for production use. The workflow is fully automated using MLRun, making it easy to track experiments, manage artifacts, and monitor deployed models in real time.\n",
2315
"\n",
2416
"![](images/01_churn_ml_model_architecture.png)"
2517
]
2618
},
19+
{
20+
"cell_type": "markdown",
21+
"id": "6fdd7bba",
22+
"metadata": {},
23+
"source": [
24+
"## Table of contents\n",
25+
" 1. [Install MLRun](#install-mlrun)\n",
26+
" 2. [Set up the project](#set-up-the-project)\n",
27+
" 2. [Log the dataset](#log-the-dataset)\n",
28+
" 2. [Run the pipeline](#run-the-pipeline)\n",
29+
" 2. [View the output training artifacts](#view-the-output-training-artifacts)\n",
30+
" 2. [Test the model endpoint](#test-the-model-endpoint)\n",
31+
" 2. [Model monitoring](#model-monitoring)\n",
32+
" \n",
33+
" (install-mlrun)=\n",
34+
" ## Install MLRun and set up the environment\n",
35+
"\n",
36+
" First import mlrun and other required packages:"
37+
]
38+
},
2739
{
2840
"cell_type": "code",
2941
"execution_count": null,
@@ -51,13 +63,13 @@
5163
"id": "3a1ca344-b7fd-47a7-86b7-575d352851a4",
5264
"metadata": {},
5365
"source": [
54-
"### Setup Project\n",
66+
"### Set the up project\n",
5567
"\n",
56-
"First, create and populate the MLRun project. Under the hood, this will execute [project_setup.py](project_setup.py) to add our MLRun functions, workflows, and build the project image - see the [documentation](https://docs.mlrun.org/en/stable/projects/project-setup.html) for more information.\n",
68+
"Now, create and populate the MLRun project. Under the hood, this executes [project_setup.py](project_setup.py) to add our MLRun functions, workflows, and build the project image. See the [documentation](https://docs.mlrun.org/en/stable/projects/project-setup.html) for more information.\n",
5769
"\n",
5870
"Make sure to enable `force_build` on the first run to build the project image. After the initial setup, you can disable `force_build` to speed up subsequent runs. This setup ensures all dependencies and source code are packaged for reproducible and scalable ML workflows.\n",
5971
"\n",
60-
"Set the OpenAI credentials in the project and local environment - **be sure to update [.env.example](.env.example) as described in the [README](README.md)**."
72+
"Set the OpenAI credentials in the project and local environment - **be sure to update [.env.example](.env.example) as described in the [README](README.md#prerequisites)**."
6173
]
6274
},
6375
{
@@ -103,7 +115,7 @@
103115
"id": "43af04dc",
104116
"metadata": {},
105117
"source": [
106-
"*Please note that you will need to set the `\"build_image\": True` once to build the default image for the project. After you successfully built the image, set the `\"build_image\": False\"` to continue.*"
118+
"**Set the `\\\"build_image\\\": True` once to build the default image for the project. After you successfully built the image, set the `\\\"build_image\\\": False\\\"` to continue.**"
107119
]
108120
},
109121
{
@@ -133,7 +145,7 @@
133145
"id": "6c1e6f8a-b144-4311-bdf4-7a5fee3e6bd9",
134146
"metadata": {},
135147
"source": [
136-
"Enables project model monitoring and deploys required infrastructure - see [documentation](https://docs.mlrun.org/en/stable/tutorials/05-model-monitoring.html#realtime-monitor-drift-tutor) for more information. Only needs to be run once."
148+
"Enable project model monitoring on the project and deploy required infrastructure. See [documentation](https://docs.mlrun.org/en/stable/tutorials/05-model-monitoring.html#realtime-monitor-drift-tutor) for more information. This only needs to be run once."
137149
]
138150
},
139151
{
@@ -154,7 +166,7 @@
154166
"id": "7e50bd0d-096f-4322-b509-31b3ac358782",
155167
"metadata": {},
156168
"source": [
157-
"### Log Dataset"
169+
"### Log the dataset"
158170
]
159171
},
160172
{
@@ -186,15 +198,17 @@
186198
"id": "a813c814-e7e1-45b8-b991-72579efe8eda",
187199
"metadata": {},
188200
"source": [
189-
"### Run Pipeline"
201+
"### Run the pipeline"
190202
]
191203
},
192204
{
193205
"cell_type": "markdown",
194206
"id": "cab7d535-244d-4f21-9e1e-61310afed118",
195207
"metadata": {},
196208
"source": [
197-
"Submits [train_and_deploy_workflow.py](src/workflows/train_and_deploy_workflow.py) via Kubeflow Pipelines. ***Note**: Requires minimum 6 CPU to run (see `sentiment_fn` step).*"
209+
"This step submits the [train_and_deploy_workflow.py](src/workflows/train_and_deploy_workflow.py) via Kubeflow Pipelines. \n",
210+
"\n",
211+
"This requires a minimum of 6 CPUs to run (see `sentiment_fn` step in train_and_deploy_workflow.py)."
198212
]
199213
},
200214
{
@@ -517,13 +531,13 @@
517531
"id": "8e4207f4-4fe7-474d-b5d0-761da5a3226e",
518532
"metadata": {},
519533
"source": [
520-
"### View Output Training Artifacts\n",
534+
"### View the output training artifacts\n",
521535
"\n",
522536
"View the training artifacts in the MLRun UI like so:\n",
523537
"\n",
524538
"![](images/churn_pipeline.png)\n",
525539
"\n",
526-
"Or pull the artifacts directly into your notebook for additional exploration like below - see nore information about using data & artifacts in the [documentation](https://docs.mlrun.org/en/stable/concepts/data.html)."
540+
"Alternatively, you can pull the artifacts directly into your notebook for additional exploration like below. See more information about using data and artifacts in the [documentation](https://docs.mlrun.org/en/stable/concepts/data.html)."
527541
]
528542
},
529543
{
@@ -926,15 +940,15 @@
926940
"id": "0dfeee46-4f82-4277-aad1-3cc5583ca8f3",
927941
"metadata": {},
928942
"source": [
929-
"### Test Model Endpoint"
943+
"### Test the model endpoint"
930944
]
931945
},
932946
{
933947
"cell_type": "markdown",
934948
"id": "9c4d263a-297a-4d2d-b8f4-03e5790e0789",
935949
"metadata": {},
936950
"source": [
937-
"Test newly deployed real-time endpoint using test data."
951+
"Test the newly deployed real-time endpoint using test data."
938952
]
939953
},
940954
{
@@ -998,9 +1012,9 @@
9981012
"id": "e536babe-e987-4b33-a93c-243fd8abeb43",
9991013
"metadata": {},
10001014
"source": [
1001-
"### Model Monitoring\n",
1015+
"### Model monitoring\n",
10021016
"\n",
1003-
"Once the churn model is deployed and invoked, you will be able to view the model monitoring results in the MLRun UI:\n",
1017+
"Once the churn model is deployed and invoked, you can view the model monitoring results in the MLRun UI:</br>\n",
10041018
"![](images/tabular_model_monitoring.png)"
10051019
]
10061020
}

02_guardrail_deployment.ipynb

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@
55
"id": "7bd0dfc6-4ed5-40f1-ab93-9fcb72fdb3b6",
66
"metadata": {},
77
"source": [
8-
"# Guardrail Deployment\n",
8+
"# Guardrail deployment\n",
99
"\n",
10-
"The second part of the demo is to deploy guardrails to be used later in our application pipeline to filter user inputs. This notebook will also deploy an LLM as a Judge monitoring application to monitor our generative input guardrail for banking topic adherence.\n",
10+
"The second part of the demo is to deploy guardrails to be used later in the application pipeline to filter user inputs. This notebook will also deploy an LLM as a Judge monitoring application to monitor our generative input guardrail for banking topic adherence.\n",
1111
"\n",
1212
"In this notebook, you will:\n",
1313
"- Set up and configure project secrets and environment variables.\n",
@@ -51,7 +51,7 @@
5151
"id": "a763105c-c7b4-4fa4-b719-a371fd897f51",
5252
"metadata": {},
5353
"source": [
54-
"### Setup Project\n",
54+
"### Setup the project\n",
5555
"\n",
5656
"Load the previously created project in the first notebook."
5757
]
@@ -71,15 +71,15 @@
7171
"id": "0e8481d6-7361-4bfa-b705-0a5765399f48",
7272
"metadata": {},
7373
"source": [
74-
"### LLM as a Judge Monitoring Application"
74+
"### LLM as a judge monitoring application"
7575
]
7676
},
7777
{
7878
"cell_type": "markdown",
7979
"id": "d88f46ba",
8080
"metadata": {},
8181
"source": [
82-
"The \"LLM as a Judge\" monitoring application leverages a large language model (LLM) to automatically evaluate and score the effectiveness of deployed guardrails. By providing a rubric and clear examples, the LLM acts as an impartial evaluator, determining whether user inputs are correctly classified according to defined criteria (e.g., banking-topic relevance). This approach enables scalable, consistent, and automated assessment of guardrail performance, ensuring that only appropriate and relevant inputs are processed by downstream applications.\n",
82+
"The \"LLM as a judge\" monitoring application leverages a large language model (LLM) to automatically evaluate and score the effectiveness of deployed guardrails. By providing a rubric and clear examples, the LLM acts as an impartial evaluator, determining whether user inputs are correctly classified according to defined criteria (e.g., banking-topic relevance). This approach enables scalable, consistent, and automated assessment of guardrail performance, ensuring that only appropriate and relevant inputs are processed by downstream applications.\n",
8383
"\n",
8484
"This implementation is pulled from another [MLRun demo - LLM monitoring and feedback loop: Banking](https://github.com/mlrun/demo-monitoring-and-feedback-loop/tree/main)."
8585
]
@@ -121,7 +121,7 @@
121121
"id": "a1566010",
122122
"metadata": {},
123123
"source": [
124-
"More information about model monitoring applications in the context of LLM's can be found in the [documentation](https://docs.mlrun.org/en/stable/tutorials/genai-02-model-monitor-llm.html#genai-02-mm-llm)."
124+
"More information about model monitoring applications in the context of LLM's can be found in the [documentation's Model monitoring using LLM tutorial](https://docs.mlrun.org/en/stable/tutorials/genai-02-model-monitor-llm.html)."
125125
]
126126
},
127127
{
@@ -187,9 +187,9 @@
187187
"id": "73812b75-e859-43c4-8ed1-3bd88b2d1f28",
188188
"metadata": {},
189189
"source": [
190-
"### Banking Topic Guardrail\n",
190+
"### Banking topic guardrail\n",
191191
"\n",
192-
"The Banking Topic Guardrail is an LLM-powered filter designed to ensure that only banking-related user inputs are processed by downstream applications. It acts as a first line of defense, automatically classifying each user message as either relevant (`True`) or irrelevant (`False`) to banking topics, based on the context of the entire conversation.\n",
192+
"The Banking topic guardrail is an LLM-powered filter designed to ensure that only banking-related user inputs are processed by downstream applications. It acts as a first line of defense, automatically classifying each user message as either relevant (`True`) or irrelevant (`False`) to banking topics, based on the context of the entire conversation.\n",
193193
"\n",
194194
"It's important to distinguish between the guardrail itself (this component), which enforces topic adherence in real time within the application, and the monitoring application described above. The monitoring application uses an LLM as a \"judge\" to independently evaluate and score the effectiveness of this guardrail, providing oversight and ensuring that the guardrail is functioning as intended. This separation allows for both proactive filtering and ongoing quality assurance of user input handling."
195195
]
@@ -202,7 +202,7 @@
202202
"outputs": [],
203203
"source": [
204204
"SYSTEM_PROMPT_GUARDRAILS_V2 = \"\"\"\n",
205-
"You are input guardrails for an AI banking agent that responds exclusively to questions pertaining to banking topics. Respond only with a boolean true/false value on whether the input adheres to banking topics. Consider the most recent input in the context of the whole conversation. Do not include any pre or post amble.\n",
205+
"You are input-guardrails for an AI banking agent that responds exclusively to questions pertaining to banking topics. Respond only with a boolean true/false value on whether the input adheres to banking topics. Consider the most recent input in the context of the whole conversation. Do not include any pre or post amble.\n",
206206
"\n",
207207
"Examples:\n",
208208
"Q: What is the process to apply for a mortgage?\n",
@@ -467,9 +467,9 @@
467467
"id": "ac63e4d8-e9c7-487f-b4a8-6f410c0df1cb",
468468
"metadata": {},
469469
"source": [
470-
"### Toxicity Filter Guardrail\n",
470+
"### Toxicity filter guardrail\n",
471471
"\n",
472-
"The Toxicity Filter Guardrail is designed to automatically detect and filter out user inputs that contain toxic, offensive, or inappropriate language. By leveraging a toxicity classification model, this guardrail ensures that only safe and respectful messages are processed by downstream applications. This helps maintain a positive user experience and protects the system from harmful or disruptive content. The toxicity filter can be customized with a threshold to determine the sensitivity of the filter, allowing for flexible adaptation to different application requirements.\n",
472+
"The Toxicity filter guardrail is designed to automatically detect and filter out user inputs that contain toxic, offensive, or inappropriate language. By leveraging a toxicity classification model, this guardrail ensures that only safe and respectful messages are processed by downstream applications. This helps maintain a positive user experience and protects the system from harmful or disruptive content. The toxicity filter can be customized with a threshold to determine the sensitivity of the filter, allowing for flexible adaptation to different application requirements.\n",
473473
"\n",
474474
"The output of the toxicity guardrail is a boolean value (`True` or `False`). A result of `True` means the input passes the guardrail (i.e., is non-toxic and allowed through), while `False` indicates the input is flagged as toxic and is blocked from further processing."
475475
]

0 commit comments

Comments
 (0)