Skip to content

cmu-seai/cmu-mlip-model-testing-lab

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cmu-mlip-model-testing-lab

Lab 4: Model Testing with Zeno and LLM

In this lab, you will gain hands-on experience with Zeno and LLM-based test case generation.

  • Zeno is an interactive AI evaluation platform for exploring, debugging, and sharing how your AI systems perform. Evaluate any task and data type with Zeno's modular views which support everything from chatbot conversations to object detection and audio transcription.
  • LLM has been increasingly used for generating synthetic data, and one use case there is to generate additional test cases for a model.

To receive credit for this lab, show your work to the TA during recitation.

Deliverables

  • Successfully start a local Zeno server on the dataset provided, with metrics and model predictions
  • Create 5 slices in the Zeno interface, derive meaningful insights and showcase them to the TA
  • Write down 3 additional slices you want to create and successfully generate 10 examples for one selected slice

Hints: For the slices you create, you should be able to justify why you want to create them and demonstrate what you have observed for the created slices.

Getting started

  • Clone the starter code from this Git repository.
  • The repository includes a python notebook which contains the starter code.

Installation instructions

  • python 3.10 version is needed for the zeno packages to run correctly
  • You can run the command python -m venv mlip-lab4 in the terminal to create a new virtual environment (optional)
  • pip install zenoml datasets transformers tqdm torch
  • Restart the ipynb kernel after running all installation commands

Code related details

  • Finish all 7 steps mentioned in the python notebook
  • If you have trouble downloading the datasets and/or running model inference, use tweets.csv shared in the folder
  • If you have trouble starting a local Zeno server, copy the code in zenohub.py to the notebook and follow the steps
  • If you have trouble using the GPTs provided, use plain ChatGPT for test case generation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 91.6%
  • Python 8.4%