Skip to content

gwasiakshay/LLM-Evaluation-Prototype

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 LLM Evaluation Prototype

This project explores how to evaluate the performance of GenAI models on financial summarization tasks using metrics like ROUGE. It includes a hands-on pipeline built with Hugging Face Transformers, Python, and Streamlit β€” designed to help transition from compliance roles to AI engineering.


πŸ“Œ Objectives

  • Build an end-to-end evaluation pipeline for financial text summarization
  • Compare GenAI-generated summaries against reference texts using ROUGE
  • Visualize evaluation results with Streamlit dashboards
  • Document learnings and progress toward AI engineering readiness

πŸ› οΈ Tech Stack

Tool/Library Purpose
Python Core scripting and data handling
Hugging Face GenAI model loading and inference
evaluate ROUGE metric computation
Streamlit Dashboard visualization
Git + GitHub Version control and project tracking
VS Code + Jupyter Development environment

πŸ“‚ Folder Structure

LLM-Evaluation-Prototype/ β”œβ”€β”€ Mini_Project/ β”‚ β”œβ”€β”€ scripts/ # Evaluation scripts β”‚ β”œβ”€β”€ data/ # Sample inputs and outputs β”œβ”€β”€ Learning_Notes/ # Daily reflections and learnings β”œβ”€β”€ README.md # Project overvie

About

"GenAI evaluation pipeline for financial summarization using ROUGE, Streamlit, and Hugging Face models"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages