Intel® Unnati Industrial Training 2025 - Bug Detection and Fixing

🔗 See how our model performs: Buggy Code Fixer 2.0 on Hugging Face (https://huggingface.co/spaces/Eleuther/buggy-code-fixer_2.0)

Problem Statement 1: Bug Detection and Fixing

Project Overview

This project focuses on bug detection and fixing using a fine-tuned LLaMA 3.1 8B model. It was developed as part of the Intel® Unnati Industrial Training 2025 program.

Model Fine-Tuning

We fine-tuned the LLaMA 3.1 8B model using:

LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA) techniques
Transformer models from HuggingFace
Unsloth library for efficient fine-tuning
Google Colab with free T4 GPU
Quantization to optimize model size and performance

LoRA Model Components

The lora_model folder contains the following key files for the fine-tuned model:

adapter_config.json: Configuration file specifying LoRA parameters like rank (r), alpha, and target modules
adapter_model.safetensors: Contains the trained LoRA adapter weights in a safe tensor format
tokenizer_config.json & tokenizer.json: Tokenizer configuration and vocabulary files
special_tokens_map.json: Defines special tokens used during training

These files work together to enable:

Efficient storage of only the adapter weights (typically <1% of full model size)
Safe loading of the fine-tuned components
Seamless integration with the base LLaMA model

The fine-tuning process is documented in intell_buggy_to_fix.ipynb which includes:

Dataset downloading and preprocessing
Model fine-tuning implementation

Dataset

All datasets used for training are available in the Data_sets folder, originally sourced from HuggingFace datasets (free to download). The model is trained on multiple programming languages including Python, JavaScript, Java, Ruby, C, and Rust. The datasets contain buggy/fixed code pairs for each language:

humanevalpack_python.csv
humanevalpack_js.csv
humanevalpack_java.csv
humanevalpack_rust.csv
humanevalpack_go.csv

These datasets are combined and formatted into a JSON file (formatted_datasetfordataset_buggy_fixed_code_dataset.json) which is used to fine-tune the model. Each entry contains 'buggy' and 'fixed' versions of code snippets.

Sample Outputs Image

the sample outputs of the model in Sample_outputs folder

Model Deployment

To run the fine-tuned model:

Use loading_fine_tuned_model.ipynb to load the model
The notebook includes Gradio deployment code
The model performs bug detection and provides fixed versions of code snippets

Deployment Details

We have successfully deployed our model using:

Gradio for creating an interactive user interface
Hugging Face Spaces for hosting the deployment

Sample Outputs

We have included example outputs in the sample_output

folder which demonstrate:

Various bug detection scenarios
Fixed code examples
Multi-language support results

Getting Started

Clone this repository
Install required dependencies (see intell_buggy_to_fix.ipynb for package requirements)
Run loading_fine_tuned_model.ipynb to test the model

Team Members

Dhanush Raja – Second Year, Department of Computer Science and Business Systems
Koushik – Second Year, Department of Computer Science and Business Systems

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Data_sets		Data_sets
Sample_Outputs		Sample_Outputs
lora_model		lora_model
.gitignore		.gitignore
2_apl_home.py		2_apl_home.py
Problem Statements for Intel Unnati Industrial Training 2025 (dragged) 2.pdf		Problem Statements for Intel Unnati Industrial Training 2025 (dragged) 2.pdf
README.md		README.md
intell_buggy_fixed.ipynb		intell_buggy_fixed.ipynb
loading_fine_tuned_model_gradio.ipynb		loading_fine_tuned_model_gradio.ipynb
lora_model 2.zip		lora_model 2.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intel® Unnati Industrial Training 2025 - Bug Detection and Fixing

Problem Statement 1: Bug Detection and Fixing

Project Overview

Model Fine-Tuning

LoRA Model Components

Dataset

Sample Outputs Image

Model Deployment

Deployment Details

Sample Outputs

Getting Started

Team Members

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Intel® Unnati Industrial Training 2025 - Bug Detection and Fixing

Problem Statement 1: Bug Detection and Fixing

Project Overview

Model Fine-Tuning

LoRA Model Components

Dataset

Sample Outputs Image

Model Deployment

Deployment Details

Sample Outputs

Getting Started

Team Members

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages