🔗 See how our model performs: Buggy Code Fixer 2.0 on Hugging Face (https://huggingface.co/spaces/Eleuther/buggy-code-fixer_2.0)
This project focuses on bug detection and fixing using a fine-tuned LLaMA 3.1 8B model. It was developed as part of the Intel® Unnati Industrial Training 2025 program.
We fine-tuned the LLaMA 3.1 8B model using:
- LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA) techniques
- Transformer models from HuggingFace
- Unsloth library for efficient fine-tuning
- Google Colab with free T4 GPU
- Quantization to optimize model size and performance
The lora_model folder contains the following key files for the fine-tuned model:
adapter_config.json: Configuration file specifying LoRA parameters like rank (r), alpha, and target modulesadapter_model.safetensors: Contains the trained LoRA adapter weights in a safe tensor formattokenizer_config.json&tokenizer.json: Tokenizer configuration and vocabulary filesspecial_tokens_map.json: Defines special tokens used during training
These files work together to enable:
- Efficient storage of only the adapter weights (typically <1% of full model size)
- Safe loading of the fine-tuned components
- Seamless integration with the base LLaMA model
The fine-tuning process is documented in intell_buggy_to_fix.ipynb which includes:
- Dataset downloading and preprocessing
- Model fine-tuning implementation
All datasets used for training are available in the Data_sets folder, originally sourced from HuggingFace datasets (free to download). The model is trained on multiple programming languages including Python, JavaScript, Java, Ruby, C, and Rust. The datasets contain buggy/fixed code pairs for each language:
- humanevalpack_python.csv
- humanevalpack_js.csv
- humanevalpack_java.csv
- humanevalpack_rust.csv
- humanevalpack_go.csv
These datasets are combined and formatted into a JSON file (formatted_datasetfordataset_buggy_fixed_code_dataset.json) which is used to fine-tune the model. Each entry contains 'buggy' and 'fixed' versions of code snippets.
- the sample outputs of the model in Sample_outputs folder
To run the fine-tuned model:
- Use
loading_fine_tuned_model.ipynbto load the model - The notebook includes Gradio deployment code
- The model performs bug detection and provides fixed versions of code snippets
We have successfully deployed our model using:
- Gradio for creating an interactive user interface
- Hugging Face Spaces for hosting the deployment
We have included example outputs in the sample_output
folder which demonstrate:
- Various bug detection scenarios
- Fixed code examples
- Multi-language support results
- Clone this repository
- Install required dependencies (see
intell_buggy_to_fix.ipynbfor package requirements) - Run
loading_fine_tuned_model.ipynbto test the model
- Dhanush Raja – Second Year, Department of Computer Science and Business Systems
- Koushik – Second Year, Department of Computer Science and Business Systems