Skip to content

Fine-tuning Large Language Models (LLMs) for medical reasoning to enhances LLMs ability to understand, analyze, and generate accurate medical information.

Notifications You must be signed in to change notification settings

renaldiangsar/Medical-LLM-Fine-Tuning

Repository files navigation

LLM Fine-Tuning on Medical Reasoning Dataset

Overview

This project focuses on fine-tuning large language models (LLMs) for medical O1 reasoning using Supervised Fine-Tuning (SFT). The goal is to improve model performance in answering medical reasoning questions accurately and reliably.

Fine-Tuned Models

We have fine-tuned the following models:

  • DeepSeek-R1-Distill-Llama-8B
  • Llama-3-8B-bnb-4bit
  • Mistral-7B-Instruct-v0.2-bnb-4bit

These models are optimized for handling complex medical reasoning tasks with improved domain-specific accuracy.


What is LLM Fine-Tuning?

Fine-tuning is the process of training a pre-trained Large Language Model (LLM) on a specific dataset to improve its performance on a targeted task. It involves:

  1. Supervised Fine-Tuning (SFT): Training the model using labeled medical reasoning datasets.
  2. Low-Rank Adaptation (LoRA) / QLoRA: Efficient fine-tuning methods that allow large models to adapt without extensive computational resources.
  3. Evaluation: Measuring perplexity, accuracy, and reasoning coherence.

Why Fine-Tuning is Necessary?

While general-purpose LLMs are powerful, they often lack:

  • Domain-Specific Knowledge: General LLMs may not have deep understanding of medical concepts.
  • Reasoning Accuracy: Without fine-tuning, responses can be vague or incorrect.
  • Terminology Alignment: Medical jargon and precise wording require adaptation.

Fine-tuning ensures that the models perform well in clinical reasoning, diagnosis suggestions, and evidence-based medical responses.


Model Details

1. DeepSeek-R1-Distill-Llama-8B

  • A distilled version of DeepSeek-R1, designed for efficiency.
  • Optimized for reasoning tasks with reduced compute requirements.
  • Supports multi-turn conversations and structured medical queries.

2. Llama-3-8B-bnb-4bit

  • Quantized 4-bit version of Llama 3 (8B) using bitsandbytes for efficient inference.
  • Provides improved reasoning with lower latency.
  • Fine-tuned for medical decision support.

3. Mistral-7B-Instruct-v0.2-bnb-4bit

  • Instruction-tuned variant of Mistral 7B.
  • Optimized for instruction-following and multi-step reasoning.
  • Performs well in explanatory medical answers and logical deductions.

Dataset & Training Approach

  • Dataset Used: FreedomIntelligence/medical-o1-reasoning-SFT. Curated medical reasoning dataset, including case studies and diagnostic Q&A.
  • Training Strategy: Supervised fine-tuning using LoRA for efficiency.
  • Evaluation Metrics: Perplexity, domain-specific accuracy, and human validation.

Future Work

  • Further fine-tuning with larger and more diverse datasets.
  • Incorporating Reinforcement Learning from Human Feedback (RLHF).
  • Testing performance in real-world clinical decision support.

Conclusion

Fine-tuning LLMs for medical reasoning improves their domain knowledge, logical coherence, and diagnostic accuracy. This project enhances LLM capabilities in medical O1 reasoning, making them more useful for healthcare applications.

About

Fine-tuning Large Language Models (LLMs) for medical reasoning to enhances LLMs ability to understand, analyze, and generate accurate medical information.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published