-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Title
Less Confusion in Diffusion
Leaders
Nancy Newlin @nancynewlin-masi, Elias Levy @eliasxlevy
Collaborators
Karthik Ramadass @KarthikMasi, Gaurav Rudravaram @GauravR1206, Andre Hucke @AndreHucke
Project Description
The goal of “Less confusion in diffusion” is to develop a LLM-based tool that identifies image issues (eddy current distortions, significant motion, poor resolution, insufficient number of b-vector directions, missing slices, top of brain not in FOV) in diffusion weighted images (DWI) and recommends solutions. Working with DWI can be tricky (especially if you don’t have a diffusion imaging expert on call!) and there are a wide range of distortions, artifacts, and noise that need to be corrected. The proposed tool is designed for people getting started in DWI. This project is the first step toward a tool that gives advice based on data acquisition and quality.
Contributors to this project will gain experience in 1) using HuggingFace, 2) fine-tuning LLMs, 3) debugging LLMs, and 4) creating a user interface.
This project requires GPUs for model training. However, anyone is welcome to join the discussions and developing the research plan.
Link to project repository/sources
https://github.com/nancynewlin-masi/LessConfusionInDiffusion
Concerete Goals with Specific Tasks for Brainhack Vanderbilt 2025
- Get a hugging face account
- Dataset curation: There is a directory of diffusion image slices. Have an expert create their associated text labels (what should the completion look like?).
- Data visualization: How samples are there of each type? What do the responses look like? This is the time to look at the data and understand what each of the expected cases are.
- Model selection: what hugging face models are appropriate for this image to text task?
- Test run model: Run model as-is on a test dataset (hopefully provided by hugging face project). At this stage, we need to make sure the model will act as expected (inputs, outputs).
- Dataloading: Set up dataloader to get slices and labels from directory and properly interface with current model.
- Test: Try model on a few samples and observe the behaviour!
- Documentation: Input/Output description, open problems, how to use
Advanced: Joint embedding of text and image inputs (ex. “here is my image and I have a b-vector file with 100 directions and b-values ranging 0-2000”)
Extra 1: Improve on response quality (more conversational, more information provided).
Extra 2: User interface: Set up a local server that can take an image slice as input, and provide a response.
Extra 3: Upload model to Hugging Face!
Good first issues
-
issue one: Explore hugging face: Find three potential models for this project and weight the benefits and limitations of each (what size are they? What was the model pre-trained on? What are the expected inputs (pngs, npy, nii.gz)/outputs?
-
issue two: Practical experience: Get one of those models and run it on your machine as is with a simple training dataset. Observe the training curves/losses. What’s the quality of output?
Skills
Must haves:
- Proficient in Python
- Proficient in PyTorch
- Basic knowledge of medical images (they have headers and metadata)
- Working knowledge of machine learning principles (training, inference, data loaders)
- Able to pull/push from/to github
Preferred: Experience with diffusion weighted MRI
Onboarding documentation
Add your name to CONTRIBUTING.md by committing to the repo
Get a hugging face account https://huggingface.co/
Basics of Diffusion weighted MRI modality: https://radiopaedia.org/articles/diffusion-weighted-imaging-2?lang=us
Common issues and solutions with these images: https://pubmed.ncbi.nlm.nih.gov/33533094/
Downloading a model from hugging face: https://huggingface.co/docs/hub/models-downloading
What will participants learn?
Contributors to this project will gain experience in 1) using HuggingFace, 2) fine-tuning LLMs, 3) debugging LLMs, and 4) creating a user interface.
Public data to use
Data is currently in NIFTI format here: https://vanderbilt.box.com/s/v50gfkqzirr2pp05dgf9rs45sum3lq8h
Number of collaborators
4+
Credit to collaborators
Name listed in ReadMe (make sure you added your name to contributions) and co-authorship if there is any resulting publication or conference proceeding.
Image
Project Summary
This project aims to help beginners in diffusion-weighted imaging (DWI) by detecting issues in DWIs and offering LLM-powered preprocessing recommendations.
Type
method_development
Development status
0_concept_no_content
Topic
diffusion, machine_learning, MR_methodologies
Tools
ANTs, DIPY, Freesurfer, HuggingFace, MRtrix, Pytorch
Programming language
documentation, Python, html_css
Modalities
DWI
Git skills
1_commit_push
Anything else?
No response
Things to do after the project is submitted and ready to review.
- Add a comment below the main post of your issue saying:
Hi @brainhack-vandy/project-monitors my project is ready!
