This repository contains a collection of notebooks for gaining insights into presentation slides collected in the NFDI4BIOIMAGE Training Material through multimodal AI models. Some goals are:
-
Comparison of different models and their performance on summarizing the content of presentation slides. This is not implemented through text-to-text models but rather through image-to-text (multimodal) models. As a first test pdf I3D:bio's Training Material 'WhatIsOMERO.pdf' (Schmidt, C., Bortolomeazzi, M. et al., 2023) is used.
-
Improve our understanding about how different types of embeddings represent the same content. For this task, some presentation slides are adapted to see whether text, visual or mixed-modal embeddings perform comparably well in representating a slides features, when the slide is changed in a specific manner. For this, slides are adapted from the Bio-image Data Science Lectures.
-
Establishing a cache that stores a text embedding, visual embedding, mixed embedding and the extracted text for each slide from the collection. Besides that, there is also a dataset available that stores each slide as an image and shares a corresponding key with the embeddings from the cache dataset.
-
The cached embeddings can then be used to visualize the embeddings and gain insights into the contents.
-
Two different approaches were tested for this task:
- Using the Byaldi package, see Notebook.
- Using an Open AI CLIP Model to compare image and query embeddings, see Notebook.
-
The two different approaches were compared against each other, as seen in the corresponding Notebook. To further evaluate their performance on the desired task, a simple Benchmark was created.
To access the AI models used in this repository, this free Service from Github is used.
Be aware that there are certain rate limits for each model!
Make sure to generate a developer key / personal access token on Github and set it as an environment variable. You can generate the token via the Github website under user settings and afterwards set it like this for your current session:
export GITHUB_TOKEN= "your-github-token-goes-here"
$Env:GITHUB_TOKEN= "your-github-token-goes-here"
set GITHUB_TOKEN= your-github-token-goes-here