This repository contains code and resources for Timeline-based List-Question Answering (TLQA). TLQA focuses on questions that
- (1) require list-based answers and
- (2) incorporate temporal constraints or time periods. For example:
Question: "List all sports teams Robert Lewandowski played for from 2010 to 2020."
Answer:
- Lech Poznań (2010)
- Borussia Dortmund (2010-2014)
- FC Bayern Munich (2014-2020)
Our project explores the abilities of the Flan-T5 model to:
- Provide complete lists of answers.
- Correctly identify and align answers with time periods.
Our work investigates three main research questions:
- RQ1: How do fine-tuned generative models and few-shot prompting of generative models in a closed-book QA setting perform on TLQA?
- RQ2: How do fine-tuned generative models and few-shot prompting of generative models with retrieved top-k evidence perform on TLQA?
- RQ3: Does special handling of temporal markers (e.g., explicit time intervals in retrieval or generation) improve performance for TLQA?
Below we outline the main experiments described in our research. Detailed instructions for each experiment can be found in the scripts under code/.
- Use the generative models FlanT5-large, FlanT5-XL.
- Implement a KNN-based sample selection for demonstrations. We use an embedding model (from sentence-transformers) to find k nearest neighbors in the training set that are similar to the test question.
- Vary k from {0, 3, 5, 7, 10} to measure performance changes.
- Fine-tune models FlanT5-base andFlanT5-large.
- Run inference on the test set.
- Combine fine-tuning with few-shot prompting.
- Retrieve top-k relevant contexts from a Wikipedia infobox collection.
- Provide the retrieved text as input to the generative model.
- Vary k from {1, 3, 5, 7, 10} to see how retrieval depth affects performance.
- Combine fine-tuning and few-shot prompting with rag.
- Incorporate temporal relevance when searching for context in addition to simple cosine similarity.
Follow these steps to prepare your environment for the project:
If you don't already have WSL, run
wsl --installOtherwise:
wslChange to the project's root directory:
cd <.../Timeline-based-List-Question-Answering>Create an isolated environment for your project:
- Create a virtual environment:
python3 -m venv venv- Activate the virtual environment:
source venv/bin/activateInstall the necessary packages listed in requirements.txt:
pip install -r requirements.txtwslcd <.../Timeline-based-List-Question-Answering>source venv/bin/activatejupyter notebookIn the WSL window, you will see a link similar to http://localhost:8889/tree?token=string. Copy it and paste it in your browser.