Feat/prompt by FrejaThoresen · Pull Request #18 · alexandrainst/faithful_eval

FrejaThoresen · 2026-05-01T09:16:30Z

Minor bugfixes.
Remove selfcheckgpt code.

Copilot

Pull request overview

This PR aligns the QA prompting format with the EuroEval-style templates, updates generation/post-processing behavior, and removes the legacy SelfCheckGPT script/prompts.

Changes:

Updated all qa_prompt_*.txt templates to use a ${text}-based prompt format (max 3 words), and updated PromptUtils.format_context() accordingly.
Adjusted local HF generation to decode only newly generated tokens and added markdown marker stripping.
Tweaked dataset splitting logic, hallucination detector model naming, and reduced generation/training lengths in config; removed SelfCheckGPT script and prompt templates.

Reviewed changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`src/scripts/selfcheckgpt.py`	Removed legacy SelfCheckGPT runner script.
`src/scripts/detect_hallucinations.py`	Adjusts Hugging Face detector path naming for synthetic-hallucination dataset.
`src/prompts/selfcheckgpt_prompt_en.txt`	Removed SelfCheckGPT prompt template (EN).
`src/prompts/selfcheckgpt_prompt_de.txt`	Removed SelfCheckGPT prompt template (DE).
`src/prompts/selfcheckgpt_prompt_da.txt`	Removed SelfCheckGPT prompt template (DA).
`src/prompts/qa_prompt_uk.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_sv.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_sr.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_sl.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_sk.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_ro.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_pt.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_pl.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_no.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_nl.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_lv.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_lt.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_it.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_is.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_hu.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_hr.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_fr.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_fo.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_fi.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_et.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_es.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_en.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_el.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_de.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_da.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_cs.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_ca.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_bs.txt`	Updated QA prompt to `${text}` format.
`src/prompts/qa_prompt_bg.txt`	Updated QA prompt to `${text}` format.
`src/factuality_eval/prompt_utils.py`	Updates prompt formatting to pass `${text}` and removes passage-label formatting.
`src/factuality_eval/model_generation.py`	Improves HF generation decoding and strips markdown markers from outputs.
`src/factuality_eval/hallucination_detection.py`	Removes per-example exception handling around detector prediction.
`src/factuality_eval/dataset_generation.py`	Changes dataset split handling when only `train` is available; otherwise raises.
`config/hallucination_detection.yaml`	Reduces `training.max_length` and `generation.max_new_tokens`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Agent-Logs-Url: https://github.com/alexandrainst/factuality_eval/sessions/045c333a-8816-4b8a-919f-217ac634e975 Co-authored-by: FrejaThoresen <13599833+FrejaThoresen@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

This reverts commit 2247eba.

…lity_eval into feat/prompt

FrejaThoresen added 6 commits February 13, 2026 19:52

Exception add for splitting the dataset

ef39014

Experiment with new prompt

2247eba

Change name for model when loading

b6b042a

Merge branch 'bugfix/dataset-load' into feat/prompt

6798d86

Fix for model output format

c6fe9ae

Remove selfcheckpgt

54a2d1c

Copilot AI review requested due to automatic review settings May 1, 2026 09:16

Copilot started reviewing on behalf of FrejaThoresen May 1, 2026 09:16 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

Copilot started work on behalf of FrejaThoresen May 1, 2026 09:29 View session

FrejaThoresen and others added 2 commits May 1, 2026 11:30

Update src/factuality_eval/model_generation.py

5d0c332

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Remove all SelfCheckGPT-related code

c305d70

Agent-Logs-Url: https://github.com/alexandrainst/factuality_eval/sessions/045c333a-8816-4b8a-919f-217ac634e975 Co-authored-by: FrejaThoresen <13599833+FrejaThoresen@users.noreply.github.com>

Copilot finished work on behalf of FrejaThoresen May 1, 2026 09:32

alexandrainst deleted a comment from Copilot AI May 1, 2026

FrejaThoresen and others added 5 commits May 1, 2026 12:21

Update src/prompts/qa_prompt_it.txt

9cf624e

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/factuality_eval/dataset_generation.py

25f1a70

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Revert "Experiment with new prompt"

6ef657b

This reverts commit 2247eba.

Don't push to hub by default

ff212d7

Merge branch 'feat/prompt' of https://github.com/alexandrainst/factua…

21df172

…lity_eval into feat/prompt

FrejaThoresen merged commit 0385f2c into main May 1, 2026
4 checks passed

FrejaThoresen deleted the feat/prompt branch May 1, 2026 10:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/prompt#18

Feat/prompt#18
FrejaThoresen merged 13 commits into
mainfrom
feat/prompt

FrejaThoresen commented May 1, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

FrejaThoresen commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FrejaThoresen commented May 1, 2026 •

edited

Loading