Skip to content

Feat/prompt#18

Merged
FrejaThoresen merged 13 commits into
mainfrom
feat/prompt
May 1, 2026
Merged

Feat/prompt#18
FrejaThoresen merged 13 commits into
mainfrom
feat/prompt

Conversation

@FrejaThoresen
Copy link
Copy Markdown
Collaborator

@FrejaThoresen FrejaThoresen commented May 1, 2026

Minor bugfixes.
Remove selfcheckgpt code.

Copilot AI review requested due to automatic review settings May 1, 2026 09:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aligns the QA prompting format with the EuroEval-style templates, updates generation/post-processing behavior, and removes the legacy SelfCheckGPT script/prompts.

Changes:

  • Updated all qa_prompt_*.txt templates to use a ${text}-based prompt format (max 3 words), and updated PromptUtils.format_context() accordingly.
  • Adjusted local HF generation to decode only newly generated tokens and added markdown marker stripping.
  • Tweaked dataset splitting logic, hallucination detector model naming, and reduced generation/training lengths in config; removed SelfCheckGPT script and prompt templates.

Reviewed changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/scripts/selfcheckgpt.py Removed legacy SelfCheckGPT runner script.
src/scripts/detect_hallucinations.py Adjusts Hugging Face detector path naming for synthetic-hallucination dataset.
src/prompts/selfcheckgpt_prompt_en.txt Removed SelfCheckGPT prompt template (EN).
src/prompts/selfcheckgpt_prompt_de.txt Removed SelfCheckGPT prompt template (DE).
src/prompts/selfcheckgpt_prompt_da.txt Removed SelfCheckGPT prompt template (DA).
src/prompts/qa_prompt_uk.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_sv.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_sr.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_sl.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_sk.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_ro.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_pt.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_pl.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_no.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_nl.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_lv.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_lt.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_it.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_is.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_hu.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_hr.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_fr.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_fo.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_fi.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_et.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_es.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_en.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_el.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_de.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_da.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_cs.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_ca.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_bs.txt Updated QA prompt to ${text} format.
src/prompts/qa_prompt_bg.txt Updated QA prompt to ${text} format.
src/factuality_eval/prompt_utils.py Updates prompt formatting to pass ${text} and removes passage-label formatting.
src/factuality_eval/model_generation.py Improves HF generation decoding and strips markdown markers from outputs.
src/factuality_eval/hallucination_detection.py Removes per-example exception handling around detector prediction.
src/factuality_eval/dataset_generation.py Changes dataset split handling when only train is available; otherwise raises.
config/hallucination_detection.yaml Reduces training.max_length and generation.max_new_tokens.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/factuality_eval/prompt_utils.py Outdated
Comment thread src/factuality_eval/model_generation.py Outdated
Comment thread src/factuality_eval/hallucination_detection.py
Comment thread src/factuality_eval/dataset_generation.py Outdated
Comment thread src/prompts/qa_prompt_it.txt Outdated
FrejaThoresen and others added 2 commits May 1, 2026 11:30
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Agent-Logs-Url: https://github.com/alexandrainst/factuality_eval/sessions/045c333a-8816-4b8a-919f-217ac634e975

Co-authored-by: FrejaThoresen <13599833+FrejaThoresen@users.noreply.github.com>
@alexandrainst alexandrainst deleted a comment from Copilot AI May 1, 2026
FrejaThoresen and others added 5 commits May 1, 2026 12:21
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@FrejaThoresen FrejaThoresen merged commit 0385f2c into main May 1, 2026
4 checks passed
@FrejaThoresen FrejaThoresen deleted the feat/prompt branch May 1, 2026 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants