You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Dataset: A domain-expert-annotated corpus of solar chemistry papers covering seven experimental parameters (catalyst, co-catalyst, light source, lamp, reactor type, reaction medium, operation mode), with a filtered benchmark subset, an LLM-evaluation sample, and sentence-level retrieval evidence.
Generation pipeline (src/generation): a RAG pipeline that extracts evidence and infers the seven parameters from each paper.
Benchmark (src/evaluation): three evaluation tasks: information retrieval (NDCG), RAG-strategy comparison, and an LLM performance leaderboard.
Reproducibility & citation: pinned requirements.txt, Poetry pyproject.toml + lockfile, CITATION.cff, codemeta.json, and an Apache-2.0 license.