Clever Materials

Open-source scientific article built with the showyourwork workflow.

Gist

Machine learning can accelerate materials discovery, but strong benchmark scores do not guarantee that models learn chemistry. This article tests an alternative hypothesis: property prediction may be driven by bibliographic confounding. Across five tasks (MOF thermal and solvent stability, perovskite efficiency, battery capacity, and TADF emission), models can predict author, journal, and year from standard descriptors above chance. When those predicted metadata (bibliographic fingerprints) are used as the only inputs, performance sometimes approaches conventional descriptor-based models. The results show that many datasets do not rule out non-chemical explanations of success and motivate routine falsification tests (group/time splits, metadata ablations), better dataset design, and clearer separation between predictive utility and evidence of chemical understanding.

Build

showyourwork build

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
Snakefile		Snakefile
environment.yml		environment.yml
showyourwork.yml		showyourwork.yml
zenodo.yml		zenodo.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clever Materials

Gist

Build

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Clever Materials

Gist

Build

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages