Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
c20bbeb
Starting a branch for a new chemiscope JOSS paper
ceriottm Jan 31, 2026
504a974
Drafty version 1.0
sofiia-chorna Feb 3, 2026
f2c346d
Drafty version 1.1
sofiia-chorna Feb 3, 2026
409762b
Refine Summary
sofiia-chorna Feb 3, 2026
f0bfa40
Add implementation section
sofiia-chorna Feb 4, 2026
370c772
Refine summare, references, clean up
sofiia-chorna Feb 4, 2026
b6e0601
Replace svg with png, mention chemfiles
sofiia-chorna Feb 4, 2026
ef8d193
Small updates
sofiia-chorna Feb 4, 2026
0525ae3
Update paper.md
rosecers Feb 5, 2026
79edac8
Update paper.bib
rosecers Feb 5, 2026
c933c17
Update paper.md
rosecers Feb 5, 2026
80104ab
Added lod with petmadfeaturizer figure
sofiia-chorna Feb 6, 2026
73a62cd
Apply MC review comments, small fixes
sofiia-chorna Feb 8, 2026
4ea785b
Fix typos
sofiia-chorna Feb 9, 2026
3e64c8f
Update DOI as required by JOSS, plus update metatensor paper with rec…
sofiia-chorna Feb 16, 2026
fca893e
Explicitly add sections required by joss 2026, add research impact st…
sofiia-chorna Feb 16, 2026
a27d2e9
A bit better start of section
sofiia-chorna Feb 16, 2026
744eb40
Update first figure
sofiia-chorna Feb 17, 2026
4479076
Merge branch 'main' into paper-1.0
sofiia-chorna Feb 17, 2026
4fec801
Autofix formatting (no change to the text)
sofiia-chorna Feb 17, 2026
da3e9dd
Update with MC comments, add more references, clean up bib to JOSS re…
sofiia-chorna Feb 18, 2026
12533b7
Rename paper 2020 to avoid joss bot conflict
sofiia-chorna Mar 24, 2026
358ccc1
Merge branch 'main' into paper-1.0
sofiia-chorna Mar 24, 2026
d2fc0a8
Merge branch 'main' into paper-1.0
ceriottm Apr 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions paper/chemiscope-v1.0-overview.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions paper/chemiscope-v1.0.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
120 changes: 120 additions & 0 deletions paper/paper-2020/paper-2020.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
title: 'Chemiscope: interactive structure-property explorer for materials and molecules'
tags:
- TypeScript
- JavaScript
- chemistry
- material science
- machine learning
authors:
- name: Guillaume Fraux
orcid: 0000-0003-4824-6512
affiliation: 1
- name: Rose K. Cersonsky
orcid: 0000-0003-4515-3441
affiliation: 1
- name: Michele Ceriotti
orcid: 0000-0003-2571-2832
affiliation: 1
affiliations:
- name: Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
index: 1
date: 30 January 2020
bibliography: paper.bib
---

# Summary

The number of materials or molecules that can be created by combining different
chemical elements in various proportions and spatial arrangements is enormous.
Computational chemistry can be used to generate databases containing billions of
potential structures [@Ruddigkeit2012], and predict some of the associated
properties [@Montavon2013; @Ramakrishnan2014]. Unfortunately, the very large
number of structures makes exploring such database — to understand
structure-property relations or find the _best_ structure for a given
application — a daunting task. In recent years, multiple molecular
_representations_ [@Behler2007; @Bartok2013; @Willatt2019] have been developed
to compute structural similarities between materials or molecules, incorporating
physically-relevant information and symmetries. The features associated with
these representations can be used for unsupervised machine learning
applications, such as clustering or classification of the different structures,
and high-throughput screening of database for specific properties [@Maier2007;
@De2017; @Hautier2019]. Unfortunately, the dimensionality of these features (as
well as most of other descriptors used in chemical and materials informatics) is
very high, which makes the resulting classifications, clustering or mapping very
hard to visualize. Dimensionality reduction algorithms [@Schlkopf1998;
@Ceriotti2011; @McInnes2018] can reduce the number of relevant dimensions to a
handful, creating 2D or 3D maps of the full database.

![The Qm7b database [@Montavon2013] visualized with chemiscope](screenshot.png)

Chemiscope is a graphical tool for the interactive exploration of materials and
molecular databases, correlating local and global structural descriptors with
the physical properties of the different systems. The interface consists of
two panels. The left panel displays a 2D or 3D scatter plot, in which each
point corresponds to a chemical entity. The axes, color, and style of each point
can be set to represent a property or a structural descriptor to visualize
structure-property relations directly. Structural descriptors are not computed
directly by chemiscope, but must be obtained from one of the many codes
implementing general-purpose atomic representation [@librascal; @QUIP] or more specialized descriptors. Since the most common
descriptors can be very high dimensional, it can be convenient to apply a
dimensionality reduction algorithm that maps them to a lower-dimensional space
for easier visualization. For example the sketch-map algorithm [@Ceriotti2011]
was used with the Smooth Overlap of Atomic Positions representation [@Bartok2013] to
generate the visualization in Figure 1. The right panel displays the
three-dimensional structure of the chemical entities, possibly including
periodic repetition for crystals. Visualizing the chemical structure can help
in finding an intuitive rationalization of the layout of the dataset and the
structure-property relations.

Whereas similar tools [@Gong2013; @Gutlein2014; @Probst2017; @ISV] only allow
visualizing maps and structures in which each data point corresponds to a
molecule, or a crystal structure, a distinctive feature of chemiscope is the
possibility of visualizing maps in which points correspond to atom-centred
environments. This is useful, for instance, to rationalize the relationship
between structure and atomic properties such as nuclear chemical shieldings
(Figure 2). This is also useful as a diagnostic tool for the many
machine-learning schemes that decompose properties into atom-centred
contributions [@Behler2007; @Bartok2010].

![Database of chemical shieldings [@Paruzzo2018] in chemiscope demonstrating the use of a 3D plot and highlighting of atomic environments](./screenshot-3d.png)

Chemiscope took strong inspiration from a previous similar graphical software,
the interactive sketch-map visualizer [@ISV]. This previous software was used in
multiple research publication, related to the exploration of large-scale
databases, and the mapping of structure-property relationships [@De2016;
@De2017; @Musil2018].

# Implementation

Chemiscope is implemented using the web platform: HTML5, CSS and WebGL to
display graphical elements, and TypeScript (compiled to JavaScript) for
interactivity. It uses [Plotly.js](https://plot.ly/javascript/) to render and
animate 2D and 3D plots; and the JavaScript version of [Jmol](http://jmol.org/)
to display atomic structures. The visualization is fast enough to be used with
datasets containing up to a million points, reacting to user input within a few
hundred milliseconds in the default 2D mode. More elaborate visualizations are
slower, while still handling 100k points easily.

The use of web technologies makes chemiscope usable from different operating
systems without the need to develop, maintain and package the code for each
operating system. It also means that we can provide an online service at
http://chemiscope.org that allows users to visualize their own dataset without any
local installation. Chemiscope is implemented as a library of re-usable
components linked together via callbacks. This makes it easy to modify the
default interface to generate more elaborate visualizations, for example,
displaying multiple maps generated with different parameters of a dimensionality
reduction algorithm. Chemiscope can also be distributed in a standalone mode,
where the code and a predefined dataset are merged together as a single HTML
file. This standalone mode is useful for archival purposes, for example as
supplementary information for a published article and for use in corporate
environments with sensitive datasets.

# Acknowledgements

The development of chemiscope have been funded by the [NCCR
MARVEL](http://nccr-marvel.ch/), the [MAX](http://max-centre.eu/) European
centre of excellence, and the European Research Council (Horizon 2020 grant
agreement no. 677013-HBMAP).

# References
273 changes: 273 additions & 0 deletions paper/paper-2020/paper.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,273 @@
@article{Ceriotti2011,
doi = {10.1073/pnas.1108486108},
url = {https://doi.org/10.1073/pnas.1108486108},
year = {2011},
month = {07},
publisher = {Proceedings of the National Academy of Sciences},
volume = {108},
number = {32},
pages = {13023--13028},
author = {Michele Ceriotti and Gareth A. Tribello and Michele Parrinello},
title = {Simplifying the representation of complex free-energy landscapes using sketch-map},
journal = {Proceedings of the National Academy of Sciences}
}

@article{Bartok2013,
doi = {10.1103/physrevb.87.184115},
url = {https://doi.org/10.1103/physrevb.87.184115},
year = {2013},
month = {05},
publisher = {American Physical Society ({APS})},
volume = {87},
number = {18},
author = {Albert P. Bart{\'{o}}k and Risi Kondor and G{\'{a}}bor Cs{\'{a}}nyi},
title = {On representing chemical environments},
journal = {Physical Review B}
}

@article{Montavon2013,
doi = {10.1088/1367-2630/15/9/095003},
url = {https://doi.org/10.1088/1367-2630/15/9/095003},
year = {2013},
month = {09},
publisher = {{IOP} Publishing},
volume = {15},
number = {9},
pages = {095003},
author = {Grégoire Montavon and Matthias Rupp and Vivekanand Gobre and Alvaro Vazquez-Mayagoitia and Katja Hansen and Alexandre Tkatchenko and Klaus-Robert M\"{u}ller and O Anatole von Lilienfeld},
title = {Machine learning of molecular electronic properties in chemical compound space},
journal = {New Journal of Physics}
}

@article{Gutlein2014,
doi = {10.1186/s13321-014-0041-7},
url = {https://doi.org/10.1186/s13321-014-0041-7},
year = {2014},
month = sep,
publisher = {Springer Science and Business Media {LLC}},
volume = {6},
number = {1},
author = {Martin G\"{u}tlein and Andreas Karwath and Stefan Kramer},
title = {{CheS}-Mapper 2.0 for visual validation of (Q){SAR} models},
journal = {Journal of Cheminformatics}
}

@article{Probst2017,
doi = {10.1093/bioinformatics/btx760},
url = {https://doi.org/10.1093/bioinformatics/btx760},
year = {2017},
month = {10},
publisher = {Oxford University Press ({OUP})},
volume = {34},
number = {8},
pages = {1433--1435},
author = {Daniel Probst and Jean-Louis Reymond},
editor = {Jonathan Wren},
title = {{FUn}: a framework for interactive visualizations of large, high-dimensional datasets on the web},
journal = {Bioinformatics}
}

@article{Gong2013,
doi = {10.1093/bioinformatics/btt270},
url = {https://doi.org/10.1093/bioinformatics/btt270},
year = {2013},
month = {05},
publisher = {Oxford University Press ({OUP})},
volume = {29},
number = {14},
pages = {1827--1829},
author = {Jiayu Gong and Chaoqian Cai and Xiaofeng Liu and Xin Ku and Hualiang Jiang and Daqi Gao and Honglin Li},
title = {{ChemMapper}: a versatile web server for exploring pharmacology and chemical structure association based on molecular 3D similarity method},
journal = {Bioinformatics}
}

@article{Paruzzo2018,
doi = {10.1038/s41467-018-06972-x},
url = {https://doi.org/10.1038/s41467-018-06972-x},
year = {2018},
month = oct,
publisher = {Springer Science and Business Media {LLC}},
volume = {9},
number = {1},
author = {Federico M. Paruzzo and Albert Hofstetter and Félix Musil and Sandip De and Michele Ceriotti and Lyndon Emsley},
title = {Chemical shifts in molecular solids by machine learning},
journal = {Nature Communications}
}

@software{ISV,
author = {De, Sandip and Ceriotti, Michele},
title = {Interactive Sketchmap Visualizer},
publisher = {Zenodo},
year = {2019},
version = {1.0.0},
doi = {10.5281/zenodo.3541831},
url = {https://doi.org/10.5281/zenodo.3541831}
}

@article{De2016,
doi = {10.1039/c6cp00415f},
url = {https://doi.org/10.1039/c6cp00415f},
year = {2016},
publisher = {Royal Society of Chemistry ({RSC})},
volume = {18},
number = {20},
pages = {13754--13769},
author = {Sandip De and Albert P. Bart{\'{o}}k and G{\'{a}}bor Cs{\'{a}}nyi and Michele Ceriotti},
title = {Comparing molecules and solids across structural and alchemical space},
journal = {Physical Chemistry Chemical Physics}
}

@article{De2017,
doi = {10.1186/s13321-017-0192-4},
url = {https://doi.org/10.1186/s13321-017-0192-4},
year = {2017},
month = {02},
publisher = {Springer Science and Business Media {LLC}},
volume = {9},
number = {1},
author = {Sandip De and Félix Musil and Teresa Ingram and Carsten Baldauf and Michele Ceriotti},
title = {Mapping and classifying molecules from a high-throughput structural database},
journal = {Journal of Cheminformatics}
}

@article{Musil2018,
doi = {10.1039/c7sc04665k},
url = {https://doi.org/10.1039/c7sc04665k},
year = {2018},
publisher = {Royal Society of Chemistry ({RSC})},
volume = {9},
number = {5},
pages = {1289--1300},
author = {Félix Musil and Sandip De and Jack Yang and Joshua E. Campbell and Graeme M. Day and Michele Ceriotti},
title = {Machine learning for the structure-energy-property landscapes of molecular crystals},
journal = {Chemical Science}
}

@article{Hautier2019,
doi = {10.1016/j.commatsci.2019.02.040},
url = {https://doi.org/10.1016/j.commatsci.2019.02.040},
year = {2019},
month = {06},
publisher = {Elsevier {BV}},
volume = {163},
pages = {108--116},
author = {Geoffroy Hautier},
title = {Finding the needle in the haystack: Materials discovery and design through computational ab initio high-throughput screening},
journal = {Computational Materials Science}
}

@article{Willatt2019,
doi = {10.1063/1.5090481},
url = {https://doi.org/10.1063/1.5090481},
year = {2019},
month = {04},
publisher = {{AIP} Publishing},
volume = {150},
number = {15},
pages = {154110},
author = {Michael J. Willatt and F{\'{e}}lix Musil and Michele Ceriotti},
title = {Atom-density representations for machine learning},
journal = {The Journal of Chemical Physics}
}

@article{Behler2007,
doi = {10.1103/physrevlett.98.146401},
url = {https://doi.org/10.1103/physrevlett.98.146401},
year = {2007},
month = {04},
publisher = {American Physical Society ({APS})},
volume = {98},
number = {14},
author = {J\"{o}rg Behler and Michele Parrinello},
title = {Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces},
journal = {Physical Review Letters}
}

@article{Ruddigkeit2012,
doi = {10.1021/ci300415d},
url = {https://doi.org/10.1021/ci300415d},
year = {2012},
month = {11},
publisher = {American Chemical Society ({ACS})},
volume = {52},
number = {11},
pages = {2864--2875},
author = {Lars Ruddigkeit and Ruud van Deursen and Lorenz C. Blum and Jean-Louis Reymond},
title = {Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database {GDB}-17},
journal = {Journal of Chemical Information and Modeling}
}

@article{Ramakrishnan2014,
doi = {10.1038/sdata.2014.22},
url = {https://doi.org/10.1038/sdata.2014.22},
year = {2014},
month = {08},
publisher = {Springer Science and Business Media {LLC}},
volume = {1},
number = {1},
author = {Raghunathan Ramakrishnan and Pavlo O. Dral and Matthias Rupp and O. Anatole von Lilienfeld},
title = {Quantum chemistry structures and properties of 134 kilo molecules},
journal = {Scientific Data}
}

@article{Bartok2010,
doi = {10.1103/physrevlett.104.136403},
url = {https://doi.org/10.1103/physrevlett.104.136403},
year = {2010},
month = {04},
publisher = {American Physical Society ({APS})},
volume = {104},
number = {13},
author = {Albert P. Bart{\'{o}}k and Mike C. Payne and Risi Kondor and G{\'{a}}bor Cs{\'{a}}nyi},
title = {Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons},
journal = {Physical Review Letters}
}

@article{Schlkopf1998,
doi = {10.1162/089976698300017467},
url = {https://doi.org/10.1162/089976698300017467},
year = {1998},
month = {08},
publisher = {{MIT} Press - Journals},
volume = {10},
number = {5},
pages = {1299--1319},
author = {Bernhard Sch\"{o}lkopf and Alexander Smola and Klaus-Robert M\"{u}ller},
title = {Nonlinear Component Analysis as a Kernel Eigenvalue Problem},
journal = {Neural Computation}
}

@article{Maier2007,
doi = {10.1002/anie.200603675},
url = {https://doi.org/10.1002/anie.200603675},
year = {2007},
month = aug,
publisher = {Wiley},
volume = {46},
number = {32},
pages = {6016--6067},
author = {Wilhelm{\hspace{0.25em}}F. Maier and Klaus St\"{o}we and Simone Sieg},
title = {Combinatorial and High-Throughput Materials Science},
journal = {Angewandte Chemie International Edition}
}

@article{McInnes2018,
title={UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction},
author={Leland McInnes and John Healy and James Melville},
year={2018},
eprint={1802.03426},
}

@online{librascal,
author = {librascal},
title = {},
date = {},
url = {https://github.com/lab-cosmo/librascal}
}

@online{QUIP,
author = {QUIP},
title = {},
date = {},
url = {http://libatoms.github.io/QUIP/}
}
File renamed without changes
File renamed without changes
Loading
Loading