Interpretation of thyroid-relevant bioactivity data for comparison to in vivo exposures: A prioritization approach for putative chemical inhibitors of in vitro deiodinase activity
Kimberly T.Truong, John F. Wambaugh, Dustin F. Kapraun, Sarah E. Davidson-Fritz, Stephanie Eytcheson, Richard Judson, Katie Paul Friedman
This repository contains all the necessary code and data files to reproduce figures and tables in the manuscript as well as replicate the prioritization workflow, as described in Figure 1 shown here:
The code in this repository can be used to run the prioritization pipeline, from left to right, on your own (see Usage for running the prioritization workflow from ToxCast data). However, it's also possible to simply generate the figures in the manuscript without having to generate the underlying data from scratch (see Usage for generating figures using preprocessed data from workflow).
Scripts in the scripts
directory reference source files in the data/
folder. Because this work relies on many data sources, we highlight the major ones here:
- High-throughput screening (HTS) data: concentration-response profiling data on thyroid-related endpoints for thousands of chemicals from the US EPA's ToxCast program (
data/invitrodb_v3_5_thyroid_data.RData
) - High-throughput toxicokinetic (HTTK) information for tens of thousands of chemicals available in the httk R package (will be available in version v2.6.0 on CRAN)
- High-throughput exposure predictions representing the median of total US population aggregate exposures from all exposure pathways considered, available for almost 700k substances from the ExpoCast SEEM3 model (
data/chem.preds-2018-11-28.RData
) - In vivo toxicity information from repeat-dose studies for over 7000 substances summarized from the US EPA's ToxValDB v9.4 (
data/toxval pods chemical level oral mgkgday.xlsx
).
If you'd rather not spend the time needed to process the data and run the HTTK models, I've included most of the numeric outputs in the data/invitrodb_v3_5_deiod_filtered_httk.RData
object. Complete enrichment analysis could be reproduced with the in vitro data from ToxCast as input without taking too much time (~2 minutes) by knitting Supp3_Enrichment_Analysis.Rmd
within the supplement/
subfolder of scripts/
. Otherwise, all other scripts for generating the main figures in the manuscript are found in the scripts/
directory.
Details with regards to reproducing individual figures are as follows:
Figures 4 and 5: knit scripts/supplement/Supp3_Enrichment_Analysis.Rmd
(R chunks responsible for generating Figures 4 and 5 are noted).
Figures 6-11 mostly map to individual R files in the scripts
directory as follows:
Figure 6: model_stitching.R
Figure 7: critical_times.R
Figure 8: is_Cplasma_protective.R
Figures 9-10: Knit Truong_etal_Full_Gestational_IVIVE.Rmd
with execute.vignette = FALSE
Figures 11: devtox_pods.R
For Figures 7-8, one can just run the plotting sections noted in the corresponding scripts by doing Ctrl + Alt + T
. Figures 1-3 were made in Powerpoint.
If you're interested in replicating the prioritization pipeline from top to bottom, the steps are as follows:
- Run all chunks in
data/invitrodb_v3_5_data.Rmd
for ToxCast data retrieval from invitrodb v3.5. - Run
scripts/deiod_invitrodb_v3_5_processing.R
which carries out the ToxCast data filtering (see "Assessment for Selectivity + Assay Interference" and "Refinement" steps of the workflow). - Change variable
execute.vignette
to TRUE inTruong_etal_Full_Gestational_IVIVE.Rmd
and knit or run all the chunks in RStudio ("Targeted bioactivity:exposure ratios" of workflow).
After running the workflow, you can proceed to make the figures as described above. It is recommended to run the script for Figure 7 before that for Figure 8, since some redundancy has been factored out.
All code was written and tested using R 4.4.1, and should run using later versions. Figures in the manuscript were generated with the versions of each library listed below:
Use Case | Package(s) |
---|---|
General Data Manipulation |
data.table 1.16.2 dplyr 1.1.4 tidyr 1.3.1 reshape2 1.4.4 |
Plotting and Visualization |
ggplot2 3.5.1 ggrepel 0.9.5 ggvenn 0.1.10 ggstar 1.0.4 cowplot 1.1.3 ggpubr 0.6.0 RColorBrewer 1.1-3 viridis 0.6.5 viridisLite 0.4.2 pheatmap 1.0.12 dendextend 1.18.0 latex2exp 0.9.6 |
Data Download and Writing |
openxlsx 4.2.6.1 readxl 1.4.3 tcpl 3.2.0 (dev) |
- High-throughput toxicokinetics (httk) R package v2.6.0 and its relevant dependencies are required.