Skip to content

Refactored tcplPlot and auxiliary functions to support any number of curves on a comparison plot #325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 111 commits into from
Apr 8, 2025

Conversation

cthunes
Copy link
Contributor

@cthunes cthunes commented Jan 24, 2025

Removed compare.val param from tcplPlot and replaced it with compare. Use compare to choose a field from plot loaded data to compare on. For example, compare = "dsstox_substance_id" will match all of the same chemical across the loaded data (say, fld= "aeid", val = c(<list of 4 endpoints>). If all 4 endpoints test the same chemicals, then the output (for 'pdf') would be a list of plots which contain 4 curves/point sets, each for a different chemical. By default compare = "m4id", which means all samples will be plotted individually, since every m4id/s2id is always unique.

Use dat with preloaded -- and potentially manipulated data -- for more flexibility. For example one could load plot data using tcplPlotLoadData the same way we would in tcplPlot, then add a custom column, say for user decided compare grouping rather than just using the default available fields in compare. If dat is instead a list of data.tables, no compare field is needed and each list item will be interpreted by tcplPlot as a separate comparison plot.

If you have an especially large number of curves to plot on each comparison plot, the new type of compare plot could be useful. There are two new parameters to tcplPlot called and with defaults: group.fld = NULL and group.threshold = 9. This means at the specified (or default of 9) group.threshold value, curves on comparison plots will be grouped differently by color and in the legend, and a verbose table will no longer be printed as it becomes excessively large. The default group.fld if the number of curves on a plot exceeds group.threshold is modl for mc and hitc for sc (up for suggestions on better defaults). Both are fully customizable, so you can use any field available in the data for group.fld, including a custom field if dat is supplied, and any size small or large as a minimum group.threshold to switch over to the other style. Set group.threshold to a large number to effectively disable this functionality. The most common use case currently for this functionality is when plotting an entire endpoint on one comparison plot. Extensive tress testing has not been done for this to find the limits, but so far using a Tox21 endpoint, I have been successful up to about 3000 curves before I run into "node stack overflow" errors. I think ggplot may have some limits on the number of layers.

Note - verbose = TRUE is the new default for tcplPlot!

248 new unit tests passing. devtools::check() successful. Closes #293. Closes #215. Closes #280. Closes #249. Closes #228. Closes #175. Closes #311. Closes #117. Closes #241. Closes #296. Closes #50. Closes #336.

Loec plotting is also included within this pull request.
tcplPlot PR.pptx

also halved the 'number of curves' in the error comparison test cases that had too many plots for console plotting
@madison-feshuk
Copy link
Collaborator

Recommendation: consider hiding data points when group.threshold > 9. Data points are black and distract from curves
image

@cthunes cthunes added this to the invitrodb v4.3 / tcpl v3.3 milestone Mar 5, 2025
essentially rebase dev onto plotting branch
@madison-feshuk
Copy link
Collaborator

Had some install warnings related to these packages:
namespace 'rlang' 1.1.3 is already loaded, but >= 1.1.4 is required
namespace 'htmltools' 0.5.7 is being loaded, but >= 0.5.8.1 is required

@madison-feshuk
Copy link
Collaborator

madison-feshuk commented Mar 20, 2025

Trying to make comparison plots for some zebrafish data fit with LOEC vs normal tcplfit2. Getting the following error despite same units:

tcplPlot(dat = mc5, compare = c("aeid", "spid"), output = "pdf")
Error in FUN(X[[i]], ...) :
Concentration or normalized data type units do not match.
unique(mc5$conc_unit)
[1] "uM"
unique(mc5$resp_unit)
[1] "percent_activity"

This actually was user error related to not using tcplPlotLoadData. Thanks @cthunes !

New error:

tcplPlot(dat = mc5, type="mc", compare = c("aeid", "spid"), output = "pdf")
Error in sum(act_lens) : invalid 'type' (list) of argument

@cthunes
Copy link
Contributor Author

cthunes commented Mar 20, 2025

Had some install warnings related to these packages: namespace 'rlang' 1.1.3 is already loaded, but >= 1.1.4 is required namespace 'htmltools' 0.5.7 is being loaded, but >= 0.5.8.1 is required

This is due to the addition of the "gt" package in our DESCRIPTION: https://cran.r-project.org/web/packages/gt/index.html . See the Imports version requirements

@cthunes
Copy link
Contributor Author

cthunes commented Mar 20, 2025

New error:

tcplPlot(dat = mc5, type="mc", compare = c("aeid", "spid"), output = "pdf")
Error in sum(act_lens) : invalid 'type' (list) of argument

Resolved with latest commit! Logic didn't cover the case that all table components within a comparison besides m4id are in common, like when comparing across versions.

@madison-feshuk
Copy link
Collaborator

The resultant PDF file is now much larger when plotting on this test branch compared to using dev. File is too large to share over email.

In my example, I'm only plotting two endpoints:

tcplPlot(type="mc", fld = "aeid", val = c(3223,3226), verbose = TRUE, output = "pdf", fileprefix =
"plots_CCTE_Deisenroth_DEVTOX_25MAR2025")

image

@cthunes
Copy link
Contributor Author

cthunes commented Mar 26, 2025

The resultant PDF file is now much larger when plotting on this test branch compared to using dev. File is too large to share over email.

In my example, I'm only plotting two endpoints:

tcplPlot(type="mc", fld = "aeid", val = c(3223,3226), verbose = TRUE, output = "pdf", fileprefix =
"plots_CCTE_Deisenroth_DEVTOX_25MAR2025")

@madison-feshuk I ran this example alongside several other combinations: single/multi plots, compare/not compare, verbose/not verbose. I never had any difference remotely that big (max 30KB). 311_DEVTOX and dev_DEVTOX are my two I produced just like your example -- not seeing this issue. Can we find a way to reproduce this?
image

@madison-feshuk
Copy link
Collaborator

Had the following error when trying to output a plot to console:

tcplPlot(fld="m4id", val=10469134, type="mc", output = "console")
Error in if (compare.dat$fitc == 100) { : argument is of length zero

@cthunes
Copy link
Contributor Author

cthunes commented Mar 28, 2025

Had the following error when trying to output a plot to console:

tcplPlot(fld="m4id", val=10469134, type="mc", output = "console")
Error in if (compare.dat$fitc == 100) { : argument is of length zero

Resolved by making plotly plot compatible with loec plots!

Copy link
Collaborator

@madison-feshuk madison-feshuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving! After much testing, plotting updates seem to be working great

@cthunes cthunes merged commit a09fd5e into dev Apr 8, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment