Context
Some Flow datasets contain FSC/SSC in linear mode and other channels in some variation of a logarithmic mode. Fluorescent channels are typically transformed with some kind of zero-robust logarithmic-like method (arcsinh, autologicle, ...), as this is how populations happen to behave. FSC/SSC are often used linearly, matching the biological behaviour of size and granularity. Similarly, figures of gating typically show markers on a log/exp scale, and FSC/SSC on a lin scale.
Problem with prep_fcd()
When importing the data, only a global choice (for all channels) can be made. E.g. arcsinh for all channels. Transformation of marker channels is required for proper analysis, but this forces FSC/SSC to be transformed as well.
FSC/SSC channels being transformed against my will makes it very hard to recreate FlowJo-like plots for e.g. inspecting a clustering result on the FSC-W/FSC-A dublett plot.
Autologi-transformed FSC/SSC values have very odd ranges for a very normal dataset (FSC: 4-4.5, SSC: 3-4)
Brainstorming for possible solutions
- high precision: import all markers in all ways (e.g. FSC as lin, arcsinh, auto_logi, CD3 as lin, arcsinh, auto_logi, ...)
many combinations to chose for all algotithms, very rich to inspect all transformations, but very storage-heavy
- high flexibility: specify transformation for each marker (e.g.
prep_fcd(..., lin = c("FSC-A, FSC-W, ...), auto_logi = c("CD3", "CD19", ...) ...)
data-efficient, flexible, but complicated input
- marker anno solution: setup a marker info .csv with marker name in .fcs, transformation to use, rename in cyCONDOR object etc
high flexibility, also renaming option (useful for cases of FJComp-Alexa Flour XXX -> CD3), reusable .csv file, optional marker sets like "lineage", "acivation", "markers_for_clustering", ...
I have a strong preference to an optional marker anno table. Maybe extend the prep_fcd function to continue to accept current inputs, but also accept a marker anno .csv. Such a file could contain:
- extract data from the channels of interest in a column "fcs_channel_name"
- rename channels, e.g. from auto-assigned/fluorophore/misspelled names to proper antigen names in a column "cyCONDOR_channel_name"
- specify transformations per channel in a column "transformation"
- specify categories of a channel as in custom columns to later select, e.g. "clustering (y/n)" to use or not use this channel in clustering
- document e.g. antibody clone/brand/...
Context
Some Flow datasets contain FSC/SSC in linear mode and other channels in some variation of a logarithmic mode. Fluorescent channels are typically transformed with some kind of zero-robust logarithmic-like method (arcsinh, autologicle, ...), as this is how populations happen to behave. FSC/SSC are often used linearly, matching the biological behaviour of size and granularity. Similarly, figures of gating typically show markers on a log/exp scale, and FSC/SSC on a lin scale.
Problem with
prep_fcd()When importing the data, only a global choice (for all channels) can be made. E.g. arcsinh for all channels. Transformation of marker channels is required for proper analysis, but this forces FSC/SSC to be transformed as well.
FSC/SSC channels being transformed against my will makes it very hard to recreate FlowJo-like plots for e.g. inspecting a clustering result on the FSC-W/FSC-A dublett plot.
Autologi-transformed FSC/SSC values have very odd ranges for a very normal dataset (FSC: 4-4.5, SSC: 3-4)
Brainstorming for possible solutions
prep_fcd(..., lin = c("FSC-A, FSC-W, ...), auto_logi = c("CD3", "CD19", ...) ...)I have a strong preference to an optional marker anno table. Maybe extend the prep_fcd function to continue to accept current inputs, but also accept a marker anno .csv. Such a file could contain: