Skip to content

Sage-Bionetworks/genie-bpc-nsclc-2l

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Feasibility analysis for 2L NSCLC

GENIE logo

Overview

A feasibility analysis for Revolution Medicines examining NSCLC 2L therapies in KRAS G12D+ patients. The target population is non-small cell lung cancer (NSCLC) patients who have received second line therapy, having also received anti-PD-1 or anti-PD-L1 therapy and platinum chemotherapy before the second line. We leverage the existing curation in the BPC (v3.1-consortium) NSCLC cohort. Originally completed by Alex Paynter at Sage Bionetworks.

Installation and Setup

To clone this repository, run the following in the command line from a machine with git installed:

git clone https://github.com/Sage-Bionetworks/genie-bpc-nsclc-2l

Reproducibility tools

This repository:

  • Was tested and run on R version 4.4.2.
  • Uses R projects. When running any codes, please open the .RProj file first.
  • Does not use renv to manage package environments.
  • Does not use docker or other containerization to manage deployment.

The code may work without appreciation of these tools, but no guarantees.

Requirements

To run the code in this repository you will need:

  • A Synapse account which has download rights for GENIE data. See below on data versions.
  • The synapser R package, which will also require a python installation (follow instructions at that link).
    • Note: This is only used to acquire the data. It is technically possible to download the data by pointing and clicking if you want to.

Code structure

The top-level workflow of the project is in main.R. This calls the other analysis scripts in the correct sequence to reproduce my workflow. Other top level folders include:

  • /analysis - Scripts (analysis/script), quarto/rmarkdown files (analysis/report) and any other analysis code excluding function definitions.
  • /data-raw - Raw data, where raw means "as it comes in the data release."
  • /data - Processed data, saved at various stages in the analysis.
  • /R - Function definitions. Sometimes I include {roxygen}-style documentation like a package would be.

Data

We use the latest available version of each BPC cohort, public if available and private (consortium) if not. The public versions are listed here.

If you are not a consortium member and you want to access the exact data version to reproduce this analysis, please send a request explaining this to genieinfo@aacr.org. If your request is denied then please write me at the email below - reproduction or critique of my analysis is a great service and I want to help you do it.

The structure, processing and flow of data is described in detail in the PDF data guides, which accompany the data files.

Acknowledgments/References

We wish to thank the following groups for their upstream contributions to the data:

License

The license for this material is GNU GPLv3. That means you can use this code, as long as your code remains open for others to use. We're on a strict honor system with no repercussions here so thanks in advance for definitely following this.

Contact

If you have additional questions please write alexander.paynter@sagebase.org. If that fails, try genie.dcc@sagebase.org and ask them to put you in touch with me.

About

A data feasibility request regarding second line lung cancer therapies.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors