Skip to content

fhdsl/SOTA2024_ReportOut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SOTA2024_ReportOut

DOI

This repository contains all of the code to reproduce the analysis done for the State of the AnVIL 2024 Poll.

Directory Structure:

data

Raw data for this project is in a password protected, controlled access shared Google Drive because it contains some identifying information. This data is processed and de-identified and made available within the wrangled_data subdirectory.

annotations

These are codebook files created by the analysts explaining the columns in the raw data as well as possible values and dictionaries to categorize certain columns (e.g., institution).

  • codebook.txt: codebook relating to raw data
  • controlledAccessData_codebook.txt: Controlled access data mentioned in the poll as well as whether AnVIL hosts it.
  • institution_codebook.txt: institutions and simplified categorization

wrangled_data

  • resultsTidy.rds: wrangled data saved from 1_TidyData.Rmd (with identifying information of email and raw institutional affiliation removed)
  • resultsTidy_personas.rds: wrangled data saved from 2_PersonaStats.Rmd

analyses

  • 1_TidyData.Rmd: Fetching of Raw Data and wrangling steps for later analysis to create a de-identified tidy data file.
  • 2_PersonaStats.Rmd: Identification of personas and joining of persona categorization with tidy data.
  • 3_MainAnalysis.Rmd: Main analysis and plotting driver
  • 4_Stats.Rmd: Code to support all stated stats/general observations in the report out that aren't directly observed from plots/figures. Description of format for this:
    • Chronological order of statements and sections aligning with layout of the preprint
    • For each section, if there's a table that is used to support multiple statements, table is constructed within an expandable details section prior to any direct statements from the preprint
    • For each statement, there's a section separator and the specific statement, followed by an expandable details section with code to show the support for the statement.
  • 5_PCA.Rmd: Performs PCA analysis for all respondents after subsetting and wrangling the data

reports

This directory contains corresponding knit HTML files for each of the R Markdown files in the analyses directory and the figure creation R Markdown in the figures directory.

resources

  • scripts/shared_functions.R: some functions used repeatedly in analysis or for plotting
  • plots/: plots from the main analysis saved as png files
  • supplemental_material/: Includes the complete poll, supplementary Table 1 (relation of study aims and poll questions), and supplementary Table 2 (raw responses translated to awareness and use)

figures

  • figureCreation.Rmd: Uses patchwork to combine plots from 3_MainAnalysis.Rmdto make figure panels and adjusts aesthetics as necessary.
  • The figure panels themselves are saved as png files within this directory as well

Other notes:

  • Preprint information
  • A poster presented at the AnVIL Community Conference 2025
  • A companion website information
  • AnVIL Collection and other outreach information

About

Materials associated with the Report Out for the State of the AnVIL 2024

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •