markdown linting

colevandersWands · colevandersWands · commit e5d72a4d8e84 · 2025-05-15T10:35:58.000-04:00
diff --git a/0_domain_study/guide.md b/0_domain_study/guide.md
@@ -1,5 +1,10 @@
 # Domain Study: Guide
 
-To do meaningful research in a domain, you need to learn what others already do and don't understand in this area. Use this folder to organize your group's understanding of your research domain including: your own summaries, helpful PDFs, links you found helpful, ...
+To do meaningful research in a domain, you need to learn what others already do
+and don't understand in this area. Use this folder to organize your group's
+understanding of your research domain including: your own summaries, helpful
+PDFs, links you found helpful, ...
 
-This folder is different from `/notes` because it contains _only_ information about your research domain.  When deciding what goes here, ask yourself this question:  _Would someone need to know this to understand our research?_
+This folder is different from `/notes` because it contains _only_ information
+about your research domain. When deciding what goes here, ask yourself this
+question: _Would someone need to know this to understand our research?_
diff --git a/1_datasets/guide.md b/1_datasets/guide.md
@@ -1,14 +1,25 @@
 # Datasets: Guide
 
-Store your local datasets in this folder (`.csv`, `.xlsx`, `.json`, `.sqlite`, ...). You can use the README to document each dataset (where it's from, what data & types it contains, what you use it for, ...).
+Store your local datasets in this folder (`.csv`, `.xlsx`, `.json`, `.sqlite`,
+...). You can use the README to document each dataset (where it's from, what
+data & types it contains, what you use it for, ...).
 
-One of the primary goals of this repository is that anyone can clone and replicate your research. To make this possible **DO NOT modify or overwrite your raw datasets**! You should keep them _exactly_ as they were when you downloaded them, you may even want to name them `dataset.raw.ext` (eg. `daily_temperatures.raw.csv`).
+One of the primary goals of this repository is that anyone can clone and
+replicate your research. To make this possible **DO NOT modify or overwrite your
+raw datasets**! You should keep them _exactly_ as they were when you downloaded
+them, you may even want to name them `dataset.raw.ext` (eg.
+`daily_temperatures.raw.csv`).
 
-When cleaning and processing your datasets, you should save the prepared data to a _new_ file with a descriptive name. This approach will result in many dataset files, but that's ok!
+When cleaning and processing your datasets, you should save the prepared data to
+a _new_ file with a descriptive name. This approach will result in many dataset
+files, but that's ok!
 
 ## Types of Dataset
 
-A dataset is "simply" a collection of related measurements or observations. To create a good model of your problem using data you must understanding what _kinds_ of data exist, how to understand them, and the best ways to analyze each one. The kind of data you choose impacts:
+A dataset is "simply" a collection of related measurements or observations. To
+create a good model of your problem using data you must understanding what
+_kinds_ of data exist, how to understand them, and the best ways to analyze each
+one. The kind of data you choose impacts:
 
 - The tools you use for exploration and analysis
 - How we visualize the data
@@ -32,17 +43,20 @@ Data that represents quantities and can represented as numbers.
 
 #### Continuous Data
 
-- **Definition**: Can take any value within a range (including fractions and decimals)
+- **Definition**: Can take any value within a range (including fractions and
+  decimals)
 - **Examples**: Height, weight, temperature, time, distance
 - **Analysis**: Mean, median, standard deviation, histograms, scatter plots
-- **Real-world example**: Recording daily temperature over a month (72.5°F, 68.3°F, etc.)
+- **Real-world example**: Recording daily temperature over a month (72.5°F,
+  68.3°F, etc.)
 
 #### Discrete Data
 
 - **Definition**: Countable values, typically whole numbers
 - **Examples**: Number of children, items sold, count of occurrences
 - **Analysis**: Frequency tables, bar charts, mode
-- **Real-world example**: Number of customers visiting a store each day (45, 52, 38, etc.)
+- **Real-world example**: Number of customers visiting a store each day (45, 52,
+  38, etc.)
 
 ### Qualitative (Categorical) Data
 
@@ -53,14 +67,16 @@ Data that describes qualities or characteristics of what you want to study.
 - **Definition**: Categories with no inherent order or ranking
 - **Examples**: Gender, blood type, country, color, product type
 - **Analysis**: Frequency counts, mode, chi-square tests, pie charts
-- **Real-world example**: Survey responses for favorite color (red, blue, green, etc.)
+- **Real-world example**: Survey responses for favorite color (red, blue, green,
+  etc.)
 
 #### Ordinal Data
 
 - **Definition**: Categories with a meaningful order or ranking
 - **Examples**: Education level, satisfaction ratings (1-5), economic status
 - **Analysis**: Median, percentiles, rank correlations, stacked bar charts
-- **Real-world example**: Customer satisfaction ratings (very dissatisfied, dissatisfied, neutral, satisfied, very satisfied)
+- **Real-world example**: Customer satisfaction ratings (very dissatisfied,
+  dissatisfied, neutral, satisfied, very satisfied)
 
 ### Binary Data
 
@@ -116,7 +132,8 @@ Data that describes qualities or characteristics of what you want to study.
 - **Examples**: Surveys, experiments, interviews, direct observations
 - **Advantages**: Tailored to research needs, higher control over quality
 - **Disadvantages**: Time-consuming, potentially expensive
-- **Real-world example**: Market research survey designed specifically for a new product
+- **Real-world example**: Market research survey designed specifically for a new
+  product
 
 ### Secondary Data
 
@@ -128,21 +145,26 @@ Data that describes qualities or characteristics of what you want to study.
 
 ### [Proxy Data](https://centerforgov.gitbooks.io/benchmarking/content/Proxy.html)
 
-- **Definition**: Data that is 
-- **Examples**: Tree rings to proxy historical weather patterns, tax data to proxy incomes
-- **Advantages**: Helos you understand phenomena that are difficult or impossible to study directly.
+- **Definition**: Data that is
+- **Examples**: Tree rings to proxy historical weather patterns, tax data to
+  proxy incomes
+- **Advantages**: Helos you understand phenomena that are difficult or
+  impossible to study directly.
 - **Disadvantages**: You cannot draw conclusions with the same confidence.
-- **Real-world example**:  Using the stock market + unemployment rates as a proxy for the economy..
+- **Real-world example**: Using the stock market + unemployment rates as a proxy
+  for the economy..
 
 ### Experimental Data
 
-- **Definition**: Generated from controlled experiments with manipulated variables
+- **Definition**: Generated from controlled experiments with manipulated
+  variables
 - **Examples**: A/B tests, clinical trials, laboratory experiments
 - **Characteristics**:
   - Control and treatment groups
   - Controlled conditions
   - Designed to establish causality
-- **Real-world example**: Testing whether a new website design increases conversion rates
+- **Real-world example**: Testing whether a new website design increases
+  conversion rates
 
 ### Observational Data
 
@@ -152,7 +174,8 @@ Data that describes qualities or characteristics of what you want to study.
   - Natural setting
   - No manipulation of variables
   - Good for establishing correlation (not causation)
-- **Real-world example**: Observing and recording consumer shopping behaviors in a store
+- **Real-world example**: Observing and recording consumer shopping behaviors in
+  a store
 
 ## Classification by Size and Complexity
 
@@ -186,8 +209,10 @@ Data that describes qualities or characteristics of what you want to study.
   - Curse of dimensionality
   - Feature selection importance
   - Visualization difficulties
-- **Analysis**: Dimension reduction techniques (PCA, t-SNE), specialized algorithms
-- **Real-world example**: Gene expression data with thousands of genes measured for each sample
+- **Analysis**: Dimension reduction techniques (PCA, t-SNE), specialized
+  algorithms
+- **Real-world example**: Gene expression data with thousands of genes measured
+  for each sample
 
 ## Classification by Access Type
 
@@ -204,7 +229,8 @@ Data that describes qualities or characteristics of what you want to study.
 ### Private Data
 
 - **Definition**: Access restricted to authorized users
-- **Examples**: Company internal data, personal health records, proprietary research
+- **Examples**: Company internal data, personal health records, proprietary
+  research
 - **Characteristics**:
   - Security measures required
   - Often subject to privacy regulations
@@ -251,7 +277,8 @@ Data that describes qualities or characteristics of what you want to study.
   - Reference data
   - Shared across systems
   - Requires governance
-- **Real-world example**: Product master list with SKUs, descriptions, and categories
+- **Real-world example**: Product master list with SKUs, descriptions, and
+  categories
 
 ### Metadata
 
@@ -276,7 +303,8 @@ Data that describes qualities or characteristics of what you want to study.
 
 ### Hierarchical Data
 
-- **Definition**: Organized in a tree-like structure with parent-child relationships
+- **Definition**: Organized in a tree-like structure with parent-child
+  relationships
 - **Examples**: XML, JSON, file systems
 - **Characteristics**:
   - Nested structure
diff --git a/2_data_preparation/guide.md b/2_data_preparation/guide.md
@@ -1,9 +1,13 @@
 # Data Preparation: Guide
 
-This folder is for any Python scripts or notebooks you use to clean & prepare your datasets. These files should:
+This folder is for any Python scripts or notebooks you use to clean & prepare
+your datasets. These files should:
 
 1. Read in datasets from `0_datasets`
 2. Clean, reformat, or otherwise process the datasets for later.
 3. Write the processed dataset into `0_datasets` with a helpful file name.
 
-**DO NOT modify an existing dataset in `0_datasets`! Instead, save your processed data to a _new_ file.** This is critical to open research: Someone should be able to clone this repository and run your scripts to replicate your research. If you modify an original dataset, others cannot replicate your work.
+**DO NOT modify an existing dataset in `0_datasets`! Instead, save your
+processed data to a _new_ file.** This is critical to open research: Someone
+should be able to clone this repository and run your scripts to replicate your
+research. If you modify an original dataset, others cannot replicate your work.
diff --git a/3_data_exploration/guide.md b/3_data_exploration/guide.md
@@ -1,13 +1,23 @@
 # Data Exploration: Guide
 
-This folder is for any Python scripts or notebooks you use to _explore and understand_ your datasets. These files should:
+This folder is for any Python scripts or notebooks you use to _explore and
+understand_ your datasets. These files should:
 
 1. Read in prepared datasets from `0_datasets`
 2. Explore and understand the dataset without running a deep analysis:
-   - Generate some visualizations (in a notebook, or in a separate image file saved to this folder)
-   - Run some descriptive statistics (_[beware](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing) the [Datasaurus Dozen](https://www.research.autodesk.com/publications/same-stats-different-graphs/)!_)
-   - ... let your curiosity guide you, but _avoid_ running any inferential statistics or using any machine learning at this stage.
+   - Generate some visualizations (in a notebook, or in a separate image file
+     saved to this folder)
+   - Run some descriptive statistics
+     (_[beware](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing)
+     the
+     [Datasaurus Dozen](https://www.research.autodesk.com/publications/same-stats-different-graphs/)!_)
+   - ... let your curiosity guide you, but _avoid_ running any inferential
+     statistics or using any machine learning at this stage.
 
-**DO NOT modify an existing dataset in `0_datasets`!** This is critical to open research: Someone should be able to clone this repository and run your scripts to replicate your research. If you modify an original dataset, others cannot replicate your work.
+**DO NOT modify an existing dataset in `0_datasets`!** This is critical to open
+research: Someone should be able to clone this repository and run your scripts
+to replicate your research. If you modify an original dataset, others cannot
+replicate your work.
 
-> [Chapter 4 - Exploratory Data Analysis](https://bookdown.org/rdpeng/artofdatascience/exploratory-data-analysis.html) from the Art of Data Science is a good starting reference.
+> [Chapter 4 - Exploratory Data Analysis](https://bookdown.org/rdpeng/artofdatascience/exploratory-data-analysis.html)
+> from the Art of Data Science is a good starting reference.
diff --git a/4_data_analysis/README.md b/4_data_analysis/README.md
@@ -1 +1 @@
-# Data Analysis
+# Data Analysis
diff --git a/4_data_analysis/guide.md b/4_data_analysis/guide.md
@@ -1,10 +1,17 @@
 # Data Analysis: Guide
 
-This folder is for any Python scripts or notebooks you use to gain insights from your data through modeling, inferential statistics, and other analytical techniques. These files should:
+This folder is for any Python scripts or notebooks you use to gain insights from
+your data through modeling, inferential statistics, and other analytical
+techniques. These files should:
 
 1. Read in prepared datasets from `0_datasets`
-2. Learn from your datasets using methods that are appropriate to your research question, dataset and team's constraints.
+2. Learn from your datasets using methods that are appropriate to your research
+   question, dataset and team's constraints.
 
-**DO NOT modify an existing dataset in `0_datasets`!** This is critical to open research: Someone should be able to clone this repository and run your scripts to replicate your research. If you modify an original dataset, others cannot replicate your work.
+**DO NOT modify an existing dataset in `0_datasets`!** This is critical to open
+research: Someone should be able to clone this repository and run your scripts
+to replicate your research. If you modify an original dataset, others cannot
+replicate your work.
 
-> [Chapters 5-8](https://bookdown.org/rdpeng/artofdatascience) from the Art of Data Science are a good starting reference.
+> [Chapters 5-8](https://bookdown.org/rdpeng/artofdatascience) from the Art of
+> Data Science are a good starting reference.
diff --git a/5_communication_strategy/guide.md b/5_communication_strategy/guide.md
@@ -1,3 +1,6 @@
 # Communication Strategy: Guide
 
-This folder is here to organize the communication strategy for your research findings.  You can use it however you like.  Your communication artefact doesn't need to be stored here - it could be a video hosted on YouTube, a SM campaign, ... don't constrain yourself to something that can be stored on GitHub!
+This folder is here to organize the communication strategy for your research
+findings. You can use it however you like. Your communication artefact doesn't
+need to be stored here - it could be a video hosted on YouTube, a SM campaign,
+... don't constrain yourself to something that can be stored on GitHub!
diff --git a/6_final_presentation/guide.md b/6_final_presentation/guide.md
@@ -1,5 +1,6 @@
 # Final Presentation: Guide
 
-You can use this folder to plan your final presentation including presentation outlines, scripts, ...
+You can use this folder to plan your final presentation including presentation
+outlines, scripts, ...
 
 Don't forget to link to your final presentation in the repository README!
diff --git a/collaboration/communication.md b/collaboration/communication.md
@@ -9,7 +9,7 @@
 
 # Communication
 
-______________________________________________________________________
+---
 
 ## Communication Schedule
 
@@ -25,15 +25,15 @@ how often will we get in touch on each channel, and what we will discuss there:
 - **Slack/Discord**:
 - **Video Calls**:
 
-______________________________________________________________________
+---
 
 ## Availability
 
 ### Availability for calling/messaging
 
-| Day | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | |
------- | :----: | :-----: | :-------: | :------: | :----: | :------: | :----: |
-| _name_ | | | | | | | |
+| Day    | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday |     |
+| ------ | :----: | :-----: | :-------: | :------: | :----: | :------: | :----: | --- |
+| _name_ |        |         |           |          |        |          |        |
 
 ### How many hours everyone has per day
 
diff --git a/collaboration/retrospective.md b/collaboration/retrospective.md
@@ -28,4 +28,4 @@
 
 ### Name
 
-<!-- write a 2-3 sentence reflection on your contributions, challenges and progress in this milestone -->
+<!-- reflect on your contributions, challenges and progress in this milestone -->
diff --git a/notes/guide.md b/notes/guide.md
@@ -1,3 +1,4 @@
 # Notes: Guide
 
-Use this folder to organize your team's notes about the project process, data science, and anything else you found useful while completing the CDSP project.  
+Use this folder to organize your team's notes about the project process, data
+science, and anything else you found useful while completing the CDSP project.

Original file line number	Diff line number	Diff line change
`@@ -28,4 +28,4 @@`
`28`	`28`
`29`	`29`	`### Name`
`30`	`30`
`31`		`-<!-- write a 2-3 sentence reflection on your contributions, challenges and progress in this milestone -->`
	`31`	`+<!-- reflect on your contributions, challenges and progress in this milestone -->`