You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-1Lines changed: 11 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -237,6 +237,13 @@ Processes raw deep mutational scanning (DMS) data from external sources into sta
237
237
2. Aligns the DMS NEP sequence (PR8 strain) to the reference using MUSCLE; uses actual DMS site numbers for coordinate mapping to handle non-consecutive site numbering
238
238
3. Outputs `results/dms_data/Teo_NEP/processed_dms_data.csv` — NEP DMS data with tree reference site numbering
239
239
240
+
**`process_dms_data_chen_pa.ipynb`** (Chen et al., PA):
241
+
1. Parses the AA mutation column to extract wildtype AA, DMS site, and mutant AA; excludes indel and stop codon mutations
242
+
2. Averages the two no-drug fitness replicates (P1NO-1-fit, P1NO-2-fit) and groups by AA mutation (multiple nucleotide changes can produce the same AA change)
243
+
3. Computes log-scale DMS effects: `log(mean fitness without drug)`
244
+
4. Aligns the DMS sequence (first 240 AA of PA) to the PA tree reference using MUSCLE to establish site numbering correspondence
245
+
5. Outputs `results/dms_data/Chen_PA/processed_dms_data.csv` — PA DMS data with tree reference site numbering
246
+
240
247
### Step 11: Analyze Site-Specific Rates
241
248
242
249
Executes an analysis notebook that:
@@ -450,6 +457,7 @@ Located in the `results/` root directory:
450
457
- `Li_PB1/processed_dms_data.csv`- PB1 DMS fitness effects (Li et al. 2023) with tree reference site numbering
451
458
- `Hom_M1/processed_dms_data.csv`- M1 DMS fitness effects (Hom et al. 2019) with tree reference site numbering
452
459
- `Teo_NEP/processed_dms_data.csv`- NEP DMS fitness effects (Teo et al. 2024) with tree reference site numbering
460
+
- `Chen_PA/processed_dms_data.csv`- PA DMS fitness effects (Chen et al. 2024) for first 240 AA with tree reference site numbering
0 commit comments