Skip to content

Polish of the usage and TODOs to clarify and simplify options#219

Merged
XingerTang merged 1 commit into
develfrom
gg/docPolish
Mar 5, 2026
Merged

Polish of the usage and TODOs to clarify and simplify options#219
XingerTang merged 1 commit into
develfrom
gg/docPolish

Conversation

@gregorgorjanc

Copy link
Copy Markdown
Member

I was always annoyed by the usage section since it was confusing us and was not consistent. I have now completely overhauled it and strived to describe one thing in one place and cross-link where needed.

I also went and sorted all options according to their input, output, peeling methods, and peeling parameters.

Then I went through all input file formats (in the order listed in input section).

Then I went through all output file formats (in the order listed in output section).

Then I went through all peeling parameters file formats (in the order listed in peeling parameters section).

This was helpful because it points out things that are inputted, outputted, how peeling methods are run, and how we can tweak peelling parameters.

It also shows areas that we should polish (some option renames) or check.

There is a number of simple TODOs for us to sort out! I will today work with @RosCraddock on the allele prob estimation parts and pheno penetrance part.

@gregorgorjanc gregorgorjanc left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XingerTang @AprilYUZhang @RosCraddock critical review of these changes would be much appreciated.

This should now be a model on how we should polish AlphaImpute2 documentation too. Maybe we can leverage some text from here (say the zero_one_two_etc note, maybe also others)

@gregorgorjanc gregorgorjanc left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more, I tested most options by creating input files based on the input file format section and running these commands


AlphaPeel -ped_file ped.txt -geno_file geno.txt -out_file test - method multi-locus

AlphaPeel -ped_file ped.txt -geno_file geno.txt -pheno_penetrance_file pheno_penetrance.txt -phased_geno_prob -geno_prob -geno -hap -seg_prob -out_file test -update_alt_allele_prob -est_rec_prob -est_geno_error_prob

AlphaPeel -ped_file ped.txt -geno_file geno.txt -seq_file seq.txt -pheno_penetrance_file pheno_penetrance.txt -phased_geno_prob -geno_prob -geno -hap -seg_prob -out_file test -est_alt_allele_prob -update_alt_allele_prob -alt_allele_prob -est_geno_error_prob -est_seq_error_prob

AlphaPeel -ped_file ped.txt -geno_file geno_single.txt -pheno_file pheno.txt -pheno_penetrance_file pheno_penetrance.txt -phased_geno_prob -geno_prob -geno -hap -seg_prob -pheno_prob  -out_file test -est_pheno_error_prob

Update: here are all the input files used above:

Comment thread docs/source/changelog.rst Outdated
simple_output.dosage.txt
simple_output.rec_prob.txt
TODO: Let's round up these thresholds to 0.3333333333333333 to 0.33, so 2 digits
This should be enough, I reckon, but happy to discuss!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be easier to read.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XingerTang can you please take this one on? Decide if you want to open an issue or just work on it as part of your workflow. Once you do, please open PR to this PR and add edits to it regarding this rounding up.

@gregorgorjanc

Copy link
Copy Markdown
Member Author

@XingerTang I have met with @RosCraddock and we addressed some of the TODOs in the file, she has already done some of these changes in tinyhouse, but we should ensure they are also available in AlphaPeel and she will open PR to this PR to address the doc changes.

@XingerTang please go over the TODO's left in the changed files in this PR and reflect on the proposed changes. I think many of these are straightforward so let's do them ASAP;) Decide if you want to open an issue for each of those or just work on it as part of your workflow. Once you do, please open PR to this PR and add edits to it regarding the documentation updates. Happy to meet about some of the more gnarly TODOs.

Once we are done on all the ends we will have the code and docs sorted and polished and functionality clearer, phiu.

@XingerTang XingerTang left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregorgorjanc, I may need a meeting with @RosCraddock and @AprilYUZhang to understand what to change and how to implement the change.

Comment thread docs/source/usage.rst

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original help messages match the output of AlphaPeel -h exactly. Do we need to update the AlphaPeel help messages as well? Some of the changes, such as the subsections of "Individuals" and "Markers", may not be replicated in the help message output.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, update the code doc strings - I only focused on the rst file! Good call!

Comment thread docs/source/usage.rst Outdated
Comment thread docs/source/usage.rst Outdated
-rec_length REC_LENGTH
Recombination length of the chromosome in Morgans.
Default: 1.00.
TODO: rename mutation_rate to mut_prob

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Comment thread docs/source/usage.rst Outdated
Default: 1.

Estimation of model parameters:
TODO: rename to est_start_alt_allele_prob

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Comment thread docs/source/usage.rst Outdated
Comment thread docs/source/usage.rst Outdated
Comment thread docs/source/usage.rst Outdated
Alternative allele probability file
===================================
TODO: how does seg_file look like?
TODO: how does seg_map_file look like?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Comment thread docs/source/usage.rst Outdated

The phenotype penetrance file allows for user flexibility to include phenotype information. This file contains the conditional probability table for true phenotype given the true genotype. Each column represents the different phenotype states (order from 0 to total number of states), and each row represents the genotype states in order the aa, aA, Aa, AA. Below, we provide an example for a monogenic recessive, binary trait.
The ``.geno_prob.txt`` file contains *genotype probabilities* for each individual.
TODO: remove the empty lines in the output files? (4 will be 3 in text!)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty lines are still an open issue (#117).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. This should be a simple fix;)

Comment thread docs/source/usage.rst Outdated

::
1. The grand *paternal* allele from the father and
the grand *paternal* allele from the mother (``Pr(F_p,- & M_p,-)``),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use LaTeX here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the same, but didn't know how;) Does the notation make sense to you?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we do the following way:

Suggested change
the grand *paternal* allele from the mother (``Pr(F_p,- & M_p,-)``),
the grand *paternal* allele from the mother (:math:`Pr(F_{p,-} & M_{p,-})`),

Comment thread docs/source/usage.rst Outdated

Dosage file
===========
TODO: Do we need a diagram for this maybe even show the 4 cases so we bring home the message!?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be useful.

@XingerTang XingerTang merged commit 6f942be into devel Mar 5, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants