@@ -27,6 +27,66 @@ arbitrarily complex patterns of genetic inheritance.
2727The Genetic Inheritance Graph Library (giglib) is a proof-of-concept implementation of the idea
2828behind GIGs, and is inspired by the standard [ tskit] ( https://tskit.dev ) ARG library.
2929
30- :::{todo}
31- Fill out more details from the [ README.md] ( https://github.com/hyanwong/GeneticInheritanceGraphLibrary/blob/main/README.md )
32- :::
30+ ## Details
31+
32+ In [ ` tskit ` ] ( https://tskit.dev ) we use edge annotations to describe which pieces of DNA are inherited in terms of a left and right coordinate.
33+ In giglib, this is extended to track the L & R in the edge * child* , and the L & R in the edge * parent* separately.
34+ The left and right values in each case refer to the coordinate system of the child and parent respectively.
35+
36+ ``` {note}
37+ For terminological clarity, we switch to using the term interval-edge (`iedge`)
38+ to refer to what is normally called an `edge` in a *tskit* Tree Sequence.
39+ Separating child from parent coordinates brings a host of extra complexities,
40+ and it’s unclear if the efficiency of the tskit approach,
41+ with its edge indexing etc, will port in any meaningful way to this new structure.
42+ ```
43+
44+ ## Structural variation
45+
46+ Below are some examples of how different sorts of structural variation can be encoded. These correspond to the
47+ schematic below:
48+
49+ ![ GIG schematic] ( _static/schematic.png )
50+
51+ ### Inversions
52+
53+ The easiest example is an inversion. This would be an iedge like
54+
55+ ```
56+ {parent: P, child: C, child_left: 6, child_right: 14, parent_left: 14, parent_right: 6}
57+ ```
58+
59+ There is a subtle gotcha here, because intervals in a GIG, as in _ tskit_ , are treated as half-closed
60+ (i.e. do not include the position given by the right coordinate). When we invert an interval, it
61+ therefore does not include the * left* parent coordinate, but does include the * right* parent coordinate.
62+ Any transformed position is thus out by one. Or to put it another way, an inversion specified
63+ by child_left=0, child_right=3, parent_left=3, parent_right=0 transforms the points
64+ 0, 1, 2 to 2, 1, 0: although the * interval* 0, 3 is transformed to 0, 3., the * point* 0 is transformed
65+ to position 2, not position 3. See
66+ [ here] ( https://github.com/hyanwong/giglib/issues/41#issuecomment-1858530867 )
67+ for more discussion.
68+
69+ ### Duplications
70+
71+ A tandem duplication is represented by two iedges, one for each duplicated region:
72+
73+ ```
74+ {parent: P, child: C, child_left: 10, child_right: 20, parent_left: 10, parent_right: 20}
75+ {parent: P, child: C, child_left: 20, child_right: 30, parent_left: 10, parent_right: 20}
76+ ```
77+
78+ Or one of the iedges could represent a non-adjacent duplication (e.g. corresponding to a transposable element):
79+ ```
80+ {parent: P, child: C, child_left: 25, child_right: 35, parent_left: 10, parent_right: 20}
81+ ```
82+
83+ ### Deletions
84+
85+ A deletion simply occurs when no material from the parent is transmitted to any of its children (and the coordinate system is shrunk)
86+
87+ ```
88+ # Deletion of parental region from 5-15
89+ {parent: P, child: C, child_left: 0, child_right: 5, parent_left: 0, parent_right: 5}
90+ {parent: P, child: C, child_left: 5, child_right: 10, parent_left: 15, parent_right: 20}
91+ ```
92+
0 commit comments