`pLI` is based on the underlying premise that we can assign genes to three natural categories with respect to sensitivity to loss-of-function variation: null (tolerant; where loss-of-function variation – heterozygous or homozygous - is completely tolerated by natural selection), recessive (where heterozygous variants are tolerated but homozygous ones are not), and haploinsufficient (where heterozygous loss-of-function variants are not tolerated). In order to create these metrics, we assumed that tolerant genes would have the expected amount of loss-of-function variation and then took the empirical observed/expected rate of loss-of-function variation for recessive disease genes (0.706) and severe haploinsufficient genes (0.207) to represent the average outcome of the homozygous and heterozygous intolerant scenarios, respectively. We then used an expectation-maximization (EM) algorithm to assign each transcript a probability of belonging to each category. `pLI` is the probability of belonging to the haploinsufficient class of genes. We have updated the empirical observed/expected rate of loss-of-function variants from previous releases. More details on the original formulation of pLI can be found in section 4.4 of the supplement in [Lek _et al._ Nature 2016](https://www.nature.com/articles/nature19057).
0 commit comments