Skip to content

loci RNA association

Sergey Venev edited this page Feb 1, 2016 · 4 revisions

nothing to note so far. See read for brief description.

Remarks on GLPK integer optimization for RNA folding

In our study, we are trying to determine structure of viral RNA (vRNA) encapsidated by nucleoprotein (NP). This structure is believed to be local: bulk of vRNA is condensed on NP, while the most stable hairpins are protruding from RNA-protein complex (vRNP).

We are trying to unveil those stable hairpins using local RNA folding. RNALfold local RNA folder from ViennaRNA package yields a set of local RNA structures (hairpins) smaller than window size. These hairpins are overlapping with each other in general case. To mimic encapsidated vRNA structure we select the best subset of non-overlapping hairpins from RNALfold output.

In our case, the best, implies that these hairpins yield minimum free energy of the vRNP, RNA-protein complex. Free energy functional:

F= a_1*(dg_1 - dg_NP)*L_1 + a_2*(dg_2 - dg_NP)*L_2 + ...,

where dg_i is MFE of i hairpin per nucleotide, dg_NP - free energy for nucleotide in NP-condensed form, L_i - hairpin's size in nucleotides, and a_i are binary coefficien assuming 1 if i hairpin is included in global RNA structure, and 0 otherwise. We are minimizing such a functional with respect to the set of inclusion coefficients a_1,a_2,..., given the non-overlapping constraint.

It is a Mixed Integer Programming (MIP) problem, with the objective function F(a_1,a_2,...), where (dg_i - dg_NP)*L_i are objective coefficients, and coefficients a_1,a_2,... are bound by the non-overlapping constraint. The constraint can be formalized as a set of linear inequalities, derived from matrix OM of hairpin overlap: OM_ij = 1 if hairpins i and j are overlapping and OM_ij = 0 otherwise. OM is symmetrical by definition and diagonal elements can be set OM_ii = 1 as well. Let us consider following example: hairpin 1 overlaps with 2, 2 overlaps with 3,4, and 5, while 3 and 4 overlaps with each other but not with 5:

1 ...((..))...........................
2 .......((((((....)))))).............
3 ...........((..))...................
4 ............(((..)))................
5 ......................(((...))).....

Given this set of hairpins (presumably RNALfold output), we can write non-overlap constraints as follows:

0<= a_1 + a_2 <=1
0<= a_2 + a_3 + a_4 <=1
0<= a_2 + a_5 <=1,

i.e., 1 and 2 are not allowed at the same time, 2,3,4 - as well, yet both 1,3 and 1,4 combinations are permitted, similarly combination 2,5 is prohibited, while 5 can combine with any of 1,3,4.

Clone this wiki locally