Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
1eed1f8
Adding ensemble metrics to sidebars.
vmullig Feb 8, 2022
9156829
Updating main RosettaScripts page.
vmullig Feb 8, 2022
35cb717
Adding page for CentralTendency metric.
vmullig Feb 8, 2022
cc023e6
Updating auto-generated docs.
vmullig Feb 8, 2022
a18aca6
Adding auto-generated ensemble metric docs.
vmullig Feb 8, 2022
3e8cc7f
Updating CentralTendency ensemble metric doc.
vmullig Feb 8, 2022
0e5918f
Working on documentation for EnsembleMetrics.
vmullig Feb 10, 2022
b3730e1
Fleshing out EnsembleMetric documentation.
vmullig Feb 10, 2022
97d7892
Updating auto-generated docs.
vmullig Feb 10, 2022
ca9eebd
Adding note about accessing named values.
vmullig Feb 10, 2022
bbfdb09
Adding note about filtering.
vmullig Feb 10, 2022
9dd6e89
Revising text slightly.
vmullig Feb 10, 2022
9629b4b
Adding note about MPI mode.
vmullig Feb 10, 2022
3301900
Adding example of internal generation mode.
vmullig Feb 10, 2022
a5778bb
Adding note about multithreading.
vmullig Feb 10, 2022
a045962
Updating note about multi-threading.
vmullig Feb 11, 2022
eb93e85
Adding example for mode 3.
vmullig Feb 11, 2022
62a9158
Add EnsembleFilter docs to filter list.
vmullig Feb 11, 2022
d38608f
Moving some filters that were in the wrong folder.
vmullig Feb 11, 2022
11215f2
Adding documentation for EnsembleFilter.
vmullig Feb 11, 2022
b7e9ef7
Minor typos.
vmullig Feb 11, 2022
fc6e674
Expanding note about mode.
vmullig Feb 11, 2022
08dc81e
Minor tweak.
vmullig Feb 11, 2022
0b2cc38
Merge remote-tracking branch 'origin/master' into vmullig/ensemble_me…
vmullig Feb 25, 2022
af0cded
Updating note about MPI.
vmullig Feb 25, 2022
41dacf1
Updating CentralTendency and FragmentScore auto-generated docs.
vmullig Feb 25, 2022
faa5e70
Merge branch 'vmullig/ensemble_metric_doc' into vmullig/ensemble_metr…
vmullig Feb 25, 2022
e0d4213
Add auto-generated docs for PNearEnsembleMetric.
vmullig Mar 11, 2022
5faa107
Add PNear ensemble metric to EnsembleMetrics documenation page.
vmullig Mar 11, 2022
d00043e
Updating CentralTendency.
vmullig Mar 11, 2022
f08de9e
Adding documentation for PNear ensemble metric.
vmullig Mar 11, 2022
f93b3be
Adding a little more description.
vmullig Mar 11, 2022
0801e4a
Adding MPI note.
vmullig Mar 11, 2022
22503cc
Merge remote-tracking branch 'origin/master' into vmullig/ensemble_me…
vmullig Mar 11, 2022
99146f7
Updating auto-generated docs.
vmullig Mar 11, 2022
e23b820
Merge branch 'vmullig/ensemble_metric_doc' into vmullig/ensemble_metr…
vmullig Mar 11, 2022
171afb3
Merge remote-tracking branch 'origin/vmullig/ensemble_metric_mpi_docs…
vmullig Mar 11, 2022
af0a491
Merge remote-tracking branch 'origin/master' into vmullig/ensemble_me…
vmullig Apr 18, 2022
4fa4729
Update documentation with mention of support in MPIFileBufJobDistribu…
vmullig Apr 18, 2022
2549d1b
Merge remote-tracking branch 'origin/master' into vmullig/ensemble_me…
vmullig Apr 27, 2022
bfe5514
Merge branch 'vmullig/ensemble_metric_doc' into vmullig/ensemble_metr…
vmullig Apr 27, 2022
f2cbac0
Merge remote-tracking branch 'origin/master' into vmullig/ensemble_me…
vmullig Jul 2, 2022
9c8700e
Merge branch 'vmullig/ensemble_metric_doc' into vmullig/ensemble_metr…
vmullig Jul 2, 2022
1bef5b8
Merge branch 'vmullig/ensemble_metric_mpi_docs' into vmullig/pnear_en…
vmullig Jul 2, 2022
802ce09
Merge remote-tracking branch 'origin/master' into vmullig/ensemble_me…
vmullig Oct 11, 2022
1aed331
Merge branch 'vmullig/ensemble_metric_doc' into vmullig/ensemble_metr…
vmullig Oct 11, 2022
07f0d4a
Merge branch 'vmullig/ensemble_metric_mpi_docs' into vmullig/pnear_en…
vmullig Oct 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# CentralTendency Ensemble Metric
*Back to [[EnsembleMetrics]] page.*
## CentralTendency Ensemble Metric

[[_TOC_]]

### Description

The Central Tendency metric accepts as input a real-valued [[SimpleMetric|SimpleMetrics]]. It then applies it to each pose in an ensemble, collecting a series of values. At reporting time, the metric computes measures of central tendency (mean, median, and mode), plus other descriptive statistics about the distribution of the measured value over the ensemble (standard deviation, standard error, min, max, range).

### Author and history

Created Tuesday, 8 February 2022 by Vikram K. Mulligan, Center for Computational Biology, Flatiron Institute ([email protected]). This was the first [[EnsembleMetric|EnsembleMetrics]] implemented

### Interface

[[include:ensemble_metric_CentralTendency_type]]

### Named values produced

Measure | Name (used for the [[EnsembleFilter]]) | Description
--------|----------------------------------------|------------
Mean | mean | The average of the values measured for the poses in the ensemble.
Median | median | When values measured from all of hte poses in the ensemble are listed in increasing order, this is the middle value. If the number of poses in the ensemble is even, the middle two values are averaged.
Mode | mode | The most frequently seen value in the values measured from the poses in the environment. If more than one value appears with equal frequency and this frequency is highest, the values are averaged.
Standard Deviation | stddev | Estimate of the standard deviation of the mean, defined as the sqrt( sum_i( S_i - mean )^2 / N ), where S_i is the ith sample, mean is the average of all the samples, and N is the number of samples.
Standard Error | stderr | Estimate of the standard error of the mean, defined by stddev / sqrt(N), where N is the number of samples.
Min | min | The minimum value seen.
Max | max | The maximum value seen.
Range | range | the largest value seen minus the smallest.

#### Note about mode

The mode of a set of floating-point numbers can be thrown off by floating-point error. For instance, two poses may have energies of -3.7641 kJ/mol, but the process of computing that energy may result in slightly different values at the 15th decimal point. This could prevent the filter from recognizing this is at the most frequent value. Mode is most useful as a metric when the "floating-point" values are actually integers (for instance, given a [[SimpleMetric|SimpleMetrics]] like the [[SelectedResidueCountMetric]], which returns integer counts).

##See Also

* [[SimpleMetrics]]: Available SimpleMetrics.
* [[EnsembleMetrics]]: Available EnsembleMetrics.
* [[PNear ensemble metric|PNearEnsembleMetric]]: An ensemble metric that computes propensity to favour a desired conformation given a conformational ensemble.
* [[I want to do x]]: Guide to choosing a tool in Rosetta.
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# PNear Ensemble Metric
*Back to [[EnsembleMetrics]] page.*
## PNear Ensemble Metric

[[_TOC_]]

### Description

P<sub>Near</sub> is a metric used to describe the propensity of a sequence to favour a particular conformational state. It was first described in Bhardwaj, Mulligan, Bahl _et al_. (2016) _Nature_ 538(7625):329-335, doi: 10.1038/nature19791. The PNear [[EnsembleMetric|EnsembleMetrics]] computes P<sub>Near</sub> given a conformational ensemble generated by some protocol. It can optionally compute the propensity to favour some desired state, provided with the `-in:file:native` commandline option or with the `native_file` option in RosettaScripts, or it can compute the propensity to favour the lowest-energy state observed in the ensemble.

As with any ensemble metric, the analysis can be performed for an ensemble of structures on disk, an ensemble generated on the fly in memory (but never written to disk), or even a distributed ensemble sampled on many nodes across a large cluster via MPI (with no single node ever seeing all of the structures).

### Author and history

Created Thursday, 10 March 2022 by Vikram K. Mulligan, Center for Computational Biology, Flatiron Institute ([email protected]).

### Details of the calculation

P<sub>Near</sub> is defined as follows:

![Expression defining PNear](/scripting_documentation/RosettaScripts/EnsembleMetrics/ensemble_metric_pages/PNear_Eqn.png)

In the above, _N_ is the number of structures in the ensemble, _E<sub>i</sub>_ and _rmsd<sub>i</sub>_ are the energy and the RMSD to target conformation of the ith sample, respectively, and λ (lambda) and k<sub>B</sub>T are two parameters controlling the calculation of P<sub>Near</sub>. The parameter λ, measured in Angstroms, defines how close a structure has to be to the target conformation (either the user-provided native state or the lowest-energy state in the ensemble) in order for it to be considered "close enough" to count as contributing to high propensity to favour the target state. The parameter k<sub>B</sub>T, meaured in kcal/mol, is a Boltzmann temperature that determines how much each sampled conformation contributes to distribution of states as its energy rises.

The value of P<sub>Near</sub> ranges from 0 to 1, with values close to 0 indicating that the molecule has very low propensity to favour the desired conformation and values close to 1 indicating that it has very high propensity to do so. It may be thought of as the Boltzmann probability of being near to the desired conformation, where "near" is defined fuzzily by a Gaussian function of RMSD with breadth λ. By defining this fuzzily rather than sharply, numerical instability on repeated sampling runs is avoided: small changes in the distribution of samples result in _small_ changes in P<sub>Near</sub>, but could result in _large_ changes in the values of earlier metrics that used hard cutoffs to determine whether a sampled state was "native-like" or not.

### Interface

[[include:ensemble_metric_PNear_type]]

### Vector-valued metrics produced

The PNear ensemble metric produces two vectors of outputs that can be accessed from C++ or Python by calling the `PNearEnsembleMetric::get_realvector_metric_value_by_name( std::string const & name, core::Size const index )` method. These are called "pnear_to_native" and "pnear_to_lowestE". There is one entry in the vector for every scoring function provided when configuring the PNear ensemble metric. (For ordinary use, when only one scoring function is provided, these are 1-vectors.)

### Example usage

The following script reads a series of PDB files or structures from a silent file (passed in with one of the `-in:file:s`, `-in:file:l` or `-in:file:silent` options), aligns each to a native structure (`inputs/native.pdb`), scores each with Rosetta's `ref2015` scoring function, and computes P<sub>Near</sub> to both the native structure and to the lowest-energy structure in the ensemble of input structuers.


```xml
<ROSETTASCRIPTS>
<SCOREFXNS>
<ScoreFunction name="r15" weights="ref2015.wts" />
</SCOREFXNS>
<MOVERS>
<DeclareBond name="cyclize" atom1="C" atom2="N" res1="7" res2="1" />
</MOVERS>
<ENSEMBLE_METRICS>
<PNear name="pnear" scorefxns="r15" use_CB_in_rmsd="false"
compute_pnear_to_lowestE="true" compute_pnear_to_native="true"
output_filename="pnear_analysis.txt" output_mode="tracer_and_file"
native_file="inputs/native.pdb" native_preparation_protocol="cyclize"
superimpose_for_rmsd="true" lambda="0.5" kbt="1.0"
/>
</ENSEMBLE_METRICS>
<PROTOCOLS>
<Add ensemble_metrics="pnear" />
</PROTOCOLS>
</ROSETTASCRIPTS>
```

##See Also

* [[EnsembleMetrics]]: Available EnsembleMetrics.
* [[SimpleMetrics]]: Available SimpleMetrics.
* [[CentralTendency ensemble metric|CentralTendency]]: An ensemble metric that computs mean, median, mode, etc. of values produced by a [[SimpleMetric|SimpleMetrics]].
* [[I want to do x]]: Guide to choosing a tool in Rosetta.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

* [[Simple Metrics | SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[Filters|Filters-RosettaScripts]]

* [[FeaturesReporters|Features-reporter-overview]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@

* [[Filters|Filters-RosettaScripts]]

* [[Simple Metrics|SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[Residue Selectors|ResidueSelectors]]

* [[PackerPalettes|PackerPalette]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@

* [[Residue Selectors|ResidueSelectors]]

* [[Simple Metrics|SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[PackerPalettes|PackerPalette]]

* [[Filters|Filters-RosettaScripts]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Filter | Description
**[[CompoundStatement|CompoundStatementFilter]]** | Uses previously defined filters with logical operations to construct a compound filter.
**[[CombinedValue|CombinedValueFilter]]** | Weighted sum of multiple filters.
**[[CalculatorFilter]]** | Combine multiple filters with a mathematical expression.
**[[EnsembleFilter]]** | Filter based, not on a property of a single pose, but on a property of an _ensemble_ of many poses.
**[[ReplicateFilter]]** | Repeat a filter multiple times and average.
**[[Boltzmann|BoltzmannFilter]]** | Boltzmann weighted sum of positive/negative filters.
**[[MoveBeforeFilter]]** | Apply a mover before applying the filter.
Expand Down
2 changes: 2 additions & 0 deletions scripting_documentation/RosettaScripts/Filters/_Sidebar.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

* [[Simple Metrics | SimpleMetrics]]

* [[Ensemble Metrics | EnsembleMetrics]]

* [[Filters|Filters-RosettaScripts]]

* [[FeaturesReporters|Features-reporter-overview]]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# EnsembleFilter
*Back to [[SimpleMetrics]] page.*
*Back to [[Filters | Filters-RosettaScripts]] page.*
## EnsembleFilter

Created by Vikram K. Mulligan ([email protected]) on 10 February 2022.

[[_TOC_]]

### Description

This filter takes as input an [[EnsembleMetric|EnsembleMetrics]] that has been used to evaluate some set of properties of an ensemble of filters, retrives a named floating-point value from the metric, and filters based on whether that value is greater than, equal to, or less than some threshold. (Note that [[EnsembleMetrics]] evaluate a property of a collection or _ensemble_ poses, not of a single pose. This makes this filter unusual: where most discard a trajectory based on the state of a single pose, this can discard a trajectory based on the state of large ensemble of poses -- for example, based on many sampled conformatinos of a single design.)


### Options

[[include:filter_SimpleMetricFilter_type]]

### Example:

In this example, we load one or more cyclic peptides (provided with the `-in:file:s` or `-in:file:l` commandline options), generate a conformational ensemble of slightly perturbed conformations for each peptide _in memory_, without writing all structures to disk, and perform ensemble analysis on that ensemble with the [[CentralTendency EnsembleMetric|CentralTendency]], filtering on the results with the EnsembleFilter. Only those peptides that have low-energy ensembles of perturbed conformations pass the filter.

```xml
<ROSETTASCRIPTS>
<!-- Example of using the EnsembleFilter to filter based on the properties of an ensemble of poses
generated from the current pose. -->
<SCOREFXNS>
<ScoreFunction name="r15" weights="ref2015.wts" />
</SCOREFXNS>
<MOVERS>
<!-- The movers that set up, perturb, and relax a cyclic peptide are set up here. We
later bundle the perturbation protocol in a ParsedProtocol: -->
<DeclareBond name="connect_termini" res1="8" res2="1" atom1="C" atom2="N" add_termini="true" />
<GeneralizedKIC name="perturb1" selector_scorefunction="r15" closure_attempts="200"
stop_when_n_solutions_found="1" selector="lowest_rmsd_selector"
>
<AddResidue res_index="3"/>
<AddResidue res_index="4"/>
<AddResidue res_index="5"/>
<AddResidue res_index="6"/>
<AddResidue res_index="7"/>
<SetPivots res1="3" atom1="CA" res2="5" atom2="CA" res3="7" atom3="CA" />
<AddPerturber effect="perturb_dihedral" >
<AddAtoms res1="3" atom1="N" res2="3" atom2="CA" />
<AddAtoms res1="3" atom1="CA" res2="3" atom2="C" />
<AddAtoms res1="4" atom1="N" res2="4" atom2="CA" />
<AddAtoms res1="4" atom1="CA" res2="4" atom2="C" />
<AddAtoms res1="5" atom1="N" res2="5" atom2="CA" />
<AddAtoms res1="5" atom1="CA" res2="5" atom2="C" />
<AddAtoms res1="6" atom1="N" res2="6" atom2="CA" />
<AddAtoms res1="6" atom1="CA" res2="6" atom2="C" />
<AddAtoms res1="7" atom1="N" res2="7" atom2="CA" />
<AddAtoms res1="7" atom1="CA" res2="7" atom2="C" />
<AddValue value="5.0"/>
</AddPerturber>
</GeneralizedKIC>
<GeneralizedKIC name="perturb2" selector_scorefunction="r15" closure_attempts="200"
stop_when_n_solutions_found="1" selector="lowest_rmsd_selector"
>
<AddResidue res_index="7"/>
<AddResidue res_index="1"/>
<AddResidue res_index="2"/>
<AddResidue res_index="3"/>
<AddResidue res_index="4"/>
<SetPivots res1="7" atom1="CA" res2="2" atom2="CA" res3="4" atom3="CA"></SetPivots>
<AddPerturber effect="perturb_dihedral" >
<AddAtoms res1="7" atom1="N" res2="7" atom2="CA" />
<AddAtoms res1="7" atom1="CA" res2="7" atom2="C" />
<AddAtoms res1="1" atom1="N" res2="1" atom2="CA" />
<AddAtoms res1="1" atom1="CA" res2="1" atom2="C" />
<AddAtoms res1="2" atom1="N" res2="2" atom2="CA" />
<AddAtoms res1="2" atom1="CA" res2="2" atom2="C" />
<AddAtoms res1="3" atom1="N" res2="3" atom2="CA" />
<AddAtoms res1="3" atom1="CA" res2="3" atom2="C" />
<AddAtoms res1="4" atom1="N" res2="4" atom2="CA" />
<AdmoverdAtoms res1="4" atom1="CA" res2="4" atom2="C" />
<AddValue value="5.0"/>
</AddPerturber>
</GeneralizedKIC>
<FastRelax name="frlx" repeats="1" scorefxn="r15" />
<!-- Bundling the perturbation steps together so that they can be passed
to the CentralTendency EnsembleMetric: -->
<ParsedProtocol name="ensemble_generating_protocol" >
<Add mover="perturb1" />
<Add mover="perturb2" />
<Add mover="frlx" />
</ParsedProtocol>
</MOVERS>
<SIMPLE_METRICS>
<!-- The SimpleMetric that will be passed to the CentralTendency EnsembleMetric: -->
<TotalEnergyMetric name="total_energy" scorefxn="r15" />
</SIMPLE_METRICS>
<ENSEMBLE_METRICS>
<!-- Setting up the EnsembleMetric with both a SimpleMetric and a
ParsedProtocol for generating the ensemble from a given pose: -->
<CentralTendency name="avg_energy" n_threads="0" real_valued_metric="total_energy"
output_mode="tracer_and_file" output_filename="report.txt"
ensemble_generating_protocol="ensemble_generating_protocol"
ensemble_generating_protocol_repeats="20"
/>
</ENSEMBLE_METRICS>
<FILTERS>
<!-- Set up the filter that can discard those peptides that yield an
ensemble with energy above a cutoff threshold: -->
<EnsembleFilter name="filter_on_avg_energy" ensemble_metric="avg_energy"
named_value="mean" filter_acceptance_mode="less_than_or_equal"
threshold="4.0"
/>
</FILTERS>
<PROTOCOLS>
<!-- Set up the peptide, but don't perturb it yet: -->
<Add mover="connect_termini" />
<!-- Accumulate data with the EnsembleMetric for every replicate of the
peturbation protocol (which in this case is run by the EnsembleMetric,
generating each member of the ensemble internally, in memory, without
exporting them): -->
<Add ensemble_metrics="avg_energy" />
<!-- Abandon the jobs that produce bad ensemble properties prior to
writing the structure back to disk: -->
<Add filter="filter_on_avg_energy" />
</PROTOCOLS>
<OUTPUT scorefxn="r15" />
</ROSETTASCRIPTS>
```

### See also

* [[EnsembleMetrics]]: Available SimpleMetrics
* [[SimpleMetrics]]: Available SimpleMetrics
* [[SimpleMetricFilter]]: Filter on an arbitrary SimpleMetric
* [[Movers|Movers-RosettaScripts]]: Available Movers
* [[I want to do x]]: Guide to choosing a Rosetta protocol.
2 changes: 2 additions & 0 deletions scripting_documentation/RosettaScripts/Movers/_Sidebar.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

* [[Simple Metrics | SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[Filters|Filters-RosettaScripts]]

* [[FeaturesReporters|Features-reporter-overview]]
Expand Down
1 change: 1 addition & 0 deletions scripting_documentation/RosettaScripts/RosettaScripts.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Fleishman SJ, Leaver-Fay A, Corn JE, Strauch EM, Khare SD, et al. (2011) Rosetta
- [[JumpSelectors |JumpSelectors]]
- [[PackerPalettes|PackerPalette]]
- [[SimpleMetrics]]
- [[EnsembleMetrics]]

---------------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@

* [[Residue Selectors|ResidueSelectors]]

* [[Simple Metrics|SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[PackerPalettes|PackerPalette]]

* [[Task Operations|TaskOperations-RosettaScripts]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
* [[Task Operations|TaskOperations-RosettaScripts]]

* [[Simple Metrics | SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[Filters|Filters-RosettaScripts]]

Expand Down
2 changes: 2 additions & 0 deletions scripting_documentation/RosettaScripts/_Sidebar.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@

* [[Simple Metrics | SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[Filters|Filters-RosettaScripts]]

* [[FeaturesReporters|Features-reporter-overview]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

* [[Simple Metrics | SimpleMetrics]]

* [[Ensemble Metrics|EnsembleMetrics]]

* [[Filters|Filters-RosettaScripts]]

* [[Features Reporters|Features-reporter-overview]]
Expand Down
Loading