pl-unirep_analysis

Table of Contents

Abstract
Citations
Synopsis
Description
TL;DR
Arguments
Run
- Using docker run
Examples
Development

Abstract

unirep_analysis is a ChRIS app that is wrapped around the UniRep project (https://github.com/churchlab/UniRep)

This plugin is GPU-capable. The 64-unit model should be OK to run on any machine. The full-sized model will require a machine with more than 8GB of GPU RAM.

Citations

For full information about the underlying method, consult the UniRep publication:

Paper: https://www.nature.com/articles/s41592-019-0598-1

The source code of UniRep is available on Github: https://github.com/churchlab/UniRep.

Synopsis

unirep_analysis                                                     \
                            [--dimension <modelDimension>]          \
                            [--batch_size <batchSize>]              \
                            [--learning_rate <learningRate>]        \
                            [--inputFile <inputFileToProcess>]      \
                            [--inputGlob <inputGlobPattern>]        \
                            [--modelWeightPath <pathToWeights>]     \
                            [--outputFile <resultOutputFile>]       \
                            [--topModelTraining]                    \
                            [--jointModelTraining]                  \
                            [--json]                                \
                            <inputDir>
                            <outputDir>

Description

unirep_analysis is a ChRIS-based "plugin" application that is capable of inferencing protein sequence representations and generative modelling aka "babbling".

TL;DR

Simply pull the docker image,

docker pull fnndsc/pl-unirep_analysis

and go straight to the examples section.

Arguments

[--dimension <modelDimension>]
By default, the <modelDimension> is 64. However, the value can be changed
to 1900 (full) or 256 and the corresponding weights files (present inside
the container) will be used.

[--batch_size <batchSize>]
This represents the batch size of the babbler. Default value is 12.

[--learning_rate <learningRate>]
Needed to build the model. Default is 0.001.

[--inputFile <inputFileToProcess>]
The name of the input text file that contains your amino acid sequences.
The default file name is an empty string. The full path to the
<inputFileToProcess> is constructed by concatenating <inputDir>

        <inputDir>/<inputFileToProcess>

[--inputGlob <inputGlob>]
A glob pattern string, default '**/*txt', that specifies the file containing
an amino acid sequence. This parameter allows for dynamic searching in the
input space a sequence file, and the first "hit" is grabbed.

[--modelWeightPath <path>]
A path to a directory containing model weight files to use for inference.

[--outputFile <resultOutputFile>]
The name of the output or formatted 'txt' file. Default name is 'format.txt'

[--topModelTraining]
If specified, run a training model just optimizing top model

[--jointModelTraing]
If specified, jointly train top model and mLSTM

[-h]
Display inline help

[--json]
If specified, print a JSON representation of the app.

Run

The execute vector of this plugin is via docker.

Using `docker run`

To run using docker, be sure to assign an "input" directory to /incoming and an output directory to /outgoing. Make sure that the $(pwd)/out directory is world writable!

Now, prefix all calls with

docker run --rm -v $(pwd)/out:/outgoing                        \
        fnndsc/pl-unirep_analysis                              \
        unirep_analysis                                        \

Thus, getting inline help is:

mkdir in out && chmod 777 out
docker run --rm -v $(pwd)/in:/incoming -v $(pwd)/out:/outgoing      \
        fnndsc/pl-unirep_analysis                                   \
        unirep_analysis                                             \
        -h                                                          \
        /incoming /outgoing

Examples

Assuming that the <inputDir> layout conforms to

<inputDir>
    │
    └──█ sequence.txt

to process this (by default on a GPU) do

docker run   --rm --gpus all                                             \
             -v $(pwd)/in:/incoming -v $(pwd)/out:/outgoing              \
             fnndsc/pl-unirep_analysis unirep_analysis                   \
             --inputFile sequence.txt --outputFile formatted.txt         \
             /incoming /outgoing

(note the --gpus all is not necessarily required) which will create in the <outputDir>:

<outputDir>
    │
    └──█ formatted.txt

Development

To perform in-line debugging of the container, do

docker run --rm -it --userns=host  -u $(id -u):$(id -g)                                     \
    -v $PWD/unirep_analysis.py:/usr/local/lib/python3.5/dist-packages/unirep_analysis.py:ro \
    -v $PWD/src:/usr/local/lib/python3.5/dist-packages/src                                  \
       -v $PWD/in:/incoming:ro -v $PWD/out:/outgoing:rw -w /outgoing                        \
       local/pl-unirep_analysis2 unirep_analysis /incoming /outgoing

Note, if you want to use pudb for debugging, then omit the -u $(id -u):$(id -g):

docker run --rm -it --userns=host                                                           \
    -v $PWD/unirep_analysis.py:/usr/local/lib/python3.5/dist-packages/unirep_analysis.py:ro \
    -v $PWD/src:/usr/local/lib/python3.5/dist-packages/src                                  \
       -v $PWD/in:/incoming:ro -v $PWD/out:/outgoing:rw -w /outgoing                        \
       local/pl-unirep_analysis2 unirep_analysis /incoming /outgoing

Of course, in both cases above, use approrpiate CLI args if required.

https://raw.githubusercontent.com/FNNDSC/cookiecutter-chrisapp/master/doc/assets/badge/light.png

_-30-_

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github/workflows		.github/workflows
src		src
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.rst		README.rst
chris_plugin_info.json		chris_plugin_info.json
docker-entrypoint.sh		docker-entrypoint.sh
gpl.txt		gpl.txt
pypi.sh		pypi.sh
requirements.txt		requirements.txt
setup.py		setup.py
unirep_analysis.py		unirep_analysis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pl-unirep_analysis

Abstract

Citations

Synopsis

Description

TL;DR

Arguments

Run

Using `docker run`

Examples

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

FNNDSC/pl-unirep_analysis

Folders and files

Latest commit

History

Repository files navigation

pl-unirep_analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Uh oh!

Languages