Skip to content
This repository was archived by the owner on Nov 15, 2019. It is now read-only.

Commit 1c45df3

Browse files
committed
Make pip installable
1 parent d5c856f commit 1c45df3

11 files changed

+107
-51
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@ atlassian-ide-plugin.xml
2121
# Virtualenv
2222
.venv
2323

24+
# Development build artifacts
25+
*.egg-info
26+
2427
# OS X metadata files
2528
.DS_Store
2629

.travis.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,13 @@ before_install:
55
- openssl aes-256-cbc -K $encrypted_e70c9f59db9f_key -iv $encrypted_e70c9f59db9f_iv
66
-in client-secret.json.enc -out client-secret.json -d
77
install:
8-
- pip install -r requirements-dev.txt
8+
- make develop
99
script:
1010
- make test
11+
deploy:
12+
provider: pypi
13+
on:
14+
tags: true
15+
user: jessebrennan
16+
password:
17+
secure: eBqClaTiltLnIC/lwqAqQQpi5Qrb3cpKn666JQPIYUA5Nn36SqLmX8QxrN1lqc9rbE/H1CufMhdwQPgpM7D5n7nQ0e4o0dXU2UsggIuEyaD4ZHliRveogbj6rrk+zrvqczZXkvNp22nefoi4ehk9wYhYQL1SAA4ytgsLJm0Svd1X/Gm6RmLuvafwKdsXwFkSm2ihJIMb7VPZa8vQ+EX0TMllX4on+6P5lQNmaCo/9pdocnM3HQTPJohZ2lx6EfUMLaX/gkC5akqqJ5MHCcbCNezJpP/MC0JibF08GDcwUy7zc79f0mIqc4rpbyqPWjZeuBUFrkwh67v8BRWcC7al3r1L7xbQRXpL00CbS2ySUNOfXgmV2M4UUUSgga4TdA28tUuP/lxgcS7tUT07ccSpo+RlD+8xLWs28oBsTJ+J7ebRrKAdTKu4X2Qts8LklxWbYIfIy2XELth+GpxR9mHaIoGDpV7E+wHSFqwGQekLMiZyuzDKKw/pG2taMq2EK6JnUO/470hc7FZf6gBIDlUMFFBffoArPrEhC4LdCwxqtAlG2Vy/UBAYqLMcTYgm7IGHQAwt1pHqMwHfb3pVsFbHyJ7vH7HJMknWHmx7yCXv+/pE6zTRNrANEPlhkVNrNAve++PIIZvBIY7K213SmUwxdmftfaxKaaAMj8xSpe6xrKk=

MANIFEST.in

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
include README.md VERSION release.py
2+
include *.txt

Makefile

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,25 @@ lint:
1010
mypy:
1111
mypy --ignore-missing-imports $(MODULES)
1212

13+
check_readme:
14+
python setup.py check -s
15+
1316
tests:=$(wildcard tests/test_*.py)
1417

1518
# A pattern rule that runs a single test module, for example:
1619
# make tests/test_gen3_input_json.py
1720

18-
$(tests): %.py : mypy lint
21+
$(tests): %.py : mypy lint check_readme
1922
python -m unittest --verbose $*.py
2023

2124
test: $(tests)
2225

23-
.PHONY: all lint mypy test
26+
develop:
27+
pip install -e .
28+
pip install -r requirements-dev.txt
2429

30+
undevelop:
31+
python setup.py develop --uninstall
32+
pip uninstall -y -r requirements-dev.txt
33+
34+
.PHONY: all lint mypy test

README.md

Lines changed: 27 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -2,65 +2,60 @@
22
Simple data loader for CGP HCA Data Store
33

44
## Common Setup
5-
1. Clone the repo:
5+
1. **(optional)** We recommend using a Python 3
6+
[virtual environment](https://docs.python.org/3/tutorial/venv.html).
67

7-
`git clone https://github.com/DataBiosphere/cgp-dss-data-loader.git`
8+
1. Run:
89

9-
2. Go to the root directory of the cloned project:
10-
11-
`cd cgp-dss-data-loader`
10+
`pip3 install cgp-dss-data-loader`
1211

13-
3. Run (ideally in a new [virtual environment](https://docs.python.org/3/tutorial/venv.html)):
12+
## setup for development
13+
1. clone the repo:
1414

15-
`pip install -r requirements.txt`
15+
`git clone https://github.com/databiosphere/cgp-dss-data-loader.git`
1616

17-
## Setup for Development
18-
1. Clone the repo:
17+
1. go to the root directory of the cloned project:
1918

20-
`git clone https://github.com/DataBiosphere/cgp-dss-data-loader.git`
21-
22-
2. Go to the root directory of the cloned project:
23-
2419
`cd cgp-dss-data-loader`
25-
26-
3. Make sure you are on the branch `develop`.
27-
28-
4. Run (ideally in a new [virtual environment](https://docs.python.org/3/tutorial/venv.html)):
2920

30-
`pip install -r requirements-dev.txt`
21+
1. make sure you are on the branch `develop`.
3122

32-
## Running Tests
33-
Run:
23+
1. run (ideally in a new [virtual environment](https://docs.python.org/3/tutorial/venv.html)):
24+
25+
`make develop`
26+
27+
## running tests
28+
run:
3429

3530
`make test`
3631

37-
## Getting data from Gen3 and Loading it
32+
## getting data from gen3 and loading it
3833

39-
1. The first step is to extract the Gen3 data you want using the
40-
[sheepdog exporter](https://github.com/david4096/sheepdog-exporter). The TopMed public data extracted
34+
1. the first step is to extract the gen3 data you want using the
35+
[sheepdog exporter](https://github.com/david4096/sheepdog-exporter). the topmed public data extracted
4136
from sheepdog is available [on the release page](https://github.com/david4096/sheepdog-exporter/releases/tag/0.3.1)
42-
under Assets. Assuming you use this data, you will now have a file called `topmed-public.json`
43-
44-
2. Make sure you are running the virtual environment you set up in the **Setup** instructions.
37+
under assets. assuming you use this data, you will now have a file called `topmed-public.json`
38+
39+
1. make sure you are running the virtual environment you set up in the **setup** instructions.
4540

46-
3. Now we need to transform the data. We can transform to the outdated gen3 format, or to the new standard format.
41+
1. now we need to transform the data. we can transform to the outdated gen3 format, or to the new standard format.
4742

4843
- for the standard format, follow instructions at
4944
[newt-transformer](https://github.com/jessebrennan/newt-transformer#transforming-data-from-sheepdog-exporter).
5045

5146
- for the old gen3 format
52-
From the root of the project run:
47+
from the root of the project run:
5348

5449
```
5550
python transformer/gen3_transformer.py /path/to/topmed_public.json --output-json transformed-topmed-public.json
5651
```
5752
58-
4. Now that we have our new transformed output we can run it with the loader.
53+
1. now that we have our new transformed output we can run it with the loader.
5954
60-
If you used the standard transformer use the command:
55+
if you used the standard transformer use the command:
6156
6257
```
63-
python scripts/cgp_data_loader.py --no-dry-run --dss-endpoint MY_DSS_ENDPOINT --staging-bucket NAME_OF_MY_S3_BUCKET standard --json-input-file transformed-topmed-public.json
58+
python scripts/cgp_data_loader.py --no-dry-run --dss-endpoint my_dss_endpoint --staging-bucket name_of_my_s3_bucket standard --json-input-file transformed-topmed-public.json
6459
```
6560
6661
otherwise for the outdated gen3 format run:
@@ -69,4 +64,4 @@ Run:
6964
python scripts/cgp_data_loader.py --no-dry-run --dss-endpoint MY_DSS_ENDPOINT --staging-bucket NAME_OF_MY_S3_BUCKET gen3 --json-input-file transformed-topmed-public.json
7065
```
7166
72-
5. You did it!
67+
1. You did it!

VERSION

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
0.0.1

requirements-dev.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,2 @@
11
flake8
22
mypy >= 0.600
3-
-r requirements.txt

requirements.txt

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1 @@
1-
boto3 >= 1.6.0, < 2
2-
cloud-blobstore >= 2.1.1, < 3
3-
crcmod >= 1.7, < 2
4-
dcplib >= 1.1.0, < 2
5-
google-cloud-storage >= 1.9.0, < 2
6-
hca >= 3.5.1, < 4
7-
requests >= 2.18.4, < 3
8-
9-
# topmed metadata exporter
10-
lifelines >= 0.14.2, < 1
11-
numpy >= 1.14.3, < 2
12-
scipy >= 1.1.0, < 2
13-
matplotlib >= 2.2.2, < 3
1+
.

scripts/__init__.py

Whitespace-only changes.

scripts/cgp_data_loader.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
GOOGLE_PROJECT_ID = "platform-dev-178517" # For requester pays buckets
2525

2626

27-
def main(argv):
27+
def main(argv=sys.argv[1:]):
2828
import argparse
2929
parser = argparse.ArgumentParser(description=__doc__)
3030
dry_run_group = parser.add_mutually_exclusive_group(required=True)
@@ -74,4 +74,4 @@ def main(argv):
7474

7575

7676
if __name__ == '__main__':
77-
main(sys.argv[1:])
77+
main()

setup.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
import os
2+
3+
from setuptools import setup, find_packages
4+
5+
VERSION_FILE = 'VERSION'
6+
7+
8+
def read_version():
9+
with open(VERSION_FILE, 'r') as fp:
10+
return tuple(map(int, fp.read().split('.')))
11+
12+
13+
def read(fname):
14+
return open(os.path.join(os.path.dirname(__file__), fname)).read()
15+
16+
17+
setup(
18+
name="cgp-dss-data-loader",
19+
description="Simple data loader for CGP HCA Data Store",
20+
packages=find_packages(exclude=('datasets', 'tests', 'transformer')), # include all packages
21+
url="https://github.com/DataBiosphere/cgp-dss-data-loader",
22+
entry_points={
23+
'console_scripts': [
24+
'dssload=scripts.cgp_data_loader:main'
25+
]
26+
},
27+
long_description=read('README.md'),
28+
long_description_content_type="text/markdown",
29+
install_requires=['boto3 >= 1.6.0, < 2',
30+
'cloud-blobstore >= 2.1.1, < 3',
31+
'crcmod >= 1.7, < 2',
32+
'dcplib >= 1.1.0, < 2',
33+
'google-cloud-storage >= 1.9.0, < 2',
34+
'hca >= 3.5.1, < 4',
35+
'requests >= 2.18.4, < 3'],
36+
license='Apache License 2.0',
37+
include_package_data=True,
38+
zip_safe=True,
39+
author="Jesse Brennan",
40+
author_email="[email protected]",
41+
classifiers=[
42+
'Development Status :: 3 - Alpha',
43+
'Intended Audience :: Science/Research',
44+
'License :: OSI Approved :: Apache Software License',
45+
'Natural Language :: English',
46+
'Programming Language :: Python :: 3.6',
47+
'Topic :: Scientific/Engineering :: Bio-Informatics',
48+
],
49+
version='{}.{}.{}'.format(*read_version()),
50+
keywords=['genomics', 'metadata', 'loading', 'NIHDataCommons'],
51+
)

0 commit comments

Comments
 (0)