Skip to content

Commit e395a70

Browse files
committed
python SDK release 0.1.0
1 parent c42b110 commit e395a70

File tree

319 files changed

+30943
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

319 files changed

+30943
-0
lines changed

PKG-INFO

+208
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
Metadata-Version: 2.1
2+
Name: tuneinsight
3+
Version: 0.1.0
4+
Summary: Diapason is the official Python SDK for the Tune Insight Agent API
5+
License: Apache-2.0
6+
Author: Tune Insight SA
7+
Requires-Python: >=3.9,<3.11
8+
Classifier: License :: OSI Approved :: Apache Software License
9+
Classifier: Programming Language :: Python :: 3
10+
Classifier: Programming Language :: Python :: 3.9
11+
Classifier: Programming Language :: Python :: 3.10
12+
Requires-Dist: PyYAML (>=6.0,<7.0)
13+
Requires-Dist: attrs (>=21.3.0)
14+
Requires-Dist: docker (>=6.0.1,<7.0.0)
15+
Requires-Dist: httpx (>=0.15.4,<0.24.0)
16+
Requires-Dist: matplotlib (>=3.6.0,<4.0.0)
17+
Requires-Dist: notebook (>=6.4.11,<7.0.0)
18+
Requires-Dist: pandas (>=1.4.2,<2.0.0)
19+
Requires-Dist: pylint (>=2.15.2,<3.0.0)
20+
Requires-Dist: python-dateutil (>=2.8.0,<3.0.0)
21+
Requires-Dist: python-dotenv (>=0.21.0,<0.22.0)
22+
Requires-Dist: python-keycloak (>=0.27.0,<0.28.0)
23+
Requires-Dist: pyvcf3 (>=1.0.3,<2.0.0)
24+
Requires-Dist: scikit-learn (>=1.1.3,<2.0.0)
25+
Requires-Dist: scipy (>=1.9.3,<2.0.0)
26+
Requires-Dist: seaborn (>=0.12.1,<0.13.0)
27+
Requires-Dist: wheel (>=0.37.1,<0.38.0)
28+
Description-Content-Type: text/markdown
29+
30+
# Tune Insight Python SDK
31+
32+
Diapason is the Tune Insight Python SDK
33+
34+
## Getting Started
35+
36+
### Installing
37+
38+
```bash
39+
pip install tuneinsight-0.1.0.tar.gz
40+
```
41+
42+
## Usage
43+
44+
To use the SDK you must be able to connect to a *Tune Insight* Agent.
45+
46+
47+
### Creating a client to the agents
48+
49+
To create a new client to one of the running agents, simply run:
50+
```python
51+
from tuneinsight.client.diapason import Diapason
52+
client = Diapason.from_config_path('conf.yml')
53+
```
54+
55+
### Features
56+
#### Computations
57+
#### Preprocessing
58+
Preprocessing operations should be defined in relation to a computation. The preprocessing when the computation is ran.
59+
For example:
60+
```
61+
aggregation = project.new_enc_aggregation()
62+
aggregation.preprocessing.one_hot_encoding(target_column='gender', prefix='', specified_types=['Male', 'Female'])
63+
```
64+
65+
Preprocessing operations can be applied to all nodes or specific nodes if the data format is different across nodes. This requires using the `nodes` argument, as follows:
66+
```
67+
aggregation.preprocessing.one_hot_encoding(target_column='gender', prefix='', specified_types=['Male', 'Female'], nodes=['Organization_A'])
68+
aggregation.preprocessing.one_hot_encoding(target_column='genre', prefix='', specified_types=['Male', 'Female'], nodes=['Organization_B'])
69+
aggregation.preprocessing.one_hot_encoding(target_column='genero', prefix='', specified_types=['Male', 'Female'], nodes=['Organization_C'])
70+
```
71+
72+
##### Select
73+
Select specified columns from data.
74+
```
75+
select(columns, create_if_missing, dummy_value, nodes)
76+
```
77+
* `columns` : list of column names to be selected (`List[str]`)
78+
* `create_if_missing` : whether to create the columns if they do not exist, default = False (`bool`)
79+
* `dummy_value` : what to fill the created columns with, default = "" (`str`)
80+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
81+
82+
##### One Hot Encoding
83+
Encodes a target column into one hot encoding and extends the table with these columns
84+
```
85+
one_hot_encoding(target_column, prefix, specified_types, nodes)
86+
```
87+
* `target_column` : name of column to convert to one-hot-encoding (`str`)
88+
* `prefix` : prefix string to prepend to one-hot column names (`str`)
89+
* `specified_types` : specified types to one-hot encode, if specified, then possible missing columns will be added (`List[str]`)
90+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
91+
92+
##### Filter
93+
Filters rows from the data under a given condition
94+
```
95+
filter(target_column, comparator, value, numerical, nodes)
96+
```
97+
* `target_column` : name of column to filter on (`str`)
98+
* `comparator` : type of comparison (`ComparisonType` enum)
99+
100+
* equal
101+
* nEqual
102+
* greater
103+
* greaterEq
104+
* less
105+
* lessEq
106+
* in
107+
108+
* `value` : value with which to compare (`str`)
109+
* `numerical` : whether the comparison is on numerical values, default = False (`bool`)
110+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
111+
112+
##### Counts
113+
Concatenates a new column containing 1 for each row in order to count the number of rows
114+
```
115+
counts(output_column_name, nodes)
116+
```
117+
* `output_column_name` : name of the column to store the counts. If not specified, the name 'count' will be used. (`str`)
118+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
119+
120+
##### Transpose
121+
Transpose index and columns
122+
```
123+
transpose(copy, nodes)
124+
```
125+
* `copy` : Whether to copy the data after transposing. default False (`bool`)
126+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
127+
128+
##### Set Index
129+
Set the DataFrame index using existing columns.
130+
```
131+
set_index(cols, drop, append, nodes)
132+
```
133+
* `columns` : list of column names to set as index (`List[str]`)
134+
* `drop` : Delete columns to be used as the new index. default True (`bool`)
135+
* `append` : Whether to append columns to existing index. default False (`bool`)
136+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
137+
138+
##### Reset Index
139+
Reset the index, or a level of it.
140+
```
141+
reset_index(level, drop, nodes)
142+
```
143+
* `level` : list of column names to remove from index (`List[str]`)
144+
* `drop` : Do not try to insert index into dataframe columns. This resets the index to the default integer index. default False (`bool`)
145+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
146+
147+
##### Rename
148+
Alter axes labels.
149+
```
150+
rename(mapper, axis, copy, errors, nodes)
151+
```
152+
* `mapper` : Dict of transformations to apply to that axis’ values. (`dict`)
153+
* `axis` : Axis to target with `mapper`. Should be the axis name (‘index’, ‘columns’). The default is ‘index’. (`RenameAxis`)
154+
* `copy` : Also copy underlying data. default True (`bool`)
155+
* `errors` : If True raise a KeyError when a dict-like mapper, index, or columns contains labels that are not present in the Index being transformed. If False existing keys will be renamed and extra keys will be ignored.(`bool`)
156+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
157+
158+
##### As Type
159+
Cast column types
160+
```
161+
astype(type_map, copy, errors, nodes)
162+
```
163+
* `mapper` : Dict which maps column names to dtypes. (`dict`)
164+
* `copy` : Return a copy. default True (`bool`)
165+
* `errors` : If True raise a KeyError when a dict-like mapper, index, or columns contains labels that are not present in the Index being transformed. If False existing keys will be renamed and extra keys will be ignored.(`bool`)
166+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
167+
168+
##### Extract Dict Field
169+
Extract field value from dict-like columns
170+
```
171+
extract(field, columns, names, nodes)
172+
```
173+
* `field` : dict field to extract (`str`)
174+
* `columns` : list of column names from which to extract field (`List[str]`)
175+
* `names`: names of resulting columns, if None, no new columns are created (`List[str]`)
176+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
177+
178+
For example given:
179+
| id | dict_col |
180+
| -- | -- |
181+
| 0 | { 'foo' : 3, 'bar' : 0.56} |
182+
| 1 | { 'foo' : 8, 'bar' : 0.22} |
183+
| 2 | { 'foo' : 5, 'bar' : 0.13} |
184+
185+
`extract(field='foo', columns=['dict_col'])` yields:
186+
| id | dict_col |
187+
| -- | -- |
188+
| 0 | 3 |
189+
| 1 | 8 |
190+
| 2 | 5 |
191+
192+
##### Apply RegEx
193+
Apply a RegEx mapping to columns
194+
```
195+
apply_regex(regex, columns, regex_type, names, nodes)
196+
```
197+
* `regex` : regular expression to apply (`str`)
198+
* `columns` : list of column names from which to extract field (`List[str]`)
199+
* `regex_type` : defines what we want to retrieve from the regex (`ApplyRegExType`)
200+
* `ApplyRegExType.MATCH` : return the first match
201+
* `ApplyRegExType.FINDALL`: return list of matching values
202+
* `ApplyRegExType.POSITION`: return position of first match
203+
* `names`: names of resulting columns, if None, no new columns are created (`List[str]`)
204+
* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)
205+
206+
207+
## License
208+
Apache License 2.0

pyproject.toml

+36
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
[tool.poetry]
2+
name = "tuneinsight"
3+
version = "0.1.0"
4+
description = "Diapason is the official Python SDK for the Tune Insight Agent API"
5+
authors = ["Tune Insight SA"]
6+
license = "Apache-2.0"
7+
include = ["src/tuneinsight/api/**/*.py"]
8+
readme = "src/tuneinsight/README.md"
9+
10+
[tool.poetry.dependencies]
11+
python = ">= 3.9, < 3.11"
12+
python-keycloak = "^0.27.0"
13+
PyYAML = "^6.0"
14+
pandas = "^1.4.2"
15+
matplotlib = "^3.6.0"
16+
#cloudpickle = "^2.0.0" // TODO: to add in the future to run python on backend
17+
scikit-learn = "^1.1.3"
18+
#cryptolib = { file = "../../geco/internal/python/dist/cryptolib-0.1.0-cp39-cp39-linux_x86_64.whl" } // TODO: to add in the future for client side crypto
19+
wheel = "^0.37.1"
20+
pylint = "^2.15.2"
21+
docker = "^6.0.1"
22+
seaborn = "^0.12.1"
23+
notebook = "^6.4.11"
24+
25+
# Required by ge_co_rest_api
26+
httpx = ">=0.15.4,<0.24.0"
27+
attrs = ">=21.3.0"
28+
python-dateutil = "^2.8.0"
29+
python-dotenv = "^0.21.0"
30+
pyvcf3 = "^1.0.3"
31+
scipy = "^1.9.3"
32+
33+
34+
[build-system]
35+
requires = ["poetry-core>=1.0.0"]
36+
build-backend = "poetry.core.masonry.api"

setup.py

+69
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# -*- coding: utf-8 -*-
2+
from setuptools import setup
3+
4+
package_dir = \
5+
{'': 'src'}
6+
7+
packages = \
8+
['tuneinsight',
9+
'tuneinsight.api',
10+
'tuneinsight.api.sdk',
11+
'tuneinsight.api.sdk.api',
12+
'tuneinsight.api.sdk.api.api_admin',
13+
'tuneinsight.api.sdk.api.api_computations',
14+
'tuneinsight.api.sdk.api.api_dataobject',
15+
'tuneinsight.api.sdk.api.api_datasource',
16+
'tuneinsight.api.sdk.api.api_log',
17+
'tuneinsight.api.sdk.api.api_ml',
18+
'tuneinsight.api.sdk.api.api_network',
19+
'tuneinsight.api.sdk.api.api_project',
20+
'tuneinsight.api.sdk.api.api_protocols',
21+
'tuneinsight.api.sdk.api.api_query',
22+
'tuneinsight.api.sdk.api.api_sessions',
23+
'tuneinsight.api.sdk.api.health',
24+
'tuneinsight.api.sdk.api.metrics',
25+
'tuneinsight.api.sdk.models',
26+
'tuneinsight.client',
27+
'tuneinsight.computations',
28+
'tuneinsight.utils']
29+
30+
package_data = \
31+
{'': ['*'], 'tuneinsight.utils': ['graphical/*']}
32+
33+
install_requires = \
34+
['PyYAML>=6.0,<7.0',
35+
'attrs>=21.3.0',
36+
'docker>=6.0.1,<7.0.0',
37+
'httpx>=0.15.4,<0.24.0',
38+
'matplotlib>=3.6.0,<4.0.0',
39+
'notebook>=6.4.11,<7.0.0',
40+
'pandas>=1.4.2,<2.0.0',
41+
'pylint>=2.15.2,<3.0.0',
42+
'python-dateutil>=2.8.0,<3.0.0',
43+
'python-dotenv>=0.21.0,<0.22.0',
44+
'python-keycloak>=0.27.0,<0.28.0',
45+
'pyvcf3>=1.0.3,<2.0.0',
46+
'scikit-learn>=1.1.3,<2.0.0',
47+
'scipy>=1.9.3,<2.0.0',
48+
'seaborn>=0.12.1,<0.13.0',
49+
'wheel>=0.37.1,<0.38.0']
50+
51+
setup_kwargs = {
52+
'name': 'tuneinsight',
53+
'version': '0.1.0',
54+
'description': 'Diapason is the official Python SDK for the Tune Insight Agent API',
55+
'long_description': '# Tune Insight Python SDK\n\nDiapason is the Tune Insight Python SDK\n\n## Getting Started\n\n### Installing\n\n```bash\npip install tuneinsight-0.1.0.tar.gz\n```\n\n## Usage\n\nTo use the SDK you must be able to connect to a *Tune Insight* Agent.\n\n\n### Creating a client to the agents\n\nTo create a new client to one of the running agents, simply run:\n```python\nfrom tuneinsight.client.diapason import Diapason\nclient = Diapason.from_config_path(\'conf.yml\')\n```\n\n### Features\n#### Computations\n#### Preprocessing\nPreprocessing operations should be defined in relation to a computation. The preprocessing when the computation is ran.\nFor example:\n```\naggregation = project.new_enc_aggregation()\naggregation.preprocessing.one_hot_encoding(target_column=\'gender\', prefix=\'\', specified_types=[\'Male\', \'Female\'])\n```\n\nPreprocessing operations can be applied to all nodes or specific nodes if the data format is different across nodes. This requires using the `nodes` argument, as follows:\n```\naggregation.preprocessing.one_hot_encoding(target_column=\'gender\', prefix=\'\', specified_types=[\'Male\', \'Female\'], nodes=[\'Organization_A\'])\naggregation.preprocessing.one_hot_encoding(target_column=\'genre\', prefix=\'\', specified_types=[\'Male\', \'Female\'], nodes=[\'Organization_B\'])\naggregation.preprocessing.one_hot_encoding(target_column=\'genero\', prefix=\'\', specified_types=[\'Male\', \'Female\'], nodes=[\'Organization_C\'])\n```\n\n##### Select\nSelect specified columns from data.\n```\nselect(columns, create_if_missing, dummy_value, nodes)\n```\n* `columns` : list of column names to be selected (`List[str]`)\n* `create_if_missing` : whether to create the columns if they do not exist, default = False (`bool`)\n* `dummy_value` : what to fill the created columns with, default = "" (`str`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### One Hot Encoding\nEncodes a target column into one hot encoding and extends the table with these columns\n```\none_hot_encoding(target_column, prefix, specified_types, nodes)\n```\n* `target_column` : name of column to convert to one-hot-encoding (`str`)\n* `prefix` : prefix string to prepend to one-hot column names (`str`)\n* `specified_types` : specified types to one-hot encode, if specified, then possible missing columns will be added (`List[str]`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### Filter\nFilters rows from the data under a given condition\n```\nfilter(target_column, comparator, value, numerical, nodes)\n```\n* `target_column` : name of column to filter on (`str`)\n* `comparator` : type of comparison (`ComparisonType` enum)\n\n\t* equal\n\t* nEqual\n\t* greater\n\t* greaterEq\n\t* less\n\t* lessEq\n\t* in\n\n* `value` : value with which to compare (`str`)\n* `numerical` : whether the comparison is on numerical values, default = False (`bool`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### Counts\nConcatenates a new column containing 1 for each row in order to count the number of rows\n```\ncounts(output_column_name, nodes)\n```\n* `output_column_name` : name of the column to store the counts. If not specified, the name \'count\' will be used. (`str`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### Transpose\nTranspose index and columns\n```\ntranspose(copy, nodes)\n```\n* `copy` : Whether to copy the data after transposing. default False (`bool`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### Set Index\nSet the DataFrame index using existing columns.\n```\nset_index(cols, drop, append, nodes)\n```\n* `columns` : list of column names to set as index (`List[str]`)\n* `drop` : Delete columns to be used as the new index. default True (`bool`)\n* `append` : Whether to append columns to existing index. default False (`bool`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### Reset Index\nReset the index, or a level of it.\n```\nreset_index(level, drop, nodes)\n```\n* `level` : list of column names to remove from index (`List[str]`)\n* `drop` : Do not try to insert index into dataframe columns. This resets the index to the default integer index. default False (`bool`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### Rename\nAlter axes labels.\n```\nrename(mapper, axis, copy, errors, nodes)\n```\n* `mapper` : Dict of transformations to apply to that axis’ values. (`dict`)\n* `axis` : Axis to target with `mapper`. Should be the axis name (‘index’, ‘columns’). The default is ‘index’. (`RenameAxis`)\n* `copy` : Also copy underlying data. default True (`bool`)\n* `errors` : If True raise a KeyError when a dict-like mapper, index, or columns contains labels that are not present in the Index being transformed. If False existing keys will be renamed and extra keys will be ignored.(`bool`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### As Type\nCast column types\n```\nastype(type_map, copy, errors, nodes)\n```\n* `mapper` : Dict which maps column names to dtypes. (`dict`)\n* `copy` : Return a copy. default True (`bool`)\n* `errors` : If True raise a KeyError when a dict-like mapper, index, or columns contains labels that are not present in the Index being transformed. If False existing keys will be renamed and extra keys will be ignored.(`bool`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n##### Extract Dict Field\nExtract field value from dict-like columns\n```\nextract(field, columns, names, nodes)\n```\n* `field` : dict field to extract (`str`)\n* `columns` : list of column names from which to extract field (`List[str]`)\n* `names`: names of resulting columns, if None, no new columns are created (`List[str]`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\nFor example given:\n | id | dict_col |\n | -- | -- |\n | 0 | { \'foo\' : 3, \'bar\' : 0.56} |\n | 1 | { \'foo\' : 8, \'bar\' : 0.22} |\n | 2 | { \'foo\' : 5, \'bar\' : 0.13} |\n\n`extract(field=\'foo\', columns=[\'dict_col\'])` yields:\n | id | dict_col |\n | -- | -- |\n | 0 | 3 |\n | 1 | 8 |\n | 2 | 5 |\n\n##### Apply RegEx\nApply a RegEx mapping to columns\n```\napply_regex(regex, columns, regex_type, names, nodes)\n```\n* `regex` : regular expression to apply (`str`)\n* `columns` : list of column names from which to extract field (`List[str]`)\n* `regex_type` : defines what we want to retrieve from the regex (`ApplyRegExType`)\n\t* `ApplyRegExType.MATCH` : return the first match\n\t* `ApplyRegExType.FINDALL`: return list of matching values\n\t* `ApplyRegExType.POSITION`: return position of first match\n* `names`: names of resulting columns, if None, no new columns are created (`List[str]`)\n* `nodes` : which nodes to apply the preprocessing operation to, if `None` it will apply to all (`List[str]`)\n\n\n## License\nApache License 2.0',
56+
'author': 'Tune Insight SA',
57+
'author_email': 'None',
58+
'maintainer': 'None',
59+
'maintainer_email': 'None',
60+
'url': 'None',
61+
'package_dir': package_dir,
62+
'packages': packages,
63+
'package_data': package_data,
64+
'install_requires': install_requires,
65+
'python_requires': '>=3.9,<3.11',
66+
}
67+
68+
69+
setup(**setup_kwargs)

0 commit comments

Comments
 (0)