Here we provide the specifications, complete descriptions, and comprehensive usage examples for base commands. For a list of commands, type base --help.
Import data files or external meta data files into Base project.
Synopsis
usage: base import project [-d <datafiles-dirpath>] [-e <datafile-extension>] [-c <path-parsing-rule>] [-m] [-p <external-filepath>] [-a <additional-key-value>]
positional arguments:
project your project name to import.
Description
This command provide you the way to import meta data related with data file paths and defined in external files such as .xlsx, .csv.
You have to select import mode as data files or external files.
If you want to import data files, you have to specify -d, -c and -e options (or prompt ask you interactively).
And then, Base will take below actions.
- Calculate the file hash.
- Parse the file path with
parsing-rule. - Create meta data records with the file hash and parsed path data.
- Add that records into project database table.
{
"FileHash": String,
"MetaKey1": ...,
...
}
If you want to import external files, you have to specify -m and -p options.
And then, Base will take below actions.
- Extract tables from the external file.
- Parse each table and detect headers in the table.
- Set header as Key and create meta data records.
- Link and update existing records with new meta data records in project database table.
{
"Table0,MetaKey1": ...,
...
}
Options
-
-d <datafiles-dirpath>,--directory <datafiles-dirpath>- specify adatafiles-dirpathto load data files which have an extension specified with-eoption. Base will search recursively. -
-e <datafile-extension>,--extension <datafile-extension>- specify adatafile-extensionto filter the targets on load data files. if you have some extensions in one dataset (such as png and jpg), you have to split loading workflow. -
-c <path-parsing-rule>,--parse <path-parsing-rule>- specifypath-parsing-ruleto extract meta data from each data file path.- you can use {key-name} to parse phrases with key. - you can use {_} to ignore some phrases. - you have to use '/' as separator. >>> sample parsing rule: {}/{name}/{timestamp}/{sensor}-{condition}{iteration}.csv
The following options are used only when importing external files.
-m,--external-file- parse the content of external files which specified with-poption.-p <external-filepath>,--path <external-filepath>- specify anexternal-filepathto import external files. Base will parse content of that file, extract table data on it, and parse the tables.-a <additional-key-value>,--additional <additional-key-value>- specify additional meta data you want to add whole the file you import. the value must be include colon (”:”) betweenkey nameandvalue string. for instance, if you want to import and join an external file for only “test” data type files, you should specify like-x dataType:test.--extract- with this option, only extract the content of external file, dose not link and update with existing tables. you can specify output path with-eand-ooptions to get extract results.--export <export-file-type>- if you want to convert extract results into CSV, you can specify CSV as export-file-type.--output <output-filepath>- specify output-filepath to save dataset file. default is “./{external-filepath}_Table{number}.csv”
--estimate-rule- with this option, only estimate the joining rule from existing tables and external files which specified with-poption, dose not link and update with existing tables.
Example: Import png files on project “mnist”
# after you download mnist data files based on Tutorial1
$ base import mnist --directory ~/dataset/mnist --extension png --parse "{dataType}/{label}/{id}.png"
Output
Check datafiles...
found 70000 files with png extension.
70000/70000 files uploaded.
Success!
Example: Import external csv file on project “mnist”
# download external csv
$ curl -SL https://raw.githubusercontent.com/youkaichao/mnist-wrong-test/master/wrongImagesInMNISTTestset.csv > ~/Downloads/wrongImagesInMNISTTestset.csv
$ base import mnist --external-file --path ~/Downloads/wrongImagesInMNISTTestset.csv -a dataType:test
Output
1 tables found!
now estimating the rule for table joining...
1 table joining rule was estimated!
Below table joining rule will be applied...
Rule no.1
key 'index' -> connected to 'id' key on exist table
key 'originalLabel' -> connected to 'label' key on exist table
key 'correction' -> newly added
1 tables will be applied
Table 1 sample record:
{'index': 8, 'originalLabel': 5, 'correction': '-1'}
Do you want to perform table join?
Base will join tables with that rule described above.
'y' will be accepted to approve.
If you need to modify it, please enter 'm'
Definition YML file with estimated table join rules will be downloaded, then you can modify it and apply the new join rule.
Enter a value: y
Success!
If you enter 'm', definition YAML file with estimated table join rules will be downloaded.
You can modify this file and execute the commands displayed in the terminal to apply the new join rule.
Do you want to perform table join?
Base will join tables with that rule described above.
'y' will be accepted to approve.
If you need to modify it, please enter 'm'
Definition YML file with estimated table join rules will be downloaded, then you can modify it and apply the new join rule.
Enter a value: m
Downloaded a YAML file 'joinrule_definition_mnist.yml' in current directory.
Key information for the new table and the existing table is as follows.
===== New Table1 =====
KEY NAME VALUE RANGE VALUE TYPE RECORDED COUNT
'index' 8 ~ 9850 int('index') 74
'originalLabel' 0 ~ 9 int('originalLabel') 74
'correction' -1 ~ 8or9 str('correction') 74
'dataType' test ~ test str('dataType') 74
===== Existing Table =====
KEY NAME VALUE RANGE VALUE TYPE RECORDED COUNT
'id' 0 ~ 59999 str('id') 70000
'label' 0 ~ 9 str('label') 70000
'dataType' test ~ train str('dataType') 70000
You can apply the new join-rule according to 2 steps.
1. Modify the file 'joinrule_definition_mnist.yml'. Open the file to see a detailed description.
2. Execute the following command.
base import mnist --external-file --additional dataType:test --join-rule joinrule_definition_mnist.yml
Success!
joinrule_definition_mnist.yml
RequestedTime: 1654257223.4988642
ProjectName: mnist
Body:
Table1:
FilePath: /Users/user/Downloads/wrongImagesInMNISTTestset.csv
JoinRules:
index: id
originalLabel: label
correction:
dataType: dataTypeNew join rules can be defined by modifying the Body/Table/JoinRules section.
Fundamentally, this section consists of Key-Value Pairs. Key is the key name from the new table extracted from the external file. Value is the key name from the existing table.
How to define join rules.
if you have same key on the new table and the existing table, write like this.
'New table key': 'Existing table key'if you have new value on the existing key, write like this.
'New table key': 'ADD:Existing table key'if you have new key, no need to specify anything.
'New table key': For example:
JoinRules:
first_name: name
age: ADD:Age
height:- "first_name: name" means to join the new key named "first_name" with the existing key named "name".
- "age: ADD:Age" means to add new values of the new key named 'age' on the existing key named 'Age'.
- "height: " means to add the key named "height" as a new key.
Invite collaborators into your Base project.
Synopsis
usage: base invite project [-m <member-id>] [-p <permission-level>] [-u]
positional arguments:
project your project name to invite.
Description
This command control access into your project.
You can invite new project member as below permission level.
Viewer: only read meta data on project database. viewer can not import data files or external files and can not control permission of other members.Editor: can read and write meta data into project database. editor can not control permission of other members.Admin: can read and write meta data into project database. admin can also control permission of other members, but can not transferOwnerpermission level.
And also you can update member’s permission level with -u option, if you are admin or owner.
If you are the project owner and try update other member’s permission to Owner , the member will become project owner and your permission will be downgraded to Admin .
Owner: can transferownerpermission to others, and delete project completely.
Options
-m <member-id>,--member <member-id>- specifymember-idto invite. if you will be invited by others, you have to tell him/her your user id.-p <permission-level>,--permission <permission-level>--u,--update-
Example: Invite an viewer into mnist
check current your project members on mnist with [base show <project> --member-list](#show) command
$ base show mnist --member-list
Output
project Members
===============
xxxx@yyyy.com (Owner, invited at 2022-03-11 18:18:54)
then, invite zzzz@yyyy.com into mnist as an viewer
$ base invite mnist --member zzzz@yyyy.com --permission viewer
Output
Successfully invited zzzz@yyyy.com into mnist as Viewer
finally, you can check the invited user in project member list .
$ base show mnist --member-list
Output
project Members
===============
xxxx@yyyy.com (Owner, invited at 2022-03-11 18:18:54)
zzzz@yyyy.com (Viewer, invited at 2022-03-12 13:45:04)
Example: Update project member’s permission
check current your project members
$ base show mnist --member-list
Output
project Members
===============
xxxx@yyyy.com (Owner, invited at 2022-03-11 18:18:54)
zzzz@yyyy.com (Viewer, invited at 2022-03-12 13:45:04)
then, update permission of zzzz@yyyy.com to editor
$ base invite mnist --update --member zzzz@yyyy.com --permission editor
Output
Successfully update zzzz@yyyy.com's permission to Editor
finally, you can check the updated user permission in project member list .
$ base show mnist --member-list
Output
project Members
===============
xxxx@yyyy.com (Owner, invited at 2022-03-11 18:18:54)
zzzz@yyyy.com (Editor, invited at 2022-03-12 13:45:04)
Link path to data files on local computer and meta data on Base project.
Synopsis
usage: base link project [-d <datafiles-dirpath>] [-e <datafile-extension>]
positional arguments:
project your invited project name to link data files.
Description
This command will link data files and meta data records on Base project.
After invitation to project, invited collaborators have to link their data files on local computer.
The data files often locate in different directory with the project owner, and sometimes in different directory name or file name.
So Base will create linker to match local file paths and this enable your collaborators to share the python script which load local file.
Options
-d <datafiles-dirpath>,--directory <datafiles-dirpath>- specify adatafiles-dirpathto load data files which have an extension specified with -e option. Base will search recursively.-e <datafile-extension>,--extension <datafile-extension>- specify adatafile-extensionto filter the targets on load data files. if you have some extensions in one dataset (such as png and jpg), you have to split loading workflow.
Example: Link mnist data files into invited project
$ base link --directory ~/Downloads/mnist --extension png
then, you can search and export dataset as you want, or run python modeling script shared by other collaborators.
Output
Check datafiles...
found 70000 files with png extension.
linked!
Show list of Base projects you can access.
Synopsis
usage: base list [--archived]
Description
This command will show you what project you can access.
You can check Project UID, your Role on the project (”Viewer”, “Editor”, “Admin” or “Owner”), whether the project is private or not and project created date.
Options
--archived- show archived projects
Example: Check projects (not archived)
if you have project “mnist”,
$ base list
Output
projects
========
[mnist]
Project UID: abcdefghij0123456789
Role: Owner
Private Project: yes
Created Date: 2022-03-11 18:18:54
Example: Check project (archived)
if you have archived project “fashion-mnist”,
$ base list --archived
Output
projects
========
[fashion-mnist]
Project UID: klmnopqrst0123456789
Role: Owner
Private Project: yes
Created Date: 2022-03-16 01:38:29
Note: you can archive your projects with
base rm <project>command.
Create a new Base project.
Synopsis
usage: base new project
positional arguments:
project your project name to create.
Description
This command will create database table for meta data.
- issue 20 characters as
project unique id (Project UID)and create tables. - save Project UID at
~/.base/projectsfile on your local computer. - you can use
project nameas alias to Project UID with any Base command.
Example
$ base new mnist
Output
Your Project UID
----------------
abcdefghij0123456789
save Project UID in local file (~/.base/projects)
then, project uids will save on ~/.base/projects .
$ cat ~/.base/projects
[xxxx@yyyy.com]
mnist = abcdefghij0123456789
Note: your user id is saved in Global section
Archive or completely Delete your Base projects.
Synopsis
usage: base rm project [--confirm] [-m <member-id>]
positional arguments:
project your project name to archive or delete.
Description
This command provide you the way to remove project member and archive or delete your project.
If you specify -m option, you can remove a project member from the project.
If not, Base will archive or delete specified project.
For prevention unexpected delete, we suppose you to archive project not delete.
If not deleted, you can restore your archived projects.
Note: Delete project action can be performed only by project owner.
Options
--confirm- delete archived project completely (only Owner user)-m <member-id>,--member <member-id>- specify amember-idto remove from project. you can see your project member list with[base show <project> --member-list](#show)command.
Example: Remove project member
check your project members on mnist with [base show <project> --member-list](#show) command
$ base show mnist --member-list
Output
project Members
===============
xxxx@yyyy.com (Owner, invited at 2022-03-11 18:18:54)
zzzz@yyyy.com (Editor, invited at 2022-03-12 13:45:04)
then, remove zzzz@yyyy.com from mnist
$ base rm mnist --member zzzz@yyyy.com
Output
zzzz@yyyy.com was removed from mnist
finally, you can check the removed user not in project member list .
$ base show mnist --member-list
Output
project Members
===============
xxxx@yyyy.com (Owner, invited at 2022-03-11 18:18:54)
Example: Archive mnist project
$ base rm mnist
Output
mnist was Archived
then, you can check whether the project was archived with [base list](#list) command.
Example: Delete mnist project
$ base rm mnist --confirm
Output
mnist was Deleted
then, you can check whether the project was deleted with [base list](#list) command.
$ base list --archived
Output
projects
========
Note: if you delete the project once, you can not restore its saved data forever.
Search data files and export it based on meta data of Base project.
Synopsis
usage: base search project [-q <query-condition>] [-c <value-conditions>]
[-e <export-file-type>] [-o <output-filepath>] [-s]
positional arguments:
project your project name to search.
Description
This command provide you search engine for data files.
You can search some words in meta data with -c option, or set filter with -q option.
And also you can export as JSON or CSV with -e and -o options.
Note: if you have same values on different keys, condition filter will be confused and return a result you have not expected. for secure filtering, you should specify key name with query option if some values duplicated in over 2 keys.
Options
-
-q <query-condition>,--query <query-condition>- specifyquery-conditionto filter the data files based on meta data. you can use various operators and specify multiplequery-condition.[query grammar] {KeyName} {Operator} {Values} - add 1 spaces each section - don't use space any other >>> sample query condition: CategoryName == airplane [operators] - == : equal - != : not equal - >= : greater than - <= : less than - > : greater - < : less - is : missing value (only 'None' is allowed as Values, ex. query='correction is None') - is not : any value (only 'None' is allowed as Values, ex. query='correction is not None') - in : inner list of Values - not in : outer list of valuesNote: you have to follow query grammar.
-
-c <value-conditions>,--conditions <value-conditions>- specifyvalue-conditionsto filter by meta data value. this is so powerful mode because you do not have to know the KeyName of meta data.- if you specify multiple value of one meta data key, Base will return you the union of the values.
- ex.) if you specify "airplane,automobile” and both of them in same meta data key ”CategoryName”, Base will Interpret as “CategoryName is airplane or automobile”.
- if you specify multiple value of different meta data keys, Base will return you the intersection of the values.
- ex.) if you specify "airplane,2007” and one of them in meta data key ”CategoryName”, and other in “Timestamp”, Base will Interpret as “CategoryName is airplane and also Timestamp is 2007”.
- and you can combine these behaviors.
- ex.) if you specify "airplane,automobile,2007” and two of them in same meta data key ”CategoryName”, and one in “Timestamp”, Base will Interpret as “(CategoryName is airplane and also Timestamp is 2007) or (CategoryName is automobile and also Timestamp is 2007)”.
[conditions grammar] "{Value1},{Value2},..." - separate with comma >>> sample conditions: "airplane,automobile"Note: you have to follow conditions grammar.
- if you specify multiple value of one meta data key, Base will return you the union of the values.
-
-e <export-file-type>,--export <export-file-type>- if you want to convert search results into JSON or CSV, you can specify JSON or CSV asexport-file-type. -
-o <output-filepath>,--output <output-filepath>- specifyoutput-filepathto save dataset file. default is “./dataset.json” or “./dataset.csv” -
-s,--summary- summarize result and hide detail output
Example: Search mnist with value conditions
$ base search mnist --conditions "train" --query "label in ['1','2','3']"
Output
18831 files
========
'/home/xxxx/dataset/mnist/train/1/42485.png'
...
Example: Search mnist and export as JSON
$ base search mnist --conditions "test" --query "correction != -1" --export JSON --output ./dataset.json
Output
9963 files
========
'/home/xxxx/dataset/mnist/test/7/3329.png'
...
$ cat ./dataset.json
{
"Data": [
{
"FilePath": "/home/xxxx/dataset/mnist/test/7/3329.png",
"label": "7",
"dataType": "test",
"id": "3329"
},
...
]
}
Show detail information about your Base project.
Synopsis
usage: base show project [--member-list]
positional arguments:
project your project name to show detail.
Description
This command will show you what meta data in your project.
Each meta data has KeyName (like “CategoryName”) and KeyHash for specify the meta data if you changed KeyName.
The structure of returns likes below.
{
"KeyHash": String,
"KeyName": String,
"RecordedCount", Number,
"Creator": String,
"LastEditor": String,
"EditerList": List,
"ValueHash": String,
"ValueType": String,
"UpperValue": String,
"LowerValue": String,
"UniqueValues": String,
"CreatedTime": String of unix time,
"LastModifiedTime": String of unix time
}
Options
--member-list- show project members
Example: Show mnist project
$ base show mnist
Output
project mnist
===============
You have 70000 records with 4 keys in this project.
[Keys Information]
KEY NAME VALUE RANGE VALUE TYPE RECORDED COUNT
'id','index' 0 ~ 59999 str('id'), int('index') 70000
'correction' 0or6 ~ -1 str('correction') 74
'label','originalLabel' 0 ~ 9 str('label'), int('originalLabel') 70000
'dataType' test ~ train str('dataType') 70000
...
Example: Show mnist project members
$ base show mnist --member-list
Output
project Members
===============
xxxx@yyyy.com (Owner, invited at 2022-03-11 18:18:54)
zzzz@yyyy.com (Editor, invited at 2022-03-12 13:45:04)