DataButler

A python-based software for automatic data profiling and cataloging

It takes a lot of man-hours and energy to transform data dumps into forms that are more understandable and suitable for databases. The objective of DataButler is to model a framework that performs data profiling and cataloging, thereby providing more efficient search/discover functionalities.

Process Flow

Installation

pip install data_butler

#In case of missing English spacy model (en_core_web_sm)
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Directions

import data_butler

data_butler.db('file directory/filename.csv')

Authors

Keerthi Pullela, Rahul Madhu, Rukmini Sunil, Sagar Kurada, Xema Pathak

Acknowledgments

Professor Matthew Lanham, Krannert School of Management, Purdue University
Mike Lutz, Caleb Keller, Samtec Inc.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
LICENSE.txt		LICENSE.txt
Processflow.png		Processflow.png
README.md		README.md
Workflow.png		Workflow.png
__init__.py		__init__.py
data_butler.py		data_butler.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DataButler

A python-based software for automatic data profiling and cataloging

Process Flow

Installation

Directions

Authors

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

DataButler/Data-Butler

Folders and files

Latest commit

History

Repository files navigation

DataButler

A python-based software for automatic data profiling and cataloging

Process Flow

Installation

Directions

Authors

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages