Skip to content

Latest commit

 

History

History
53 lines (37 loc) · 2.59 KB

README.md

File metadata and controls

53 lines (37 loc) · 2.59 KB

View article View on YouTube

Data Science Cookie Cutter

Why?

It is important to structure your data science project based on a certain standard so that your teammates can easily maintain and modify your project.

This repository provides a template that incorporates best practices to create a maintainable and reproducible data science project.

Tools used in this project

  • hydra: Manage configuration files - article
  • pdoc: Automatically create an API documentation for your project
  • pre-commit plugins: Automate code reviewing formatting
  • Poetry: Dependency management - article
  • uv: Ultra-fast Python package installer and resolver
  • pip: Traditional Python package installer

How to use this project

Install Cookiecutter:

pip install cookiecutter

Create a project based on the template:

cookiecutter https://github.com/khuyentran1401/data-science-template

You will be prompted to choose your preferred dependency manager:

  • poetry: Modern Python package and dependency manager
  • uv: Ultra-fast Python package installer and resolver
  • pip: Traditional Python package installer

Book: Production-Ready Data Science

Want to learn more about building production-ready data science projects? Check out my upcoming book:

Production Ready Data Science: From Prototyping to Production with Python

The book will cover:

  • Best practices for structuring data science projects
  • Tools and techniques for reproducible research
  • Deploying and monitoring machine learning models
  • And much more!

Sign up now to receive the first 3 chapters for free! You'll also be notified when the full book becomes available.

Other Resources: