Google Partners Capabilities Assessment - Demo 1

This repository showcases an end-to-end TensorFlow pipeline to train a ML model for predicting taxi trip duration. The "Chicago Taxi Trips" Kaggle dataset was used for training and evaluation.

Getting started

To get your environment set up and start using this model, please follow the step-by-step instructions provided below.

Step 1: Clone the Repository

First, you'll need to clone this repository to your local machine or development environment. Open your terminal, navigate to the directory where you want to clone the repository, and run the following command:

git clone <repository-url> Replace <repository-url> with the actual URL of this repository. Once cloned, navigate into the repository's directory with cd <repository-name>.

Step 2: Create a .env File

Within the root directory of the cloned repository, create a .env file to store your project configurations. This file should include the following environment variables tailored to your project:

PROJECT_ID=<project_id>
REGION=<region> # example: us-central1

Make sure to replace with your specific project details.

Step 3: Install the Google Cloud CLI

To interact with Google Cloud resources, you need to install the Google Cloud Command Line Interface (CLI) on your system. Follow the detailed installation instructions provided in the official documentation here.

Step 4: Set Up Application Default Credentials

To be able to authenticate to Google Cloud services from your development environment, configure the Application Default Credentials (ADC) by following the guide here.

Step 5: Set up credentials file to use BigQuery

Follow the guide here, the section Create a service account key. To create a json file which stores the service account credentials.

Step 6: Install dependencies

Install python 3.9.5, using pyenv install 3.9.5
Install python dependencies with poetry install

Executing the pipeline

Executing the pipeline takes care of ingesting and transforming data, hyperparameter tuning and model training, evaluating the model and registering to the model registry, creating a Vertex endpoint, and deploying the model to make it available for predictions. To run the pipeline using Vertex AI use the following command:

python chicago_taxis/kubeflow_v2_runner.py

Evaluation and model analysis

When executing the pipeline there MAPE on the test set will be logged in a file in GCP. If the MAPE beats the current best (for the blessed model) then the trained model will be registered in GCP and the MAPE will be registered as well.

Making predictions

The prediction process is facilitated through the predict.py script by running:

python chicago_taxis/predict.py \
 --endpoint_id <endpoint_id> \
 --input_file chicago_taxis/data/prediction_sample.json

Contributing

Our project embraces a streamlined workflow that ensures high-quality software development and efficient collaboration among team members. To maintain this standard, we follow a specific branching strategy and commit convention outlined in our CONTRIBUTING.md file.

We highly encourage all contributors to familiarize themselves with these guidelines. Adhering to the outlined practices helps us keep our codebase organized, facilitates easier code reviews, and accelerates the development process. For detailed information on our branching strategy and how we commit changes, please refer to the CONTRIBUTING.md file.

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.github		.github
.vscode		.vscode
chicago_taxis		chicago_taxis
notebooks		notebooks
.flake8		.flake8
.gitignore		.gitignore
.python-version		.python-version
CONTRIBUTING.MD		CONTRIBUTING.MD
README.md		README.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Google Partners Capabilities Assessment - Demo 1

Table of contents

Getting started

Step 1: Clone the Repository

Step 2: Create a .env File

Step 3: Install the Google Cloud CLI

Step 4: Set Up Application Default Credentials

Step 5: Set up credentials file to use BigQuery

Step 6: Install dependencies

Executing the pipeline

Evaluation and model analysis

Making predictions

Contributing

About

Uh oh!

Releases

Packages

Contributors 7

Uh oh!

Languages

tryolabs/google-partners-demo-1

Folders and files

Latest commit

History

Repository files navigation

Google Partners Capabilities Assessment - Demo 1

Table of contents

Getting started

Step 1: Clone the Repository

Step 2: Create a .env File

Step 3: Install the Google Cloud CLI

Step 4: Set Up Application Default Credentials

Step 5: Set up credentials file to use BigQuery

Step 6: Install dependencies

Executing the pipeline

Evaluation and model analysis

Making predictions

Contributing

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Uh oh!

Languages

Packages