Skip to content

Files

Failed to load latest commit information.

Latest commit

 Cannot retrieve latest commit at this time.

History

History
 
 

generate-synthetic-database

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Generate Synthetic Sales Database with Genkit

Author(s) Aiko Zhao

This project leverages Firebase Genkit structured output alongside the Gemini 1.5 Pro model, to generate synthetic sales data for a fictional dog food company called Bone Appetit. The generated data is then stored in a Firestore database, row by row, for further analysis or AI/ML use in applications.

Prerequisites

  • Vertex AI for LLM
  • Cloud Functions for deployment
  • Firestore for database storage
  • Firebase for application

Overview

synthetic-database-diagram

  1. Setup:
  • Import necessary libraries for AI (GenKit), Firebase, and data handling.
  • Configure GenKit to use Google's Vertex AI and set logging preferences.
  1. Data Structures:
  • Order class: Defines the structure of each sales record (order ID, customer info, product, etc.).
  • menuItems: A list of dog food products and their prices.
  • BoneAppetitSalesDatabaseSchema: A strict schema (using Zod) to ensure generated data matches the expected format.
  1. Data Generation:

    • createBoneAppetitSalesRowSchema: This is a GenKit flow. It takes a product as input, prompts the Gemini 1.5 Pro model, and gets back structured JSON representing one sales record.
      • The prompt instructs the AI to create realistic data, including reviews that align with customer ratings.
    • rateLimitedRunFlowGenerator: This is a special function to control the pace of data generation. We don't want to overwhelm the AI or hit API limits. It yields Promises that resolve to new sales data, but with pauses if needed.
  2. Firestore Storage:

  • Batch write synthetic sales data to Firestore.

How to deploy to Cloud Functions

firebase deploy

How to test

curl -m 70 -X POST  [YourCloudFunctionHTTPEndpoint]   -H "Authorization: bearer $(gcloud auth print-identity-token)"   -H "Content-Type: application/json"