This project is a cost-efficient prototype for deriving deep business insights from customer journey logs using OpenAI's GPT API. It provides a backend (FastAPI) and a minimal Streamlit-based frontend to upload JSON-formatted session data and generate actionable insights for product managers, analysts, and business leaders.
- Automatically understand user behavior from complex e-commerce session data.
- Help PMs and analysts discover conversion blockers, engagement trends, and missed product opportunities.
- Do this affordably using OpenAI APIs by applying smart token-optimization and data summarization techniques.
- Accepts
customer_journeys.jsonfile via Streamlit file uploader - Backend reads and validates it using FastAPI
My prompt is specifically engineered to extract useful observations, such as:
- Behavioral patterns: what successful users do differently
- Drop-off points: where users lose interest
- Search quality & gaps: which searches succeed vs. frustrate
- Cart behavior: which products are added but not purchased
- Conversion insights: device, category, pricing influences
- Product trends: top viewed vs. top purchased discrepancies
- Customer hesitation signals: re-visits, search loops, time spent
These are directly rooted in the schema of the JSON: activity_type, conversion, cart_value, search_query, etc.

- Simple Streamlit-based frontend
- Displays GPT output in Markdown block
- Slider for sample size control
- The UI is minimal and task-focused
- Not styled or branded — uses stream-lit for quick UI interface
- FastAPI: Efficient, async-ready backend
- OpenAI: LLM insights engine
- Streamlit: Lightweight, no-boilerplate UI
- Python: Simple for data processing
"You're reviewing {sample_size} customer journeys..."
The prompt intentionally avoids generic phrasing and instead mirrors how a human business analyst would review a spreadsheet or analytics dashboard. It explicitly:
- Aligns to JSON structure (
activity_type,search_query,conversion, etc.) - Requests observations in bullet format
- Frames GPT as a collaborator instead of a content generator
- Encourages practical, non-obvious business insights
It covers all six insight areas from the assignment plus bonus ones:
- Price sensitivity
- Product performance
- Category drop-off
- Balanced sampling: Equal number of converted and abandoned sessions selected (
select_balanced_sample). This improves signal and reduces hallucination. - Pre-summarization: We transform long activity logs into compact summaries with session metadata before sending to GPT (
json_summary). - Token limit control: Sessions are sliced to stay under 4K tokens.
- Environment-based config: API keys are securely loaded via config file or environment.
- Pre-processes JSON into summaries (e.g., flow, session duration, avg duration)
- Filters out noise (no raw HTML or huge payloads sent to GPT)
- Uses
sample_sizeslider with default = 10 sessions - Uses balanced sampling to select a mix of converted and abandoned sessions for richer signal
- OpenAI GPT-4-turbo: Fast, cost-effective LLM
- FastAPI: Scalable, clean Python web framework
- Streamlit: Ideal for quick UI without HTML/JS
- Python + JSON: Native match to assignment's format
--
A successful run should:
- Can run and analyze real
customer_journeys.json - Return GPT-generated insights clearly tied to what’s in the data
- Respect cost and prompt constraints
- Avoid generic fluff — insights should be actionable and reflect real journey patterns (e.g., "high cart values on mobile drop at checkout").
- GPT insights reflect actual patterns in the sessions, not generic advice
- Cost remains low (<1000 tokens typical)
- Frontend works locally via Streamlit
- Reviewer sees clearly how AI, backend, and UI connect
The combination of LLM + domain-aligned prompt + session sampling + activity summarization = high-quality analysis at low cost.
- No authentication or rate limiting
- Prompt + analysis assumes English input and US market patterns
- UI is functional, not styled (per brief)
- Single-shot prompt approach
Gali-ai/
├── app/
│ ├── main.py # FastAPI backend
│ ├── analyzer.py # GPT-based insight engine
│ ├── config.py # Secure API key loader
├── streamlit_app.py # Upload & UI
├── requirements.txt # Dependencies
├── config.json # (local only) OpenAI key
├── customer_journeys.json # Input example
pip install -r requirements.txtSave this file as config.json in the root:
{
"OPENAI_API_KEY": "sk-..."
}uvicorn app.main:app --reloadstreamlit run streamlit_app.py