Skip to content
View hienphan161's full-sized avatar

Block or report hienphan161

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hienphan161/README.md

Hey there! I'm Hien Phan.

Data Analyst | Machine Learning Engineer | MLOps Enthusiast

I'm a data professional with over 4 years of experience in roles including Data Analyst and Machine Learning Engineer. My passion lies in designing, building, and deploying scalable, end-to-end machine learning systems that solve real-world business problems.

  • 🔭 I specialize in MLOps, Generative AI, and building robust Data Engineering pipelines.
  • ☁️ My core expertise is within the Google Cloud Platform (GCP) ecosystem.
  • 🌱 I'm a detail-oriented problem solver, driven by the challenge of turning complex data into impactful, automated solutions.

🚀 What I'm Currently Exploring

I believe in continuous learning and am always seeking opportunities to expand my skill set. Right now, I'm particularly focused on:

  • Large-Scale Data Processing with PySpark.
  • Modern Data Transformation workflows using dbt.
  • Handling Unstructured Data with NoSQL databases like MongoDB.
  • Expanding my knowledge of multi-cloud data architectures.

🛠️ My Core Technical Toolkit

This is a snapshot of the technologies I use regularly and have hands-on experience with.

Category Technologies
Programming Languages Python, SQL, Kotlin, TypeScript, Bash Scripting
Data Science & ML Pandas, PyTorch, Hugging Face
MLOps Model Training & Evaluation, Testing (Pytest), API Serving (REST, gRPC)
Generative AI LangChain, LangSmith, Prompt Engineering, LLMs
Data Engineering Airflow, Kafka, Google Cloud Pub/Sub, Dimensional Modeling
Infrastructure & CI/CD Terraform, Kubernetes, Docker, GitLab, Jenkins, Git
Databases BigQuery, PostgreSQL, MySQL
Observability Prometheus, Grafana, Elasticsearch


📫 Reach me on

         


Pinned Loading

  1. diabetes-prediction diabetes-prediction Public

    Production-grade MLOps project demonstrating end-to-end machine learning operations: from model training to deployment with CI/CD, monitoring, and Kubernetes orchestration.

    Python

  2. titanic-challenge-pipeline titanic-challenge-pipeline Public

    A machine learning project predicting Titanic passenger survival through domain-driven feature engineering and Random Forest classification.

    Python

  3. chicago-fare-prediction chicago-fare-prediction Public

    An end-to-end machine learning pipeline for predicting taxi fares in Chicago using real-world data from the City of Chicago's open data portal.

    HTML

  4. skynet-transportation-institute skynet-transportation-institute Public

    A comprehensive ML pipeline for Boston rideshare transportation analysis, forecasting, and multi-objective optimization.

    Python

  5. documents-analysis-with-rag documents-analysis-with-rag Public

    AI-Powered Document Analysis — Generate, analyze, and query customer service tickets using advanced LLM techniques including RAG, structured extraction, and adaptive reranking.

    Python

  6. translation-multi-agent-collaboration translation-multi-agent-collaboration Public

    A simple translation system using multi-agent collaboration. Three AI agents work together to produce high-quality translations: Translator Agent, Editor Agent and Technical Reviewer Agent.

    Python