Skip to content
View benitomartin's full-sized avatar
💭
Book a Consultation on my Website ☎️
💭
Book a Consultation on my Website ☎️

Organizations

@lewagon @mlops-club @Real-World-ML @end-to-end-mlops-databricks

Block or report benitomartin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
benitomartin/README.md

 

⬇️ You can find me here! ⬇️

LinkedIn Badge Website Badge Qdrant Badge Datacamp Badge
Datacamp Badge Datacamp Badge Zilliz Badge Medium Badge
 

 

🎵 While you're here, why not enhance your visit with a melodious twist? Tune into this enchanting Spanish AI Song. A perfect blend of technology and art. Enjoy the vibes! 🎷🎶

🌟 Crafting each piece of content is a journey that demands both time and passion. If you enjoy my work, consider supporting me on GitHub Sponsors 🚀

💰 My website has been created using Hostinger. If you want to create your own one, using this referral code will provide you 20% discount on the selected plan 💶

👉 CONTACT ME! 👉 Book a Consultation, or use this Form 🚀

 

🏛️ Trusted by Companies

😎 My Profile

I am passionate, innovative and dynamic Data Scientist that provides a diverse range of services, including project development, teaching, workshops, technical writing, and career coaching. My skill set includes (not limited to):

 

  • Data Science, Analytics & ML: Python, NumPy, Pandas, TensorFlow, PyTorch, Scikit-learn, XGBoost, LightGBM
  • AI: Langchain, LlamaIndex, Hugging Face, Transformers, Vector Databases
  • Key Domains: Regression, Classification, NLP, LLM, RAG, Computer Vision, Neural Networks, Ensemble Methods, Clustering, Dimensionality Reduction
  • Data Engineering: dbt, Terraform, SQL, BigQuery, PySpark, Databricks
  • MLOps: MLflow, Prefect, Comet, Docker, Kubernetes
  • DevOps: Pydantic, Ruff, Poetry, uv, Conda, Linux, Pytest, Pre-commit, Coverage, CI/CD
  • APIs: Flask, FastAPI
  • Apps: Streamlit, Gradio
  • Cloud Platforms: GCP, AWS
  • Version Control: Git

 

✍️ Blogs

I write about Data Science, Machine Learning, and AI, covering topics such as end-to-end applications, LLMs, Retrieval-Augmented Generation (RAG), and optimization techniques. You can find my blogs on the following platforms:

Some of my blogs have been highlighted in the LlamaIndex Newsletter, GKE Newsletter, and the MLOps community: 

 

👨‍💻 Projects Portfolio

💰 My end-to-end projects can be found in these repositories. Feel free to click ⭐ if you like them 😎

 

Project Name Main Libraries/Tools Cloud Service App DevOps Best Practices
ML/MLOps
MLOps Credit Default Scikit-learn
LightGBM
MLflow
Databricks
AWS/Databricks Experiment Tracking
Model Registry
Model/Data Monitoring
Data Validation
Linting
Formatting
Testing
Error Handling
Pre-Commit
IaC
CI/CD
Medical Insurance Costs Prediction Scikit-learn
TensorFlow
Amazon SageMaker
AWS Lambda
Comet ML
Flask
AWS Experiment Tracking
Model Registry
Model/Data Monitoring
Model/Data Monitoring
Linting
Formatting
Testing
Error Handling
Coverage
CI/CD
Stroke Prediction Scikit-learn
XGBoost
Amazon SageMaker
AWS Lambda
Amazon ECR
Comet ML
Flask
Docker
AWS Experiment Tracking
Model Registry
Model Monitoring
Containerization
Testing
Error Handling
Car Price Prediction Scikit-learn
TensorFlow
MLFlow
Prefect
Flask
Amazon ECR
AWS Lambda
Docker
Grafana
Terraform
AWS Experiment Tracking
Model Registry
Model Monitoring
Orchestration
Containerization
Linting
Formatting
Testing
Error Handling
IaC
CI/CD
Serverless API with AWS SAM AWS SAM
AWS Lambda
AWS API Gateway
Docker
AWS Containerization
Error Handling
IaC
Taxi Rides Prediction Scikit-learn
TensorFlow
MLFlow
Prefect
FastAPI
Docker
GCP Experiment Tracking
Model Registry
Model Monitoring
Orchestration
Containerization
Error Handling
Music Clustering Scikit-learn
FastAPI
Amazon ECR
Docker
AWS Streamlit Containerization
CI/CD
Birds Classification Pytorch Gradio
Food Prediction Scikit-learn
TensorFlow
OpenCV
FastAPI
Docker
GCP Streamlit Containerization
LLM, RAG and Fine-tuning
Serverless GenAI API with FastAPI, AWS, and CircleCI OpenAI
AWS SAM
AWS Lambda
AWS API Gateway
AWS Secrets Manager
FastAPI
CircleCI
AWS Linting
Formatting
Testing
Error Handling
CI/CD
RAG Hybrid Search and Semantic Caching Qdrant
FastEmbed
SPLADE
Hugging Face Transformers
Error Handling
Linting
Formatting
Multimodal Bill Scan System AWS Bedrock
AWS Lambda
AWS DynamoDB
AWS SQS/SNS
AWS CDK
Claude 3 Sonnet
AWS Error Handling
Linting
Formatting
IaC
IaC in RAG Applications with Terraform AWS Bedrock
LangChain
AWS Opensearch
AWS Secrets Manager
Terraform
Titan
AWS Testing
Error Handling
Linting
Formatting
IaC
Scalable RAG in AWS with Fargate OpenAI
LlamaIndex
Qdrant
AWS CDK
AWS Fargate
Amazon ECR
Amazon ECS
AWS Secrets Manager
FastAPI
AWS Testing
Error Handling
RAG Deployment with Azure Functions OpenAI
LangChain
Qdrant
Azure Functions App
Azure Linting
Formatting
Testing
Error Handling
Scalable RAG with Kubernetes OpenAI
LlamaIndex
Qdrant
Docker
FastAPI
GKE
GCP Streamlit Containerization
Linting
Formatting
Testing
Error Handling
CI/CD
Research Papers Semantic Search OpenAI
LangChain
Qdrant
Docker
Amazon ECR
AWS Lambda
AWS API Gateway
AWS Streamlit Containerization
Linting
Formatting
Testing
Error Handling
Video Summarization Hugging Face Transformers
Whisper
Langchain
ChromaDB
Streamlit Error Handling
Multimodal RAG with Video Frames Gemini
LlamaIndex
Qdrant
Books Reranking Semantic Search OpenAI
LlamaIndex
Deep Lake
RAG Evaluation with Ragas OpenAI
Hugging Face Transformers
Faiss
LangChain
Ragas
PII RAG LlamaIndex Milvus OpenAI
Presidio
LlamaIndex
Milvus
Multimodal RAG with PyMuPDF OpenAI
Qdrant
LlamaIndex
PyMuPDF
Agentic RAG LlamaIndex Milvus OpenAI
Claude
LlamaIndex
Milvus
Agentic RAG with LangChain OpenAI
Groq
LangChain
Pinecone
Agentic RAG with CrewAI OpenAI
LangChain
Qdrant
CrewAI Agents
Fine Tuning Gemma 2B Hugging Face Transformers
PEFT (LoRA/QLoRA)
Hugging Face
Data Analysis + Modeling
News Classification Scikit-learn (Multinomial Naive Bayes)
Tensorflow (CNN, RNN, feedforward)
Streamlit
Breast Cancer Classification Scikit-learn
Spark
IBM
Bank Churn Classification Scikit-learn
LightGBM
XGBoost
CatBoost
Data Engineering
Hotel Reviews Prefect
Spark
SQL
BigQuery
dbt
Terraform
Looker
GCP Orchestration
Linting
Formatting
Error Handling
Pre-Commit
IaC
CI/CD
Air Quality Switzerland Mage
dbt
SQL
BigQuery
Docker
Terraform
Looker
GCP Orchestration
IaC
Containerization
CI/CD
Miscellaneous
Gradio Application with Descope Authentication Flask
Descope
Gradio Error Handling
Justicio Web Scraping Beautiful Soup
MySQL
Error Handling
Docker + uv Benchmark Docker
FastAPI
uv
Error Handling
AWS S3 Buckets Deletion AWS S3 AWS Error Handling

 

🧮 Tech Stack

Visual Studio Code HTML5 CSS3 Jupyter Notebook MySQL SQLite Looker Studio Python Pandas NumPy Plotly Matplotlib Databricks Spark scikit-learn TensorFlow PyTorch Hugging Face OpenAI FastAPI Flask Docker Kubernetes Anaconda Linux Ubuntu Google Cloud AWS Terraform Prefect dbt MLflow GitHub Actions Git Streamlit

Pinned Loading

  1. crewai-rag-langchain-qdrant crewai-rag-langchain-qdrant Public

    Agentic RAG with Langchain, Qdrant and CrewAI

    Jupyter Notebook 53 8

  2. multimodal-llm-pymupdf4llm multimodal-llm-pymupdf4llm Public

    Multimodal LLM Application with PyMuPDF4LLM

    Jupyter Notebook 36 7

  3. rag-aws-qdrant rag-aws-qdrant Public

    Serverless Application with AWS Lambda and Qdrant for Semantic Search

    Python 16 3

  4. mlops-databricks-credit-default mlops-databricks-credit-default Public

    End-to-end MLOps Credit Default Project using DABs

    Python 17 14

  5. aws-bedrock-opensearch-langchain aws-bedrock-opensearch-langchain Public

    RAG Application with LangChain, Terraform, AWS Opensearch and AWS Bedrock

    Python 8 1

  6. rag-langchain-ragas rag-langchain-ragas Public

    RAG project for QA retrieval using LanChain and Ragas

    Jupyter Notebook 7 3