Skip to content
View mrobee's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report mrobee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mrobee/README.md

Hi, I'm Muhammad Robby 👋

🚀 Senior Data Engineer with 10+ years of experience designing, building, and optimizing large-scale data platforms across fintech, e-commerce, edtech, and global tech companies.

I specialize in transforming complex data challenges into scalable, cost-efficient, and reliable systems; from architecting multi-cloud platforms to implementing real-time streaming pipelines.


🛠 Tech Stack & Expertise

  • Languages: Python (Expert), SQL (Expert), Scala, Java, Golang
  • Data Engineering Tools: Apache Spark, Kafka, Apache Airflow, dbt, Debezium, GCP Dataflow, BigQuery, Dataproc
  • Cloud & Databases: Google Pub/Sub, AWS Lambda, AWS EMR, DynamoDB, Presto, Hive, Elasticsearch, MongoDB, PostgreSQL, MySQL
  • DevOps & CI/CD: GitLab CI, GitHub Actions, Terraform, Kubernetes, Docker, AWS CloudFormation

📌 Highlight Projects

(Selected to showcase scale, complexity, and measurable results)

End-to-End Data Pipeline Monitoring System

  • Stack: Airflow, GCP Dataflow, BigQuery
  • Impact: Reduced user-reported data errors by introducing proactive monitoring & alerting.
  • Result: Improved trust and reliability in production data pipelines.

Cost-Optimized Data Ingestion Architecture

  • Stack: Kafka, Spark, GCP
  • Impact: Redesigned ingestion process to cut costs by 50% while maintaining performance.
  • Result: Significant cloud cost savings without compromising throughput.

Machine Learning–Driven Credit Risk Scoring

  • Stack: Python, AWS, Scikit-learn
  • Impact: Deployed ML scoring engine to improve accuracy in credit risk assessments.
  • Result: Enabled better decision-making in loan approvals.

(More projects coming soon - stay tuned!)


🌍 Work Experience Snapshot

  • Senior Data Engineer - Leaf Grow (UK, Marketing Tech) – 2024–Present
  • Data Engineer Lead - Hijra Group (Indonesia, Bank & Fintech)
  • Senior Data Engineer - Zenius Education (Indonesia, EdTech)
  • Data Engineer Lead - Warung Pintar (Indonesia, Retail Distribution)
  • Data Engineer - Bukalapak (Indonesia, E-commerce)
  • Data Engineer - Provetic Indonesia (Consulting)

📫 Let's Connect


💡 “I design and optimize large-scale, cost-efficient data systems that empower businesses to make smarter, faster decisions.”

Pinned Loading

  1. airflow-spark airflow-spark Public

    Apache Airflow with Apache Spark cluster support

    Dockerfile 3

  2. Reroute Elasticsearch Unassigned Shards Reroute Elasticsearch Unassigned Shards
    1
    #!/usr/bin/sh
    2
    
                  
    3
    host=$1
    4
    port=$2
    5