Skip to content

marcus-exe/bigdata-infra-suite

Repository files navigation


🧪 Big Data Infrastructure Test Suite

This repository documents my experiments and tests with a wide range of Big Data technologies and infrastructure components. The goal is to understand how these tools work individually and together within modern data platforms.


🛠️ Technologies Explored

  • Apache Hadoop – Distributed storage and processing
  • Apache Hive – Data warehousing on top of Hadoop
  • PostgreSQL – Relational database for metadata and integration
  • Apache Spark – Fast, in-memory data processing engine
  • Apache Kylin – OLAP engine for real-time analytics
  • Apache HBase – NoSQL database on HDFS
  • Apache Kafka – Distributed messaging and streaming platform
  • Cloudflare – Edge networking and security (DNS, caching, etc.)
  • Docker Compose – Local container orchestration
  • Kubernetes – Scalable container orchestration for cloud-native deployments
  • MinIO – High-performance, S3-compatible object storage designed for large-scale data infrastructure and cloud-native environments
  • Kubernetes – Scalable container orchestration for cloud-native deployments

About

Big Data Infrastructure Test Suite

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published