🧪 Big Data Infrastructure Test Suite

This repository documents my experiments and tests with a wide range of Big Data technologies and infrastructure components. The goal is to understand how these tools work individually and together within modern data platforms.

🛠️ Technologies Explored

Apache Hadoop – Distributed storage and processing
Apache Hive – Data warehousing on top of Hadoop
PostgreSQL – Relational database for metadata and integration
Apache Spark – Fast, in-memory data processing engine
Apache Kylin – OLAP engine for real-time analytics
Apache HBase – NoSQL database on HDFS
Apache Kafka – Distributed messaging and streaming platform
Cloudflare – Edge networking and security (DNS, caching, etc.)
Docker Compose – Local container orchestration
Kubernetes – Scalable container orchestration for cloud-native deployments
MinIO – High-performance, S3-compatible object storage designed for large-scale data infrastructure and cloud-native environments
Kubernetes – Scalable container orchestration for cloud-native deployments

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
apache-airflow-clickhouse-sample		apache-airflow-clickhouse-sample
apache-airflow		apache-airflow
apache-spark		apache-spark
clickhouse+grafana		clickhouse+grafana
hadoop		hadoop
iceberg-minio		iceberg-minio
k8-minio		k8-minio
k8s		k8s
kafka		kafka
kylin		kylin
minIO+spark+iceberg		minIO+spark+iceberg
postgre+hive		postgre+hive
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧪 Big Data Infrastructure Test Suite

🛠️ Technologies Explored

About

Uh oh!

Releases

Packages

Languages

marcus-exe/bigdata-infra-suite

Folders and files

Latest commit

History

Repository files navigation

🧪 Big Data Infrastructure Test Suite

🛠️ Technologies Explored

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages