Skip to content
View dawsongzhao0523's full-sized avatar

Block or report dawsongzhao0523

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

数据工具

62 repositories

Support agile DataOps Based on Flink, DataX and Flink-CDC, Chunjun with Web-UI

Java 1,089 232 Updated Feb 25, 2025

The developer first cloud governance platform

Go 6,011 520 Updated Feb 24, 2025

A curated list of awesome ETL frameworks, libraries, and software.

3,354 348 Updated Jul 23, 2024

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such …

JavaScript 793 54 Updated Aug 10, 2022

Efficient data transformation and modeling framework that is backwards compatible with dbt.

Python 2,100 184 Updated Feb 25, 2025

Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.

Rust 1,533 125 Updated Jun 18, 2024

Addax is a versatile open-source ETL tool that can seamlessly transfer data between various RDBMS and NoSQL databases, making it an ideal solution for data migration.

Java 1,229 310 Updated Feb 21, 2025

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Java 982 124 Updated Feb 25, 2025

Hop Orchestration Platform

Java 1,072 357 Updated Feb 25, 2025

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

Go 746 155 Updated Jun 8, 2024

Mirror of https://gitlab.com/Rehket/febrl

Python 6 6 Updated Dec 8, 2022

SQL Tools ( Dialect, Pagination, DDL dump, UrlParser, SqlStatementParser, WallFilter, BatchExecutor for Test) based Java. it is easy to integration into any ORM frameworks

Java 311 64 Updated Feb 15, 2025

Extensible SQL Lexer and Parser for Rust

Rust 2,944 583 Updated Feb 25, 2025

[ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs

Python 416 30 Updated Jan 23, 2025

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

Java 1,249 406 Updated Feb 25, 2025

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

Python 2,127 209 Updated Jun 27, 2024

🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide better code/research plans 🧰 OpenAI, Anthropic, Ollama, etc s…

Python 1,233 60 Updated Feb 3, 2025

"AnyGraph: Graph Foundation Model in the Wild"

Python 206 18 Updated Sep 19, 2024

The Data Change Processing platform

Rust 1,051 36 Updated Feb 25, 2025

Big data computing platform based on Spark <至轻云-超轻量级大数据计算平台/数据中台>

Java 150 43 Updated Feb 25, 2025

新一代实时计算底座,计算性能超越flink/spark 100倍,XL-LightHouse是一套支持超大数据量、支持超高并发的通用型流式大数据统计系统【同时支持单机版】。常见的应用场景包括:PV、UV统计;电商销售额、下单用户数统计;日志量统计;接口调用量、异常量、耗时情况统计;服务器运维监控等功能,系统支持多维度统计,支持各种复杂的条件筛选和逻辑判断,一键部署,一行代码接入,轻松实现业务…

Java 291 40 Updated Feb 22, 2025

Convert Jupyter Notebooks to Web Apps

Python 4,139 261 Updated Dec 6, 2024

🔎 Open source distributed and RESTful search engine.

Java 10,246 1,962 Updated Feb 25, 2025

pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tidb.ai

TypeScript 2,339 129 Updated Feb 24, 2025

Ape Data Transfer Suite, written in Rust. Provides ultra-fast data replication between MySQL, PostgreSQL, Redis, MongoDB, Kafka and ClickHouse, ideal for disaster recovery (DR) and migration scenar…

Rust 323 41 Updated Feb 24, 2025

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

C++ 8,827 1,203 Updated Feb 23, 2025

SeekStorm - sub-millisecond full-text search library & multi-tenancy server in Rust

Rust 1,623 49 Updated Feb 15, 2025

Rust-powered, Dependency-Free DataFrame Library for Node.js

6 Updated Nov 28, 2024

Database diagrams editor that allows you to visualize and design your DB with a single query.

TypeScript 14,070 682 Updated Feb 24, 2025

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Go 3,787 162 Updated Sep 30, 2023