|
1 | 1 | # DataFusion-SQLancer
|
2 |
| -This is DataFusion's implementation of SQLancer on SQLancer's testing framework. See the original [README](https://github.com/sqlancer/sqlancer). |
3 |
| -`datafusion-sqlancer` has found ~10 [bugs](https://github.com/apache/datafusion/issues?q=is%3Aissue+label%3Abug+sqlancer+) with only a small subset of SQL features implemented. |
| 2 | +This is [DataFusion](https://github.com/apache/datafusion)'s implementation of [SQLancer](https://github.com/sqlancer/sqlancer) on SQLancer's testing framework. |
| 3 | + |
| 4 | +`datafusion-sqlancer` has found ~30 [bugs](https://github.com/apache/datafusion/issues?q=is%3Aissue+label%3Abug+sqlancer+) in DataFusion. |
4 | 5 | # Overview
|
5 | 6 | SQLancer (Synthesized Query Lancer) is a tool to automatically test Database Management Systems (DBMS) in order to find logic bugs in their implementation. It's a black box fuzzer which performs SQL-level testings.
|
6 | 7 |
|
@@ -88,11 +89,24 @@ Notes for query generation:
|
88 | 89 | - `generateAndTestDatabase()` is a "driver" method to create random databases, generate queries, and finally test results.
|
89 | 90 | - `DataFusionExpressionGenerator.java` includes top-down expression generation logic.
|
90 | 91 | - `test/DataFusionNoRECOracle.java` contains the final result check for generated queries.
|
91 |
| -# Supported Features |
92 |
| -- SQL Features: `SELECT`, `FROM`, `WHERE` |
93 |
| -- Operators: Numeric, Comparison, Logical |
94 |
| -- Scalar Functions: Numeric Scalar Functions |
95 |
| -- SQLancer Test Oracles: `NoREC`, `TLP-Where` |
| 92 | +# Supported SQL Features |
| 93 | +- `JOIN`s, `ORDER BY`, `WHERE` |
| 94 | +- Numeric scalar functions/expression operators |
| 95 | +- String scalar functions/expression operators |
| 96 | +- Aggregate functions, `HAVING` clause |
| 97 | +- Window functions |
| 98 | +- (TODO) Time related data type functions |
| 99 | +- (TODO) Subquery |
| 100 | +- (TODO) Queries from parquet, csv |
| 101 | +- (TODO) Exploit different configurations (change config knobs like `target_partition`, `prefer_hash_join` etc. |
| 102 | +# Supported Test Oracles |
| 103 | +Note: most oracles only apply to a subset of available query types, for advanced SQL features like window functions we can only generate random queries and report crashes. |
| 104 | +More context for below test oracles at https://github.com/sqlancer/sqlancer/tree/main |
| 105 | +- NoREC |
| 106 | +- TLP |
| 107 | +- (TODO) PQS |
| 108 | +- (TODO) DQP for logical bugs in joins |
| 109 | +- (TODO) [EET](https://www.usenix.org/conference/osdi24/presentation/jiang#:~:text=To%20find%20logic%20bugs%20in,is%20independent%20of%20query%20patterns.) for logic bugs in joins and subqueries |
96 | 110 | # Bug Report
|
97 | 111 | If any bug is found by `SQLancer`, it will print a full reproducer to terminal output, and also writes to `logs/datafusion_custom_log/error_report.log`.
|
98 | 112 | 1. Then, first verify the bug with latest `datafusion` main branch.
|
|
0 commit comments