Skip to content

Latest commit

 

History

History
66 lines (54 loc) · 4.97 KB

README.md

File metadata and controls

66 lines (54 loc) · 4.97 KB
description icon
Learn how to use Data Connector to query external data.
database

Data Connectors

Data Connectors provide connections to databases, data warehouses, and data lakes for federated SQL queries and data replication.

Supported Data Connectors include:

Name Description Protocol/Format
databricks (mode: delta_lake) [Databricks][databricks] S3/Delta Lake
delta_lake Delta Lake Delta Lake
dremio [Dremio][dremio] Arrow Flight
duckdb DuckDB Embedded
github GitHub GitHub API
postgres PostgreSQL
s3 [S3][s3] Parquet, CSV
mysql MySQL
delta_lake Delta Lake Delta Lake
graphql GraphQL JSON
databricks (mode: spark_connect) [Databricks][databricks] [Spark Connect][spark]
flightsql FlightSQL Arrow Flight SQL
mssql Microsoft SQL Server Tabular Data Stream (TDS)
snowflake Snowflake Arrow
spark Spark [Spark Connect][spark]
spice.ai [Spice.ai][spiceai] Arrow Flight
iceberg [Apache Iceberg][iceberg] Parquet
abfs Azure BlobFS Parquet, CSV
clickhouse Clickhouse
debezium Debezium CDC Kafka + JSON
dynamodb DynamoDB
ftp, sftp FTP/SFTP Parquet, CSV
http, https HTTP(s) Parquet, CSV
sharepoint Microsoft SharePoint Unstructured UTF-8 documents

Object Store File Formats

For data connectors that are object store compatible, if a folder is provided, the file format must be specified with params.file_format.

If a file is provided, the file format will be inferred, and params.file_format is unnecessary.

File formats currently supported are:

Name Parameter Supported Is Document Format
Apache Parquet file_format: parquet
CSV file_format: csv
Apache Iceberg file_format: iceberg Roadmap
JSON file_format: json Roadmap
Microsoft Excel file_format: xlsx Roadmap
Markdown file_format: md
Text file_format: txt
PDF file_format: pdf Alpha
Microsoft Word file_format: docx Alpha

File formats support additional parameters in the params (like csv_has_header) described in File Formats

If a format is a document format, each file will be treated as a document, as per document support below.

{% hint style="info" %} Note Document formats in Alpha (e.g. pdf, docx) may not parse all structure or text from the underlying documents correctly. {% endhint %}