Skip to content

Commit 24bf225

Browse files
authored
Add async queries cookbook recipe (#357)
* Add async queries cookbook recipe Demonstrates the async queries API for long-running SQL queries in distributed cluster mode: - mTLS certificate setup with spice cluster tls - Scheduler and executor startup - Submit/poll/retrieve via HTTP REST API (curl examples) - spice query CLI and interactive REPL usage - Parameterized queries, timeouts, and size limits References: - https://spiceai.org/docs/features/async-queries-api - https://spiceai.org/docs/features/distributed-query * Update cookbook links to point to integrated distributed-query page * refactor(README): reorganize recipe sections and improve descriptions for clarity * fix(README): clarify Spice CLI version requirement and improve REPL commands table formatting
1 parent 5b687cc commit 24bf225

3 files changed

Lines changed: 406 additions & 55 deletions

File tree

README.md

Lines changed: 66 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -11,51 +11,56 @@ Welcome to the Spice.ai OSS Cookbook—a comprehensive collection of recipes for
1111
### Core scenarios
1212

1313
- [Federated SQL Query](./federation/README.md) - Query data from S3, PostgreSQL, and Dremio in a single query.
14+
- [Cayenne Data Accelerator](./cayenne/README.md)
15+
- [Async Queries](./async-queries/README.md) - Submit long-running SQL queries and retrieve results asynchronously.
16+
- [Hybrid-Search](./search/README.md) - Combine keyword and vector search for improved retrieval.
17+
- [AI SQL Function](./ai/README.md) - Use the `ai()` SQL function to invoke LLMs directly in SQL queries for text generation, sentiment analysis, and data enrichment.
1418

1519
### Sample Applications
1620

1721
- [Command Query Responsibility Segregation (CQRS)](./cqrs/README.md) - Sample application implementing the CQRS pattern with Spice.
1822

1923
### Models & AI - Connect data to hosted or local AI models
2024

21-
- [AI SQL Function](./ai/README.md) - Use the `ai()` SQL function to invoke LLMs directly in SQL queries for text generation, sentiment analysis, and data enrichment.
22-
- [Azure OpenAI Models](./azure_openai/README.md)
23-
- [Generative Visualizations](./generative-visualisations/README.md) - Generate SQL queries and Chart.js visualizations from natural language using AI.
24-
- [Running Llama3 Locally](./llama/README.md) - Use the Llama family of models locally from HuggingFace using Spice.
25+
- [AI SQL Function](./ai/README.md) - Invoke LLMs directly in SQL queries for text generation and data enrichment.
26+
- [Azure OpenAI Models](./azure_openai/README.md) - Use Azure OpenAI for search and chat.
27+
- [Generative Visualizations](./generative-visualisations/README.md) - Generate SQL queries and visualizations from natural language.
28+
- [Running Llama3 Locally](./llama/README.md) - Run Llama models locally from HuggingFace.
2529
- [OpenAI Models](./models/openai/README.md) - Use OpenAI LLM and embedding models.
2630
- [OpenAI SDK](./openai_sdk/README.md) - Use the OpenAI SDK to connect to models hosted on Spice.
2731
- [LLM Memory](./llm-memory/README.md) - Persistent memory for language models.
28-
- [Text to SQL (Tools)](./text-to-sql/README.md)
29-
- [Nvidia NIM on Kubernetes](./nvidia-nim/kubernetes/README.md) - Deploy Nvidia NIM infrastructure, on Kubernetes, with GPUs connected to Spice.
30-
- [Nvidia NIM on AWS EC2](./nvidia-nim/ec2/README.md) - Deploy Nvidia NIM on AWS GPU-optimized EC2 instances connected to Spice.
31-
- [Searching GitHub Files](./search_github_files/README.md) - Search GitHub files with embeddings and vector similarity search.
32+
- [Text to SQL (Tools)](./text-to-sql/README.md) - Query data with natural language.
33+
- [Nvidia NIM on Kubernetes](./nvidia-nim/kubernetes/README.md) - Deploy Nvidia NIM on Kubernetes with GPUs.
34+
- [Nvidia NIM on AWS EC2](./nvidia-nim/ec2/README.md) - Deploy Nvidia NIM on AWS GPU-optimized EC2 instances.
35+
- [Searching GitHub Files](./search_github_files/README.md) - Search GitHub files with embeddings and vector search.
3236
- [xAI Models](./models/xai/README.md) - Use xAI models such as Grok.
3337
- [DeepSeek Model](./deepseek/README.md) - Use DeepSeek model through Spice.
3438
- [Filesystem Hosted Model](./models/filesystem/README.md) - Use models hosted directly on filesystems.
35-
- [Web Search Tools using Perplexity](./websearch/README.md) - Provide LLMs with web search access for more informed answers.
39+
- [Web Search Tools using Perplexity](./websearch/README.md) - Give LLMs web search access via Perplexity.
3640
- [Language Model Evaluations](./evals/README.md) - Use Spice to evaluate language models.
37-
- [LLM as a Judge](./llm-judge/README.md) - Define LLM judge models to evaluate the performance of other language models.
41+
- [LLM as a Judge](./llm-judge/README.md) - Define LLM judge models to evaluate other models.
3842
- [OpenAI Responses API](./openai-responses-api/README.md) - Use OpenAI's Responses API with Spice
43+
- [Model Context Protocol (MCP)](./mcp/README.md) - Connect to MCP servers and use MCP tools with Spice.
3944

4045
### Data Acceleration - Materializing & accelerating data locally with Data Accelerators
4146

42-
- [Cayenne Data Accelerator](./cayenne/README.md)
43-
- [DuckDB Data Accelerator](./duckdb/accelerator/README.md)
44-
- [Hashed Partitioning with DuckDB](./hashed_partitioning/README.md)
45-
- [PostgreSQL Data Accelerator](./postgres/accelerator/README.md)
46-
- [SQLite Data Accelerator](./sqlite/accelerator/README.md)
47-
- [Database Snapshots](./acceleration/snapshots/README.md) - Bootstrap DuckDB accelerations from object storage to skip cold starts.
48-
- [Apache Arrow Data Accelerator](./arrow/README.md)
49-
- [Accelerated Views](./views/README.md)
47+
- [Cayenne Data Accelerator](./cayenne/README.md) - Accelerate data using Cayenne.
48+
- [DuckDB Data Accelerator](./duckdb/accelerator/README.md) - Accelerate data using DuckDB.
49+
- [Hashed Partitioning with DuckDB](./hashed_partitioning/README.md) - Prune data with hashed partitioning on categorical columns.
50+
- [PostgreSQL Data Accelerator](./postgres/accelerator/README.md) - Materialize data into an attached PostgreSQL instance.
51+
- [SQLite Data Accelerator](./sqlite/accelerator/README.md) - Accelerate data using SQLite.
52+
- [Database Snapshots](./acceleration/snapshots/README.md) - Bootstrap accelerations from object storage to skip cold starts.
53+
- [Apache Arrow Data Accelerator](./arrow/README.md) - Accelerate data using in-memory Arrow.
54+
- [Accelerated Views](./views/README.md) - Pre-calculate and materialize derived data for faster queries.
5055
- [Dataset Partitioning](./acceleration/partitioning/README.md) - Partition accelerated datasets to improve query performance.
5156

5257
### Consuming and visualizing data with clients
5358

5459
- [Sales BI (Apache Superset)](./sales-bi/README.md) - Visualize data in Spice with Apache Superset.
5560
- [Grafana Datasource](./grafana-datasource/README.md) - Add Spice as a Grafana datasource.
56-
- [Python ADBC Client](./clients/adbc/README.md) - Query Spice using ADBC and Parameterized Queries with Python.
57-
- [Java JDBC Client](./clients/java/README.md) - Query Spice using JDBC and Parameterized Queries with Java.
58-
- [Scala JDBC Client](./clients/scala/README.md) - Query Spice using JDBC and Parameterized Queries with Scala.
61+
- [Python ADBC Client](./clients/adbc/README.md) - Query Spice using ADBC with Python.
62+
- [Java JDBC Client](./clients/java/README.md) - Query Spice using JDBC with Java.
63+
- [Scala JDBC Client](./clients/scala/README.md) - Query Spice using JDBC with Scala.
5964

6065
### Connecting to Data Sources with Data Connectors
6166

@@ -65,49 +70,54 @@ Welcome to the Spice.ai OSS Cookbook—a comprehensive collection of recipes for
6570
- [MySQL Data Connector](./mysql/connector/README.md)
6671
- [AWS RDS Aurora (MySQL Compatible)](./mysql/rds-aurora/README.md)
6772
- [PlanetScale](./mysql/planetscale/README.md)
68-
- [Clickhouse Data Connector](./clickhouse/README.md)
73+
- [Clickhouse Data Connector](./clickhouse/README.md) - Connect to ClickHouse as a data source.
6974
- [Databricks Connector](./databricks/README.md) - Delta Lake and Spark Connect.
7075
- [Delta Lake Connector](./delta-lake/README.md) - Query data from Delta Lake tables.
71-
- [Debezium Change Data Capture (CDC) Data Connector from Postgres](./cdc-debezium/README.md) - Stream changes from a Postgres database to Spice.
72-
- [Debezium CDC SASL/SCRAM Authentication from MySQL](./cdc-debezium/sasl-scram/README.md) - Stream changes from a MySQL database to Spice using SASL/SCRAM authentication.
73-
- [Dremio Data Connector](./dremio/README.md)
76+
- [Debezium CDC Data Connector](./cdc-debezium/README.md) - Stream changes from Postgres to Spice.
77+
- [Debezium CDC SASL/SCRAM from MySQL](./cdc-debezium/sasl-scram/README.md) - Stream changes from MySQL using SASL/SCRAM.
78+
- [DynamoDB Data Connector](./dynamodb/README.md) - Query data from an AWS-hosted DynamoDB table.
79+
- [DynamoDB Streams](./dynamodb/streams/README.md) - Stream real-time changes from DynamoDB tables.
80+
- [Dremio Data Connector](./dremio/README.md) - Connect to a Dremio instance.
7481
- [DuckDB Data Connector](./duckdb/connector/README.md) - Use a DuckDB database with sample TPCH data.
7582
- [File Data Connector](./file/README.md) - Query data from local files.
7683
- [FTP Data Connector](./ftp/README.md) - Query data from an FTP server.
77-
- [Glue Data Connector](./glue/README.md)
78-
- [GitHub Data Connector](./github/README.md)
79-
- [GraphQL Data Connector](./graphql/README.md)
84+
- [Glue Data Connector](./glue/README.md) - Query tables in an AWS Glue Data Catalog.
85+
- [GitHub Data Connector](./github/README.md) - Query GitHub repository data.
86+
- [GraphQL Data Connector](./graphql/README.md) - Connect to GraphQL endpoints.
8087
- [HTTP Data Connector](./http/README.md) - Query data from HTTP(s) endpoints like REST APIs.
81-
- [MSSQL (Microsoft SQL Server) Data Connector](./mssql/README.md)
82-
- [ODBC Data Connector](./odbc/README.md)
88+
- [MongoDB Data Connector](./mongodb/connector/README.md) - Connect to MongoDB as a data source.
89+
- [MSSQL (Microsoft SQL Server) Data Connector](./mssql/README.md) - Query across multiple SQL Server instances.
90+
- [ODBC Data Connector](./odbc/README.md) - Connect to databases via ODBC.
8391
- [Amazon Redshift](./redshift/README.md) - Read and write TPC-H data with Amazon Redshift.
84-
- [Oracle Data Connector](./oracle/README.md)
85-
- [S3 Data Connector](./s3/README.md)
92+
- [Oracle Data Connector](./oracle/README.md) - Connect to and accelerate data from Oracle.
93+
- [S3 Data Connector](./s3/README.md) - Query data from an S3 bucket.
8694
- [ScyllaDB Data Connector](./scylladb/README.md) - Query data from ScyllaDB clusters using federated SQL.
87-
- [SharePoint/OneDrive for Business Data Connector](./sharepoint/README.md)
95+
- [SharePoint/OneDrive for Business Data Connector](./sharepoint/README.md) - Query documents in SharePoint.
8896
- [SMB Data Connector](./smb/README.md) - Query data files from SMB/CIFS network shares.
89-
- [Snowflake Data Connector](./snowflake/README.md)
90-
- [Spice.ai Cloud Platform Data Connector](./spiceai/README.md)
91-
- [Apache Spark Data Connector](./spark/README.md)
92-
- [Apache Kafka Data Connector](./kafka/README.md)
93-
- [IMAP Data Connector](./imap/README.md)
97+
- [Snowflake Data Connector](./snowflake/README.md) - Access a Snowflake database.
98+
- [Spice.ai Cloud Platform Data Connector](./spiceai/README.md) - Connect to Spice.ai Cloud Platform datasets.
99+
- [Apache Spark Data Connector](./spark/README.md) - Read data from an Apache Spark instance.
100+
- [Apache Kafka Data Connector](./kafka/README.md) - Stream data from Kafka with federated queries.
101+
- [IMAP Data Connector](./imap/README.md) - Connect to an IMAP email server.
94102
- [Connecting to an Outlook mailbox](./imap/outlook.md)
95103

96104
### Connecting to Data Sources with Catalog Connectors
97105

98-
- [Spice.ai Cloud Platform Catalog Connector](./catalogs/spiceai/README.md)
99-
- [Databricks Unity Catalog Connector](./catalogs/databricks/README.md)
100-
- [Unity Catalog Connector](./catalogs/unity_catalog/README.md)
101-
- [Iceberg Catalog Connector](./catalogs/iceberg/README.md)
102-
- [Glue Catalog Connector](./catalogs/glue/README.md)
106+
- [Spice.ai Cloud Platform Catalog Connector](./catalogs/spiceai/README.md) - Query datasets in Spice.ai Cloud Platform.
107+
- [Databricks Unity Catalog Connector](./catalogs/databricks/README.md) - Query Databricks Unity Catalog tables.
108+
- [Unity Catalog Connector](./catalogs/unity_catalog/README.md) - Query an open-source Unity Catalog instance.
109+
- [Iceberg Catalog Connector](./catalogs/iceberg/README.md) - Query and write to Iceberg tables.
110+
- [Iceberg Hadoop Catalog Connector](./catalogs/iceberg-hadoop/README.md) - Connect to Hadoop catalogs on S3-compatible storage.
111+
- [Glue Catalog Connector](./catalogs/glue/README.md) - Query tables in an AWS Glue Data Catalog.
103112

104113
### Using Vector Engines
105114

106-
- [Amazon S3 Vectors](./vectors/s3-vectors/README.md) - Use Amazon S3 as a vector engine for embeddings and similarity search.
115+
- [Amazon S3 Vectors](./vectors/s3-vectors/README.md) - Use S3 as a vector engine for embeddings and similarity search.
107116

108117
## Search
109118

110119
- [Hybrid-Search](./search/README.md) - Combine keyword and vector search for improved retrieval.
120+
- [Full-Text Search](./full-text-search/README.md) - Retrieve records matching keywords using BM25 scoring.
111121

112122
### Deployment and Installation
113123

@@ -118,24 +128,24 @@ Welcome to the Spice.ai OSS Cookbook—a comprehensive collection of recipes for
118128

119129
### Performance
120130

121-
- [TPC-H Benchmarking](./tpc-h/README.md)
122-
- [SQL Results Caching](./caching/sql_results/README.md)
123-
- [Caching Accelerator](./caching/accelerator/README.md) - Intelligent HTTP response caching with Stale-While-Revalidate (SWR) support.
124-
- [Indexes on Accelerated Data](./acceleration/indexes/README.md)
131+
- [TPC-H Benchmarking](./tpc-h/README.md) - Run TPC-H benchmark queries.
132+
- [SQL Results Caching](./caching/sql_results/README.md) - Cache query results in memory for faster repeated queries.
133+
- [Caching Accelerator](./caching/accelerator/README.md) - HTTP response caching with SWR support.
134+
- [Indexes on Accelerated Data](./acceleration/indexes/README.md) - Create indexes to improve query performance.
125135

126136
### Acceleration Data Configuration
127137

128-
- [Data Retention Policy](./retention/README.md)
129-
- [Refresh Data Window](./refresh-data-window/README.md)
130-
- [Advanced Data Refresh](./acceleration/data-refresh/README.md)
131-
- [Data Quality with Constraints](./acceleration/constraints/README.md)
138+
- [Data Retention Policy](./retention/README.md) - Evict data older than a specified duration.
139+
- [Refresh Data Window](./refresh-data-window/README.md) - Filter data refresh to only recent data.
140+
- [Advanced Data Refresh](./acceleration/data-refresh/README.md) - Configure and tune data refresh for accelerated datasets.
141+
- [Data Quality with Constraints](./acceleration/constraints/README.md) - Enforce data quality constraints on accelerated datasets.
132142

133143
## Client SDKs - Recipes for querying data from Spice with language-specific SDKs
134144

135145
- [Rust SDK](client-sdk/spice-rs-sdk-sample/README.md)
136146
- [Python SDK](client-sdk/spicepy-sdk-sample/README.md)
137147
- [Go SDK](client-sdk/gospice-sdk-sample/README.md)
138-
- [JavaScript SDK (Node.js)](client-sdk/spice.js-sdk-sample/README.md) - Query NYC taxi trips data using the [`@spiceai/spice`](https://www.npmjs.com/package/@spiceai/spice) npm package.
148+
- [JavaScript SDK (Node.js)](client-sdk/spice.js-sdk-sample/README.md) - Query data using the `@spiceai/spice` npm package.
139149
- [Java SDK](client-sdk/spice-java-sdk-sample/README.md)
140150

141151
### Security
@@ -145,5 +155,6 @@ Welcome to the Spice.ai OSS Cookbook—a comprehensive collection of recipes for
145155

146156
### Advanced Topics
147157

148-
- [Local dataset replication](./localpod/README.md) - Link datasets in a parent/child relationship within the current Spicepod
149-
- [Distributed Query](./distributed/README.md) - Run queries distributed across multiple nodes for maximum performance across large datasets
158+
- [Local dataset replication](./localpod/README.md) - Link datasets in a parent/child relationship.
159+
- [Distributed Query](./distributed/README.md) - Run queries distributed across multiple nodes.
160+
- [JSON Strings](./json_strings/README.md) - Work with JSON strings using JSON functions.

0 commit comments

Comments
 (0)