Skip to content

Latest commit

 

History

History
133 lines (94 loc) · 4.79 KB

File metadata and controls

133 lines (94 loc) · 4.79 KB

spicepy

Build Lint Test codecov PyPI version Python 3.10+ License

Spice.ai client library for Python.

Installation

pip install git+https://github.com/spiceai/spicepy@v3.1.0

For parameterized query support, install with the optional params extra:

pip install "spicepy[params]"

Usage

Arrow Query with local spice runtime

Follow the quickstart guide to install and run spice locally

from spicepy import Client

client = Client()
data = client.query('SELECT trip_distance, total_amount FROM taxi_trips ORDER BY trip_distance DESC LIMIT 10;', timeout=5*60)
pd = data.read_pandas()

Arrow Query with spice.ai cloud

SQL Query

from spicepy import Client

client = Client(
      api_key='API_KEY',
      flight_url="grpc+tls://flight.spiceai.io"
)
data = client.query('SELECT * FROM taxi_trips LIMIT 10;', timeout=5*60)
pd = data.read_pandas()

Parameterized Queries (Recommended)

Use parameterized queries to prevent SQL injection and improve query performance. Parameters use positional placeholders ($1, $2, etc.):

from spicepy import Client

client = Client()

# Query with automatic type inference
reader = client.query_with_params(
    'SELECT trip_distance, fare_amount FROM taxi_trips WHERE trip_distance > $1 LIMIT 10',
    [5.0]
)

for batch in reader:
    print(batch.to_pandas())

# Query without parameters (use empty list)
reader = client.query_with_params(
    'SELECT * FROM taxi_trips LIMIT 10',
    []
)

Multiple Parameters

reader = client.query_with_params(
    'SELECT trip_distance, fare_amount FROM taxi_trips WHERE trip_distance > $1 AND fare_amount > $2 LIMIT 10',
    [5.0, 20.0]
)

Explicit Type Control

For precise control over Arrow types, use tuples of (value, pyarrow.DataType):

import pyarrow as pa
from spicepy import Client

client = Client()

reader = client.query_with_params(
    'SELECT * FROM table WHERE id = $1 AND amount = $2',
    [(123, pa.int32()), (99.99, pa.float64())]
)

Common PyArrow types:

  • Integers: pa.int8(), pa.int16(), pa.int32(), pa.int64(), pa.uint8(), pa.uint16(), pa.uint32(), pa.uint64()
  • Floating point: pa.float16(), pa.float32(), pa.float64()
  • Strings: pa.string(), pa.large_string()
  • Binary: pa.binary(), pa.large_binary()
  • Boolean: pa.bool_()
  • Temporal: pa.date32(), pa.date64(), pa.time32(), pa.time64(), pa.timestamp(), pa.duration()
  • Decimals: pa.decimal128(), pa.decimal256()
  • Null: pa.null()

See the PyArrow documentation for the full list of available types.

Querying data is done through a Client object that initialize the connection with Spice endpoint. Client has the following arguments:

  • api_key (string, required): API key to authenticate with the endpoint.
  • url (string, optional): URL of the endpoint to use (default: grpc+tls://flight.spiceai.io; firecache: grpc+tls://firecache.spiceai.io)
  • tls_root_cert (Path or string, optional): Path to the tls certificate to use for the secure connection (omit for automatic detection)
  • user_agent (string, optional): A custom User-Agent string to pass when connecting to Spice. Use spicepy.config.get_user_agent to build the custom User-Agent

Once a Client is obtained queries can be made using the query() function. The query() function has the following arguments:

  • query (string, required): The SQL query.
  • timeout (int, optional): The timeout in seconds.

A custom timeout can be set by passing the timeout parameter in the query function call. If no timeout is specified, it will default to a 10 min timeout then cancel the query, and a TimeoutError exception will be raised.

Documentation

Check out our Documentation to learn more about how to use the Python SDK.