docs/website/versioned_docs/version-1.11.x/features/semantic-model/index.md at 911cf08cfee1e306315cbe4714228f7c51b8542c · spiceai/docs

title	Semantic Model
sidebar_label	Semantic Model
description	Learn how to define and use semantic data models with Spice.
sidebar_position	11
pagination_prev
pagination_next

Semantic data models in Spice are defined using the datasets[*].columns configuration. These models provide structured and meaningful data representations, which are beneficial for both AI large language models (LLMs) and traditional data analysis.

Use-Cases

Large Language Models (LLMs)

The semantic model is automatically used by Spice Models as context to produce more accurate and context-aware AI responses.

Defining a Semantic Model

Semantic data models are defined within the spicepod.yaml file, specifically under the datasets section. Each dataset supports description, metadata, and a columns field where individual columns are described with metadata and features for utility and clarity.

Example Configuration

Example spicepod.yaml:

datasets:
  - name: taxi_trips
    description: NYC taxi trip rides
    metadata:
      instructions: Always provide citations with reference URLs.
      reference_url_template: https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_<YYYY-MM>.parquet
    columns:
      - name: tpep_pickup_time
        description: 'The time the passenger was picked up by the taxi'
      - name: notes
        description: 'Optional notes about the trip'
        embeddings:
          - from: hf_minilm # A defined Spice Model
            chunking:
              enabled: true
              target_chunk_size: 512
              overlap_size: 128
              trim_whitespace: true

Dataset Metadata

Datasets can be defined with the following metadata:

instructions: Optional. Instructions to provide to a language model when using this dataset.
reference_url_template: Optional. A URL template for citation links.

For detailed metadata configuration, see the Dataset Reference

Column Definitions

Each column in the dataset can be defined with the following attributes:

description: Optional. A description of the column's contents and purpose.
embeddings: Optional. Vector embeddings configuration for this column.

For detailed columns configuration, see the Dataset Reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use-Cases

Large Language Models (LLMs)

Defining a Semantic Model

Example Configuration

Dataset Metadata

Column Definitions

FilesExpand file tree

index.md

Latest commit

History

index.md

File metadata and controls

Use-Cases

Large Language Models (LLMs)

Defining a Semantic Model

Example Configuration

Dataset Metadata

Column Definitions