Architecture Documentation

This document describes the technical architecture, data models, and project structure of the project.

Table of Contents

Overview
System Architecture

Overview

This unified data stack uses incident management as an example use-case relatable to its real-world business process, which may include registering new incidents, assessing their categories for prioritization, tracking their resolution time, etc., for KPI tracking and reporting, but also may include looking at policy documents and other review reports to answer questions and provide insights to the team.

The idea is to showcase how teams can now bring in multi-modal data into Snowflake with ease and create a more rounded, 360-degree view of the entire business process.

System Architecture

The platform follows a modern data stack architecture built entirely on Snowflake:

Components

Sources
1. Slack App
  - User-facing interface for incident reporting from Slack channels where a bot is deployed to capture the incident details and attachments
  - This bot is connected to an OpenFlow connector to ingest the data into Snowflake
  - Supports text messages and image attachments
    
    OpenFlow Slack Connector
  - Real-time data ingestion from Slack (seconds latency)
  - Automatic message and file capture
  - Stores raw data in Bronze Zone tables
2. Unstructured Documents
  - Snowflake Staged of unstructured documents like PDFs, Word documents, etc.
  - 2 primary sub-directories:
    
    full - for full documents that are not expected to be split into chunks
    
    qa - for documents that are expected to be split into chunks for question and answer extraction
  - These documents are queried through a View in Bronze Zone to qualify the documents for processing
Transformation (dbt Projects)

Business logic implementation using Standard models, Semantic Views, Cortex Search, and Cortex Agents. Layered approach to processing data from Bronze to Gold Zone. AI-powered classification and enrichment using AISQL within dbt models AI-powered text chunk extraction from documents for search and retrieval, with deployment of Cortex Search service. AI-powered question extraction from documents for answer retrieval, using dbt Model meta config. Semantic View creation using the latest Snowflake Semantic View dbt package
1. Bronze Zone (Raw Data)
  - Landing zone for raw Slack data, Slack attachments, and for raw document data from sources like Google Drive, Box, etc. (This project doesn’t showcase OpenFlow connectors for document ingestion, but it is possible to use them for document ingestion.)
  - Minimal transformations like file type detection, file size validation, etc.
  - Source of truth for all incidents and attachments, and other raw unstructured documents that may be relevant to the entire incident management process
2. Silver Zone (Processed Data)
  - Unstructured documents processing that may be useful for a 360-degree insights into incident management processes, policies, etc.
  - Two primary mechanisms for processing unstructured documents:
    
    Text chunk extraction for search and retrieval. This is used to create Cortex Search service for hybrid search on large bodies of text from documents of such nature
    
    Question extraction for answer retrieval. Extracted question and answer pairs are stored in a table in Silver Zone later used in Semantic View for this project.
3. Gold Zone (Analytics-Ready)
  - Consumption-ready data models for incidents (open, closed, weekly metrics, quarterly business metrics, etc.)
  - Cortex Agent with access to the Semantic View and the Cortex Search service for use with Snowflake Intelligence
4. Semantic Views (Semantic Views)
  - Semantic Views created from all the processed and conformed (structured) data in Gold Zone for Text2SQL analysis through Cortex Analyst
Consumption (Snowflake Intelligence)
- Snowflake Intelligence for interactive analysis through Cortex Agent and Semantic View deployed previously.
- Additionally (and optionally) a Streamlit dashboard can be deployed to visualize the incidents and their statuses as a more traditional method of reporting pre-calculated metrics and insights.

dbt DAG

Note	dbt Projects allows to view this DAG from within Workspace > Projects > dbt Projects. Or, from Horizon Catalog > Database Explorer > Incident Management > dbt Projects > Project Details

Quick Links for dbt Models for document processing

dbt Projects Component	Documentation
Document Processing	Unstructured document processing for text and question extraction
Cortex Search	Hybrid search service for document retrieval
Semantic Views	Semantic layer for Text2SQL analysis
Cortex Agents	AI agents with access to Semantic Views and Cortex Search

← To README ||| → To Setup Guide

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture Documentation

Overview

System Architecture

Components

dbt DAG

Quick Links for dbt Models for document processing

FilesExpand file tree

ARCHITECTURE.adoc

Latest commit

History

ARCHITECTURE.adoc

File metadata and controls

Architecture Documentation

Overview

System Architecture

Components

dbt DAG

Quick Links for dbt Models for document processing