You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(sql_connector): add support for sql connector (#1543)
* feat(sql_connector): adding support for the create and push of sql connectors
* feat(sql_implementation): add columns of sql in schema file
* feat(dataset): add test cases for the sql connectors
* feat(sql_connectors): add unit tests for sql connectors
* fix(connector): pull dataframe fixed
* fix(sql_connector): update docs
* remove extra schema redeclaration
description: 'Turn raw data into semantic-enhanced and clean dataframes'
2
+
title: "Semantic Layer"
3
+
description: "Turn raw data into semantic-enhanced and clean dataframes"
4
4
---
5
5
6
6
<Notetitle="Beta Notice">
7
-
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
7
+
Release v3 is currently in beta. This documentation reflects the features and
8
+
functionality in progress and may change before the final release.
8
9
</Note>
9
10
10
11
## What's the Semantic Layer?
11
12
12
13
The semantic layer allows you to turn raw data into [dataframes](/v3/dataframes) you can ask questions to and [share with your team](/v3/share-dataframes) as conversational AI dashboards. It serves several important purposes:
14
+
13
15
1.**Data Configuration**: Define how your data should be loaded and processed
14
16
2.**Semantic Information**: Add context and meaning to your data columns
15
17
3.**Data Transformation**: Specify how data should be cleaned and transformed
@@ -60,7 +62,9 @@ pai.create(
60
62
...
61
63
)
62
64
```
65
+
63
66
**Type**: `str`
67
+
64
68
- A string without special characters or spaces
65
69
- Using kebab-case naming convention
66
70
- Unique within your project
@@ -80,6 +84,7 @@ pai.create(
80
84
```
81
85
82
86
**Type**: `str`
87
+
83
88
- Must follow the format: "organization-identifier/dataset-identifier"
84
89
- Organization identifier should be unique to your organization
85
90
- Dataset identifier should be unique within your organization
@@ -101,11 +106,42 @@ pai.create(
101
106
```
102
107
103
108
**Type**: `DataFrame`
109
+
104
110
- Must be a pandas DataFrame created with `pai.read_csv()`
105
111
- Contains the raw data you want to enhance with semantic information
106
112
- Required parameter for creating a semantic layer
107
113
114
+
#### Connectors
115
+
116
+
The connector field allows you to connect your data sources like PostgreSQL, MySQL and Sqlite to the semantic layer.
117
+
For example, if you're working with a SQL database, you can specify the connection details using the connector field.
118
+
119
+
```python
120
+
121
+
pai.create(
122
+
path="acme-corp/sales-data",
123
+
connector={
124
+
"type": "postgres",
125
+
"connection": {
126
+
"host": "postgres-host",
127
+
"port": 5432,
128
+
"user": "postgres",
129
+
"password": "*****",
130
+
"database": "postgres",
131
+
},
132
+
"table": "orders",
133
+
},
134
+
...
135
+
)
136
+
```
137
+
138
+
**Type**: `Dict`
139
+
140
+
- Must be a sql connector source dict
141
+
- Required connection string for creating a semantic layer
142
+
108
143
#### description
144
+
109
145
A clear text description that helps others understand the dataset's contents and purpose.
110
146
111
147
```python
@@ -121,15 +157,17 @@ pai.create(
121
157
```
122
158
123
159
**Type**: `str`
160
+
124
161
- The purpose of the dataset
125
162
- The type of data contained
126
163
- Any relevant context about data collection or usage
127
164
- Optional but recommended for better data understanding
128
165
129
166
#### columns
167
+
130
168
Define the structure and metadata of your dataset's columns to help PandaAI understand your data better.
131
169
132
-
**Note**: If the `columns` parameter is not provided, all columns from the input dataframe will be included in the semantic layer.
170
+
**Note**: If the `columns` parameter is not provided, all columns from the input dataframe will be included in the semantic layer.
133
171
When specified, only the declared columns will be included, allowing you to select specific columns for your semantic layer.
134
172
135
173
```python
@@ -171,6 +209,7 @@ pai.create(
171
209
```
172
210
173
211
**Type**: `dict[str, dict]`
212
+
174
213
- Keys: column names as they appear in your DataFrame
175
214
- Values: dictionary containing:
176
215
-`type` (str): Data type of the column
@@ -181,22 +220,28 @@ pai.create(
181
220
- "boolean": flags, true/false values
182
221
-`description` (str): Clear explanation of what the column represents
183
222
184
-
185
223
### For other data sources: YAML configuration
186
224
187
225
For other data sources (SQL databases, data warehouses, etc.), create a YAML file in your datasets folder:
226
+
188
227
> Keep in mind that you have to install the sql, cloud data (ee), or yahoo_finance data extension to use this feature.
0 commit comments