Skip to content

Commit b537d3d

Browse files
committed
docs: add guidelines back, some clarifications
1 parent e0cd598 commit b537d3d

File tree

1 file changed

+28
-15
lines changed

1 file changed

+28
-15
lines changed

site/docs/extensions/index.md

Lines changed: 28 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ Advanced extensions provide a way to embed custom functionality that goes beyond
138138

139139
Advanced extensions come in several main forms, discussed below:
140140

141-
1. Embedded extensions: These use the `AdvancedExtension` message for adding custom data to existing Substrait elements
141+
1. Embedded extensions: These use the `AdvancedExtension` message for adding custom data to existing Substrait messages
142142
2. Custom relation types: For defining entirely new relational operations
143143
3. Custom read/write types: for defining new ways to read from or write to data sources
144144

@@ -157,19 +157,27 @@ message AdvancedExtension {
157157
}
158158
```
159159

160+
!!! note "Enhancements vs Optimizations"
161+
162+
Use **optimizations** for performance hints that don't change semantics and can be safely ignored. Use **enhancements** for semantic changes that must be understood by consumers or the plan cannot be executed correctly.
163+
160164
#### Optimizations
161165

162166
- Provide hints to improve performance but don't change the meaning of operations
163167
- Can be safely ignored by consumers that don't understand them
168+
- Multiple optimizations can be attached to a single message
164169
- Examples: memory usage hints, preferred algorithms, caching strategies
165-
- Multiple optimizations can be attached to a single element
166170

167171
#### Enhancements
168172

169173
- Modify the semantic behavior of operations
170174
- Must be understood by consumers or the plan cannot be executed correctly
171-
- Examples: custom aggregation logic, specialized join conditions, new relation types
172-
- Only one enhancement per element
175+
- Only one enhancement per message
176+
- Examples: specialized join conditions (e.g. fuzzy matching, geospatial) or sorting (e.g. clustering)
177+
178+
!!! note "Enhancement Constraints"
179+
180+
Semantic-changing extensions shouldn't change the core characteristics of the underlying relation. For example, they should *not* change the default direct output field ordering, change the number of fields output or change the behavior of physical property characteristics. If one needs to change one of these behaviors, one should define a new relation as described below.
173181

174182
#### Where AdvancedExtension Messages Can Be Used
175183

@@ -181,29 +189,34 @@ The `AdvancedExtension` message can be attached to various parts of a Substrait
181189
| **`RelCommon`** | Extensions for any relational operator |
182190
| **Relations** (e.g. `ProjectRel`) | Extensions for a specific relation type |
183191
| **Hints** | Extensions within optimization hints |
184-
| **`ReadRel.NamedTable`** | Add custom metadata to named table references |
185-
| **`WriteRel.NamedObjectWrite`** | Add custom metadata to write targets |
186-
| **`DdlRel.NamedObjectWrite`** | Add custom metadata to DDL targets |
192+
| **`ReadRel.NamedTable`** | Custom metadata to named table references |
193+
| **`ReadRel.LocalFiles`** | Custom metadata to local file sources |
194+
| **`WriteRel.NamedObjectWrite`** | Custom metadata to write targets |
195+
| **`DdlRel.NamedObjectWrite`** | Custom metadata to DDL targets |
187196

188197
### Custom Relations
189198

190199
The second form of advanced extensions provides entirely new relational operations via dedicated extension relation types. These allow you to define custom relations while maintaining proper integration with the type system:
191200

192201
| Relation Type | Description | Examples |
193202
| ---------------------- | ----------------------------------------------- | -------- |
194-
| **`ExtensionLeafRel`** | Custom relations with no inputs | Custom table sources |
195-
| **`ExtensionSingleRel`** | Custom relations with one input | Custom transforms |
196-
| **`ExtensionMultiRel`** | Custom relations with multiple inputs | Custom joins |
203+
| **`ExtensionLeafRel`** | Custom relations with no inputs | Custom table sources |
204+
| **`ExtensionSingleRel`** | Custom relations with one input | Custom transforms |
205+
| **`ExtensionMultiRel`** | Custom relations with multiple inputs | Custom joins |
197206

198207
These extension relations are first-class relation types in Substrait and can be used anywhere a standard relation would be used.
199208

209+
!!! note "Interoperability Guidance"
210+
211+
Custom relations are the most flexible but least interoperable option. In most cases it is better to use enhancements to existing relations rather than defining new custom relations, as it means existing code patterns can easily be extended to work with the additional properties.
212+
200213
### Custom Read and Write Types
201214

202215
The third form of advanced extensions allows you to define extension data sources and destinations:
203216

204-
| Extension Type | Description | Examples |
205-
| ------------------ | ------------------------------------------- | ---------------------------------------------------- |
206-
| **`ReadRel.ExtensionTable`** | Define entirely new table source types | APIs, specialized formats |
207-
| **`WriteRel.ExtensionObject`** | Define entirely new write destination types | APIs, specialized formats |
208-
| **`DdlRel.ExtensionObject`** | Define entirely new DDL destination types | Catalogs, schema registries |
217+
| Extension Type | Description | Examples |
218+
| ------------------------------ | --------------------------------------------- | ---------------------------- |
219+
| **`ReadRel.ExtensionTable`** | Define entirely new table source types | APIs, specialized formats |
220+
| **`WriteRel.ExtensionObject`** | Define entirely new write destination types | APIs, specialized formats |
221+
| **`DdlRel.ExtensionObject`** | Define entirely new DDL destination types | Catalogs, schema registries |
209222

0 commit comments

Comments
 (0)