Skip to content

Commit 1670d65

Browse files
committed
chore: update changelog for version 0.1.0-dev0
1 parent 931cde0 commit 1670d65

File tree

2 files changed

+88
-84
lines changed

2 files changed

+88
-84
lines changed

Diff for: CHANGELOG.md

+87-83
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,54 @@
1+
## 0.2.2-dev0
2+
3+
### Fixes
4+
5+
- **Fix Notion Pagination** Iterate on Notion paginated results using the `next_cursor` and `start_cursor` properties.
6+
17
## 0.2.1
28

39
### Enhancements
410

5-
* **File system based indexers return a record display name**
6-
* **Add singlestore source connector**
7-
* **Astra DB V2 Source Connector** Create a v2 version of the Astra DB Source Connector.
11+
- **File system based indexers return a record display name**
12+
- **Add singlestore source connector**
13+
- **Astra DB V2 Source Connector** Create a v2 version of the Astra DB Source Connector.
814

915
### Fixes
1016

11-
* **Fix Databricks Volumes file naming** Add .json to end of upload file.
17+
- **Fix Databricks Volumes file naming** Add .json to end of upload file.
1218

1319
## 0.2.0
1420

1521
### Enhancements
1622

17-
* **Add snowflake source and destination connectors**
18-
* **Migrate Slack Source Connector to V2**
19-
* **Migrate Slack Source Connector to V2**
20-
* **Add Delta Table destination to v2**
21-
* **Migrate Slack Source Connector to V2**
23+
- **Add snowflake source and destination connectors**
24+
- **Migrate Slack Source Connector to V2**
25+
- **Migrate Slack Source Connector to V2**
26+
- **Add Delta Table destination to v2**
27+
- **Migrate Slack Source Connector to V2**
2228

2329
## 0.1.1
2430

2531
### Enhancements
2632

27-
* **Update KDB.AI vectorstore integration to 1.4**
28-
* **Add sqlite and postgres source connectors**
29-
* **Add sampling functionality for indexers in fsspec connectors**
33+
- **Update KDB.AI vectorstore integration to 1.4**
34+
- **Add sqlite and postgres source connectors**
35+
- **Add sampling functionality for indexers in fsspec connectors**
3036

3137
### Fixes
3238

33-
* **Fix Databricks Volumes destination** Fix for filenames to not be hashes.
39+
- **Fix Databricks Volumes destination** Fix for filenames to not be hashes.
3440

3541
## 0.1.0
3642

3743
### Enhancements
3844

39-
* **Move default API URL parameter value to serverless API**
40-
* **Add check that access config always wrapped in Secret**
41-
* **Add togetherai embedder support**
42-
* **Refactor sqlite and postgres to be distinct connectors to support better input validation**
43-
* **Added MongoDB source V2 connector**
44-
* **Support optional access configs on connection configs**
45-
* **Refactor databricks into distinct connectors based on auth type**
45+
- **Move default API URL parameter value to serverless API**
46+
- **Add check that access config always wrapped in Secret**
47+
- **Add togetherai embedder support**
48+
- **Refactor sqlite and postgres to be distinct connectors to support better input validation**
49+
- **Added MongoDB source V2 connector**
50+
- **Support optional access configs on connection configs**
51+
- **Refactor databricks into distinct connectors based on auth type**
4652

4753
### Fixes
4854

@@ -52,223 +58,221 @@
5258

5359
### Enhancements
5460

55-
* **Support pinecone namespace on upload**
56-
* **Migrate Outlook Source Connector to V2**
57-
* **Support for Databricks Volumes source connector**
61+
- **Support pinecone namespace on upload**
62+
- **Migrate Outlook Source Connector to V2**
63+
- **Support for Databricks Volumes source connector**
5864

5965
### Fixes
6066

61-
* **Update Sharepoint Creds and Expected docs**
67+
- **Update Sharepoint Creds and Expected docs**
6268

6369
## 0.0.24
6470

6571
### Enhancements
6672

67-
* **Support dynamic metadata mapping in Pinecone uploader**
73+
- **Support dynamic metadata mapping in Pinecone uploader**
6874

6975
## 0.0.23
7076

7177
### Fixes
7278

73-
* **Remove check for langchain dependency in embedders**
79+
- **Remove check for langchain dependency in embedders**
7480

7581
## 0.0.22
7682

7783
### Enhancements
7884

79-
* **Add documentation for developing sources/destinations**
85+
- **Add documentation for developing sources/destinations**
8086

81-
* **Leverage `uv` for pip compile**
87+
- **Leverage `uv` for pip compile**
8288

83-
* **Use incoming fsspec data to populate metadata** Rather than make additional calls to collect metadata after initial file list, use connector-specific data to populate the metadata.
89+
- **Use incoming fsspec data to populate metadata** Rather than make additional calls to collect metadata after initial file list, use connector-specific data to populate the metadata.
8490

85-
* **Drop langchain as dependency for embedders**
91+
- **Drop langchain as dependency for embedders**
8692

8793
## 0.0.21
8894

8995
### Fixes
9096

91-
* **Fix forward compatibility issues with `unstructured-client==0.26.0`.** Update syntax and create a new SDK util file for reuse in the Partitioner and Chunker
97+
- **Fix forward compatibility issues with `unstructured-client==0.26.0`.** Update syntax and create a new SDK util file for reuse in the Partitioner and Chunker
9298

93-
* **Update Databricks CI Test** Update to use client_id and client_secret auth. Also return files.upload method to one from open source.
99+
- **Update Databricks CI Test** Update to use client_id and client_secret auth. Also return files.upload method to one from open source.
94100

95-
* **Fix astra src bug** V1 source connector was updated to work with astrapy 1.5.0
101+
- **Fix astra src bug** V1 source connector was updated to work with astrapy 1.5.0
96102

97103
## 0.0.20
98104

99105
### Enhancements
100106

101-
* **Support for latest AstraPy API** Add support for the modern AstraPy client interface for the Astra DB Connector.
107+
- **Support for latest AstraPy API** Add support for the modern AstraPy client interface for the Astra DB Connector.
102108

103109
## 0.0.19
104110

105111
### Fixes
106112

107-
* **Use validate_default to instantiate default pydantic secrets**
113+
- **Use validate_default to instantiate default pydantic secrets**
108114

109115
## 0.0.18
110116

111117
### Enhancements
112118

113-
* **Better destination precheck for blob storage** Write an empty file to the destination location when running fsspec-based precheck
119+
- **Better destination precheck for blob storage** Write an empty file to the destination location when running fsspec-based precheck
114120

115121
## 0.0.17
116122

117123
### Fixes
118124

119-
* **Drop use of unstructued in embed** Remove remnant import from unstructured dependency in embed implementations.
120-
125+
- **Drop use of unstructued in embed** Remove remnant import from unstructured dependency in embed implementations.
121126

122127
## 0.0.16
123128

124129
### Fixes
125130

126-
* **Add constraint on pydantic** Make sure the version of pydantic being used with this repo pulls in the earliest version that introduces generic Secret, since this is used heavily.
131+
- **Add constraint on pydantic** Make sure the version of pydantic being used with this repo pulls in the earliest version that introduces generic Secret, since this is used heavily.
127132

128133
## 0.0.15
129134

130135
### Fixes
131136

132-
* **Model serialization with nested models** Logic updated to properly handle serializing pydantic models that have nested configs with secret values.
133-
* **Sharepoint permission config requirement** The sharepoint connector was expecting the permission config, even though it should have been optional.
134-
* **Sharepoint CLI permission params made optional
137+
- **Model serialization with nested models** Logic updated to properly handle serializing pydantic models that have nested configs with secret values.
138+
- **Sharepoint permission config requirement** The sharepoint connector was expecting the permission config, even though it should have been optional.
139+
- \*\*Sharepoint CLI permission params made optional
135140

136141
### Enhancements
137142

138-
* **Migrate airtable connector to v2**
139-
* **Support iteratively deleting cached content** Add a flag to delete cached content once it's no longer needed for systems that are limited in memory.
143+
- **Migrate airtable connector to v2**
144+
- **Support iteratively deleting cached content** Add a flag to delete cached content once it's no longer needed for systems that are limited in memory.
140145

141146
## 0.0.14
142147

143148
### Enhancements
144149

145-
* **Support async batch uploads for pinecone connector**
146-
* **Migrate embedders** Move embedder implementations from the open source unstructured repo into this one.
150+
- **Support async batch uploads for pinecone connector**
151+
- **Migrate embedders** Move embedder implementations from the open source unstructured repo into this one.
147152

148153
### Fixes
149154

150-
* **Misc. Onedrive connector fixes**
155+
- **Misc. Onedrive connector fixes**
151156

152157
## 0.0.13
153158

154159
### Fixes
155160

156-
* **Pinecone payload size fixes** Pinecone destination now has a limited set of properties it will publish as well as dynamically handles batch size to stay under 2MB pinecone payload limit.
161+
- **Pinecone payload size fixes** Pinecone destination now has a limited set of properties it will publish as well as dynamically handles batch size to stay under 2MB pinecone payload limit.
157162

158163
## 0.0.12
159164

160165
### Enhancements
161166

162167
### Fixes
163168

164-
* **Fix invalid `replace()` calls in uncompress** - `replace()` calls meant to be on `str` versions of the path were instead called on `Path` causing errors with parameters.
169+
- **Fix invalid `replace()` calls in uncompress** - `replace()` calls meant to be on `str` versions of the path were instead called on `Path` causing errors with parameters.
165170

166171
## 0.0.11
167172

168173
### Enhancements
169174

170-
* **Fix OpenSearch connector** OpenSearch connector did not work when `http_auth` was not provided
175+
- **Fix OpenSearch connector** OpenSearch connector did not work when `http_auth` was not provided
171176

172177
## 0.0.10
173178

174179
### Enhancements
175180

176-
* "Fix tar extraction" - tar extraction function assumed archive was gzip compressed which isn't true for supported `.tar` archives. Updated to work for both compressed and uncompressed tar archives.
181+
- "Fix tar extraction" - tar extraction function assumed archive was gzip compressed which isn't true for supported `.tar` archives. Updated to work for both compressed and uncompressed tar archives.
177182

178183
## 0.0.9
179184

180185
### Enhancements
181186

182-
* **Chroma dict settings should allow string inputs**
183-
* **Move opensearch non-secret fields out of access config**
184-
* **Support string inputs for dict type model fields** Use the `BeforeValidator` support from pydantic to map a string value to a dict if that's provided.
185-
* **Move opensearch non-secret fields out of access config
187+
- **Chroma dict settings should allow string inputs**
188+
- **Move opensearch non-secret fields out of access config**
189+
- **Support string inputs for dict type model fields** Use the `BeforeValidator` support from pydantic to map a string value to a dict if that's provided.
190+
- \*\*Move opensearch non-secret fields out of access config
186191

187192
### Fixes
188193

189-
**Fix uncompress logic** Use of the uncompress process wasn't being leveraged in the pipeline correctly. Updated to use the new loca download path for where the partitioned looks for the new file.
190-
194+
**Fix uncompress logic** Use of the uncompress process wasn't being leveraged in the pipeline correctly. Updated to use the new loca download path for where the partitioned looks for the new file.
191195

192196
## 0.0.8
193197

194198
### Enhancements
195199

196-
* **Add fields_to_include option for Milvus Stager** Adds support for filtering which fields will remain in the document so user can align document structure to collection schema.
197-
* **Add flatten_metadata option for Milvus Stager** Flattening metadata is now optional (enabled by default) step in processing the document.
200+
- **Add fields_to_include option for Milvus Stager** Adds support for filtering which fields will remain in the document so user can align document structure to collection schema.
201+
- **Add flatten_metadata option for Milvus Stager** Flattening metadata is now optional (enabled by default) step in processing the document.
198202

199203
## 0.0.7
200204

201205
### Enhancements
202206

203-
* **support sharing parent multiprocessing for uploaders** If an uploader needs to fan out it's process using multiprocessing, support that using the parent pipeline approach rather than handling it explicitly by the connector logic.
204-
* **OTEL support** If endpoint supplied, publish all traces to an otel collector.
207+
- **support sharing parent multiprocessing for uploaders** If an uploader needs to fan out it's process using multiprocessing, support that using the parent pipeline approach rather than handling it explicitly by the connector logic.
208+
- **OTEL support** If endpoint supplied, publish all traces to an otel collector.
205209

206210
### Fixes
207211

208-
* **Weaviate access configs access** Weaviate access config uses pydantic Secret and it needs to be resolved to the secret value when being used. This was fixed.
209-
* **unstructured-client compatibility fix** Fix an error when accessing the fields on `PartitionParameters` in the new 0.26.0 Python client.
212+
- **Weaviate access configs access** Weaviate access config uses pydantic Secret and it needs to be resolved to the secret value when being used. This was fixed.
213+
- **unstructured-client compatibility fix** Fix an error when accessing the fields on `PartitionParameters` in the new 0.26.0 Python client.
210214

211215
## 0.0.6
212216

213217
### Fixes
214218

215-
* **unstructured-client compatibility fix** Update the calls to `unstructured_client.general.partition` to avoid a breaking change in the newest version.
219+
- **unstructured-client compatibility fix** Update the calls to `unstructured_client.general.partition` to avoid a breaking change in the newest version.
216220

217221
## 0.0.5
218222

219223
### Enhancements
220224

221-
* **Add Couchbase Source Connector** Adds support for reading artifacts from Couchbase DB for processing in unstructured
222-
* **Drop environment from pinecone as part of v2 migration** environment is no longer required by the pinecone SDK, so that field has been removed from the ingest CLI/SDK/
223-
* **Add KDBAI Destination Connector** Adds support for writing elements and their embeddings to KDBAI DB.
225+
- **Add Couchbase Source Connector** Adds support for reading artifacts from Couchbase DB for processing in unstructured
226+
- **Drop environment from pinecone as part of v2 migration** environment is no longer required by the pinecone SDK, so that field has been removed from the ingest CLI/SDK/
227+
- **Add KDBAI Destination Connector** Adds support for writing elements and their embeddings to KDBAI DB.
224228

225229
### Fixes
226230

227-
* **AstraDB connector configs** Configs had dataclass annotation removed since they're now pydantic data models.
228-
* **Local indexer recursive behavior** Local indexer was indexing directories as well as files. This was filtered out.
231+
- **AstraDB connector configs** Configs had dataclass annotation removed since they're now pydantic data models.
232+
- **Local indexer recursive behavior** Local indexer was indexing directories as well as files. This was filtered out.
229233

230234
## 0.0.4
231235

232236
### Enhancements
233237

234-
* **Add Couchbase Destination Connector** Adds support for storing artifacts in Couchbase DB for Vector Search
235-
* **Leverage pydantic base models** All user-supplied configs are now derived from pydantic base models to leverage better type checking and add built in support for sensitive fields.
236-
* **Autogenerate click options from base models** Leverage the pydantic base models for all configs to autogenerate the cli options exposed when running ingest as a CLI.
237-
* **Drop required Unstructured dependency** Unstructured was moved to an extra dependency to only be imported when needed for functionality such as local partitioning/chunking.
238-
* **Rebrand Astra to Astra DB** The Astra DB integration was re-branded to be consistent with DataStax standard branding.
238+
- **Add Couchbase Destination Connector** Adds support for storing artifacts in Couchbase DB for Vector Search
239+
- **Leverage pydantic base models** All user-supplied configs are now derived from pydantic base models to leverage better type checking and add built in support for sensitive fields.
240+
- **Autogenerate click options from base models** Leverage the pydantic base models for all configs to autogenerate the cli options exposed when running ingest as a CLI.
241+
- **Drop required Unstructured dependency** Unstructured was moved to an extra dependency to only be imported when needed for functionality such as local partitioning/chunking.
242+
- **Rebrand Astra to Astra DB** The Astra DB integration was re-branded to be consistent with DataStax standard branding.
239243

240244
## 0.0.3
241245

242246
### Enhancements
243247

244-
* **Improve documentation** Update the README's.
245-
* **Explicit Opensearch classes** For the connector registry entries for opensearch, use only opensearch specific classes rather than any elasticsearch ones.
246-
* **Add missing fsspec destination precheck** check connection in precheck for all fsspec-based destination connectors
248+
- **Improve documentation** Update the README's.
249+
- **Explicit Opensearch classes** For the connector registry entries for opensearch, use only opensearch specific classes rather than any elasticsearch ones.
250+
- **Add missing fsspec destination precheck** check connection in precheck for all fsspec-based destination connectors
247251

248252
## 0.0.2
249253

250254
### Enhancements
251255

252-
* **Use uuid for s3 identifiers** Update unique id to use uuid derived from file path rather than the filepath itself.
253-
* **V2 connectors precheck support** All steps in the v2 pipeline support an optional precheck call, which encompasses the previous check connection functionality.
254-
* **Filter Step** Support dedicated step as part of the pipeline to filter documents.
256+
- **Use uuid for s3 identifiers** Update unique id to use uuid derived from file path rather than the filepath itself.
257+
- **V2 connectors precheck support** All steps in the v2 pipeline support an optional precheck call, which encompasses the previous check connection functionality.
258+
- **Filter Step** Support dedicated step as part of the pipeline to filter documents.
255259

256260
## 0.0.1
257261

258262
### Enhancements
259263

260264
### Features
261265

262-
* **Add Milvus destination connector** Adds support storing artifacts in Milvus vector database.
266+
- **Add Milvus destination connector** Adds support storing artifacts in Milvus vector database.
263267

264268
### Fixes
265269

266-
* **Remove old repo references** Any mention of the repo this project came from was removed.
270+
- **Remove old repo references** Any mention of the repo this project came from was removed.
267271

268272
## 0.0.0
269273

270274
### Features
271275

272-
* **Initial Migration** Create the structure of this repo from the original code in the [Unstructured](https://github.com/Unstructured-IO/unstructured) project.
276+
- **Initial Migration** Create the structure of this repo from the original code in the [Unstructured](https://github.com/Unstructured-IO/unstructured) project.
273277

274278
### Fixes

Diff for: unstructured_ingest/__version__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.2.1" # pragma: no cover
1+
__version__ = "0.2.2-dev0" # pragma: no cover

0 commit comments

Comments
 (0)