You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Summary
In some environments, such as Google Colab, loggers have a root handling
that did not mask sensitive values. As a result, secrets such as API
keys appeared in the logs. The PR removes root handlers when they exist
to ensure sensitive values are handler properly.
### Testing
Run the following in a Colab notebook. You should see two log outputs,
one with the API key masked and one with it exposed.
```
!pip install unstructured
```
```python
import logging
import json
from unstructured.ingest.interfaces import (
ChunkingConfig,
EmbeddingConfig,
PartitionConfig,
ProcessorConfig,
ReadConfig,
)
partition_config = PartitionConfig(
partition_by_api=True,
api_key="super secret",
)
from unstructured.ingest.logger import ingest_log_streaming_init
ingest_log_streaming_init(logging.INFO)
logger = logging.getLogger("unstructured.ingest")
logger.setLevel(logging.INFO)
logger.info(
f"Running partition node to extract content from json files. "
f"Config: {partition_config.to_json()}, "
)
```
Now replace the first cell with the following and rerun the Python code.
Only the masked logging output should remain.
```
!git clone https://github.com/Unstructured-IO/unstructured.git && cd unstructured && git checkout fix/rm-log-dupes && pip install -e .
```
Copy file name to clipboardExpand all lines: CHANGELOG.md
+2-1
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
## 0.14.4-dev5
1
+
## 0.14.4-dev6
2
2
3
3
### Enhancements
4
4
@@ -12,6 +12,7 @@
12
12
13
13
### Fixes
14
14
15
+
***Remove root handlers in ingest logger**. Removes root handlers in ingest loggers to ensure secrets aren't accidentally exposed in Colab notebooks.
15
16
***Fix V2 S3 Destination Connector authentication** Fixes bugs with S3 Destination Connector where the connection config was neither registered nor properly deserialized.
16
17
***Clarified dependence on particular version of `python-docx`** Pinned `python-docx` version to ensure a particular method `unstructured` uses is included.
17
18
***Ingest preserves original file extension** Ingest V2 introduced a change that dropped the original extension for upgraded connectors. This reverts that change.
0 commit comments