Skip to content

Commit cefa547

Browse files
committed
Merge branch 'develop' into master
* develop: Update CHANGELOG.md Use compression.zstd (PEP-784) (#895) Drop python 3.8, add python 3.14 (#896) [s3] Add range_chunk_size param to read using multiple GET requests (#887) Run tests in parallel (#893) Optimize forward seeks within buffered data to avoid redundant GET (#892) Add macos to CI (#891) Simplify CI, use uv (#890) [s3] Improve handling of InvalidRange and seek on empty file (#889) Protect against hanging tests (#888) Bump the github-actions group with 2 updates (#886) build: fix invalid `fallback_version` when builing with `uv` (#884) Remove travis leftover (#881) Disambiguate URI examples in README.rst (#879)
2 parents c17ae23 + 3178827 commit cefa547

File tree

18 files changed

+783
-526
lines changed

18 files changed

+783
-526
lines changed

.github/workflows/python-package.yml

Lines changed: 35 additions & 149 deletions
Original file line numberDiff line numberDiff line change
@@ -4,203 +4,89 @@ on:
44
push:
55
branches: [master, develop]
66
workflow_dispatch: # allows running CI manually from the Actions tab
7+
78
concurrency: # https://stackoverflow.com/questions/66335225#comment133398800_72408109
89
group: ${{ github.workflow }}-${{ github.ref || github.run_id }}
910
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
10-
jobs:
11-
linters:
12-
runs-on: ubuntu-24.04
13-
steps:
14-
- uses: actions/checkout@v5
15-
with:
16-
fetch-depth: 0 # fetch git tags for setuptools_scm (smart_open.__version__)
17-
18-
- name: Setup up Python 3.11
19-
uses: actions/setup-python@v5
20-
with:
21-
python-version: "3.11"
22-
23-
- name: Install dependencies
24-
run: pip install flake8 -e .[all]
25-
26-
- name: Run flake8 linter (source)
27-
run: flake8 --show-source smart_open
28-
29-
- name: "Check whether help.txt update was forgotten"
30-
if: github.event_name == 'pull_request'
31-
run: |
32-
python update_helptext.py
33-
test ! "$(git diff)" && echo "no changes" || ( git diff && echo 'looks like "python update_helptext.py" was forgotten' && exit 1 )
3411

35-
unit_tests:
36-
needs: [linters]
12+
jobs:
13+
ci:
3714
runs-on: ${{ matrix.os }}
15+
timeout-minutes: 10
3816
strategy:
3917
matrix:
4018
include:
41-
- {python-version: '3.8', os: ubuntu-24.04}
19+
# sync with linting steps below
4220
- {python-version: '3.9', os: ubuntu-24.04}
43-
- {python-version: '3.10', os: ubuntu-24.04}
44-
- {python-version: '3.11', os: ubuntu-24.04}
45-
- {python-version: '3.12', os: ubuntu-24.04}
46-
- {python-version: '3.13', os: ubuntu-24.04}
21+
- {python-version: '3.14', os: ubuntu-24.04}
4722

48-
- {python-version: '3.8', os: windows-2025}
4923
- {python-version: '3.9', os: windows-2025}
50-
- {python-version: '3.10', os: windows-2025}
51-
- {python-version: '3.11', os: windows-2025}
52-
- {python-version: '3.12', os: windows-2025}
53-
- {python-version: '3.13', os: windows-2025}
24+
- {python-version: '3.14', os: windows-2025}
25+
26+
# deprecate macos-15-intel when python 3.9 becomes the minimum supported version
27+
# ref https://github.blog/changelog/2025-09-19-github-actions-macos-13-runner-image-is-closing-down/
28+
# ref https://github.com/actions/python-versions/blob/d026dedcb/versions-manifest.json#L9969-L10016
29+
- {python-version: '3.9', os: macos-15-intel}
30+
- {python-version: '3.14', os: macos-15}
31+
5432
steps:
5533
- uses: actions/checkout@v5
5634
with:
5735
fetch-depth: 0 # fetch git tags for setuptools_scm (smart_open.__version__)
5836

59-
- uses: actions/setup-python@v5
37+
- uses: astral-sh/setup-uv@v6
6038
with:
6139
python-version: ${{ matrix.python-version }}
40+
activate-environment: true
41+
enable-cache: true
42+
cache-dependency-glob: "**/pyproject.toml"
6243

6344
- name: Install smart_open without dependencies
64-
run: pip install -e .
45+
run: uv pip install -e .
6546

6647
- name: Check that smart_open imports without dependencies
6748
run: python -c 'import smart_open'
6849

6950
- name: Install smart_open and its dependencies
70-
run: pip install -e .[test]
51+
run: uv pip install -e .[test]
7152

72-
- name: Run unit tests
73-
run: pytest tests -v -rfxECs --durations=20
74-
75-
doctest:
76-
needs: [linters,unit_tests]
77-
runs-on: ${{ matrix.os }}
78-
strategy:
79-
matrix:
80-
include:
81-
- {python-version: '3.8', os: ubuntu-24.04}
82-
- {python-version: '3.9', os: ubuntu-24.04}
83-
- {python-version: '3.10', os: ubuntu-24.04}
84-
- {python-version: '3.11', os: ubuntu-24.04}
85-
- {python-version: '3.12', os: ubuntu-24.04}
86-
- {python-version: '3.13', os: ubuntu-24.04}
87-
88-
#
89-
# Some of the doctests don't pass on Windows because of Windows-specific
90-
# character encoding issues.
91-
#
92-
# - {python-version: '3.8', os: windows-2025}
93-
# - {python-version: '3.9', os: windows-2025}
94-
# - {python-version: '3.10', os: windows-2025}
95-
# - {python-version: '3.11', os: windows-2025}
96-
# - {python-version: '3.12', os: windows-2025}
97-
# - {python-version: '3.13', os: windows-2025}
98-
99-
steps:
100-
- uses: actions/checkout@v5
101-
with:
102-
fetch-depth: 0 # fetch git tags for setuptools_scm (smart_open.__version__)
53+
- name: Run flake8 linter (source)
54+
if: matrix.python-version == '3.9' # sync with matrix above
55+
run: flake8 --show-source smart_open
10356

104-
- uses: actions/setup-python@v5
105-
with:
106-
python-version: ${{ matrix.python-version }}
57+
- name: "Check whether help.txt update was forgotten"
58+
if: matrix.python-version == '3.9' && github.event_name == 'pull_request' # sync with matrix above
59+
run: |
60+
python update_helptext.py
61+
test ! "$(git diff)" && echo "no changes" || ( git diff && echo 'looks like "python update_helptext.py" was forgotten' && exit 1 )
10762
108-
- name: Install smart_open and its dependencies
109-
run: pip install -e .[test]
63+
- name: Run unit tests
64+
# configuration in pyproject.toml
65+
run: pytest tests -v
11066

11167
- name: Run doctests
68+
if: startsWith(matrix.os, 'ubuntu')
11269
run: python ci_helpers/doctest.py
11370
env:
11471
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
11572
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
11673

117-
integration:
118-
needs: [linters,unit_tests]
119-
runs-on: ${{ matrix.os }}
120-
strategy:
121-
matrix:
122-
include:
123-
- {python-version: '3.8', os: ubuntu-24.04}
124-
- {python-version: '3.9', os: ubuntu-24.04}
125-
- {python-version: '3.10', os: ubuntu-24.04}
126-
- {python-version: '3.11', os: ubuntu-24.04}
127-
- {python-version: '3.12', os: ubuntu-24.04}
128-
- {python-version: '3.13', os: ubuntu-24.04}
129-
130-
# Not sure why we exclude these, perhaps for historical reasons?
131-
#
132-
# - {python-version: '3.8', os: windows-2025}
133-
# - {python-version: '3.9', os: windows-2025}
134-
# - {python-version: '3.10', os: windows-2025}
135-
# - {python-version: '3.11', os: windows-2025}
136-
# - {python-version: '3.12', os: windows-2025}
137-
# - {python-version: '3.13', os: windows-2025}
138-
139-
steps:
140-
- uses: actions/checkout@v5
141-
with:
142-
fetch-depth: 0 # fetch git tags for setuptools_scm (smart_open.__version__)
143-
144-
- uses: actions/setup-python@v5
145-
with:
146-
python-version: ${{ matrix.python-version }}
147-
148-
- name: Install smart_open and its dependencies
149-
run: pip install -e .[test]
150-
151-
- run: bash ci_helpers/helpers.sh enable_moto_server
152-
if: ${{ matrix.moto_server }}
153-
15474
- name: Start vsftpd
75+
if: startsWith(matrix.os, 'ubuntu')
15576
timeout-minutes: 2
15677
run: |
15778
sudo apt-get install vsftpd
15879
sudo bash ci_helpers/helpers.sh create_ftp_ftps_servers
15980
16081
- name: Run integration tests
82+
if: startsWith(matrix.os, 'ubuntu')
16183
run: python ci_helpers/run_integration_tests.py
16284
env:
16385
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
16486
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
16587

166-
- run: bash ci_helpers/helpers.sh disable_moto_server
167-
if: ${{ matrix.moto_server }}
168-
169-
- run: sudo bash ci_helpers/helpers.sh delete_ftp_ftps_servers
170-
171-
benchmarks:
172-
needs: [linters,unit_tests]
173-
runs-on: ${{ matrix.os }}
174-
strategy:
175-
matrix:
176-
include:
177-
- {python-version: '3.8', os: ubuntu-24.04}
178-
- {python-version: '3.9', os: ubuntu-24.04}
179-
- {python-version: '3.10', os: ubuntu-24.04}
180-
- {python-version: '3.11', os: ubuntu-24.04}
181-
- {python-version: '3.12', os: ubuntu-24.04}
182-
- {python-version: '3.13', os: ubuntu-24.04}
183-
184-
# - {python-version: '3.8', os: windows-2025}
185-
# - {python-version: '3.9', os: windows-2025}
186-
# - {python-version: '3.10', os: windows-2025}
187-
# - {python-version: '3.11', os: windows-2025}
188-
# - {python-version: '3.12', os: windows-2025}
189-
# - {python-version: '3.13', os: windows-2025}
190-
191-
steps:
192-
- uses: actions/checkout@v5
193-
with:
194-
fetch-depth: 0 # fetch git tags for setuptools_scm (smart_open.__version__)
195-
196-
- uses: actions/setup-python@v5
197-
with:
198-
python-version: ${{ matrix.python-version }}
199-
200-
- name: Install smart_open and its dependencies
201-
run: pip install -e .[test]
202-
20388
- name: Run benchmarks
89+
if: startsWith(matrix.os, 'ubuntu')
20490
run: python ci_helpers/run_benchmarks.py
20591
env:
20692
SO_BUCKET: smart-open

.github/workflows/release.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ jobs:
1919
fetch-depth: 0 # fetch git tags for setuptools_scm (smart_open.__version__)
2020

2121
- name: Set up Python
22-
uses: actions/setup-python@v5
22+
uses: actions/setup-python@v6
2323
with:
2424
python-version: 3.x
2525

@@ -32,7 +32,7 @@ jobs:
3232
python -m build
3333
3434
- name: Upload package distributions as release assets
35-
uses: softprops/[email protected].2
35+
uses: softprops/[email protected].3
3636
with:
3737
files: dist/*
3838

CHANGELOG.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,19 @@
1+
# 7.4.0, 2025-10-20
2+
3+
- Disambiguate URI examples in README.rst (PR [#879](https://github.com/piskvorky/smart_open/pull/879), [@ddelange](https://github.com/ddelange))
4+
- Remove travis leftover (PR [#881](https://github.com/piskvorky/smart_open/pull/881), [@ddelange](https://github.com/ddelange))
5+
- build: fix invalid `fallback_version` when builing with `uv` (PR [#884](https://github.com/piskvorky/smart_open/pull/884), [@DeflateAwning](https://github.com/DeflateAwning))
6+
- Bump the github-actions group with 2 updates (PR [#886](https://github.com/piskvorky/smart_open/pull/886), [@dependabot[bot]](https://github.com/apps/dependabot))
7+
- Protect against hanging tests (PR [#888](https://github.com/piskvorky/smart_open/pull/888), [@ddelange](https://github.com/ddelange))
8+
- [s3] Improve handling of InvalidRange and seek on empty file (PR [#889](https://github.com/piskvorky/smart_open/pull/889), [@ddelange](https://github.com/ddelange))
9+
- Simplify CI, use uv (PR [#890](https://github.com/piskvorky/smart_open/pull/890), [@ddelange](https://github.com/ddelange))
10+
- Add macos to CI (PR [#891](https://github.com/piskvorky/smart_open/pull/891), [@ddelange](https://github.com/ddelange))
11+
- [s3] Optimize forward seeks within buffered data to avoid redundant GET (PR [#892](https://github.com/piskvorky/smart_open/pull/892), [@ddelange](https://github.com/ddelange))
12+
- Run tests in parallel (PR [#893](https://github.com/piskvorky/smart_open/pull/893), [@ddelange](https://github.com/ddelange))
13+
- [s3] Add range_chunk_size param to read using multiple GET requests (PR [#887](https://github.com/piskvorky/smart_open/pull/887), [@ddelange](https://github.com/ddelange))
14+
- Drop python 3.8, add python 3.14 (PR [#896](https://github.com/piskvorky/smart_open/pull/896), [@ddelange](https://github.com/ddelange))
15+
- Use compression.zstd (PEP-784) (PR [#895](https://github.com/piskvorky/smart_open/pull/895), [@Rogdham](https://github.com/Rogdham))
16+
117
# 7.3.1, 2025-09-08
218

319
- Fix release.sh for the final merge back into develop (PR [#872](https://github.com/piskvorky/smart_open/pull/872), [@ddelange](https://github.com/ddelange))

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Tests should pass:
2121
pytest
2222
```
2323

24-
Thats it! When you're done, deactivate the venv:
24+
That's it! When you're done, deactivate the venv:
2525

2626
```sh
2727
deactivate

README.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -80,19 +80,19 @@ How?
8080
8181
>>> # stream from HTTP
8282
>>> for line in open('http://example.com/index.html'):
83-
... print(repr(line))
83+
... print(repr(line[:15]))
8484
... break
85-
'<!doctype html>\n'
85+
'<!doctype html>'
8686
8787
.. _doctools_after_examples:
8888

89-
Other examples of URLs that ``smart_open`` accepts::
89+
Other examples of URIs that ``smart_open`` accepts::
9090

91-
s3://my_bucket/my_key
92-
s3://my_key:my_secret@my_bucket/my_key
93-
s3://my_key:my_secret@my_server:my_port@my_bucket/my_key
94-
gs://my_bucket/my_blob
95-
azure://my_bucket/my_blob
91+
s3://bucket/key
92+
s3://access_key_id:secret_access_key@bucket/key
93+
s3://access_key_id:secret_access_key@server:port@bucket/key
94+
gs://bucket/blob
95+
azure://bucket/blob
9696
hdfs:///path/file
9797
hdfs://path/file
9898
webhdfs://host:port/path/file

ci_helpers/run_benchmarks.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
'-v',
3333
'integration-tests/test_s3.py',
3434
'--benchmark-save=%s' % commit_hash,
35+
'--numprocesses=0', # disable pytest-xdist
3536
]
3637
)
3738

help.txt

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,12 @@ FUNCTIONS
199199
mode: str
200200
The mode for opening the object. Must be either "rb" or "wb".
201201
buffer_size: int, optional
202-
The buffer size to use when performing I/O.
202+
Default: 128KB
203+
The buffer size in bytes for reading. Controls memory usage. Data is streamed
204+
from a S3 network stream in buffer_size chunks. Forward seeks within
205+
the current buffer are satisfied without additional GET requests. Backward
206+
seeks always open a new GET request. For forward seek-intensive workloads,
207+
increase buffer_size to reduce GET requests at the cost of higher memory usage.
203208
min_part_size: int, optional
204209
The minimum part size for multipart uploads, in bytes.
205210
When the writebuffer contains this many bytes, smart_open will upload
@@ -228,6 +233,21 @@ FUNCTIONS
228233
If set to `True` on a file opened for reading, GetObject will not be
229234
called until the first seek() or read().
230235
Avoids redundant API queries when seeking before reading.
236+
range_chunk_size: int, optional
237+
Default: `None`
238+
Maximum byte range per S3 GET request when reading.
239+
When None (default), a single GET request is made for the entire file,
240+
and data is streamed from that single botocore.response.StreamingBody
241+
in buffer_size chunks.
242+
When set to a positive integer, multiple GET requests are made, each
243+
limited to at most this many bytes via HTTP Range headers. Each GET
244+
returns a new StreamingBody that is streamed in buffer_size chunks.
245+
Useful for reading small portions of large files without forcing
246+
S3-compatible systems like SeaweedFS/Ceph to load the entire file.
247+
Larger values mean fewer billable GET requests but higher load on S3
248+
servers. Smaller values mean more GET requests but less server load per request.
249+
Values larger than the file size result in a single GET for the whole file.
250+
Affects reading only. Does not affect memory usage (controlled by buffer_size).
231251
client: object, optional
232252
The S3 client to use when working with boto3.
233253
If you don't specify this, then smart_open will create a new client for you.
@@ -318,7 +338,7 @@ FUNCTIONS
318338

319339
>>> # stream from HTTP
320340
>>> for line in open('http://example.com/index.html'):
321-
... print(repr(line))
341+
... print(repr(line[:15]))
322342
... break
323343

324344
This function also supports transparent compression and decompression
@@ -334,7 +354,7 @@ FUNCTIONS
334354

335355
See Also
336356
--------
337-
- `Standard library reference <https://docs.python.org/3.13/library/functions.html#open>`__
357+
- `Standard library reference <https://docs.python.org/3.14/library/functions.html#open>`__
338358
- `smart_open README.rst
339359
<https://github.com/piskvorky/smart_open/blob/master/README.rst>`__
340360

0 commit comments

Comments
 (0)