Skip to content

Commit e040efd

Browse files
committed
Merge branch 'release-7.1.0'
2 parents 497541c + 3569c17 commit e040efd

File tree

15 files changed

+169
-63
lines changed

15 files changed

+169
-63
lines changed

.github/workflows/python-package.yml

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,6 @@ jobs:
1414
with:
1515
python-version: "3.11"
1616

17-
- name: Update pip
18-
run: pip install -U pip
19-
2017
- name: Install dependencies
2118
run: pip install flake8
2219

@@ -34,22 +31,21 @@ jobs:
3431
- {python-version: '3.10', os: ubuntu-20.04}
3532
- {python-version: '3.11', os: ubuntu-20.04}
3633
- {python-version: '3.12', os: ubuntu-20.04}
34+
- {python-version: '3.13', os: ubuntu-20.04}
3735

3836
- {python-version: '3.8', os: windows-2019}
3937
- {python-version: '3.9', os: windows-2019}
4038
- {python-version: '3.10', os: windows-2019}
4139
- {python-version: '3.11', os: windows-2019}
4240
- {python-version: '3.12', os: windows-2019}
41+
- {python-version: '3.13', os: windows-2019}
4342
steps:
4443
- uses: actions/checkout@v2
4544

4645
- uses: actions/setup-python@v2
4746
with:
4847
python-version: ${{ matrix.python-version }}
4948

50-
- name: Update pip
51-
run: pip install -U pip
52-
5349
- name: Install smart_open without dependencies
5450
run: pip install -e .
5551

@@ -73,6 +69,7 @@ jobs:
7369
- {python-version: '3.10', os: ubuntu-20.04}
7470
- {python-version: '3.11', os: ubuntu-20.04}
7571
- {python-version: '3.12', os: ubuntu-20.04}
72+
- {python-version: '3.13', os: ubuntu-20.04}
7673

7774
#
7875
# Some of the doctests don't pass on Windows because of Windows-specific
@@ -82,6 +79,9 @@ jobs:
8279
# - {python-version: '3.8', os: windows-2019}
8380
# - {python-version: '3.9', os: windows-2019}
8481
# - {python-version: '3.10', os: windows-2019}
82+
# - {python-version: '3.11', os: windows-2019}
83+
# - {python-version: '3.12', os: windows-2019}
84+
# - {python-version: '3.13', os: windows-2019}
8585

8686
steps:
8787
- uses: actions/checkout@v2
@@ -90,9 +90,6 @@ jobs:
9090
with:
9191
python-version: ${{ matrix.python-version }}
9292

93-
- name: Update pip
94-
run: pip install -U pip
95-
9693
- name: Install smart_open and its dependencies
9794
run: pip install -e .[test]
9895

@@ -113,13 +110,17 @@ jobs:
113110
- {python-version: '3.10', os: ubuntu-20.04}
114111
- {python-version: '3.11', os: ubuntu-20.04}
115112
- {python-version: '3.12', os: ubuntu-20.04}
113+
- {python-version: '3.13', os: ubuntu-20.04}
116114

117115
# Not sure why we exclude these, perhaps for historical reasons?
118116
#
119117
# - {python-version: '3.7', os: windows-2019}
120118
# - {python-version: '3.8', os: windows-2019}
121119
# - {python-version: '3.9', os: windows-2019}
122120
# - {python-version: '3.10', os: windows-2019}
121+
# - {python-version: '3.11', os: windows-2019}
122+
# - {python-version: '3.12', os: windows-2019}
123+
# - {python-version: '3.13', os: windows-2019}
123124

124125
steps:
125126
- uses: actions/checkout@v2
@@ -128,9 +129,6 @@ jobs:
128129
with:
129130
python-version: ${{ matrix.python-version }}
130131

131-
- name: Update pip
132-
run: pip install -U pip
133-
134132
- name: Install smart_open and its dependencies
135133
run: pip install -e .[test]
136134

@@ -165,11 +163,15 @@ jobs:
165163
- {python-version: '3.10', os: ubuntu-20.04}
166164
- {python-version: '3.11', os: ubuntu-20.04}
167165
- {python-version: '3.12', os: ubuntu-20.04}
166+
- {python-version: '3.13', os: ubuntu-20.04}
168167

169168
# - {python-version: '3.7', os: windows-2019}
170169
# - {python-version: '3.8', os: windows-2019}
171170
# - {python-version: '3.9', os: windows-2019}
172171
# - {python-version: '3.10', os: windows-2019}
172+
# - {python-version: '3.11', os: windows-2019}
173+
# - {python-version: '3.12', os: windows-2019}
174+
# - {python-version: '3.13', os: windows-2019}
173175

174176
steps:
175177
- uses: actions/checkout@v2
@@ -178,9 +180,6 @@ jobs:
178180
with:
179181
python-version: ${{ matrix.python-version }}
180182

181-
- name: Update pip
182-
run: pip install -U pip
183-
184183
- name: Install smart_open and its dependencies
185184
run: pip install -e .[test]
186185

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
# 7.1.0, 2024-12-17
2+
3+
- Add support for python 3.13 (PR [#847](https://github.com/piskvorky/smart_open/pull/847), [@ddelange](https://github.com/ddelange))
4+
- Propagate uri to compression_wrapper (PR [#842](https://github.com/piskvorky/smart_open/pull/842), [@ddelange](https://github.com/ddelange))
5+
16
# 7.0.5, 2024-10-04
27

38
- Fix zstd compression in ab mode (PR [#833](https://github.com/piskvorky/smart_open/pull/833), [@ddelange](https://github.com/ddelange))

README.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -227,6 +227,7 @@ The supported values for this parameter are:
227227
- ``disable``
228228
- ``.gz``
229229
- ``.bz2``
230+
- ``.zst``
230231

231232
By default, ``smart_open`` determines the compression algorithm to use based on the file extension.
232233

integration-tests/test_ftp.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
from __future__ import unicode_literals
2+
import gzip
23
import pytest
34
from smart_open import open
45
import ssl
@@ -52,6 +53,39 @@ def test_binary(server_info):
5253
read_contents = f.read()
5354
assert read_contents == file_contents + appended_content1
5455

56+
def test_compression(server_info):
57+
server_type = server_info[0]
58+
port_num = server_info[1]
59+
file_contents = "Test Test \n new test \n another tests"
60+
appended_content1 = "Added \n to end"
61+
62+
with open(f"{server_type}://user:123@localhost:{port_num}/file.gz", "w") as f:
63+
f.write(file_contents)
64+
65+
with open(f"{server_type}://user:123@localhost:{port_num}/file.gz", "r") as f:
66+
read_contents = f.read()
67+
assert read_contents == file_contents
68+
69+
with open(f"{server_type}://user:123@localhost:{port_num}/file.gz", "a") as f:
70+
f.write(appended_content1)
71+
72+
with open(f"{server_type}://user:123@localhost:{port_num}/file.gz", "r") as f:
73+
read_contents = f.read()
74+
assert read_contents == file_contents + appended_content1
75+
76+
# ftp socket makefile returns a file whose name attribute is fileno() which is int
77+
# that can't be used to infer compression extension, so the calls above would
78+
# silently not use any compression (neither reading nor writing) so they would pass
79+
# pytest suppresses the logging.warning('unable to transparently decompress...')
80+
# so check here explicitly that the bytes on server are gzip compressed
81+
with open(
82+
f"{server_type}://user:123@localhost:{port_num}/file.gz",
83+
"rb",
84+
compression='disable',
85+
) as f:
86+
read_contents = gzip.decompress(f.read()).decode()
87+
assert read_contents == file_contents + appended_content1
88+
5589
def test_line_endings_non_binary(server_info):
5690
server_type = server_info[0]
5791
port_num = server_info[1]

setup.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,8 @@ def read(fname):
102102
'Programming Language :: Python :: 3.9',
103103
'Programming Language :: Python :: 3.10',
104104
'Programming Language :: Python :: 3.11',
105+
'Programming Language :: Python :: 3.12',
106+
'Programming Language :: Python :: 3.13',
105107
'Topic :: System :: Distributed Computing',
106108
'Topic :: Database :: Front-Ends',
107109
],

smart_open/azure.py

Lines changed: 32 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -195,8 +195,9 @@ class Reader(io.BufferedIOBase):
195195
Implements the io.BufferedIOBase interface of the standard library.
196196
197197
:raises azure.core.exceptions.ResourceNotFoundError: Raised when the blob to read from does not exist.
198-
199198
"""
199+
_blob = None # so `closed` property works in case __init__ fails and __del__ is called
200+
200201
def __init__(
201202
self,
202203
container,
@@ -207,9 +208,10 @@ def __init__(
207208
max_concurrency=DEFAULT_MAX_CONCURRENCY,
208209
):
209210
self._container_name = container
211+
self._blob_name = blob
210212

211-
self._blob = _get_blob_client(client, container, blob)
212213
# type: azure.storage.blob.BlobClient
214+
self._blob = _get_blob_client(client, container, blob)
213215

214216
if self._blob is None:
215217
raise azure.core.exceptions.ResourceNotFoundError(
@@ -236,8 +238,13 @@ def __init__(
236238
def close(self):
237239
"""Flush and close this stream."""
238240
logger.debug("close: called")
239-
self._blob = None
240-
self._raw_reader = None
241+
if not self.closed:
242+
self._blob = None
243+
self._raw_reader = None
244+
245+
@property
246+
def closed(self):
247+
return self._blob is None
241248

242249
def readable(self):
243250
"""Return True if the stream can be read from."""
@@ -369,20 +376,26 @@ def __exit__(self, exc_type, exc_val, exc_tb):
369376
self.close()
370377

371378
def __str__(self):
372-
return "(%s, %r, %r)" % (self.__class__.__name__,
373-
self._container_name,
374-
self._blob.blob_name)
379+
return "(%s, %r, %r)" % (
380+
self.__class__.__name__,
381+
self._container_name,
382+
self._blob_name
383+
)
375384

376385
def __repr__(self):
377386
return "%s(container=%r, blob=%r)" % (
378-
self.__class__.__name__, self._container_name, self._blob.blob_name,
387+
self.__class__.__name__,
388+
self._container_name,
389+
self._blob_name,
379390
)
380391

381392

382393
class Writer(io.BufferedIOBase):
383394
"""Writes bytes to Azure Blob Storage.
384395
385-
Implements the io.BufferedIOBase interface of the standard library."""
396+
Implements the io.BufferedIOBase interface of the standard library.
397+
"""
398+
_blob = None # so `closed` property works in case __init__ fails and __del__ is called
386399

387400
def __init__(
388401
self,
@@ -392,21 +405,19 @@ def __init__(
392405
blob_kwargs=None,
393406
min_part_size=_DEFAULT_MIN_PART_SIZE,
394407
):
395-
self._is_closed = False
396408
self._container_name = container
397-
398-
self._blob = _get_blob_client(client, container, blob)
409+
self._blob_name = blob
399410
self._blob_kwargs = blob_kwargs or {}
400-
# type: azure.storage.blob.BlobClient
401-
402411
self._min_part_size = min_part_size
403-
404412
self._total_size = 0
405413
self._total_parts = 0
406414
self._bytes_uploaded = 0
407415
self._current_part = io.BytesIO()
408416
self._block_list = []
409417

418+
# type: azure.storage.blob.BlobClient
419+
self._blob = _get_blob_client(client, container, blob)
420+
410421
#
411422
# This member is part of the io.BufferedIOBase interface.
412423
#
@@ -424,25 +435,26 @@ def terminate(self):
424435
logger.debug('%s: terminating multipart upload', self)
425436
if not self.closed:
426437
self._block_list = []
427-
self._is_closed = True
438+
self._blob = None
428439
logger.debug('%s: terminated multipart upload', self)
429440

430441
#
431442
# Override some methods from io.IOBase.
432443
#
433444
def close(self):
445+
logger.debug("close: called")
434446
if not self.closed:
435447
logger.debug('%s: completing multipart upload', self)
436448
if self._current_part.tell() > 0:
437449
self._upload_part()
438450
self._blob.commit_block_list(self._block_list, **self._blob_kwargs)
439451
self._block_list = []
440-
self._is_closed = True
452+
self._blob = None
441453
logger.debug('%s: completed multipart upload', self)
442454

443455
@property
444456
def closed(self):
445-
return self._is_closed
457+
return self._blob is None
446458

447459
def writable(self):
448460
"""Return True if the stream supports writing."""
@@ -528,13 +540,13 @@ def __str__(self):
528540
return "(%s, %r, %r)" % (
529541
self.__class__.__name__,
530542
self._container_name,
531-
self._blob.blob_name
543+
self._blob_name
532544
)
533545

534546
def __repr__(self):
535547
return "%s(container=%r, blob=%r, min_part_size=%r)" % (
536548
self.__class__.__name__,
537549
self._container_name,
538-
self._blob.blob_name,
550+
self._blob_name,
539551
self._min_part_size
540552
)

smart_open/hdfs.py

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ class CliRawInputBase(io.RawIOBase):
6868
6969
Implements the io.RawIOBase interface of the standard library.
7070
"""
71+
_sub = None # so `closed` property works in case __init__ fails and __del__ is called
7172

7273
def __init__(self, uri):
7374
self._uri = uri
@@ -84,8 +85,13 @@ def __init__(self, uri):
8485
def close(self):
8586
"""Flush and close this stream."""
8687
logger.debug("close: called")
87-
self._sub.terminate()
88-
self._sub = None
88+
if not self.closed:
89+
self._sub.terminate()
90+
self._sub = None
91+
92+
@property
93+
def closed(self):
94+
return self._sub is None
8995

9096
def readable(self):
9197
"""Return True if the stream can be read from."""
@@ -125,6 +131,8 @@ class CliRawOutputBase(io.RawIOBase):
125131
126132
Implements the io.RawIOBase interface of the standard library.
127133
"""
134+
_sub = None # so `closed` property works in case __init__ fails and __del__ is called
135+
128136
def __init__(self, uri):
129137
self._uri = uri
130138
self._sub = subprocess.Popen(["hdfs", "dfs", '-put', '-f', '-', self._uri],
@@ -136,9 +144,16 @@ def __init__(self, uri):
136144
self.raw = None
137145

138146
def close(self):
139-
self.flush()
140-
self._sub.stdin.close()
141-
self._sub.wait()
147+
logger.debug("close: called")
148+
if not self.closed:
149+
self.flush()
150+
self._sub.stdin.close()
151+
self._sub.wait()
152+
self._sub = None
153+
154+
@property
155+
def closed(self):
156+
return self._sub is None
142157

143158
def flush(self):
144159
self._sub.stdin.flush()

0 commit comments

Comments
 (0)