-
-
Notifications
You must be signed in to change notification settings - Fork 387
Open
Labels
help wantedWe can't figure this out, if you can, then please help!We can't figure this out, if you can, then please help!
Description
Problem description
I am trying to stream a binary file from Azure Blob Storage.
I expect to be able to iterate over chunks of the data set, but I see an error do with the Azure readinto function.
I'm using the npTDMS library to read a LabVIEW data file in TDMS format (binary quantitative data files.)
Steps/code to reproduce the problem
The code is something like this:
import azure.storage.blob
import smart_open
import nptdms
CONN_STR = '******************'
BLOB_URI = 'azure://test/my_data_file.tdms'
transport_params = dict(
client=azure.storage.blob.BlobServiceClient.from_connection_string(conn_str=CONN_STR),
)
with smart_open.open(BLOB_URI, mode='rb', transport_params=transport_params) as file:
with nptdms.TdmsFile.open(file) as tdms_file:
for group in tdms_file.groups():
for channel in group.channels():
for chunk in channel.data_chunks():
passand the error I get is:
Traceback (most recent call last):
File "C:\Users\my_username\my_project\scripts\blob-tdms\smart.py", line 35, in <module>
main()
File "C:\Users\my_username\my_project\scripts\blob-tdms\smart.py", line 28, in main
for chunk in channel.data_chunks():
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\tdms.py", line 564, in data_chunks
for raw_data_chunk in self._read_channel_data_chunks():
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\tdms.py", line 758, in _read_channel_data_chunks
for chunk in self._reader.read_raw_data_for_channel(self.path):
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\reader.py", line 191, in read_raw_data_for_channel
for i, chunk in enumerate(
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\tdms_segment.py", line 269, in read_raw_data_for_channel
for chunk in self._read_channel_data_chunks(f, data_objects, channel_path, chunk_offset, stop_chunk):
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\tdms_segment.py", line 367, in _read_channel_data_chunks
for chunk in reader.read_channel_data_chunks(file, data_objects, channel_path, chunk_offset, stop_chunk):
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\base_segment.py", line 64, in read_channel_data_chunks
yield self._read_channel_data_chunk(file, data_objects, chunk_index, channel_path)
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\base_segment.py", line 72, in _read_channel_data_chunk
data_chunk = self._read_data_chunk(file, data_objects, chunk_index)
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\daqmx.py", line 39, in _read_data_chunk
combined_data = read_interleaved_segment_bytes(file, raw_data_width, chunk_size)
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\base_segment.py", line 159, in read_interleaved_segment_bytes
combined_data = fromfile(f, dtype=np.uint8, count=number_bytes)
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\nptdms\base_segment.py", line 147, in fromfile
bytes_read = file.readinto(buffer[offset:])
File "C:\Users\my_username\Miniconda3\envs\my_project\lib\site-packages\smart_open\azure.py", line 322, in readinto
b[:len(data)] = data
ValueError: invalid literal for int() with base 10: b'\x93\xad\x03\x00k\xf0\xff\xff\xfe\xee\xff\xffm\xfd\xff\xffd\xc1E\x00<\xad\x03\x00O\xf0\xff\xffI\xee\xff\xff\xd1\xfd\xff\xff\xbe\xc2E\x00\xe8\xac\x03\x00\xa6\xef\xff\xff\xe5\xed\xff\xff\x92\xfd\xff\x
It seems like it's expecting a text file? Or it's not calculating the data index correctly to page through the data set?
Versions
>>> import platform, sys, smart_open
>>> print(platform.platform())
Windows-10-10.0.19042-SP0
>>> print("Python", sys.version)
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:15:42) [MSC v.1916 64 bit (AMD64)]
>>> print("smart_open", smart_open.__version__)
smart_open 6.1.0From pip list:
azure-core 1.23.0
azure-storage-blob 12.10.0
npTDMS 1.4.0
smart-open 6.1.0
Metadata
Metadata
Assignees
Labels
help wantedWe can't figure this out, if you can, then please help!We can't figure this out, if you can, then please help!