Releases: aliyun/aliyun-odps-python-sdk
Releases · aliyun/aliyun-odps-python-sdk
v0.12.3
Features
- Add support for resources for VolumeFile and VolumeArchive for external volume files.
- Implements method to write SQL result to table.
- [Experimental] Add support for auto-partition table.
Enhancements
- Ignore cases for column names in schema and record.
- Remove decimal precision & scale check to allow large decimal scale.
- Add config for logview latency and print final progress.
- Enhance DDL generation for ROW FORMAT SERDE clause.
- Add global compress options for tunnel.
- Add session refresh option for Storage API.
- Add unique identifier ID to instance job model.
- Automatically add default settings in xflow instances.
- Allow equality comparison of columns and records.
- Allow arrow instance tunnel to use multiple processes.
- Add anonymous placeholder for DBAPI SQL parameterized query.
- Allow specifying dict type for structs when
options.struct_as_dict == True
. - Allow specifying table creation args and column types when calling
write_table
withcreate_table == True
. - Add raw error message to warning when PermissionError encountered in downloading data from instance tunnel.
- Automatically convert all settings to string in MaxFrame task to avoid unexpected behavior from server.
- Add cast for complex types to make sure arrow conversion returns correct result.
- Add
create_partition
argument for tunnel APIs.
Bugfixes
- Fix error when loading ODPS engine spec of superset.
- Fix cast for string column to new decimal type in arrow tunnel.
- Fix size check issue for varchar type with binary data.
- Fix error when None elements in complex types.
- Fix huge space between lines of repr(table) when name or type is long.
- Fix inferring table schema from arrow schema when struct type exists.
Documentations
- Add docs of table functions.
- Add docs for diagnosing slow SQL execution.
- Add docs of ODPS types.
- Add links to API references for object docs.
- Add API docs for tunnels and remove Mars docs.
- Mark PyODPS DataFrame as deprecated in documentation.
Compatibility issues
- Since PyODPS 0.12.3, column names with different capitalization are considered the same to make it compatible with behaviors of MaxCompute service. It may break some cases where capitalized column names are used.
- Since PyODPS 0.12.3,
dict
instead ofOrderedDict
is used by default for structs whenoptions.struct_as_dict == True
for Python>=3.7. Change to legacy behavior by settingoptions.struct_as_ordered_dict = True
.
v0.12.2.2
v0.12.2.1
Enhancements
- Add session refresh option for storage API.
- Enable timeout when using asyncmode for table tunnel.
- Enhance DDL generation for ROW FORMAT SERDE clause.
Documentation
- Add docs for basic types, tunnel and table functions.
Bugfixes
- Fix error when loading ODPS engine spec of superset.
v0.12.2
Features
- (Experimental) Add support for MCQAv2 for sqlalchemy.
- Add table alternation utility functions
- Add support of job insight instead of logviews. Can be turned on by configuring with
options.use_legacy_logview = False
.
Enhancements
- Make SuperSet support compatible with SuperSet 4.1.0 and later.
- Print usage when command not correct for pyodps-pack.
- Add support for timestamp_ntz for arrow tunnel.
- Add checks for potential None header values before request.
- Add retry when getting credentials from providers.
- Add retry when schema of stream tunnel mismatches with current schema.
- Make
odps.task.wlm.quota
available for MCQAv2 to set quota name. - Stop using platform.platform to check OS type.
- Escape comment in DDL to allow quotes within comments.
- Enable using field readers and writers in C tunnel implementation.
Bugfixes
- Fix tags header for tunnel readers and writers.
- Fix duplicate param error in to_pandas method in partitions.
Documentation
- Add installation notices for urllib3 version when ssl module is compiled with openssl<1.1.1.
v0.12.1.1
v0.12.1
Features
- (Experimental) Add metrics interface for tunnel.
- (Experimental) Add support for MCQAv2.
- Add support for upsert writer for table object.
Enhancements
- Support hashing of decimal types for primary keys.
- Shift CSV field size limit to table field size limit when reading with legacy result interface.
- Add cythonized decimal, array, map and struct validators to accelerate reading and writing of arrays.
- Add
allow_schema_mismatch
option and CDC info on tables and partitions. - Enhance
call_with_retry
to support KeyboardInterrupt and ignoring exceptions.
Bugfixes
- Fix mishandling of project name passed to TableTunnel constructor when getting table by name only.
- Fix duplicate param error in to_pandas method in partitions.
- Fix re-obtaining bearer-token when it meets timeout.
Documentation
- Fix dead URL for the guide to run PyODPS DataFrame in cluster.
- Refine description of tunnel download limit.
v0.12.0
Features
- Implements
write_table
with pandas to facilitate creating tables or partitions with pandas DataFrames.andto_pandas
methods to facilitate converting from and to pandas DataFrames. - Add support for converting table data and instance results to pandas DataFrames with
to_pandas
anditer_pandas
methods. - Add separate delete methods for views and materialized views.
- Add support for table freeze command.
- Add support for using computational quotas.
- Add params to allow creating and removing root directory of external volumes.
- Allow direct method to obtain SQL statement from instance objects.
- Supports using AlibabaCloud credentials to access MaxCompute.
- Add support for
append_partitions
argument on tunnel reader and writer. - Support complex types in UDF debug utility
pyou
. - Add support of seeking VCS directory roots with
pyodps-pack
.
Enhancements
- Allow record and reuse MCQA session with local file.
- Move session methods into models package.
- Enable
black
lint for repository and unify quotes with double quotes if possible. - Optimize handling of arrow tunnel timezones with
pyarrow.compute
if possible. - Add tags header for tunnel session requests.
- Allow creating ODPS entry via environment variables.
- Add definition of MaxFrame tasks.
- Split task module into multiple modules by task category.
- Move table download retry from table API into base tunnel.
- Enhance support of full resource paths and temp resources.
- Add default
task_name
for instance methods. - Upgrade tblib to 3.0.0 and refine compatibility for Python 2.7.
- Allow caching object names for SQLAlchemy to accelerate table listing.
- Enable instance waiting at server side and add retrys for methods on instance object.
- Set default value of
project_as_schema
given tenant schema support in DBAPI support. - Add retry and error logging when using multiprocessing to read.
- Remove support of legacy pandas IO support for simple types.
- Support schema version on stream upload session.
- Allow creating STS account from environment variables and force reload on expiration.
- Allow raising errors when task info resposes are empty by passing
raise_empty=True
. - Allow creating tunnel sessions with full table name.
- Rename
options.always_enable_schema
asoptions.enable_schema
to make it consistent with MaxFrame configutations. - Switch
TableTunnel.create_download_session
toasync_mode
by default to reduce chances of failure when calling tunnel SDK directly. - Warn when running
pyodps-pack
with sudo under macOS. Show warning and error messages with colors as well.
Bugfixes
- Fix error when uploading multiple batches and multiple blocks with BufferedArrowWriter.
- Fix field size error when it is specified in project settings.
- Fix
odps.merge.txn.table.compact
argument of merge compact command. - Fix compatibility for Numpy 2.0.
- Upgrade six and setuptools requirements under Python 3.12 to fix installation issue.
- Rewrite do_ping errors when called by Apache Superset to fix misleading error messages.
- Fix compatibility of Apache Superset date functions.
- Fix usage of sqlalchemy with bearer token account.
Documentation
- Add more documentations for tunnel APIs.
- Normalize zh strings in po files by using
jieba
to split words.
Deployment & Tests
- Make tests more stable by using distinct table names and ordered dict.
- Refine tests to make it possible to test with pytest-xdist.
- Fold full requirements inside requirement strings (thanks @dimbleby).
Compatibility Issue
- Undocumented APIs
open_pandas_reader
andopen_pandas_writer
are removed fromTableTunnel
class to make wheels simple. Users who call these APIs should useto_pandas
,write_table
or arrow tunnel support instead. options.always_enable_schema
is renamed asoptions.enable_schema
and the former option is now marked as deprecated.--without-docker
and--without-merge
options inpyodps-pack
are renamed as--no-docker
and--no-merge
to make them consistent withpip
style.