Skip to content

Releases: aliyun/aliyun-odps-python-sdk

v0.12.3

09 May 02:14
e77853e
Compare
Choose a tag to compare

Features

  • Add support for resources for VolumeFile and VolumeArchive for external volume files.
  • Implements method to write SQL result to table.
  • [Experimental] Add support for auto-partition table.

Enhancements

  • Ignore cases for column names in schema and record.
  • Remove decimal precision & scale check to allow large decimal scale.
  • Add config for logview latency and print final progress.
  • Enhance DDL generation for ROW FORMAT SERDE clause.
  • Add global compress options for tunnel.
  • Add session refresh option for Storage API.
  • Add unique identifier ID to instance job model.
  • Automatically add default settings in xflow instances.
  • Allow equality comparison of columns and records.
  • Allow arrow instance tunnel to use multiple processes.
  • Add anonymous placeholder for DBAPI SQL parameterized query.
  • Allow specifying dict type for structs when options.struct_as_dict == True.
  • Allow specifying table creation args and column types when calling write_table with create_table == True.
  • Add raw error message to warning when PermissionError encountered in downloading data from instance tunnel.
  • Automatically convert all settings to string in MaxFrame task to avoid unexpected behavior from server.
  • Add cast for complex types to make sure arrow conversion returns correct result.
  • Add create_partition argument for tunnel APIs.

Bugfixes

  • Fix error when loading ODPS engine spec of superset.
  • Fix cast for string column to new decimal type in arrow tunnel.
  • Fix size check issue for varchar type with binary data.
  • Fix error when None elements in complex types.
  • Fix huge space between lines of repr(table) when name or type is long.
  • Fix inferring table schema from arrow schema when struct type exists.

Documentations

  • Add docs of table functions.
  • Add docs for diagnosing slow SQL execution.
  • Add docs of ODPS types.
  • Add links to API references for object docs.
  • Add API docs for tunnels and remove Mars docs.
  • Mark PyODPS DataFrame as deprecated in documentation.

Compatibility issues

  • Since PyODPS 0.12.3, column names with different capitalization are considered the same to make it compatible with behaviors of MaxCompute service. It may break some cases where capitalized column names are used.
  • Since PyODPS 0.12.3, dict instead of OrderedDict is used by default for structs when options.struct_as_dict == True for Python>=3.7. Change to legacy behavior by setting options.struct_as_ordered_dict = True.

v0.12.2.2

18 Apr 05:14
Compare
Choose a tag to compare

Bugfixes

  • Fixes potential corruption of default global settings.

v0.12.2.1

20 Mar 06:26
82ee2a3
Compare
Choose a tag to compare

Enhancements

  • Add session refresh option for storage API.
  • Enable timeout when using asyncmode for table tunnel.
  • Enhance DDL generation for ROW FORMAT SERDE clause.

Documentation

  • Add docs for basic types, tunnel and table functions.

Bugfixes

  • Fix error when loading ODPS engine spec of superset.

v0.12.2

03 Jan 02:06
6f5dbfb
Compare
Choose a tag to compare

Features

  • (Experimental) Add support for MCQAv2 for sqlalchemy.
  • Add table alternation utility functions
  • Add support of job insight instead of logviews. Can be turned on by configuring with options.use_legacy_logview = False.

Enhancements

  • Make SuperSet support compatible with SuperSet 4.1.0 and later.
  • Print usage when command not correct for pyodps-pack.
  • Add support for timestamp_ntz for arrow tunnel.
  • Add checks for potential None header values before request.
  • Add retry when getting credentials from providers.
  • Add retry when schema of stream tunnel mismatches with current schema.
  • Make odps.task.wlm.quota available for MCQAv2 to set quota name.
  • Stop using platform.platform to check OS type.
  • Escape comment in DDL to allow quotes within comments.
  • Enable using field readers and writers in C tunnel implementation.

Bugfixes

  • Fix tags header for tunnel readers and writers.
  • Fix duplicate param error in to_pandas method in partitions.

Documentation

  • Add installation notices for urllib3 version when ssl module is compiled with openssl<1.1.1.

v0.12.1.1

05 Dec 05:48
469343a
Compare
Choose a tag to compare

Bugfixes

  • Add an import to requests in odps.lib to resolve compatibility issue of legacy codes.

v0.12.1

22 Nov 02:13
df06c90
Compare
Choose a tag to compare

Features

  • (Experimental) Add metrics interface for tunnel.
  • (Experimental) Add support for MCQAv2.
  • Add support for upsert writer for table object.

Enhancements

  • Support hashing of decimal types for primary keys.
  • Shift CSV field size limit to table field size limit when reading with legacy result interface.
  • Add cythonized decimal, array, map and struct validators to accelerate reading and writing of arrays.
  • Add allow_schema_mismatch option and CDC info on tables and partitions.
  • Enhance call_with_retry to support KeyboardInterrupt and ignoring exceptions.

Bugfixes

  • Fix mishandling of project name passed to TableTunnel constructor when getting table by name only.
  • Fix duplicate param error in to_pandas method in partitions.
  • Fix re-obtaining bearer-token when it meets timeout.

Documentation

  • Fix dead URL for the guide to run PyODPS DataFrame in cluster.
  • Refine description of tunnel download limit.

v0.12.0

03 Oct 02:22
901b769
Compare
Choose a tag to compare

Features

  • Implements write_table with pandas to facilitate creating tables or partitions with pandas DataFrames.and to_pandas methods to facilitate converting from and to pandas DataFrames.
  • Add support for converting table data and instance results to pandas DataFrames with to_pandas and iter_pandas methods.
  • Add separate delete methods for views and materialized views.
  • Add support for table freeze command.
  • Add support for using computational quotas.
  • Add params to allow creating and removing root directory of external volumes.
  • Allow direct method to obtain SQL statement from instance objects.
  • Supports using AlibabaCloud credentials to access MaxCompute.
  • Add support for append_partitions argument on tunnel reader and writer.
  • Support complex types in UDF debug utility pyou.
  • Add support of seeking VCS directory roots with pyodps-pack.

Enhancements

  • Allow record and reuse MCQA session with local file.
  • Move session methods into models package.
  • Enable black lint for repository and unify quotes with double quotes if possible.
  • Optimize handling of arrow tunnel timezones with pyarrow.compute if possible.
  • Add tags header for tunnel session requests.
  • Allow creating ODPS entry via environment variables.
  • Add definition of MaxFrame tasks.
  • Split task module into multiple modules by task category.
  • Move table download retry from table API into base tunnel.
  • Enhance support of full resource paths and temp resources.
  • Add default task_name for instance methods.
  • Upgrade tblib to 3.0.0 and refine compatibility for Python 2.7.
  • Allow caching object names for SQLAlchemy to accelerate table listing.
  • Enable instance waiting at server side and add retrys for methods on instance object.
  • Set default value of project_as_schema given tenant schema support in DBAPI support.
  • Add retry and error logging when using multiprocessing to read.
  • Remove support of legacy pandas IO support for simple types.
  • Support schema version on stream upload session.
  • Allow creating STS account from environment variables and force reload on expiration.
  • Allow raising errors when task info resposes are empty by passing raise_empty=True.
  • Allow creating tunnel sessions with full table name.
  • Rename options.always_enable_schema as options.enable_schema to make it consistent with MaxFrame configutations.
  • Switch TableTunnel.create_download_session to async_mode by default to reduce chances of failure when calling tunnel SDK directly.
  • Warn when running pyodps-pack with sudo under macOS. Show warning and error messages with colors as well.

Bugfixes

  • Fix error when uploading multiple batches and multiple blocks with BufferedArrowWriter.
  • Fix field size error when it is specified in project settings.
  • Fix odps.merge.txn.table.compact argument of merge compact command.
  • Fix compatibility for Numpy 2.0.
  • Upgrade six and setuptools requirements under Python 3.12 to fix installation issue.
  • Rewrite do_ping errors when called by Apache Superset to fix misleading error messages.
  • Fix compatibility of Apache Superset date functions.
  • Fix usage of sqlalchemy with bearer token account.

Documentation

  • Add more documentations for tunnel APIs.
  • Normalize zh strings in po files by using jieba to split words.

Deployment & Tests

  • Make tests more stable by using distinct table names and ordered dict.
  • Refine tests to make it possible to test with pytest-xdist.
  • Fold full requirements inside requirement strings (thanks @dimbleby).

Compatibility Issue

  • Undocumented APIs open_pandas_reader and open_pandas_writer are removed from TableTunnel class to make wheels simple. Users who call these APIs should use to_pandas, write_table or arrow tunnel support instead.
  • options.always_enable_schema is renamed as options.enable_schema and the former option is now marked as deprecated.
  • --without-docker and --without-merge options in pyodps-pack are renamed as --no-docker and --no-merge to make them consistent with pip style.

v0.11.6.5

26 Aug 09:09
5e1c7f4
Compare
Choose a tag to compare

Enhancements

  • Switch TableTunnel.create_download_session to async_mode by default.
  • Support schema version on stream upload session.
  • Allow creating STS account from env and force reload on expiration.

v0.11.6.4

16 Aug 06:15
5eb1025
Compare
Choose a tag to compare

Bugfixes

  • Fix error when uploading multiple batches with BufferedArrowWriter.

v0.11.6.3

31 Jul 06:06
0f25c64
Compare
Choose a tag to compare

Bugfixes

  • Fix CRC computation of arrow tunnel interfaces
  • Fix completeness of upload retry of buffered writers

Enhancements

  • Allow record and reuse MCQA session with local file

Tests

  • Fix test failure of storage API