Skip to content

Releases: pathwaycom/pathway

v0.8.4

18 Mar 17:52

Choose a tag to compare

Fixed

  • Pathway will only require LiteLLM package, if you use one of the wrappers for LiteLLM.
  • Retries are implemented in pw.io.airbyte.read.
  • State processing protocol is updated in pw.io.airbyte.read.

v0.8.3

13 Mar 21:17

Choose a tag to compare

Added

  • New parameters of pw.UDF class and pw.udf decorator: return_type, deterministic, propagate_none, executor, cache_strategy.
  • The LLM Xpack now provides integrations with LlamaIndex and LangChain for running the Pathway VectorStore server.

Changed

  • Subclassing UDFSync and UDFAsync is deprecated. UDF should be subclassed to create a new UDF.
  • Passing keyword arguments to pw.apply, pw.apply_with_type, pw.apply_async is deprecated. In the future, they'll be used for configuration, not passing data to the function.

Fixed

  • Fixed a minor bug with Table.groupby() method which sometimes prevented of accessing certain columns in the following reduce().
  • Fixed warnings from using OpenAI Async embedding model in the VectorStore in Colab.

v0.8.2

28 Feb 12:56

Choose a tag to compare

Added

  • %:z timezone format code to strptime.
  • Support for Airbyte connectors pw.io.airbyte.

v0.8.1

15 Feb 13:42

Choose a tag to compare

Added

  • Introduced the send_alerts function in the pw.io.slack namespace, enabling users to send messages from a specified column directly to a Slack channel.
  • Enhanced the pw.io.http.rest_connector by introducing an additional argument called request_validator. This feature empowers users to validate payloads and raise an HTTP 400 error if necessary.

Fixed

  • Addressed an issue in pw.io.xpacks.llm.VectorStoreServer where the computation of the last modification timestamp for an indexed document was incorrect.

Changed

  • Improved the behavior of pw.io.kafka.write. It now includes retries when sending data to the output topic encounters failures.

v0.8.0

01 Feb 14:51

Choose a tag to compare

Added

  • pw.io.http.rest_connector now supports multiple HTTP request types.
  • pw.io.http.PathwayWebserver now allows Cross-Origin Resource Sharing (CORS) to be enabled on newly added endpoints
  • Wrappers for LiteLLM and HuggingFace chat services and SentenceTransformers embedding service are now added to Pathway xpack for LLMs.

Changed

  • pw.run now includes an additional parameter runtime_typechecking that enables strict type checking at runtime.
  • Embedders in pathway.xpacks.llm.embedders now correctly process empty strings as queries.
  • BREAKING: pw.run and pw.run_all now only accept keyword arguments.

Fixed

  • pw.Duration can now be returned from User-Defined Functions (UDFs) or used as a constant value without resulting in errors.
  • pw.io.debezium.read now correctly handles tables that do not have a primary key.

v0.7.10

26 Jan 16:09

Choose a tag to compare

Added

  • pw.io.http.rest_connector can now generate Open API 3.0.3 schema that will be returned by the route /_schema.
  • Wrappers for OpenAI Chat and Embedding services are now added to Pathway xpack for LLMs.
  • A vector indexing pipeline that allows querying for the most similar documents. It is available as class VectorStore as part of Pathway xpack for LLMs.

Fixed

  • pw.debug.table_from_markdown now uses schema parameter (when set) to properly assign simple types (int, bool, float, str, bytes) and optional simple types to columns.

v0.7.9

18 Jan 13:40

Choose a tag to compare

Changed

  • pw.io.http.rest_connector now also accepts port as a string for backwards compatibility.

v0.7.8

18 Jan 11:24

Choose a tag to compare

Added

  • Support for comparisons of tuples has been added.
  • Standalone versions of methods such as pw.groupby, pw.join, pw.join_inner, pw.join_left, pw.join_right, and pw.join_outer are now available.
  • The abs function from Python can now be used on Pathway expressions.
  • The asof_join method now has configurable temporal behavior. The behavior parameter can be used to pass the configuration.
  • The state of the deduplicate operator can now be persisted.

Changed

  • interval_join can now work with intervals of zero length.
  • The pw.io.http.rest_connector can now open multiple endpoints on the same port using a new pw.io.http.PathwayWebserver class.
  • The pw.xpacks.connectors.sharepoint.read and pw.io.gdrive.read methods now support the size limit for a single object. If set, it will exclude too large files and won't read them.

v0.7.7

27 Dec 14:18

Choose a tag to compare

Added

  • pathway.xpacks.llm.splitter.TokenCountSplitter.

v0.7.6

22 Dec 13:44

Choose a tag to compare

New Features

Conversion Methods in pw.Json

  • Introducing new methods for strict conversion of pw.Json to desired types within a UDF body:
    • as_int()
    • as_float()
    • as_str()
    • as_bool()
    • as_list()
    • as_dict()

DateTime Functionality

  • Added table.col.dt.utc_from_timestamp method: Creates DateTimeUtc from timestamps represented as ints or floats.
  • Enhanced the table.col.dt.timestamp method with a new unit argument to specify the unit of the returned timestamp.

Experimental Features

  • Introduced an experimental xpack with a Microsoft SharePoint input connector.

Enhancements

Improved JSON Handling

  • Index operator ([]) can now be directly applied to pw.Json within UDFs to access elements of JSON objects, arrays, and strings.

Expanded Timestamp Functionality

  • Enhanced the table.col.dt.from_timestamp method to create DateTimeNaive from timestamps represented as ints or floats.
  • Deprecated not specifying the unit argument of the table.col.dt.timestamp method.

KNNIndex Enhancements

  • KNNIndex now supports returning computed distances.
  • Added support for cosine similarity in KNNIndex.

Deprecated Features

  • The offset argument of pw.stdlib.temporal.sliding and pw.stdlib.temporal.tumbling is deprecated. Use origin instead, as it represents a point in time, not a duration.

Bug Fixes

DateTime Fixes

  • Sliding window now works correctly with UTC Datetimes.

asof_join Improvements

  • Temporal column in asof_join no longer has to be named t.
  • asof_join includes rows with equal times for all values of the direction parameter.

Fixed Issues

  • Fixed an issue with pw.io.gdrive.read: Shared folders support is now working seamlessly.