Skip to content

Releases: y-scope/clp

v0.10.0

09 Mar 13:44
10d78e0

Choose a tag to compare

A release that includes multi-dataset query support, log-ingestor fault tolerance, external third-party service configuration, and several bug fixes and improvements.

Note

This release includes some changes that are incompatible with previous releases. If this affects you, reach out and we may be able to help with version migration. These changes are marked with "Breaking".

This release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

  • clp-json: Add support for querying multiple datasets in a single query. (#1992, #2060, #2062)
    • Breaking:
      • The --dataset CLI flag can now be specified multiple times to search multiple datasets.
      • In API server requests the dataset field has been renamed to datasets and changed from a string to an array of strings.
  • log-ingestor: Add fault tolerance support. See the docs for the tolerance guarantees. (#1988, #2001, #2014, #2017, #2035, #2038, #2045, #2053, #2057)
    • Breaking:
      • In the API, job IDs are now numeric (database-generated) instead of UUIDs.
      • In the API, DELETE /jobs/{id} has been replaced with POST /jobs/{id}/terminate.
  • clp-json: Replace the DateString column type with the new Timestamp column type that supports nanosecond precision and is more compressible. (#1385, #1564, #1599, #1626, #1646, #1673, #1708, #1723, #1757, #1788, #1847, #1873, #2013, #2028)
    • Breaking:
      • Archives created with this version (v0.10.0) will not be readable by older CLP versions.
      • Datasets containing non-ISO 8601 timezone formats (e.g., the PostgreSQL sample dataset) are temporarily unsupported.
  • helm/docker-compose: Support configuration of external third-party services (database, queue, Redis, results cache) via the bundled list in clp-config.yaml. (#1648, #1681, #1694, #2056, #2066)
  • api-server: Add DELETE /query/{search_job_id} endpoint to cancel in-progress query jobs. (#1964)
  • presto-clp: Add support for custom S3 endpoint URLs and buckets, enabling use with S3-compatible storage providers (e.g., MinIO). (#1917)

Bug fixes & improvements

  • helm: Remove all host-path volume mounts; use dynamically provisioned PVCs and stdout/stderr logging. (#2023, #2075, #2077)
    • Breaking: The storage section and data_directory/logs_directory/tmp_directory values in values.yaml have been removed. Existing custom Helm values referencing these fields must be updated.
  • log-ingestor: Increase buffer_flush_threshold default from 256 MiB to 4 GiB. (#1965)
    • Breaking: Since the default value has changed, any deployments relying on the default (rather than an explicit value) will experience different buffering behavior.
  • log-ingestor: Allow SQS listener jobs to spawn multiple concurrent tasks for higher throughput, with configurable concurrency and max wait time. (#1989)
  • webui: Cap compression metadata query results to 1000 to prevent OOM crash (fixes #2040). (#2041)
  • clp-json: Fix use-after-free in the SchemaMatch pass that could cause crashes during queries (fixes #1986). (#1990)
  • helm: Remove Job TTL to prevent init container CrashLoopBackOff on pod restarts (fixes #2043). (#2044)
  • clp-text: Generate all possible wildcard subqueries for non-capture schema-based search, preventing unnecessary archive decompression and improving search accuracy. (#1313, #1959, #1972)
  • clp-json/clp-text: Remove unused staged-streams volume mount from the api-server service (fixes #1810). (#1947)
  • clp-json/clp-text: Update results cache health check command to use mongosh --eval (resolves #1742). (#1832)
  • build: Add --pull flag to all Docker build commands to ensure the latest base images are used (fixes #1051). (#1943)
  • webui: Fix scroll.x console warning in VirtualTable (fixes #1892). (#1949)
  • helm: Add --clp-package-image CLI arg to setup scripts for local image testing. (#2020)

View the full changelog for more details.

Thanks to @20001020ycx, @Bill-hbrhbr, @davidlion, @gibber9809, @hoophalab, @jonathan-imanu, @junhaoliao, @LinZhihao-723, @Nathan903, @quinntaylormitchell, and @SharafMohamed for their contributions.

v0.9.0

04 Feb 23:10
8df42ec

Choose a tag to compare

A release that includes several bug-fixes and improvements.

Note

This release includes some changes that are incompatible with previous releases. If this affects you, reach out and we may be able to help with version migration. These changes are marked with "Breaking".

This release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

Bug fixes & improvements

  • clp-json/clp-text: Reduce jitter when scheduling batches in the query scheduler, significantly improving performance for some queries. (#1899)
  • clp-json/clp-text: Report query job duration correctly for large query jobs (fixes #1874). (#1875)
  • log-ingestor: Reduce job creation delay by avoiding the AWS SDK’s default provider resolution (fixes #1915). (#1931)
  • clp-json/clp-text: Use full UUID for ephemeral container names to prevent collisions. (#1870)
  • api-server: Rename timestamp and results cache parameters for clarity. (#1886)
    • Breaking: Users of the API server should update their query keys to reflect the updated API.
  • clp-json/clp-text: Remove the archive tags feature. (#1842)
    • Breaking: The tags feature has been removed. Users that used the feature to query different sets of logs can use the datasets feature to achieve the same.
  • clp-json: Allow disabling the log ingestor when the API server is enabled (fixes #1911). (#1927)
  • helm: Enable the log ingestor when ingesting logs from S3, to align with the CLP Docker-orchestrated package’s default configuration (fixes #1912). (#1913)
  • webui: Add support for compressing unstructured text logs using clp-json. (#1861)
  • helm: Make charts available through a Helm repo rather than a GitHub repo. (#1891)

View the full changelog for more details.

Thanks to @davidlion, @gibber9809, @hoophalab, @junhaoliao, @LinZhihao-723, and @quinntaylormitchell for their contributions.

v0.8.0

21 Jan 05:25
5e98104

Choose a tag to compare

A release focused on enhanced scalability and deployment flexibility: Helm chart support enables multi-node Kubernetes deployments with horizontal scaling, a new log ingestor service enables continuous ingestion from S3, and Spider provides an alternative distributed compression orchestrator. This release also adds support for custom S3 endpoints and many other fixes and improvements.

Note

This release includes some changes that are incompatible with previous releases. If this affects you, reach out and we may be able to help with version migration. These changes are marked with "Breaking".

This release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

  • clp-json/clp-text: Add Helm chart for multi-node Kubernetes deployments, enabling horizontal scaling of compression and query workloads across node pools with configurable pod scheduling. (#1614, #1698, #1700, #1749, #1759, #1771, #1772, #1784, #1814, #1815, #1816, #1817, #1818, #1819, #1822, #1825, #1827, #1829, #1846, #1856, #1882, #1885, #1887)
  • clp-json: Add a log ingestor for continuous log ingestion from S3, with support for periodic bucket scanning or event-driven ingestion via SQS notifications. (#1517, #1523, #1535, #1537, #1538, #1552, #1602, #1629, #1704, #1715, #1729, #1736, #1740, #1741, #1776, #1783, #1789, #1856, #1882)
  • clp-json: Add support for ingesting logs from S3-compatible object storage endpoints other than AWS S3 (e.g., MinIO, Ceph). (#1767, #1776, #1796)
  • clp-json: Replace the experimental date(...) function with timestamp("<value>"[, "<pattern>]) for comparing values against timestamp strings. (#1757)
    • Breaking: This replaces the experimental date(...) function with the timestamp(...) function in KQL queries.
  • clp-json/clp-text: Add Spider as an alternative compression orchestrator to Celery, providing a fast and scalable distributed task execution framework. (#1318, #1331, #1340, #1353, #1354, #1414, #1416, #1453, #1606, #1647, #1765, #1821, #1828)
    • Breaking: In the CLP Package config, database.name is now database.names, a dictionary with clp and spider keys for separate database instances.
  • webui: Add file browser for selecting local files when submitting compression jobs. (#1292, #1688, #1786)
  • webui: Display dataset name and input paths in the compression jobs table. (#1798)
  • clp-json: Add S3 storage backend for query results, with query workers uploading results to S3 and the API server streaming them to clients. (#1722, #1728)
  • clp-json/clp-text: Add configurable restart policy for CLP Package services when deployed with Docker Compose. (#1754, #1773)

Bug fixes & improvements

  • clp-json/clp-text: Only calculate job duration for started query jobs (fixes #1806). (#1809)
  • clp-json/clp-text: Improve wildcard query performance by narrowing token type inference during subquery generation. (#1865)
  • webui: Integrate time range presets directly into the date picker with human-readable descriptions for easier date range selection. (#1676, #1724, #1734, #1762)
  • kv-ir: Support matching KQL timestamp literals against integer and float columns and drop support for matching them against strings. (#1848)
    • Breaking: Timestamp literals no longer match against string columns.
  • clp-json/clp-text: Ensure --count and --count-by-time arguments are mutually exclusive when passed to sbin/search.sh. (#1871)
  • webui: Use default dataset name when set to empty string (fixes #1854). (#1855)

View the full changelog for more details.

Thanks to @Bill-hbrhbr, @davemarco, @Eden-D-Zhang, @gibber9809, @hoophalab, @jackluo923, @junhaoliao, @kirkrodrigues, @LinZhihao-723, @quinntaylormitchell, @SharafMohamed, @sitaowang1998, and @sudheergajula for their contributions.

v0.7.0

03 Dec 18:04
68b38f1

Choose a tag to compare

A release that adds an API server, support for submitting compression jobs from the UI, improved performance for concurrent compression jobs, and other improvements.

Note

This release includes some changes that are incompatible with previous releases. If this affects you, reach out and we may be able to help with version migration. These changes are marked with "Breaking".

This release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

Bug fixes & improvements

  • clp-json/clp-text: Allow concurrent compression job processing by processing batches of compression tasks per job. (#1637)
  • webui: Visual and functional improvements to the UI for querying archives through Presto. (#1578, #1569, #1631, #1638)
  • KV-IR: Performance improvements for compressing KV-IR files into clp-json/clp-s archives. (#1468, #1544, #1561, #1607)
  • clp-json/clp-text: Configuration renaming for consistency:
    • Rename .yml files to .yaml . (#1617)
      • Breaking: This is a rename of CLP’s config files (etc/credentials.yml to etc/credentials.yaml and etc/clp-config.yml to etc/clp-config.yaml).
    • Rename user to username; Update default user to clp-user. (#1610)
      • Breaking: This is a change to CLP’s credential file format (etc/credentials.yaml).
  • clp-json/clp-text: Add support for multiple database user credentials. (#1655, #1718)
    • Breaking: CLP’s credentials file now requires root user credentials for the database.

View the full changelog for more details.

Thanks to @20001020ycx, @Bill-hbrhbr, @davemarco, @davidlion, @Eden-D-Zhang, @gibber9809, @hoophalab, @junhaoliao, @kirkrodrigues, @LinZhihao-723, @quinntaylormitchell, and @sitaowang1998 for their contributions.

v0.6.0

10 Nov 12:45

Choose a tag to compare

A release that adds an MCP server, a guided query UI for Presto, expands S3 compression options, migrates CLP’s orchestration to use Docker Compose, and much more.

Note

This release includes some changes that are incompatible with previous releases. If this affects you, reach out and we may be able to help with version migration. These changes are marked with "Breaking".

This release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

Bug fixes & improvements

  • webui: Display full dataset names in the dataset selector popup. (#1400)
  • webui: Provide improved calculation of time range for queries. (#1336)
  • webui: Show query speed in search status. (#1429)
  • clp-json/clp-s: Fix incorrect rounding when comparing of float literals with integer values (fixes #1375). (#1369)
  • clp-json/clp-text: Write compression task failure errors to a log, and store the log path (instead of the errors) in the job’s status (fixes #716). (#1425)
  • clp-json: Expand ~ in aws_config_directory config (fixes #1257). (#1258)
  • clp-json/clp-text: Ensure at least one worker is created even on single-core machines (fixes #1509).

View the full changelog for more details.

Thanks to @20001020ycx, @All-less, @AVMatthews, @Bill-hbrhbr, @davemarco, @Eden-D-Zhang, @gibber9809, @hoophalab, @junhaoliao, @kirkrodrigues, @LinZhihao-723, @quinntaylormitchell, @rishikeshdevsot, @sitaowang1998, and @wraymo for their contributions.

v0.5.1

24 Sep 18:16

Choose a tag to compare

A release that adds support for retaining the format of floating-point numbers from JSON logs, support for CLP’s UI to work with Presto, and some other improvements and bug fixes.

This release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

Bug fixes & improvements

  • clp-json/clp-text: Add support for gracefully shutting down the compression scheduler and compression workers (resolves #1037). (#1169, #1323)
  • core: Unescape variable strings before dictionary lookup in EncodedVariableInterpreter::encode_and_search_dictionary (fixes #590). (#1270)
  • clp-s: Handle pure wildcards and unexpected literal types correctly in EvaluateTimestampIndex (fixes #1096). (#1277)
  • kv-ir: Add support for getting the number of log events read from the deserializer. (#1282)
  • kv-ir: Add support for duplicate columns in projections. (#1245)
  • clp-json/clp-text: Consider all non-loopback IPv4s when selecting listen address to receive search results (fixes #1316). (#1317)
  • clp-json/clp-text: Fail job and report failure to user for compression jobs that encounter at least one invalid input path (fixes #308). (#1125)
  • clp-json: Warn the user if they do not use the --timestamp-key flag when compressing with the clp-s storage engine. (#1283)
  • webui: Expose rate limit configuration (fixes #1019); Increase rate limit to 1000 req/min (fixes #1020). (#1234)
    • NOTE: This adds the configuration key webui.rate_limit to etc/clp-config.yml.

View the full changelog for more details.

Thanks to @anlowee, @AVMatthews, @Bill-hbrhbr, @davemarco, @gibber9809, @haiqi96, @hoophalab, @junhaoliao, @kirkrodrigues, @LinZhihao-723, and @quinntaylormitchell for their contributions.

v0.5.0

22 Aug 02:36

Choose a tag to compare

A release that includes support for configuring retention periods for archives, deleting datasets entirely, and some other features and bug fixes.

Note

This release includes some changes that are incompatible with previous releases. If this affects you, reach out and we may be able to help with version migration. These changes are marked with "Breaking".

This release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

  • clp-json/clp-text: Add support for configuring retention periods for archives and search results. (#1035, #1181, #1205, #1231)
  • clp-json: Add dataset-manager tool to support listing datasets, and deleting them entirely. (#1144, #1215, #1225)

Bug fixes & improvements

  • clp-json/clp-text: Fix scheduler freezes by updating celery to 5.5.3 with the redis extra (replacing the direct redis dependency) (fixes #1059). (#1213)
  • clp-json/clp-text: Remove dependency on native libraries for scripts run on the host (fixes #895, #1185). (#1105, #1197)
  • core-clp: Preserve escaped ?-wildcards in queries (fixes #243). (#1070)
  • core: Unify clp-s and clp’s unstructured text parsing and search code. (#1101, #1103, #1112, #1138, #1143, #1163)
    • Breaking: This is a change to the archive format used by clp-json (and the core clp-s binary).
  • webui: Delete old search results after 60min by default. (#1231)
    • Breaking: Previously, the results of the last query executed before refreshing the page would be retained indefinitely.
  • webui: Update query time range on any change in the UI to prevent query submission with out-of-date time range. (#1171)
  • webui: Focus/refocus query input box for ease of use. (#1160)
  • clp-json/clp-text: Only mount stream_output_dir when stream storage type is FS, to allow running the webui on a different node when using S3 storage. (#1129)
  • clp-json/clp-text: Explicitly convert enum to integer to ensure accurate conversions when using the mysql Python library. (#1133)
  • webui: Cast BIGINT values as UNSIGNED to fix MySQL-specific type errors in dashboard stats queries (fixes #1137). (#1136)
  • clp-json/clp-text: Mark incomplete jobs as failed when schedulers restart. (#1208)
  • clp-json/clp-text: Add verbose logging option to archive_manager and simplify output on errors. (#1173)
  • clp-json/clp-text: Add query_engine option to clp-config.yml to support starting only compression and UI components when using the Presto query engine. (#1095)
    • Breaking: This is a change to clp-json/clp-text’s config file format.
  • core: Replace YAML config with CLI args and env vars for metadata DB. (#1148)
    • Breaking: This is a change to how the core clp/clp-s binaries can be configured to use a MySQL-based database (the change is transparent to clp-json/clp-text users).
  • core: Address CVE-2024-3094 and CVE-2025-31115 for xz/lzma dependency (fixes #1093). (#1094)

View the full changelog for more details.

Thanks to @anlowee, @Bill-hbrhbr, @davemarco, @davidlion, @gibber9809, @haiqi96, @hoophalab, @jackluo923, @junhaoliao, @kirkrodrigues, @LinZhihao-723, @quinntaylormitchell, @SharafMohamed, and @wraymo for their contributions.

v0.4.0

10 Jul 14:36

Choose a tag to compare

A release that includes support for organizing compressed logs into datasets, a revamped UI, support for a new IR format designed for structured log events, and more advanced metadata storage within clp-json archives. This release also includes some other features and bug fixes.

Warning

This release contains bug #1059 which can cause compression or search to freeze. This has been fixed in v0.5.0, so we recommend using v0.5.0 or a later release.

Note

We are bumping the minor version due to several breaking changes including format changes in clp-text/clp-json's archive formats, changes to clp-json/clp-s' query syntax, changes to clp-json's config file format, changes to the command line interface, and changes to CLP's runtime requirements. If this affects you, reach out and we may be able to help with version migration.

The CLP release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

Bug fixes & improvements

  • clp-s: Improve compression ratio by delta-encoding the log-order column. (#1021)
  • clp-json/clp-text: Add support for configuring the compression level. (#774)
  • clp-json: Add support for more AWS authentication methods (e.g., EC2 instance metadata, environment variables, AWS profiles, etc.). (#743, #788, #852)
  • clp-json/clp-text: Don't remap output paths when mounting them into CLP's execution container (fixes #960). (#998)
  • core: Try to exhaust Zstd's internal buffers when they might contain unconsumed data. (fixes #976). (#977)
  • clp-json/clp-text: Use job duration for final compression speed summary. (#823)
  • core: Update DictionaryReader::get_entry_matching_value to handle case-insensitive searches (fixes #648). (#690)
  • webui: Improve tracking of compression jobs submitted in quick succession (fixes #667). (#679)

View the full changelog for more details.

Thanks to @aestriplex, @anlowee, @AVMatthews, @Bill-hbrhbr, @davemarco, @davidlion, @Eden-D-Zhang, @gibber9809, @haiqi96, @Henry8192, @hoophalab, @jackluo923, @junhaoliao, @kirkrodrigues, @LinZhihao-723, @quinntaylormitchell, @SharafMohamed, @sitaowang1998, and @wraymo for their contributions.

v0.3.0

25 Jan 05:11

Choose a tag to compare

A release that adds support for using clp-json to both compress logs from object storage and store archives on object storage (docs). This release also includes some other features and bug fixes.

NOTE: We are bumping the minor version due to a breaking format change in clp-text/clp-json’s jobs table format and in clp-json/clp-s’ archive format. If this affects you, reach out and we can help with version migration.

The CLP release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

  • clp-json: Add support for compressing logs from S3. (#651)
  • clp-json: Add support for storing, decompressing, and searching archives from S3. (#634, #674, #683)
  • clp-json: Add support for viewing logs from S3. (#662, #673, #678)
  • clp-s: Add support for compressing logs from S3. (#639)
  • clp-s: Add support for writing single-file archives. (#563)
  • clp-s: Add support for reading and searching single-file archives, including from S3. (#656)
  • Add BoundedReader to prevent out-of-bound reads in segmented input streams. (#624)

Bug fixes & improvements

  • clp-json: Add option to output search results as raw logs. (#641)
  • clp-package: Unify the metadata schema for JSON and IR streams. (#620)
  • clp-package: Enable replica set for the MongoDB results cache and configure it when starting the package. (#632)
  • clp: Advance to the next message when a message has an out-of-range timestamp when searching archives (fixes #659). (#660)
  • clp-s: Unescape string values during ingestion and fix support for search using escape sequences. (#622)
  • clp-s: Improve error reporting for directory-creation failure during compression. (#671, #684)
  • clp-s: Rename tables section to use segment numbering scheme. (#666)
  • Handle 0-byte reads when BufferReader's underlying buffer is fully consumed. (#687)
  • Add missing libcurl4 dependency to clp-core and package execution containers. (#670)
  • Disable file system translation in checkinstall during dependency installation (fixes #642). (#644)

View the full changelog for more details.

Thanks to @AVMatthews, @Bill-hbrhbr, @Eden-D-Zhang, @LinZhihao-723, @gibber9809, @haiqi96, @jackluo923, @junhaoliao, and @kirkrodrigues for their contributions.

v0.2.1

04 Dec 04:23

Choose a tag to compare

A release that adds support for viewing JSON search results in context by opening the archive that contains them using the log viewer. This release also includes some features that improve usability.

The CLP release includes two tars:

  • clp-json for compressing and searching JSON logs
  • clp-text for compressing and searching unstructured text logs

Docs

The docs for this release are available here.

New features

  • Support for viewing JSON search results in context by opening the archive that contains them using the log viewer. (#569, #584, #596, #600, #615)
    • Note that the log viewer doesn’t open the archive directly but rather CLP decompresses the archive into chunks of JSONL files that the log viewer opens. In a future release, these chunks will be IR files to lower resource usage.
  • clp-json: Support for querying fields whose keys contain periods by escaping them with a backslash. (#560, #617)
  • Support for deleting archives that are entirely within a time range. (#594)

Bug fixes & improvements

  • Homebrew path detection for mariadb-connector-c to fix macOS build failures. (#582)

View the full changelog for more details.

Thanks to @AVMatthews, @Bill-hbrhbr, @LinZhihao-723, @anlowee, @gibber9809, @haiqi96, @junhaoliao, @kirkrodrigues, and @wraymo for their contributions.