Skip to content

Releases: amikos-tech/pure-tokenizers

Go Module v0.1.5

04 Mar 21:12
02a3420

Choose a tag to compare

Go Module Release

This release of the Go tokenizers module is compatible with Rust library version built-locally.

Installation

go get github.com/amikos-tech/pure-tokenizers@v0.1.5

Features

  • CGo-free implementation using purego
  • Automatic library download and caching
  • Support for multiple platforms (Linux, macOS, Windows)
  • Compatible with HuggingFace tokenizers

Requirements

  • Go 1.24 or later
  • Compatible Rust tokenizers library (downloaded automatically)

Environment Variables

  • TOKENIZERS_LIB_PATH: Override library path
  • TOKENIZERS_VERSION: Specific Rust library version to use
  • GITHUB_TOKEN / GH_TOKEN: Optional GitHub API auth for fallback requests

Documentation

See the README for detailed usage instructions.

What's Changed

  • chore(rust): bump tokenizers to 0.22.2 by @tazarov in #103

Full Changelog: v0.1.4...v0.1.5

Rust Library rust-v0.1.5

04 Mar 21:06
02a3420

Choose a tag to compare

Rust Tokenizers Library Release

This release contains pre-built tokenizers libraries for multiple platforms.

Supported Platforms

  • Linux x86_64 (GNU)
  • Linux aarch64 (GNU)
  • Linux x86_64 (MUSL)
  • Linux aarch64 (MUSL)
  • macOS x86_64
  • macOS aarch64 (Apple Silicon)
  • Windows x86_64

Installation

Download the appropriate archive for your platform and extract the library file.

Usage with Go bindings

The Go bindings will automatically download these libraries when needed.
You can also set TOKENIZERS_LIB_PATH environment variable to use a specific library.

R2 Release Endpoint

Release assets are also published to:
https://releases.amikos.tech/pure-tokenizers/rust-v0.1.5/

What's Changed

  • chore(rust): bump tokenizers to 0.22.2 by @tazarov in #103

Full Changelog: v0.1.4...rust-v0.1.5

Go Module v0.1.4

28 Feb 17:38
6db29c3

Choose a tag to compare

Go Module Release

This release of the Go tokenizers module is compatible with Rust library version built-locally.

Installation

go get github.com/amikos-tech/pure-tokenizers@v0.1.4

Features

  • CGo-free implementation using purego
  • Automatic library download and caching
  • Support for multiple platforms (Linux, macOS, Windows)
  • Compatible with HuggingFace tokenizers

Requirements

  • Go 1.24 or later
  • Compatible Rust tokenizers library (downloaded automatically)

Environment Variables

  • TOKENIZERS_LIB_PATH: Override library path
  • TOKENIZERS_VERSION: Specific Rust library version to use
  • GITHUB_TOKEN / GH_TOKEN: Optional GitHub API auth for fallback requests

Documentation

See the README for detailed usage instructions.

What's Changed

  • [TST] Eliminate HFHubBaseURL global mutations in tests by @tazarov in #100
  • [REL] Publish signed releases index to R2 by @tazarov in #101
  • release: externalize and locally test releases index generation by @tazarov in #102

Full Changelog: v0.1.3...v0.1.4

Rust Library rust-v0.1.4

28 Feb 17:34
6db29c3

Choose a tag to compare

Rust Tokenizers Library Release

This release contains pre-built tokenizers libraries for multiple platforms.

Supported Platforms

  • Linux x86_64 (GNU)
  • Linux aarch64 (GNU)
  • Linux x86_64 (MUSL)
  • Linux aarch64 (MUSL)
  • macOS x86_64
  • macOS aarch64 (Apple Silicon)
  • Windows x86_64

Installation

Download the appropriate archive for your platform and extract the library file.

Usage with Go bindings

The Go bindings will automatically download these libraries when needed.
You can also set TOKENIZERS_LIB_PATH environment variable to use a specific library.

R2 Release Endpoint

Release assets are also published to:
https://releases.amikos.tech/pure-tokenizers/rust-v0.1.4/

What's Changed

  • [TST] Eliminate HFHubBaseURL global mutations in tests by @tazarov in #100
  • [REL] Publish signed releases index to R2 by @tazarov in #101
  • release: externalize and locally test releases index generation by @tazarov in #102

Full Changelog: v0.1.3...rust-v0.1.4

Go Module v0.1.3

28 Feb 12:34
0968418

Choose a tag to compare

Go Module Release

This release of the Go tokenizers module is compatible with Rust library version built-locally.

Installation

go get github.com/amikos-tech/pure-tokenizers@v0.1.3

Features

  • CGo-free implementation using purego
  • Automatic library download and caching
  • Support for multiple platforms (Linux, macOS, Windows)
  • Compatible with HuggingFace tokenizers

Requirements

  • Go 1.24 or later
  • Compatible Rust tokenizers library (downloaded automatically)

Environment Variables

  • TOKENIZERS_LIB_PATH: Override library path
  • TOKENIZERS_VERSION: Specific Rust library version to use
  • GITHUB_TOKEN / GH_TOKEN: Optional GitHub API auth for fallback requests

Documentation

See the README for detailed usage instructions.

What's Changed

  • ci: publish signed Rust releases to R2 by @tazarov in #98
  • Harden release downloader and finalize releases-first model by @tazarov in #99

Full Changelog: v0.1.2...v0.1.3

Rust Library rust-v0.1.3

26 Feb 17:10
c5931d2

Choose a tag to compare

Rust Tokenizers Library Release

This release contains pre-built tokenizers libraries for multiple platforms.

Supported Platforms

  • Linux x86_64 (GNU)
  • Linux aarch64 (GNU)
  • Linux x86_64 (MUSL)
  • Linux aarch64 (MUSL)
  • macOS x86_64
  • macOS aarch64 (Apple Silicon)
  • Windows x86_64

Installation

Download the appropriate archive for your platform and extract the library file.

Usage with Go bindings

The Go bindings will automatically download these libraries when needed.
You can also set TOKENIZERS_LIB_PATH environment variable to use a specific library.

R2 Release Endpoint

Release assets are also published to:
https://releases.amikos.tech/pure-tokenizers/rust-v0.1.3/

What's Changed

  • ci: publish signed Rust releases to R2 by @tazarov in #98

Full Changelog: rust-v0.1.2...rust-v0.1.3

Go Module v0.1.2

07 Nov 14:07
a677c0a

Choose a tag to compare

Go Module Release

This release of the Go tokenizers module is compatible with Rust library version built-locally.

Installation

go get github.com/amikos-tech/pure-tokenizers@v0.1.2

Features

  • CGo-free implementation using purego
  • Automatic library download and caching
  • Support for multiple platforms (Linux, macOS, Windows)
  • Compatible with HuggingFace tokenizers

Requirements

  • Go 1.24 or later
  • Compatible Rust tokenizers library (downloaded automatically)

Environment Variables

  • TOKENIZERS_LIB_PATH: Override library path
  • TOKENIZERS_GITHUB_REPO: Custom GitHub repository
  • TOKENIZERS_VERSION: Specific Rust library version to use

Documentation

See the README for detailed usage instructions.

What's Changed

  • Implement EncodePair method for Tokenizer by @tazarov in #96

Full Changelog: v0.1.1...v0.1.2

Rust Library rust-v0.1.2

07 Nov 13:57
a677c0a

Choose a tag to compare

Rust Tokenizers Library Release

This release contains pre-built tokenizers libraries for multiple platforms.

Supported Platforms

  • Linux x86_64 (GNU)
  • Linux aarch64 (GNU)
  • Linux x86_64 (MUSL)
  • Linux aarch64 (MUSL)
  • macOS x86_64
  • macOS aarch64 (Apple Silicon)
  • Windows x86_64

Installation

Download the appropriate archive for your platform and extract the library file.

Usage with Go bindings

The Go bindings will automatically download these libraries when needed.
You can also set TOKENIZERS_LIB_PATH environment variable to use a specific library.

What's Changed

  • Implement EncodePair method for Tokenizer by @tazarov in #96

Full Changelog: rust-v0.1.1...rust-v0.1.2

Go Module v0.1.1

03 Oct 09:42
828678a

Choose a tag to compare

Go Module Release

This release of the Go tokenizers module is compatible with Rust library version built-locally.

Installation

go get github.com/amikos-tech/pure-tokenizers@v0.1.1

Features

  • CGo-free implementation using purego
  • Automatic library download and caching
  • Support for multiple platforms (Linux, macOS, Windows)
  • Compatible with HuggingFace tokenizers

Requirements

  • Go 1.24 or later
  • Compatible Rust tokenizers library (downloaded automatically)

Environment Variables

  • TOKENIZERS_LIB_PATH: Override library path
  • TOKENIZERS_GITHUB_REPO: Custom GitHub repository
  • TOKENIZERS_VERSION: Specific Rust library version to use

Documentation

See the README for detailed usage instructions.

What's Changed

  • [ENH] Dual cache system for HuggingFace tokenizers by @tazarov in #71
  • [DOC] Add benchmark comparison table to README by @tazarov in #72
  • [TST] Test suite for HuggingFace integration by @tazarov in #73
  • [ENH] Add debug logging for Retry-After header handling by @tazarov in #74
  • [ENH] Add file size validation for tokenizer downloads by @tazarov in #75
  • [TST] Add concurrent cache access tests by @tazarov in #76
  • [TST] Add failure injection tests for HuggingFace downloads by @tazarov in #77
  • [BUG] Fix TestConcurrentCacheEviction flaky on Windows by @tazarov in #79
  • refactor: extract concurrent error classification to helper function by @tazarov in #82
  • [ENH] Add sentinel error for cache-not-found condition by @tazarov in #83
  • [TST] Handle Windows file locking in concurrent cache eviction test by @tazarov in #85
  • [PERF] Improve resource management in BenchmarkFromHuggingFace by @tazarov in #86
  • [CHORE] Reduce workflow redundancy with composite actions by @tazarov in #87
  • [ENH] Add glob pattern support for cache operations by @tazarov in #88
  • fix: resolve release workflow issues for v0.1.1 by @tazarov in #92
  • fix: remove duplicate containsSubstring function by @tazarov in #94

Full Changelog: v0.1.0...v0.1.1

Rust Library rust-v0.1.1

03 Oct 09:37
828678a

Choose a tag to compare

Rust Tokenizers Library Release

This release contains pre-built tokenizers libraries for multiple platforms.

Supported Platforms

  • Linux x86_64 (GNU)
  • Linux aarch64 (GNU)
  • Linux x86_64 (MUSL)
  • Linux aarch64 (MUSL)
  • macOS x86_64
  • macOS aarch64 (Apple Silicon)
  • Windows x86_64

Installation

Download the appropriate archive for your platform and extract the library file.

Usage with Go bindings

The Go bindings will automatically download these libraries when needed.
You can also set TOKENIZERS_LIB_PATH environment variable to use a specific library.

What's Changed

  • [ENH] Dual cache system for HuggingFace tokenizers by @tazarov in #71
  • [DOC] Add benchmark comparison table to README by @tazarov in #72
  • [TST] Test suite for HuggingFace integration by @tazarov in #73
  • [ENH] Add debug logging for Retry-After header handling by @tazarov in #74
  • [ENH] Add file size validation for tokenizer downloads by @tazarov in #75
  • [TST] Add concurrent cache access tests by @tazarov in #76
  • [TST] Add failure injection tests for HuggingFace downloads by @tazarov in #77
  • [BUG] Fix TestConcurrentCacheEviction flaky on Windows by @tazarov in #79
  • refactor: extract concurrent error classification to helper function by @tazarov in #82
  • [ENH] Add sentinel error for cache-not-found condition by @tazarov in #83
  • [TST] Handle Windows file locking in concurrent cache eviction test by @tazarov in #85
  • [PERF] Improve resource management in BenchmarkFromHuggingFace by @tazarov in #86
  • [CHORE] Reduce workflow redundancy with composite actions by @tazarov in #87
  • [ENH] Add glob pattern support for cache operations by @tazarov in #88
  • fix: resolve release workflow issues for v0.1.1 by @tazarov in #92
  • fix: remove duplicate containsSubstring function by @tazarov in #94

Full Changelog: rust-v0.1.0...rust-v0.1.1