Skip to content

Releases: huggingface/xet-core

[git-xet v0.2.0] Better Windows support, new command and performance improvement

21 Nov 18:28
eeee211

Choose a tag to compare

  1. Extends support to Windows platform and SSH remote URL, marking all three major platforms supported with both HTTP and SSH remote URL.
  2. Add “git xet track” command to replace “git lfs track” to unify branding.
  3. Upload performance improvement: avoid re-computation of SHA256 in git-xet and uses the value passed in from git-lfs.

Full Changelog: git-xet-v0.1.0...git-xet-v0.2.0

[hf-xet v1.2.0] New logging system, Free-threaded Python, Performance Improvements

24 Oct 19:03
50c6940

Choose a tag to compare

✨ New Features and Improvements

  • New file-based logging system to support enhanced diagnostics and debugging (by @hoytak in #502)
  • SOCKS5 Proxy support (by @SuperKenVery in #474)
  • Support for Free-threaded Python 3.13 and 3.14 (by @rajatarya @seanses in #524)
  • Improved performance by disabling disk-based chunk cache by default (by @rajatarya in #535)
  • Updated rust edition to 2024, upgrade rustc to 1.89 (by @seanses in #494)

🐛 Bug Fixes and Enhancements

What's Changed

New Contributors

Full Changelog: v1.1.10...v1.2.0

[git-xet v0.1.0] Git-Xet: "git push" with Xet protocol

01 Oct 23:50
6fbde98

Choose a tag to compare

Git-Xet is a Git LFS custom transfer agent that implements upload and download of files using the Xet protocol. Follow your regular workflow to git lfs track ... & git add ... & git commit ... & git push and your files are uploaded to Hugging Face repos automatically using the Xet protocol. Enjoy the dedupe!

Installation

  1. Make sure you have git and git-lfs installed and configured correctly.
  2. For Linux (amd64 & aarch64) and macOS (amd64 & aarch64), run the following in your terminal to install and configure git-xet (requires curl and unzip):
curl --proto '=https' --tlsv1.2 -sSf https://raw.githubusercontent.com/huggingface/xet-core/refs/heads/main/git_xet/install.sh | sh
  1. For Windows (amd64), either
  • download git-xet-windows-installer-x86_64.zip and run the msi file after unzip, or
  • download git-xet-windows-x86_64.zip and place git-xet.exe under a PATH directory, and run git-xet install in a terminal.

How It Works

Git-Xet works by registering itself as a custom transfer agent to Git LFS by name "xet". On "git push", "git fetch" or "git pull", git-lfs negotiates with the remote server to determine the transfer agent to use. During this process, git-lfs sends to the server all locally registered agent names in the Batch API request, and the server replies with exactly one agent name in the response. Should "xet" be picked, git-lfs delegates the uploading or downloading operation to Git-Xet through a sequential protocol.

For more details regarding Git LFS custom transfer agent protocol, see https://github.com/git-lfs/git-lfs/blob/main/docs/api/batch.md and https://github.com/git-lfs/git-lfs/blob/main/docs/custom-transfers.md.

[v1.1.10] Bug Fixes and diagnostic tooling

12 Sep 20:09
81b0833

Choose a tag to compare

🔧 Improvements & Tools:

  • Comprehensive Diagnostic Scripts - New debugging tools for Linux and Windows
  • Network Reliability Enhancements - Better retry logic for I/O errors
  • Simplified DNS resolution to run in Kubernetes environments
  • CAS API Path Modernization - Updated to use plural nouns following REST conventions

🐛 Bug Fixes:

  • Chunker Boundary Triggering Fix - Fixed deduplication consistency issues
  • WASM First Chunk Dedup Handling - Improved client-side control
  • Data Type Safety Enhancements - Standardized on u64 for cross-platform compatibility

What's Changed

  • Add input params to Run name in GH Workflow UI by @rajatarya in #478
  • Thin wasm: do not automatically set is_dedup to true for first chunk by @coyotte508 in #481
  • update api paths to use plural nouns by @assafvayner in #482
  • Rename xet_threadpool to xet_runtime to reflect usage by @hoytak in #484
  • use u64 rather than usize in file hashing paths by @assafvayner in #485
  • Git-Xet: LFS custom transfer agent with Xet protocol by @seanses in #425
  • Drop "GaiResolverWithAbsolute" by @seanses in #486
  • Fix wheel upload for linux for dev/alpha/beta tags by @hoytak in #379
  • Adding retry for unhandled io errors when sending requests by @jgodlew in #468
  • Updated chunker to eliminate spurious boundary triggering. by @hoytak in #487
  • Diagnostic Scripts + README changes by @rajatarya in #489
  • hf_xet 1.1.10 by @assafvayner in #490

Full Changelog: v1.1.9...v1.1.10

[v1.1.9] Bug Fixes: Parallelism optimizations, metadata updates

27 Aug 23:04
7f53907

Choose a tag to compare

🚀 Performance Improvements:
• Improve parallelism in parutils by removing async_scoped
• Increase soft file limits for MacOS

🐛 Bug Fixes:
• Update hf_xet PyPI metadata

🔧 Reliability & Maintenance:
• Improved debuggability with tokio console support
• Add CI builds for MacOS

What's Changed

New Contributors

Full Changelog: v1.1.8...v1.1.9

v1.1.8 Bug Fixes

18 Aug 22:00
48be7b0

Choose a tag to compare

🚀 Performance Improvements:
• Client Caching - Reuses reqwest Client across RemoteClient objects to share connection pools
• Connection Limits - Limits idle connections to prevent resource exhaustion

🐛 Bug Fixes:
• Singleflight Fix - Critical fix preventing permanent error caching when owner tasks are dropped
• DataHash Serialization - Ensures consistent little-endian byte order across platforms

🔧 Reliability & Maintenance:
• Retry Logic Restoration - Restores retry logic accidentally removed in versions 1.1.6 and 1.1.7

What's Changed

  • fix: singleflight owner task not removing Call from Group if dropped by @jgodlew in #447
  • Add back retry for connection setup and sending request by @seanses in #455
  • Fix DataHash hex string serde to little endian by @seanses in #445
  • Clean up dependencies (no functionality change) by @seanses in #456
  • Cache and reuse reqwest Client by @seanses in #457
  • Limit number of idle connections by @hoytak in #459
  • update version by @assafvayner in #461

Full Changelog: v1.1.7...v1.1.8

v1.1.7

06 Aug 00:30
9bbc0c6

Choose a tag to compare

What's Changed

  • Remove telemetry code; eliminate Mutex on logging setup. by @hoytak in #441
  • Changed default number of parallel downloads from 64 to 48. by @hoytak in #442
  • Updated version to v1.1.7 by @hoytak in #443

Full Changelog: v1.1.6...v1.1.7

[v1.1.6] Bug Fixes: Proxy support, process safety, and more

05 Aug 22:44
7becae3

Choose a tag to compare

✨ New Features and Improvements

  • Proxy support, easing use behind corporate networks. (#413 by @hoytak; addresses #400 - thanks @albertodepaola and @goodsonjr for the initial reports)
  • Improvements to hf_xet logging; providing facility to log events to a formatted file (#428 by @hoytak)

🐛 Bug Fixes

  • Process safety: make running after os.fork() safer. (#429 by @hoytak; addresses #415 - thanks @John6666cat for the report)
  • Respect XDG_CACHE_HOME and ~/ when setting cache directories. (#426 by @hoytak; addresses #417 - thanks @half-duplex for the initial report)
  • Lower the default NUM_RANGE_CONCURRENT_GETS value to 64 to better respect file descriptor limits (#438 by @assafvayner; addresses #436 - thanks @djholt and @gary149 for the reports)
  • JWT token handling hardened with a buffer before expiration. (#405 by @jgodlew; addresses #404)

What's Changed

Full Changelog: v1.1.5...v1.1.6

[v1.1.5] Bug Fixes: Cert issue fixes & optimizations

20 Jun 21:47
d55c6a2

Choose a tag to compare

This release includes a fix for certificate issues in certain network environments and loading optimizations for dedup lookups.

🧱 Improvements

  • Background shard loading (#384): Loads shard lookup tables in the background to reduce upload_files startup time. Author: Hoyt Koepke

🐛 Bug Fixes

  • Certificate loading (#393): Switched to load_native_certs() for efficiency. Author: Hoyt Koepke

What's Changed

Full Changelog: v1.1.4...v1.1.5

[v1.1.4] Bug Fixes: Network Resilience and Performance Optimizations

16 Jun 21:20
8f7e9c8

Choose a tag to compare

📶 DNS Resolution & Network Connectivity

  • Fixed DNS resolution issues: Implemented custom DNS resolver to force absolute DNS name resolution, addressing issues where DNS resolvers struggled with CAS server addresses and fell back to local search domains
  • Enhanced TLS configuration: Updated reqwest to use rustls-tls by default with configurable TLS backends (native-tls, native-tls-vendored options available)

🚀 Performance Optimizations

  • Global download concurrency control: Changed download currency limit from per-file to global (default: 128 simultaneous connections) to prevent file handle exhaustion on macOS
  • Optimized chunking operations: Converted core Chunk data type from Arc<[u8]> to bytes::Bytes for better memory efficiency and reduced copying. Separated boundary calculation logic from chunk building for future optimization work
  • Updated shard cache size: Increased default shard cache limit to 16GB, effectively allowing deduplication against 16TB of data
  • Streamlined upload payload: Removed footer serialization from upload xorb payload in remote_client for improved efficiency

🤗 Developer Experience

  • Issue templates: Added comprehensive GitHub issue templates including bug report forms, feature request forms, and helpful links for better community engagement

What's Changed

  • Update chunker to separate out calculation of next boundary by @hoytak in #368
  • remove footer serialized from upload xorb payload on remote_client by @assafvayner in #372
  • Adding issue templates to repo by @jsulz in #374
  • Small optimizations for chunking / upload path by @hoytak in #371
  • Switch reqwest to rustls-tls from default; use hickory-dns for dns resolution. by @hoytak in #378
  • add ci steps to check cargo.lock is up to date by @assafvayner in #377
  • Update shard cache default size. by @hoytak in #381
  • Remove hickory-dns and use system dns provider by @hoytak in #380
  • Fix/dns resolution by @Hugoch in #383
  • Change download currency limit from local to global. by @hoytak in #385
  • hf_xet Cargo.toml 1.1.4 by @assafvayner in #387

New Contributors

Full Changelog: v1.1.3...v1.1.4