Skip to content

Latest commit

 

History

History
150 lines (111 loc) · 7.92 KB

File metadata and controls

150 lines (111 loc) · 7.92 KB

huggingface-hub

Async Rust client library for the Hugging Face Hub API. This is the Rust equivalent of the Python huggingface_hub library.

The primary entry point is the HFClient struct, which wraps an Arc<HFClientInner> for cheap cloning. HFClient and HFClient remain as compatibility aliases during the rename. All methods are async and use reqwest as the HTTP client. Paginated endpoints return impl Stream<Item = Result<T>> via futures::stream::try_unfold. Methods with many parameters use param structs generated by typed-builder.

Key capabilities:

  • Repository info, listing, creation, deletion, and settings updates
  • File upload, download, listing, and deletion
  • Commit creation, commit history, diffs between revisions
  • Branch and tag management
  • User and organization info
  • Xet high-performance transfers (behind the xet feature flag)

Code Standards

These rules apply to ALL code written or modified in this repo:

Style

  • NO trivial comments — do not add comments that restate what the code does
  • Descriptive variable and function names
  • No wildcard imports (e.g., use foo::*), except pub use re-exports in lib.rs
  • All imports are at the top of the file or top of module
  • Latest stable Rust features are allowed

Error Handling

  • Use Result<T, E> with explicit error handling — never panic
  • Define custom error types using thiserror for domain-specific errors
  • Provide helpful, actionable error messages

Performance

  • Be mindful of allocations in hot paths
  • Prefer structured logging (tracing/log macros with fields, not string formatting)

Dependencies

  • Add all dependencies to Cargo.toml (workspace root) or huggingface_hub/Cargo.toml (crate-level)
  • Prefer well-maintained crates from crates.io
  • Shared dependencies belong in the workspace [dependencies] table, not per-crate

Testing

Unit Tests

  • Place in the same file using #[cfg(test)] modules
  • Run: cargo test -p huggingface-hub

Integration Tests

  • Located in huggingface_hub/tests/integration_test.rs
  • Require a valid HF_TOKEN environment variable and internet access
  • Tests skip gracefully when HF_TOKEN is not set (no failures)
  • Run read-only tests: HF_TOKEN=HF_xxx cargo test -p huggingface-hub --test integration_test
  • Write operation tests (create/delete repos, upload files) require HF_TEST_WRITE=1
  • Run all tests including writes: HF_TOKEN=HF_xxx HF_TEST_WRITE=1 cargo test -p huggingface-hub --test integration_test

Formatting and Linting

  • Format: cargo +nightly fmt
  • Lint: cargo clippy -p huggingface-hub --all-features -- -D warnings
  • ALWAYS run both after making changes — do not skip this step

Minimal Changes

  • Verify that every change is minimal and necessary — do not include unrelated modifications

Project Layout

Agents MUST update this section when adding new crates or large modules.

huggingface_hub_rust/
├── Cargo.toml                      # Workspace root
├── AGENTS.md                       # This file
├── .gitignore
├── huggingface_hub/                         # Main library crate (package: huggingface-hub)
│   ├── Cargo.toml                  # Crate manifest, dependencies, features
│   ├── src/
│   │   ├── lib.rs                  # Public re-exports, crate docs
│   │   ├── client.rs               # HFClient, HFClientBuilder, HFClientInner, auth headers, URL builders
│   │   ├── repository.rs           # HFRepository/HFRepo handle, repo-scoped params, repo-bound methods
│   │   ├── constants.rs            # Env var names, default URLs, repo type helpers
│   │   ├── error.rs                # HFError enum, Result alias, NotFoundContext
│   │   ├── pagination.rs           # Generic paginate<T>() with Link header parsing
│   │   ├── cache.rs                # Cache path computation, locking, ref read/write, symlink, scan, delete
│   │   ├── diff.rs                 # Raw diff parsing (parse_raw_diff, stream_raw_diff), HFFileDiff, GitStatus
│   │   ├── xet.rs                  # Xet high-performance transfer stubs (behind "xet" feature)
│   │   ├── types/
│   │   │   ├── mod.rs              # Module declarations, re-exports
│   │   │   ├── cache.rs            # CachedFileInfo, CachedRepoInfo, HFCacheInfo, DeleteCacheRevision
│   │   │   ├── repo.rs             # RepoType, RepoInfo, ModelInfo, DatasetInfo, SpaceInfo, RepoTreeEntry
│   │   │   ├── user.rs             # User, Organization, OrgMembership
│   │   │   ├── commit.rs           # CommitInfo, GitCommitInfo, GitRefs, CommitOperation, AddSource
│   │   │   ├── params.rs           # All *Params structs with TypedBuilder
│   │   │   └── spaces.rs           # SpaceRuntime, SpaceVariable (behind "spaces" feature)
│   │   └── api/
│   │       ├── mod.rs              # Module declarations
│   │       ├── repo.rs             # Repo info, listing, existence checks, create/delete/update/move
│   │       ├── cache.rs            # scan_cache, delete_cache_revisions
│   │       ├── files.rs            # File listing, download, upload, create_commit, snapshot_download
│   │       ├── commits.rs          # Commit listing, diffs, branch/tag management
│   │       ├── users.rs            # whoami, auth_check, user/org info, followers
│   │       └── spaces.rs           # Space runtime, secrets, variables, hardware, pause/restart
│   └── tests/
│       └── integration_test.rs     # Integration tests against live Hub API

Feature Development

Before writing any code:

  1. Branch: Confirm you are on a feature branch, not main. If on main, create a branch named <username>/<short-description>.
  2. Plan: Write an implementation plan that includes testing strategy (unit tests, integration tests, manual verification steps). Add this plan as a comment on the PR.

When changing public interfaces or adding user-facing capabilities:

  • ALWAYS update the relevant examples in huggingface_hub/examples/ and any affected README snippets so they match the current public API.
  • ALWAYS add at least one example for new functionality unless an existing example already demonstrates that exact workflow clearly.
  • Prefer examples that show the intended high-level interface, not just the lowest-level parameter structs, especially for new ergonomic APIs like repo handles.

Code Review

When reviewing a pull request, follow these rules:

Tone

  • Collegiate and constructive — write as a peer, not an authority
  • Use phrases like "consider...", "what do you think about...", "we might want to..."
  • Acknowledge good decisions and clean patterns, not just problems
  • When unsure, ask a clarifying question instead of assuming something is wrong

What to Review

  • Correctness — logic errors, edge cases, off-by-one errors
  • Readability — naming consistency, code clarity, helpful error messages
  • Maintainability — temporary workarounds tracked, types in the right crate, clean abstractions
  • Testability — missing tests for new endpoints/logic, weakened assertions, coverage gaps
  • Performance — unnecessary allocations in hot paths, unbounded response sizes, missing concurrency limits
  • Security — auth checks on new routes, input validation, error message information leakage

How to Structure Feedback

  • Post a summary comment on the PR: overview of the changes, key observations, cross-cutting concerns
  • Add inline comments at specific diff locations for targeted feedback
  • Prefix minor style suggestions with nit: — these are optional and the author may skip them
  • Do NOT prefix substantive feedback (public API changes, correctness issues, missing tests) — these require attention