hubvault.api

Public repository API for the hubvault package.

This module exposes HubVaultApi, a local embedded repository interface with method names intentionally aligned with the broad calling style of huggingface_hub where it makes sense for an on-disk repository.

The module contains:

  • HubVaultApi - Public entry point for local embedded repositories

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd, HubVaultApi
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     repo_dir = Path(tmpdir) / "repo"
...     api = HubVaultApi(repo_dir)
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     api.read_bytes("demo.txt")
b'hello'

HubVaultApi

class hubvault.api.HubVaultApi(repo_path: str | PathLike, revision: str = 'main')[source]

Public entry point for a local hubvault repository.

Parameters:
  • repo_path (Union[str, os.PathLike[str]]) – Filesystem path to the local repository root

  • revision (str) – Default revision used by read APIs

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd, HubVaultApi
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         revision="main",
...         operations=[
...             CommitOperationAdd("example.txt", b"hello"),
...         ],
...         commit_message="add example",
...     )
...     api.list_repo_files()
['example.txt']
__init__(repo_path: str | PathLike, revision: str = 'main') None[source]

Initialize the public API wrapper.

Parameters:
  • repo_path (Union[str, os.PathLike[str]]) – Filesystem path to the local repository root

  • revision (str, optional) – Default revision used by read APIs, defaults to "main"

Returns:

None.

Return type:

None

Example:

>>> import tempfile
>>> from pathlib import Path
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo", revision="main")
...     api.create_repo().default_branch
'main'
create_branch(*, branch: str, revision: str | None = None, exist_ok: bool = False) None[source]

Create a branch from an existing revision.

Parameters:
  • branch (str) – Branch name to create

  • revision (Optional[str]) – Starting revision, defaults to the API default revision

  • exist_ok (bool, optional) – Whether an existing branch may be reused

Returns:

None.

Return type:

None

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     api.create_branch(branch="dev")
...     [ref.name for ref in api.list_repo_refs().branches]
['dev', 'main']
create_commit(operations: Sequence[object] = (), *, commit_message: str, commit_description: str | None = None, revision: str | None = None, parent_commit: str | None = None) CommitInfo[source]

Create a new commit on a branch.

Parameters:
  • operations (Sequence[object]) – Commit operations to apply

  • commit_message (str) – Commit summary/title. When commit_description is omitted, embedded body text after a blank line is preserved and split the same way Git and HF commit listings interpret commit text.

  • commit_description (Optional[str]) – Optional commit description/body

  • revision (Optional[str]) – Target branch name, defaults to the API default revision

  • parent_commit (Optional[str]) – Expected parent commit for optimistic concurrency. When omitted, the commit is applied against the current branch head.

Returns:

Metadata for the created commit

Return type:

CommitInfo

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     commit = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     (commit.commit_message, api.read_bytes("demo.txt"))
('seed', b'hello')
create_repo(*, default_branch: str = 'main', exist_ok: bool = False, large_file_threshold: int = 16777216) RepoInfo[source]

Create a local repository.

The repository layout is bootstrapped under repo_path and the default branch immediately receives an empty Initial commit so the repo starts with a valid history root.

Parameters:
  • default_branch (str) – Default branch name, defaults to "main"

  • exist_ok (bool) – Whether an existing repository may be reused

  • large_file_threshold (int) – File size threshold in bytes at or above which newly added files switch to chunked storage, defaults to hubvault.repo.LARGE_FILE_THRESHOLD

Returns:

Information about the created repository

Return type:

RepoInfo

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     info = api.create_repo(default_branch="main")
...     (info.default_branch, info.head is not None)
('main', True)
create_tag(*, tag: str, tag_message: str | None = None, revision: str | None = None, exist_ok: bool = False) None[source]

Create a lightweight tag from an existing revision.

Parameters:
  • tag (str) – Tag name to create

  • tag_message (Optional[str]) – Optional tag message recorded in the reflog

  • revision (Optional[str]) – Starting revision, defaults to the API default revision

  • exist_ok (bool, optional) – Whether an existing tag may be reused

Returns:

None.

Return type:

None

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     api.create_tag(tag="v1")
...     [ref.name for ref in api.list_repo_refs().tags]
['v1']
delete_branch(*, branch: str) None[source]

Delete a branch from the repository.

Parameters:

branch (str) – Branch name to delete

Returns:

None.

Return type:

None

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     api.create_branch(branch="dev")
...     api.delete_branch(branch="dev")
...     [ref.name for ref in api.list_repo_refs().branches]
['main']
delete_file(path_in_repo: str, *, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, parent_commit: str | None = None) CommitInfo[source]

Delete a single file through the public commit API.

Parameters:
  • path_in_repo (str) – Repo-relative file path

  • revision (Optional[str]) – Target branch name, defaults to the API default revision

  • commit_message (Optional[str]) – Optional commit summary

  • commit_description (Optional[str]) – Optional commit description/body

  • parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit

Returns:

Commit metadata for the created commit

Return type:

CommitInfo

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[
...             CommitOperationAdd("keep.txt", b"keep"),
...             CommitOperationAdd("remove.txt", b"gone"),
...         ],
...         commit_message="seed",
...     )
...     commit = api.delete_file("remove.txt")
...     (commit.commit_message, api.list_repo_files())
('Delete remove.txt with hubvault', ['keep.txt'])
delete_folder(path_in_repo: str, *, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, parent_commit: str | None = None) CommitInfo[source]

Delete a folder subtree through the public commit API.

Parameters:
  • path_in_repo (str) – Repo-relative folder path

  • revision (Optional[str]) – Target branch name, defaults to the API default revision

  • commit_message (Optional[str]) – Optional commit summary

  • commit_description (Optional[str]) – Optional commit description/body

  • parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit

Returns:

Commit metadata for the created commit

Return type:

CommitInfo

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[
...             CommitOperationAdd("bundle/model.bin", b"data"),
...             CommitOperationAdd("keep.txt", b"keep"),
...         ],
...         commit_message="seed",
...     )
...     commit = api.delete_folder("bundle")
...     (commit.commit_message, api.list_repo_files())
('Delete folder bundle with hubvault', ['keep.txt'])
delete_tag(*, tag: str) None[source]

Delete a tag from the repository.

Parameters:

tag (str) – Tag name to delete

Returns:

None.

Return type:

None

Raises:

hubvault.errors.RevisionNotFoundError – Raised when the tag does not exist.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     api.create_tag(tag="v1")
...     api.delete_tag(tag="v1")
...     api.list_repo_refs().tags
[]
full_verify() VerifyReport[source]

Perform a complete repository verification pass.

Unlike quick_verify(), this method validates all live commit, tree, file, blob, chunk, pack, and manifest relationships reachable from the current refs and also scans the published storage layout for malformed persisted objects.

Returns:

Verification result

Return type:

VerifyReport

Raises:

hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid hubvault repository.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo(large_file_threshold=32)
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("model.bin", b"A" * 64)],
...         commit_message="seed",
...     )
...     report = api.full_verify()
...     (report.ok, report.errors)
(True, [])
gc(*, dry_run: bool = False, prune_cache: bool = True) GcReport[source]

Reclaim unreachable repository data and rebuild detachable caches.

The local GC pass keeps all currently reachable refs intact, rewrites chunk storage into a compact live pack/index view, and optionally removes rebuildable detached caches under cache/.

Parameters:
  • dry_run (bool, optional) – Whether to compute the result without mutating storage

  • prune_cache (bool, optional) – Whether rebuildable managed caches should also be removed

Returns:

Garbage-collection report

Return type:

GcReport

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo(large_file_threshold=32)
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("model.bin", b"A" * 64)],
...         commit_message="seed",
...     )
...     _ = api.hf_hub_download("model.bin")
...     report = api.gc(dry_run=True, prune_cache=True)
...     (report.dry_run, report.reclaimed_cache_size > 0)
(True, True)
get_paths_info(paths: Sequence[str] | str, *, revision: str | None = None) List[RepoFile | RepoFolder][source]

Return public metadata for selected paths.

Parameters:
  • paths (Union[Sequence[str], str]) – Repo-relative path or paths to inspect

  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

Returns:

Metadata for the existing requested paths

Return type:

List[Union[RepoFile, RepoFolder]]

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[
...             CommitOperationAdd("demo.txt", b"hello"),
...             CommitOperationAdd("nested/config.json", b"{}"),
...         ],
...         commit_message="seed",
...     )
...     sorted(item.path for item in api.get_paths_info(["demo.txt", "nested", "missing.txt"]))
['demo.txt', 'nested']
get_storage_overview() StorageOverview[source]

Analyze repository disk usage and safe reclamation options.

The returned model separates space that is immediately reclaimable via gc(), space held only for detached caches, and space retained for rollback/history that would require an explicit rewrite such as squash_history().

Returns:

Repository storage analysis report

Return type:

StorageOverview

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo(large_file_threshold=32)
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("model.bin", b"A" * 64)],
...         commit_message="seed",
...     )
...     _ = api.hf_hub_download("model.bin")
...     overview = api.get_storage_overview()
...     (overview.total_size > 0, overview.reclaimable_cache_size > 0)
(True, True)
hf_hub_download(filename: str, *, revision: str | None = None, local_dir: str | PathLike | None = None) str[source]

Materialize a detached user-view path for a file.

Parameters:
  • filename (str) – Repo-relative file path

  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

  • local_dir (Optional[Union[str, os.PathLike[str]]]) – Optional export directory outside the repository

Returns:

A filesystem path that can be read safely without mutating repo truth

Return type:

str

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("nested/demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     download_path = Path(api.hf_hub_download("nested/demo.txt"))
...     (download_path.parts[-2:], download_path.read_bytes())
(('nested', 'demo.txt'), b'hello')
list_repo_commits(*, revision: str | None = None, formatted: bool = False) Sequence[GitCommitInfo][source]

List commits reachable from a revision in HF-style order.

The local repository keeps the public method name and the meaningful parameters from huggingface_hub.HfApi.list_repo_commits while intentionally dropping remote-only parameters such as repo_id, repo_type, and token.

Parameters:
  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

  • formatted (bool, optional) – Whether HTML-formatted title/message fields should be populated

Returns:

Commit entries ordered from newest to oldest

Return type:

Sequence[GitCommitInfo]

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed\n\nbody",
...     )
...     [(item.title, item.message) for item in api.list_repo_commits()]
[('seed', 'body')]
list_repo_files(*, revision: str | None = None) Sequence[str][source]

List all file paths in a revision.

Parameters:

revision (Optional[str]) – Revision to resolve, defaults to the API default revision

Returns:

Sorted repo-relative file paths

Return type:

Sequence[str]

Raises:

hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[
...             CommitOperationAdd("demo.txt", b"hello"),
...             CommitOperationAdd("nested/config.json", b"{}"),
...         ],
...         commit_message="seed",
...     )
...     api.list_repo_files()
['demo.txt', 'nested/config.json']
list_repo_reflog(ref_name: str, *, limit: int | None = None) Sequence[ReflogEntry][source]

List reflog entries for a branch or tag.

This is a local repository extension intended for audit and recovery workflows.

Parameters:
  • ref_name (str) – Full ref name or an unambiguous short ref name

  • limit (Optional[int]) – Optional maximum number of newest entries to return

Returns:

Reflog entries ordered from newest to oldest

Return type:

Sequence[ReflogEntry]

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     [entry.message for entry in api.list_repo_reflog("main")]
['seed']
list_repo_refs(*, include_pull_requests: bool = False) GitRefs[source]

List visible branch and tag refs in HF-style form.

Parameters:

include_pull_requests (bool, optional) – Whether pull-request refs should be included. The local repository returns [] when requested and None otherwise.

Returns:

Visible repository refs

Return type:

GitRefs

Raises:

hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid hubvault repository.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     api.create_branch(branch="dev")
...     api.create_tag(tag="v1")
...     refs = api.list_repo_refs()
...     ([ref.name for ref in refs.branches], [ref.name for ref in refs.tags])
(['dev', 'main'], ['v1'])
list_repo_tree(path_in_repo: str | None = None, *, recursive: bool = False, revision: str | None = None) List[RepoFile | RepoFolder][source]

List direct children under a repository directory.

Parameters:
  • path_in_repo (Optional[str]) – Repo-relative directory path, defaults to the root

  • recursive (bool, optional) – Whether to include descendant entries recursively

  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

Returns:

Direct child path metadata

Return type:

List[Union[RepoFile, RepoFolder]]

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[
...             CommitOperationAdd("demo.txt", b"hello"),
...             CommitOperationAdd("nested/config.json", b"{}"),
...         ],
...         commit_message="seed",
...     )
...     [item.path for item in api.list_repo_tree()]
['demo.txt', 'nested']
merge(source_revision: str, *, target_revision: str | None = None, parent_commit: str | None = None, commit_message: str | None = None, commit_description: str | None = None) MergeResult[source]

Merge a source revision into a target branch.

This local API has no direct one-to-one counterpart in huggingface_hub. It follows familiar Git merge semantics instead: conflicts are returned as structured data, while successful merges may resolve as "merged", "fast-forward", or "already-up-to-date".

Parameters:
  • source_revision (str) – Source branch, tag, or commit ID to merge from

  • target_revision (Optional[str]) – Target branch name or full branch ref, defaults to the API default revision

  • parent_commit (Optional[str]) – Optional expected current head for optimistic concurrency on the target branch

  • commit_message (Optional[str]) – Optional merge-commit title. When omitted, a default merge title is generated.

  • commit_description (Optional[str]) – Optional merge-commit body

Returns:

Structured merge result

Return type:

MergeResult

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd, HubVaultApi
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"base")],
...         commit_message="seed",
...     )
...     api.create_branch(branch="feature")
...     _ = api.create_commit(
...         revision="feature",
...         operations=[CommitOperationAdd("feature.txt", b"hello")],
...         commit_message="feature work",
...     )
...     result = api.merge("feature")
...     (result.status, api.read_bytes("feature.txt"))
('fast-forward', b'hello')
open_file(path_in_repo: str, *, revision: str | None = None) BinaryIO[source]

Open a file as a read-only binary stream.

Parameters:
  • path_in_repo (str) – Repo-relative file path

  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

Returns:

Read-only binary stream

Return type:

BinaryIO

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     with api.open_file("demo.txt") as fileobj:
...         fileobj.read()
b'hello'
quick_verify() VerifyReport[source]

Perform a minimal repository verification pass.

Returns:

Verification result

Return type:

VerifyReport

Raises:

hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid hubvault repository.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     report = api.quick_verify()
...     (report.ok, report.errors)
(True, [])
read_bytes(path_in_repo: str, *, revision: str | None = None) bytes[source]

Read the full content of a file.

Parameters:
  • path_in_repo (str) – Repo-relative file path

  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

Returns:

File content bytes

Return type:

bytes

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     api.read_bytes("demo.txt")
b'hello'
read_range(path_in_repo: str, *, start: int, length: int, revision: str | None = None) bytes[source]

Read a byte range from a file.

For whole-blob files the local backend slices the materialized file bytes. For chunked files it resolves only the overlapping chunks and avoids reconstructing unrelated file regions.

Parameters:
  • path_in_repo (str) – Repo-relative file path

  • start (int) – Starting byte offset in the logical file

  • length (int) – Number of bytes to read

  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

Returns:

Requested byte range, clamped to the file end

Return type:

bytes

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     api.read_range("demo.txt", start=1, length=3)
b'ell'
repo_info(*, revision: str | None = None) RepoInfo[source]

Return metadata about the repository.

Parameters:

revision (Optional[str]) – Revision to resolve, defaults to the API default revision

Returns:

Repository metadata

Return type:

RepoInfo

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     commit = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"hello")],
...         commit_message="seed",
...     )
...     api.repo_info().head == commit.oid
True
reset_ref(ref_name: str, *, to_revision: str) CommitInfo[source]

Reset a branch to another revision.

Parameters:
  • ref_name (str) – Branch name to update

  • to_revision (str) – Revision to resolve as the new head

Returns:

Commit metadata for the target head

Return type:

CommitInfo

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     first = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"v1")],
...         commit_message="seed v1",
...     )
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("demo.txt", b"v2")],
...         commit_message="seed v2",
...     )
...     _ = api.reset_ref("main", to_revision=first.oid)
...     api.read_bytes("demo.txt")
b'v1'
snapshot_download(*, revision: str | None = None, local_dir: str | PathLike | None = None, allow_patterns: Sequence[str] | str | None = None, ignore_patterns: Sequence[str] | str | None = None) str[source]

Materialize a detached snapshot directory for a revision.

Parameters:
  • revision (Optional[str]) – Revision to resolve, defaults to the API default revision

  • local_dir (Optional[Union[str, os.PathLike[str]]]) – Optional external export directory

  • allow_patterns (Optional[Union[Sequence[str], str]]) – Optional allowlist for repo-relative paths

  • ignore_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist for repo-relative paths

Returns:

Filesystem path to the detached snapshot directory

Return type:

str

Raises:

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     _ = api.create_commit(
...         operations=[
...             CommitOperationAdd("demo.txt", b"hello"),
...             CommitOperationAdd("nested/extra.txt", b"world"),
...         ],
...         commit_message="seed",
...     )
...     snapshot_dir = Path(api.snapshot_download(allow_patterns="nested/*"))
...     Path(snapshot_dir, "nested", "extra.txt").read_bytes()
b'world'
squash_history(ref_name: str, *, root_revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, run_gc: bool = True, prune_cache: bool = False) SquashReport[source]

Rewrite a branch so older history becomes reclaimable.

The selected branch keeps the same visible file contents at its tip, but commits older than the rewritten root become unreachable from that ref. When run_gc is enabled, the method immediately follows the rewrite with a maintenance GC pass so now-unreachable data can be reclaimed.

Parameters:
  • ref_name (str) – Branch name or full branch ref to rewrite

  • root_revision (Optional[str]) – Oldest commit to preserve on the rewritten branch. When omitted, the current branch head is collapsed into a single new root commit.

  • commit_message (Optional[str]) – Optional replacement title for the rewritten root commit

  • commit_description (Optional[str]) – Optional replacement description/body for the rewritten root commit

  • run_gc (bool, optional) – Whether to run gc() immediately after rewriting

  • prune_cache (bool, optional) – Whether the follow-up GC pass should also prune managed caches

Returns:

History-squash report

Return type:

SquashReport

Raises:

Note

The object rewrite, ref update, and optional follow-up GC are implemented by hubvault.repo.backend.RepositoryBackend. The example below stays at the public API level.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo(large_file_threshold=32)
...     _ = api.create_commit(
...         operations=[CommitOperationAdd("model.bin", b"A" * 64)],
...         commit_message="seed v1",
...     )
...     second = api.create_commit(
...         operations=[CommitOperationAdd("model.bin", b"B" * 64)],
...         commit_message="seed v2",
...     )
...     report = api.squash_history("main", root_revision=second.oid, run_gc=False)
...     (report.rewritten_commit_count, len(api.list_repo_commits()))
(1, 1)
upload_file(*, path_or_fileobj: str | PathLike | bytes | BinaryIO, path_in_repo: str, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, parent_commit: str | None = None) CommitInfo[source]

Upload a single file through the public commit API.

Parameters:
  • path_or_fileobj (Union[str, os.PathLike[str], bytes, BinaryIO]) – File content source

  • path_in_repo (str) – Target repo-relative path

  • revision (Optional[str]) – Target branch name, defaults to the API default revision

  • commit_message (Optional[str]) – Optional commit summary

  • commit_description (Optional[str]) – Optional commit description/body

  • parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit

Returns:

Commit metadata for the created commit

Return type:

CommitInfo

Example:

>>> import tempfile
>>> from pathlib import Path
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     api = HubVaultApi(Path(tmpdir) / "repo")
...     _ = api.create_repo()
...     commit = api.upload_file(path_or_fileobj=b"hello", path_in_repo="demo.txt")
...     (commit.commit_message, api.read_bytes("demo.txt"))
('Upload demo.txt with hubvault', b'hello')
upload_folder(*, folder_path: str | PathLike, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, revision: str | None = None, parent_commit: str | None = None, allow_patterns: Sequence[str] | str | None = None, ignore_patterns: Sequence[str] | str | None = None, delete_patterns: Sequence[str] | str | None = None) CommitInfo[source]

Upload a local folder while preserving its relative layout.

Parameters:
  • folder_path (Union[str, os.PathLike[str]]) – Local folder to upload

  • path_in_repo (Optional[str]) – Optional target directory in the repo

  • commit_message (Optional[str]) – Optional commit summary

  • commit_description (Optional[str]) – Optional commit description/body

  • revision (Optional[str]) – Target branch name, defaults to the API default revision

  • parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit

  • allow_patterns (Optional[Union[Sequence[str], str]]) – Optional allowlist for local relative paths

  • ignore_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist for local relative paths

  • delete_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist applied to already uploaded repo files beneath path_in_repo

Returns:

Commit metadata for the created commit

Return type:

CommitInfo

Note

The low-level staging, publish, and recovery sequence is implemented by hubvault.repo.backend.RepositoryBackend. This API example focuses only on the public workflow.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     root = Path(tmpdir)
...     repo_dir = root / "repo"
...     source_dir = root / "source"
...     source_dir.mkdir()
...     _ = source_dir.joinpath("config.json").write_text(
...         '{"dtype":"float16"}',
...         encoding="utf-8",
...     )
...     api = HubVaultApi(repo_dir)
...     _ = api.create_repo()
...     commit = api.upload_folder(folder_path=source_dir, path_in_repo="bundle")
...     (commit.commit_message, api.list_repo_files())
('Upload folder using hubvault', ['bundle/config.json'])
upload_large_folder(*, folder_path: str | PathLike, revision: str | None = None, allow_patterns: Sequence[str] | str | None = None, ignore_patterns: Sequence[str] | str | None = None) CommitInfo[source]

Upload a large folder through a single atomic local commit.

The method name follows huggingface_hub.HfApi.upload_large_folder(). Unlike the remote API, the local backend keeps the whole operation atomic and therefore returns a single CommitInfo.

Parameters:
  • folder_path (Union[str, os.PathLike[str]]) – Local folder to upload

  • revision (Optional[str]) – Target branch name, defaults to the API default revision

  • allow_patterns (Optional[Union[Sequence[str], str]]) – Optional allowlist for local relative paths

  • ignore_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist for local relative paths

Returns:

Commit metadata for the created commit

Return type:

CommitInfo

Raises:

ValueError – Raised when folder_path is not a local directory.

Note

The underlying chunk planning and atomic publish logic lives in hubvault.repo.backend.RepositoryBackend. The public example below shows the API-level behavior only.

Example:

>>> import tempfile
>>> from pathlib import Path
>>> with tempfile.TemporaryDirectory() as tmpdir:
...     root = Path(tmpdir)
...     repo_dir = root / "repo"
...     source_dir = root / "source"
...     source_dir.mkdir()
...     _ = source_dir.joinpath("model.bin").write_bytes(b"A" * 64)
...     api = HubVaultApi(repo_dir)
...     _ = api.create_repo(large_file_threshold=32)
...     commit = api.upload_large_folder(folder_path=source_dir)
...     (commit.commit_message, api.read_range("model.bin", start=0, length=4))
('Upload large folder using hubvault', b'AAAA')