hubvault.api
Public repository API for the hubvault package.
This module exposes HubVaultApi, a local embedded repository interface
with method names intentionally aligned with the broad calling style of
huggingface_hub where it makes sense for an on-disk repository.
The module contains:
HubVaultApi- Public entry point for local embedded repositories
Example:
>>> import tempfile
>>> from pathlib import Path
>>> from hubvault import CommitOperationAdd, HubVaultApi
>>> with tempfile.TemporaryDirectory() as tmpdir:
... repo_dir = Path(tmpdir) / "repo"
... api = HubVaultApi(repo_dir)
... _ = api.create_repo()
... _ = api.create_commit(
... operations=[CommitOperationAdd("demo.txt", b"hello")],
... commit_message="seed",
... )
... api.read_bytes("demo.txt")
b'hello'
HubVaultApi
- class hubvault.api.HubVaultApi(repo_path: str | PathLike, revision: str = 'main')[source]
Public entry point for a local
hubvaultrepository.- Parameters:
repo_path (Union[str, os.PathLike[str]]) – Filesystem path to the local repository root
revision (str) – Default revision used by read APIs
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd, HubVaultApi >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... revision="main", ... operations=[ ... CommitOperationAdd("example.txt", b"hello"), ... ], ... commit_message="add example", ... ) ... api.list_repo_files() ['example.txt']
- __init__(repo_path: str | PathLike, revision: str = 'main') None[source]
Initialize the public API wrapper.
- Parameters:
repo_path (Union[str, os.PathLike[str]]) – Filesystem path to the local repository root
revision (str, optional) – Default revision used by read APIs, defaults to
"main"
- Returns:
None.- Return type:
None
Example:
>>> import tempfile >>> from pathlib import Path >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo", revision="main") ... api.create_repo().default_branch 'main'
- create_branch(*, branch: str, revision: str | None = None, exist_ok: bool = False) None[source]
Create a branch from an existing revision.
- Parameters:
branch (str) – Branch name to create
revision (Optional[str]) – Starting revision, defaults to the API default revision
exist_ok (bool, optional) – Whether an existing branch may be reused
- Returns:
None.- Return type:
None
- Raises:
hubvault.errors.ConflictError – Raised when the branch already exists and
exist_okisFalse.hubvault.errors.RevisionNotFoundError – Raised when the selected revision cannot be resolved.
hubvault.errors.UnsupportedPathError – Raised when
branchis invalid.
Example:
>>> import tempfile >>> from pathlib import Path >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... api.create_branch(branch="dev") ... [ref.name for ref in api.list_repo_refs().branches] ['dev', 'main']
- create_commit(operations: Sequence[object] = (), *, commit_message: str, commit_description: str | None = None, revision: str | None = None, parent_commit: str | None = None) CommitInfo[source]
Create a new commit on a branch.
- Parameters:
operations (Sequence[object]) – Commit operations to apply
commit_message (str) – Commit summary/title. When
commit_descriptionis omitted, embedded body text after a blank line is preserved and split the same way Git and HF commit listings interpret commit text.commit_description (Optional[str]) – Optional commit description/body
revision (Optional[str]) – Target branch name, defaults to the API default revision
parent_commit (Optional[str]) – Expected parent commit for optimistic concurrency. When omitted, the commit is applied against the current branch head.
- Returns:
Metadata for the created commit
- Return type:
- Raises:
hubvault.errors.ConflictError – Raised when the operation set is empty, unsupported, or optimistic concurrency checks fail.
hubvault.errors.EntryNotFoundError – Raised when delete/copy operations refer to missing paths.
hubvault.errors.RevisionNotFoundError – Raised when the target revision cannot be resolved.
hubvault.errors.UnsupportedPathError – Raised when revision or path inputs are invalid.
ValueError – Raised when
commit_messageis empty.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... commit = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... (commit.commit_message, api.read_bytes("demo.txt")) ('seed', b'hello')
- create_repo(*, default_branch: str = 'main', exist_ok: bool = False, large_file_threshold: int = 16777216) RepoInfo[source]
Create a local repository.
The repository layout is bootstrapped under
repo_pathand the default branch immediately receives an emptyInitial commitso the repo starts with a valid history root.- Parameters:
default_branch (str) – Default branch name, defaults to
"main"exist_ok (bool) – Whether an existing repository may be reused
large_file_threshold (int) – File size threshold in bytes at or above which newly added files switch to chunked storage, defaults to
hubvault.repo.LARGE_FILE_THRESHOLD
- Returns:
Information about the created repository
- Return type:
- Raises:
hubvault.errors.RepositoryAlreadyExistsError – Raised when the target path already contains a repository or non-empty directory.
hubvault.errors.UnsupportedPathError – Raised when the default branch name is invalid.
ValueError – Raised when
large_file_thresholdis not positive.
Example:
>>> import tempfile >>> from pathlib import Path >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... info = api.create_repo(default_branch="main") ... (info.default_branch, info.head is not None) ('main', True)
- create_tag(*, tag: str, tag_message: str | None = None, revision: str | None = None, exist_ok: bool = False) None[source]
Create a lightweight tag from an existing revision.
- Parameters:
tag (str) – Tag name to create
tag_message (Optional[str]) – Optional tag message recorded in the reflog
revision (Optional[str]) – Starting revision, defaults to the API default revision
exist_ok (bool, optional) – Whether an existing tag may be reused
- Returns:
None.- Return type:
None
- Raises:
hubvault.errors.ConflictError – Raised when the tag already exists and
exist_okisFalse.hubvault.errors.RevisionNotFoundError – Raised when the selected revision cannot be resolved to a commit.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... api.create_tag(tag="v1") ... [ref.name for ref in api.list_repo_refs().tags] ['v1']
- delete_branch(*, branch: str) None[source]
Delete a branch from the repository.
- Parameters:
branch (str) – Branch name to delete
- Returns:
None.- Return type:
None
- Raises:
hubvault.errors.ConflictError – Raised when attempting to delete the default branch.
hubvault.errors.RevisionNotFoundError – Raised when the branch does not exist.
Example:
>>> import tempfile >>> from pathlib import Path >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... api.create_branch(branch="dev") ... api.delete_branch(branch="dev") ... [ref.name for ref in api.list_repo_refs().branches] ['main']
- delete_file(path_in_repo: str, *, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, parent_commit: str | None = None) CommitInfo[source]
Delete a single file through the public commit API.
- Parameters:
path_in_repo (str) – Repo-relative file path
revision (Optional[str]) – Target branch name, defaults to the API default revision
commit_message (Optional[str]) – Optional commit summary
commit_description (Optional[str]) – Optional commit description/body
parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit
- Returns:
Commit metadata for the created commit
- Return type:
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[ ... CommitOperationAdd("keep.txt", b"keep"), ... CommitOperationAdd("remove.txt", b"gone"), ... ], ... commit_message="seed", ... ) ... commit = api.delete_file("remove.txt") ... (commit.commit_message, api.list_repo_files()) ('Delete remove.txt with hubvault', ['keep.txt'])
- delete_folder(path_in_repo: str, *, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, parent_commit: str | None = None) CommitInfo[source]
Delete a folder subtree through the public commit API.
- Parameters:
path_in_repo (str) – Repo-relative folder path
revision (Optional[str]) – Target branch name, defaults to the API default revision
commit_message (Optional[str]) – Optional commit summary
commit_description (Optional[str]) – Optional commit description/body
parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit
- Returns:
Commit metadata for the created commit
- Return type:
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[ ... CommitOperationAdd("bundle/model.bin", b"data"), ... CommitOperationAdd("keep.txt", b"keep"), ... ], ... commit_message="seed", ... ) ... commit = api.delete_folder("bundle") ... (commit.commit_message, api.list_repo_files()) ('Delete folder bundle with hubvault', ['keep.txt'])
- delete_tag(*, tag: str) None[source]
Delete a tag from the repository.
- Parameters:
tag (str) – Tag name to delete
- Returns:
None.- Return type:
None
- Raises:
hubvault.errors.RevisionNotFoundError – Raised when the tag does not exist.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... api.create_tag(tag="v1") ... api.delete_tag(tag="v1") ... api.list_repo_refs().tags []
- full_verify() VerifyReport[source]
Perform a complete repository verification pass.
Unlike
quick_verify(), this method validates all live commit, tree, file, blob, chunk, pack, and manifest relationships reachable from the current refs and also scans the published storage layout for malformed persisted objects.- Returns:
Verification result
- Return type:
- Raises:
hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid
hubvaultrepository.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo(large_file_threshold=32) ... _ = api.create_commit( ... operations=[CommitOperationAdd("model.bin", b"A" * 64)], ... commit_message="seed", ... ) ... report = api.full_verify() ... (report.ok, report.errors) (True, [])
- gc(*, dry_run: bool = False, prune_cache: bool = True) GcReport[source]
Reclaim unreachable repository data and rebuild detachable caches.
The local GC pass keeps all currently reachable refs intact, rewrites chunk storage into a compact live pack/index view, and optionally removes rebuildable detached caches under
cache/.- Parameters:
dry_run (bool, optional) – Whether to compute the result without mutating storage
prune_cache (bool, optional) – Whether rebuildable managed caches should also be removed
- Returns:
Garbage-collection report
- Return type:
- Raises:
hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid
hubvaultrepository.hubvault.errors.IntegrityError – Raised when persisted storage is inconsistent and cannot be reclaimed safely.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo(large_file_threshold=32) ... _ = api.create_commit( ... operations=[CommitOperationAdd("model.bin", b"A" * 64)], ... commit_message="seed", ... ) ... _ = api.hf_hub_download("model.bin") ... report = api.gc(dry_run=True, prune_cache=True) ... (report.dry_run, report.reclaimed_cache_size > 0) (True, True)
- get_paths_info(paths: Sequence[str] | str, *, revision: str | None = None) List[RepoFile | RepoFolder][source]
Return public metadata for selected paths.
- Parameters:
paths (Union[Sequence[str], str]) – Repo-relative path or paths to inspect
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
- Returns:
Metadata for the existing requested paths
- Return type:
List[Union[RepoFile, RepoFolder]]
- Raises:
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
hubvault.errors.UnsupportedPathError – Raised when a requested path is invalid.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[ ... CommitOperationAdd("demo.txt", b"hello"), ... CommitOperationAdd("nested/config.json", b"{}"), ... ], ... commit_message="seed", ... ) ... sorted(item.path for item in api.get_paths_info(["demo.txt", "nested", "missing.txt"])) ['demo.txt', 'nested']
- get_storage_overview() StorageOverview[source]
Analyze repository disk usage and safe reclamation options.
The returned model separates space that is immediately reclaimable via
gc(), space held only for detached caches, and space retained for rollback/history that would require an explicit rewrite such assquash_history().- Returns:
Repository storage analysis report
- Return type:
- Raises:
hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid
hubvaultrepository.hubvault.errors.IntegrityError – Raised when persisted storage is too inconsistent to analyze safely.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo(large_file_threshold=32) ... _ = api.create_commit( ... operations=[CommitOperationAdd("model.bin", b"A" * 64)], ... commit_message="seed", ... ) ... _ = api.hf_hub_download("model.bin") ... overview = api.get_storage_overview() ... (overview.total_size > 0, overview.reclaimable_cache_size > 0) (True, True)
- hf_hub_download(filename: str, *, revision: str | None = None, local_dir: str | PathLike | None = None) str[source]
Materialize a detached user-view path for a file.
- Parameters:
filename (str) – Repo-relative file path
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
local_dir (Optional[Union[str, os.PathLike[str]]]) – Optional export directory outside the repository
- Returns:
A filesystem path that can be read safely without mutating repo truth
- Return type:
str
- Raises:
hubvault.errors.EntryNotFoundError – Raised when the requested file does not exist in the selected revision.
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
hubvault.errors.UnsupportedPathError – Raised when
filenameis invalid.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("nested/demo.txt", b"hello")], ... commit_message="seed", ... ) ... download_path = Path(api.hf_hub_download("nested/demo.txt")) ... (download_path.parts[-2:], download_path.read_bytes()) (('nested', 'demo.txt'), b'hello')
- list_repo_commits(*, revision: str | None = None, formatted: bool = False) Sequence[GitCommitInfo][source]
List commits reachable from a revision in HF-style order.
The local repository keeps the public method name and the meaningful parameters from
huggingface_hub.HfApi.list_repo_commitswhile intentionally dropping remote-only parameters such asrepo_id,repo_type, andtoken.- Parameters:
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
formatted (bool, optional) – Whether HTML-formatted title/message fields should be populated
- Returns:
Commit entries ordered from newest to oldest
- Return type:
Sequence[GitCommitInfo]
- Raises:
hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid
hubvaultrepository.hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed\n\nbody", ... ) ... [(item.title, item.message) for item in api.list_repo_commits()] [('seed', 'body')]
- list_repo_files(*, revision: str | None = None) Sequence[str][source]
List all file paths in a revision.
- Parameters:
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
- Returns:
Sorted repo-relative file paths
- Return type:
Sequence[str]
- Raises:
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[ ... CommitOperationAdd("demo.txt", b"hello"), ... CommitOperationAdd("nested/config.json", b"{}"), ... ], ... commit_message="seed", ... ) ... api.list_repo_files() ['demo.txt', 'nested/config.json']
- list_repo_reflog(ref_name: str, *, limit: int | None = None) Sequence[ReflogEntry][source]
List reflog entries for a branch or tag.
This is a local repository extension intended for audit and recovery workflows.
- Parameters:
ref_name (str) – Full ref name or an unambiguous short ref name
limit (Optional[int]) – Optional maximum number of newest entries to return
- Returns:
Reflog entries ordered from newest to oldest
- Return type:
Sequence[ReflogEntry]
- Raises:
hubvault.errors.ConflictError – Raised when a short ref name is ambiguous across branches and tags.
hubvault.errors.RevisionNotFoundError – Raised when the ref or reflog does not exist.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... [entry.message for entry in api.list_repo_reflog("main")] ['seed']
- list_repo_refs(*, include_pull_requests: bool = False) GitRefs[source]
List visible branch and tag refs in HF-style form.
- Parameters:
include_pull_requests (bool, optional) – Whether pull-request refs should be included. The local repository returns
[]when requested andNoneotherwise.- Returns:
Visible repository refs
- Return type:
- Raises:
hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid
hubvaultrepository.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... api.create_branch(branch="dev") ... api.create_tag(tag="v1") ... refs = api.list_repo_refs() ... ([ref.name for ref in refs.branches], [ref.name for ref in refs.tags]) (['dev', 'main'], ['v1'])
- list_repo_tree(path_in_repo: str | None = None, *, recursive: bool = False, revision: str | None = None) List[RepoFile | RepoFolder][source]
List direct children under a repository directory.
- Parameters:
path_in_repo (Optional[str]) – Repo-relative directory path, defaults to the root
recursive (bool, optional) – Whether to include descendant entries recursively
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
- Returns:
Direct child path metadata
- Return type:
List[Union[RepoFile, RepoFolder]]
- Raises:
hubvault.errors.EntryNotFoundError – Raised when the directory is missing from the selected revision.
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
hubvault.errors.UnsupportedPathError – Raised when
path_in_reporefers to a file or is invalid.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[ ... CommitOperationAdd("demo.txt", b"hello"), ... CommitOperationAdd("nested/config.json", b"{}"), ... ], ... commit_message="seed", ... ) ... [item.path for item in api.list_repo_tree()] ['demo.txt', 'nested']
- merge(source_revision: str, *, target_revision: str | None = None, parent_commit: str | None = None, commit_message: str | None = None, commit_description: str | None = None) MergeResult[source]
Merge a source revision into a target branch.
This local API has no direct one-to-one counterpart in
huggingface_hub. It follows familiar Git merge semantics instead: conflicts are returned as structured data, while successful merges may resolve as"merged","fast-forward", or"already-up-to-date".- Parameters:
source_revision (str) – Source branch, tag, or commit ID to merge from
target_revision (Optional[str]) – Target branch name or full branch ref, defaults to the API default revision
parent_commit (Optional[str]) – Optional expected current head for optimistic concurrency on the target branch
commit_message (Optional[str]) – Optional merge-commit title. When omitted, a default merge title is generated.
commit_description (Optional[str]) – Optional merge-commit body
- Returns:
Structured merge result
- Return type:
- Raises:
hubvault.errors.ConflictError – Raised when
parent_commitdoes not match the current target head.hubvault.errors.RevisionNotFoundError – Raised when the source revision or target branch does not exist.
hubvault.errors.UnsupportedPathError – Raised when the target revision is not a valid branch ref.
ValueError – Raised when
commit_messageis explicitly empty.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd, HubVaultApi >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"base")], ... commit_message="seed", ... ) ... api.create_branch(branch="feature") ... _ = api.create_commit( ... revision="feature", ... operations=[CommitOperationAdd("feature.txt", b"hello")], ... commit_message="feature work", ... ) ... result = api.merge("feature") ... (result.status, api.read_bytes("feature.txt")) ('fast-forward', b'hello')
- open_file(path_in_repo: str, *, revision: str | None = None) BinaryIO[source]
Open a file as a read-only binary stream.
- Parameters:
path_in_repo (str) – Repo-relative file path
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
- Returns:
Read-only binary stream
- Return type:
BinaryIO
- Raises:
hubvault.errors.EntryNotFoundError – Raised when the file is not present in the selected revision.
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... with api.open_file("demo.txt") as fileobj: ... fileobj.read() b'hello'
- quick_verify() VerifyReport[source]
Perform a minimal repository verification pass.
- Returns:
Verification result
- Return type:
- Raises:
hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid
hubvaultrepository.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... report = api.quick_verify() ... (report.ok, report.errors) (True, [])
- read_bytes(path_in_repo: str, *, revision: str | None = None) bytes[source]
Read the full content of a file.
- Parameters:
path_in_repo (str) – Repo-relative file path
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
- Returns:
File content bytes
- Return type:
bytes
- Raises:
hubvault.errors.IntegrityError – Raised when stored blob content fails validation checks.
hubvault.errors.EntryNotFoundError – Raised when the file is not present in the selected revision.
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... api.read_bytes("demo.txt") b'hello'
- read_range(path_in_repo: str, *, start: int, length: int, revision: str | None = None) bytes[source]
Read a byte range from a file.
For whole-blob files the local backend slices the materialized file bytes. For chunked files it resolves only the overlapping chunks and avoids reconstructing unrelated file regions.
- Parameters:
path_in_repo (str) – Repo-relative file path
start (int) – Starting byte offset in the logical file
length (int) – Number of bytes to read
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
- Returns:
Requested byte range, clamped to the file end
- Return type:
bytes
- Raises:
hubvault.errors.IntegrityError – Raised when chunk or blob storage fails verification checks.
hubvault.errors.EntryNotFoundError – Raised when the file is not present in the selected revision.
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
ValueError – Raised when
startorlengthis negative.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... api.read_range("demo.txt", start=1, length=3) b'ell'
- repo_info(*, revision: str | None = None) RepoInfo[source]
Return metadata about the repository.
- Parameters:
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
- Returns:
Repository metadata
- Return type:
- Raises:
hubvault.errors.RepositoryNotFoundError – Raised when the repository root does not contain a valid
hubvaultrepository.hubvault.errors.RevisionNotFoundError – Raised when the selected revision does not exist.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... commit = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"hello")], ... commit_message="seed", ... ) ... api.repo_info().head == commit.oid True
- reset_ref(ref_name: str, *, to_revision: str) CommitInfo[source]
Reset a branch to another revision.
- Parameters:
ref_name (str) – Branch name to update
to_revision (str) – Revision to resolve as the new head
- Returns:
Commit metadata for the target head
- Return type:
- Raises:
hubvault.errors.RevisionNotFoundError – Raised when the target revision or branch cannot be resolved.
hubvault.errors.UnsupportedPathError – Raised when the branch name is invalid.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... first = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"v1")], ... commit_message="seed v1", ... ) ... _ = api.create_commit( ... operations=[CommitOperationAdd("demo.txt", b"v2")], ... commit_message="seed v2", ... ) ... _ = api.reset_ref("main", to_revision=first.oid) ... api.read_bytes("demo.txt") b'v1'
- snapshot_download(*, revision: str | None = None, local_dir: str | PathLike | None = None, allow_patterns: Sequence[str] | str | None = None, ignore_patterns: Sequence[str] | str | None = None) str[source]
Materialize a detached snapshot directory for a revision.
- Parameters:
revision (Optional[str]) – Revision to resolve, defaults to the API default revision
local_dir (Optional[Union[str, os.PathLike[str]]]) – Optional external export directory
allow_patterns (Optional[Union[Sequence[str], str]]) – Optional allowlist for repo-relative paths
ignore_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist for repo-relative paths
- Returns:
Filesystem path to the detached snapshot directory
- Return type:
str
- Raises:
hubvault.errors.RevisionNotFoundError – Raised when the revision cannot be resolved.
hubvault.errors.UnsupportedPathError – Raised when
local_dirpoints into the repository root.
Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... _ = api.create_commit( ... operations=[ ... CommitOperationAdd("demo.txt", b"hello"), ... CommitOperationAdd("nested/extra.txt", b"world"), ... ], ... commit_message="seed", ... ) ... snapshot_dir = Path(api.snapshot_download(allow_patterns="nested/*")) ... Path(snapshot_dir, "nested", "extra.txt").read_bytes() b'world'
- squash_history(ref_name: str, *, root_revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, run_gc: bool = True, prune_cache: bool = False) SquashReport[source]
Rewrite a branch so older history becomes reclaimable.
The selected branch keeps the same visible file contents at its tip, but commits older than the rewritten root become unreachable from that ref. When
run_gcis enabled, the method immediately follows the rewrite with a maintenance GC pass so now-unreachable data can be reclaimed.- Parameters:
ref_name (str) – Branch name or full branch ref to rewrite
root_revision (Optional[str]) – Oldest commit to preserve on the rewritten branch. When omitted, the current branch head is collapsed into a single new root commit.
commit_message (Optional[str]) – Optional replacement title for the rewritten root commit
commit_description (Optional[str]) – Optional replacement description/body for the rewritten root commit
run_gc (bool, optional) – Whether to run
gc()immediately after rewritingprune_cache (bool, optional) – Whether the follow-up GC pass should also prune managed caches
- Returns:
History-squash report
- Return type:
- Raises:
hubvault.errors.ConflictError – Raised when
root_revisionis not an ancestor of the selected branch head.hubvault.errors.RevisionNotFoundError – Raised when the branch or selected revision does not exist.
hubvault.errors.UnsupportedPathError – Raised when
ref_nameis not a valid branch name.
Note
The object rewrite, ref update, and optional follow-up GC are implemented by
hubvault.repo.backend.RepositoryBackend. The example below stays at the public API level.Example:
>>> import tempfile >>> from pathlib import Path >>> from hubvault import CommitOperationAdd >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo(large_file_threshold=32) ... _ = api.create_commit( ... operations=[CommitOperationAdd("model.bin", b"A" * 64)], ... commit_message="seed v1", ... ) ... second = api.create_commit( ... operations=[CommitOperationAdd("model.bin", b"B" * 64)], ... commit_message="seed v2", ... ) ... report = api.squash_history("main", root_revision=second.oid, run_gc=False) ... (report.rewritten_commit_count, len(api.list_repo_commits())) (1, 1)
- upload_file(*, path_or_fileobj: str | PathLike | bytes | BinaryIO, path_in_repo: str, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, parent_commit: str | None = None) CommitInfo[source]
Upload a single file through the public commit API.
- Parameters:
path_or_fileobj (Union[str, os.PathLike[str], bytes, BinaryIO]) – File content source
path_in_repo (str) – Target repo-relative path
revision (Optional[str]) – Target branch name, defaults to the API default revision
commit_message (Optional[str]) – Optional commit summary
commit_description (Optional[str]) – Optional commit description/body
parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit
- Returns:
Commit metadata for the created commit
- Return type:
Example:
>>> import tempfile >>> from pathlib import Path >>> with tempfile.TemporaryDirectory() as tmpdir: ... api = HubVaultApi(Path(tmpdir) / "repo") ... _ = api.create_repo() ... commit = api.upload_file(path_or_fileobj=b"hello", path_in_repo="demo.txt") ... (commit.commit_message, api.read_bytes("demo.txt")) ('Upload demo.txt with hubvault', b'hello')
- upload_folder(*, folder_path: str | PathLike, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, revision: str | None = None, parent_commit: str | None = None, allow_patterns: Sequence[str] | str | None = None, ignore_patterns: Sequence[str] | str | None = None, delete_patterns: Sequence[str] | str | None = None) CommitInfo[source]
Upload a local folder while preserving its relative layout.
- Parameters:
folder_path (Union[str, os.PathLike[str]]) – Local folder to upload
path_in_repo (Optional[str]) – Optional target directory in the repo
commit_message (Optional[str]) – Optional commit summary
commit_description (Optional[str]) – Optional commit description/body
revision (Optional[str]) – Target branch name, defaults to the API default revision
parent_commit (Optional[str]) – Optional optimistic-concurrency parent commit
allow_patterns (Optional[Union[Sequence[str], str]]) – Optional allowlist for local relative paths
ignore_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist for local relative paths
delete_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist applied to already uploaded repo files beneath
path_in_repo
- Returns:
Commit metadata for the created commit
- Return type:
Note
The low-level staging, publish, and recovery sequence is implemented by
hubvault.repo.backend.RepositoryBackend. This API example focuses only on the public workflow.Example:
>>> import tempfile >>> from pathlib import Path >>> with tempfile.TemporaryDirectory() as tmpdir: ... root = Path(tmpdir) ... repo_dir = root / "repo" ... source_dir = root / "source" ... source_dir.mkdir() ... _ = source_dir.joinpath("config.json").write_text( ... '{"dtype":"float16"}', ... encoding="utf-8", ... ) ... api = HubVaultApi(repo_dir) ... _ = api.create_repo() ... commit = api.upload_folder(folder_path=source_dir, path_in_repo="bundle") ... (commit.commit_message, api.list_repo_files()) ('Upload folder using hubvault', ['bundle/config.json'])
- upload_large_folder(*, folder_path: str | PathLike, revision: str | None = None, allow_patterns: Sequence[str] | str | None = None, ignore_patterns: Sequence[str] | str | None = None) CommitInfo[source]
Upload a large folder through a single atomic local commit.
The method name follows
huggingface_hub.HfApi.upload_large_folder(). Unlike the remote API, the local backend keeps the whole operation atomic and therefore returns a singleCommitInfo.- Parameters:
folder_path (Union[str, os.PathLike[str]]) – Local folder to upload
revision (Optional[str]) – Target branch name, defaults to the API default revision
allow_patterns (Optional[Union[Sequence[str], str]]) – Optional allowlist for local relative paths
ignore_patterns (Optional[Union[Sequence[str], str]]) – Optional denylist for local relative paths
- Returns:
Commit metadata for the created commit
- Return type:
- Raises:
ValueError – Raised when
folder_pathis not a local directory.
Note
The underlying chunk planning and atomic publish logic lives in
hubvault.repo.backend.RepositoryBackend. The public example below shows the API-level behavior only.Example:
>>> import tempfile >>> from pathlib import Path >>> with tempfile.TemporaryDirectory() as tmpdir: ... root = Path(tmpdir) ... repo_dir = root / "repo" ... source_dir = root / "source" ... source_dir.mkdir() ... _ = source_dir.joinpath("model.bin").write_bytes(b"A" * 64) ... api = HubVaultApi(repo_dir) ... _ = api.create_repo(large_file_threshold=32) ... commit = api.upload_large_folder(folder_path=source_dir) ... (commit.commit_message, api.read_range("model.bin", start=0, length=4)) ('Upload large folder using hubvault', b'AAAA')