hubvault.models

Public data models for the hubvault package.

This module defines the stable dataclasses returned by the public API. The models intentionally expose repository-facing metadata without leaking the layout of internal storage objects.

The module contains:

RepoInfo

class hubvault.models.RepoInfo(repo_path: str, format_version: int, default_branch: str, head: str | None, refs: ~typing.List[str] = <factory>)[source]

Describe the current state of a local repository.

Parameters:
  • repo_path (str) – Filesystem path to the repository root

  • format_version (int) – Repository format version

  • default_branch (str) – Name of the default branch

  • head (Optional[str]) – Resolved Git-compatible head commit OID for the selected revision, or None only for malformed or recovery-era empty refs

  • refs (List[str]) – Visible refs in the repository

Example:

>>> info = RepoInfo("/tmp/repo", 1, "main", "1111111111111111111111111111111111111111")
>>> info.default_branch
'main'

CommitInfo

class hubvault.models.CommitInfo(*args, commit_url: str, _url: str | None = None, **kwargs)[source]

Describe the result of a commit-creating operation.

This model intentionally follows the public shape of huggingface_hub.hf_api.CommitInfo. It is returned by write-like public APIs such as hubvault.api.HubVaultApi.create_commit() and hubvault.api.HubVaultApi.reset_ref(), while GitCommitInfo remains the public model for history listings.

Parameters:
  • commit_url (str) – Local commit URL string aligned with HF naming

  • commit_message (str) – Commit summary aligned with HF naming

  • commit_description (str) – Commit description/body aligned with HF naming

  • oid (str) – Git-compatible commit OID aligned with HF naming

  • pr_url (Optional[str]) – Pull-request URL placeholder. Always None for the local repository flow.

  • _url (Optional[str]) – Legacy string payload used for HF-style str compatibility. Defaults to commit_url.

Variables:
  • repo_url (str) – Local repository URL string aligned with HF naming

  • pr_revision (Optional[str]) – Pull-request revision placeholder. Always None for the local repository flow.

  • pr_num (Optional[str]) – Pull-request number placeholder. Always None for the local repository flow.

Example:

>>> info = CommitInfo(
...     commit_url="file:///tmp/repo#commit=1111111111111111111111111111111111111111",
...     commit_message="seed",
...     commit_description="body",
...     oid="1111111111111111111111111111111111111111",
... )
>>> info.oid
'1111111111111111111111111111111111111111'
static __new__(cls, *args, commit_url: str, _url: str | None = None, **kwargs)[source]

Build the legacy string payload used by HF-style commit info objects.

Parameters:
  • commit_url (str) – Public commit URL

  • _url (Optional[str]) – Optional legacy URL override

Returns:

String-compatible commit info instance

Return type:

CommitInfo

__post_init__() None[source]

Populate computed HF-style attributes after initialization.

Returns:

None.

Return type:

None

MergeConflict

class hubvault.models.MergeConflict(path: str, conflict_type: str, message: str, base_oid: str | None, target_oid: str | None, source_oid: str | None, related_path: str | None = None)[source]

Describe one structured conflict detected during a merge attempt.

Parameters:
  • path (str) – Primary repo-relative path involved in the conflict

  • conflict_type (str) – Stable conflict kind such as "modify/modify", "add/add", "delete/modify", "file/directory", or "case-fold"

  • message (str) – Human-readable conflict summary

  • base_oid (Optional[str]) – Base-side logical file OID, if a file version exists

  • target_oid (Optional[str]) – Target-branch logical file OID, if a file version exists

  • source_oid (Optional[str]) – Source-side logical file OID, if a file version exists

  • related_path (Optional[str]) – Secondary repo-relative path for structural conflicts, or None when the conflict concerns only path

Example:

>>> conflict = MergeConflict(
...     path="demo.txt",
...     conflict_type="modify/modify",
...     message="Both sides changed demo.txt differently.",
...     base_oid="abc",
...     target_oid="def",
...     source_oid="ghi",
... )
>>> conflict.conflict_type
'modify/modify'

MergeResult

class hubvault.models.MergeResult(status: str, target_revision: str, source_revision: str, base_commit: str | None, target_head_before: str | None, source_head: str | None, head_after: str | None, commit: ~hubvault.models.CommitInfo | None, conflicts: ~typing.List[~hubvault.models.MergeConflict] = <factory>, fast_forward: bool = False, created_commit: bool = False)[source]

Describe the result of a public branch merge operation.

Parameters:
  • status (str) – Stable merge status string. Current values are "merged", "fast-forward", "already-up-to-date", and "conflict"

  • target_revision (str) – Target branch that received or would receive the merge

  • source_revision (str) – Source revision requested by the caller

  • base_commit (Optional[str]) – Resolved Git-compatible merge base commit OID, or None when no common ancestor exists

  • target_head_before (Optional[str]) – Git-compatible target branch head OID before the merge attempt

  • source_head (Optional[str]) – Resolved Git-compatible source commit OID

  • head_after (Optional[str]) – Git-compatible target branch head OID after the merge attempt

  • commit (Optional[CommitInfo]) – Commit metadata for the resulting head when the merge did not conflict, or None for conflict results

  • conflicts (List[MergeConflict]) – Structured conflicts detected during the merge attempt

  • fast_forward (bool) – Whether the merge resolved as a fast-forward ref move

  • created_commit (bool) – Whether the merge created a brand-new merge commit

Example:

>>> commit = CommitInfo(
...     commit_url="file:///tmp/repo#commit=1111111111111111111111111111111111111111",
...     commit_message="seed",
...     commit_description="",
...     oid="1111111111111111111111111111111111111111",
... )
>>> result = MergeResult(
...     status="fast-forward",
...     target_revision="main",
...     source_revision="feature",
...     base_commit="0000000000000000000000000000000000000000",
...     target_head_before="0000000000000000000000000000000000000000",
...     source_head="1111111111111111111111111111111111111111",
...     head_after="1111111111111111111111111111111111111111",
...     commit=commit,
...     conflicts=[],
...     fast_forward=True,
...     created_commit=False,
... )
>>> result.fast_forward
True

GitCommitInfo

class hubvault.models.GitCommitInfo(commit_id: str, authors: List[str], created_at: datetime, title: str, message: str, formatted_title: str | None, formatted_message: str | None)[source]

Describe a commit entry returned by hubvault.api.HubVaultApi.list_repo_commits().

This model follows the main public shape of huggingface_hub.hf_api.GitCommitInfo while staying grounded in the local repository semantics of hubvault.

Parameters:
  • commit_id (str) – Git-compatible commit OID

  • authors (List[str]) – Authors associated with the commit

  • created_at (datetime.datetime) – Commit creation time in UTC

  • title (str) – Commit title

  • message (str) – Commit body message

  • formatted_title (Optional[str]) – HTML-formatted commit title, or None when not requested

  • formatted_message (Optional[str]) – HTML-formatted commit message, or None when not requested

Example:

>>> info = GitCommitInfo(
...     commit_id="1111111111111111111111111111111111111111",
...     authors=[],
...     created_at=datetime(2024, 1, 1, 0, 0, 0),
...     title="seed",
...     message="",
...     formatted_title=None,
...     formatted_message=None,
... )
>>> info.title
'seed'

CommitFileVersionInfo

class hubvault.models.CommitFileVersionInfo(path: str, size: int, oid: str, blob_id: str, sha256: str)[source]

Describe one file version used by a commit-diff entry.

This model keeps the same public identity vocabulary as RepoFile while omitting UI-only fields such as HTTP download URLs. Callers can use the surrounding commit IDs to build local or remote download links.

Parameters:
  • path (str) – Repo-relative path for this file version

  • size (int) – Logical file size in bytes

  • oid (str) – HF-style public object identifier

  • blob_id (str) – HF-style blob identifier

  • sha256 (str) – Bare hexadecimal SHA-256 digest for the logical payload

Example:

>>> info = CommitFileVersionInfo("demo.txt", 5, "oid", "blob", "abc")
>>> info.path
'demo.txt'

CommitChangeInfo

class hubvault.models.CommitChangeInfo(path: str, change_type: str, old_file: CommitFileVersionInfo | None, new_file: CommitFileVersionInfo | None, is_binary: bool, unified_diff: str | None = None)[source]

Describe one file-level change in a commit diff.

Text changes may carry a Git-style unified diff. Binary or too-large files still carry old/new metadata so UIs can compare size and content identity without downloading the payload automatically.

Parameters:
  • path (str) – Repo-relative path being described

  • change_type (str) – One of "added", "deleted", or "modified"

  • old_file (Optional[CommitFileVersionInfo]) – File metadata before the commit, or None for additions

  • new_file (Optional[CommitFileVersionInfo]) – File metadata after the commit, or None for deletions

  • is_binary (bool) – Whether the change should be treated as non-text

  • unified_diff (Optional[str]) – Git-style unified diff text for renderable text files

Example:

>>> change = CommitChangeInfo("demo.txt", "added", None, None, True, None)
>>> change.change_type
'added'

CommitDetailInfo

class hubvault.models.CommitDetailInfo(commit: ~hubvault.models.GitCommitInfo, parent_commit_ids: ~typing.List[str], compare_parent_commit_id: str | None, changes: ~typing.List[~hubvault.models.CommitChangeInfo] = <factory>)[source]

Describe one commit together with its first-parent file changes.

The first Phase 9 implementation intentionally compares a commit with its first parent, matching common repository-browser defaults. Multi-parent merge comparison UIs can be layered on later without changing the commit listing model.

Parameters:
  • commit (GitCommitInfo) – Public commit-list metadata

  • parent_commit_ids (List[str]) – Public commit IDs of all parents

  • compare_parent_commit_id (Optional[str]) – Parent used as the diff base, or None for root commits

  • changes (List[CommitChangeInfo]) – File-level changes from the selected parent to commit

Example:

>>> commit = GitCommitInfo("c", [], datetime(2024, 1, 1), "seed", "", None, None)
>>> detail = CommitDetailInfo(commit, [], None, [])
>>> detail.commit.title
'seed'

GitRefInfo

class hubvault.models.GitRefInfo(name: str, ref: str, target_commit: str | None)[source]

Describe a git reference in HF-style form.

This model follows the public role of huggingface_hub.hf_api.GitRefInfo. Normal repositories now create an initial empty-tree commit during hubvault.api.HubVaultApi.create_repo(), but target_commit remains optional so recovery tooling can still report malformed or legacy empty refs.

Parameters:
  • name (str) – Short branch or tag name

  • ref (str) – Full ref name such as refs/heads/main

  • target_commit (Optional[str]) – Git-compatible target commit OID, or None for a malformed or legacy empty local ref

Example:

>>> info = GitRefInfo("main", "refs/heads/main", "1111111111111111111111111111111111111111")
>>> info.ref
'refs/heads/main'

GitRefs

class hubvault.models.GitRefs(branches: List[GitRefInfo], converts: List[GitRefInfo], tags: List[GitRefInfo], pull_requests: List[GitRefInfo] | None = None)[source]

Describe the visible git references for a repository.

This model follows the public role of huggingface_hub.hf_api.GitRefs. The local repository does not support convert refs or pull requests, but keeps the same top-level structure for compatibility.

Parameters:
  • branches (List[GitRefInfo]) – Visible branch references

  • converts (List[GitRefInfo]) – Convert refs. Always empty for the local repository.

  • tags (List[GitRefInfo]) – Visible tag references

  • pull_requests (Optional[List[GitRefInfo]]) – Pull-request refs when explicitly requested. The local repository returns [] if requested and None otherwise.

Example:

>>> refs = GitRefs(branches=[], converts=[], tags=[], pull_requests=None)
>>> refs.tags
[]

ReflogEntry

class hubvault.models.ReflogEntry(timestamp: datetime, ref_name: str, old_head: str | None, new_head: str | None, message: str, checksum: str)[source]

Describe a single reflog record for a branch or tag.

Parameters:
  • timestamp (datetime.datetime) – UTC time recorded for the reflog entry

  • ref_name (str) – Full ref name such as refs/heads/main

  • old_head (Optional[str]) – Previous target commit, if any

  • new_head (Optional[str]) – New target commit, if any

  • message (str) – Short reflog message

  • checksum (str) – Integrity checksum for the reflog record

Example:

>>> entry = ReflogEntry(
...     datetime(2024, 1, 1),
...     "refs/heads/main",
...     None,
...     "1111111111111111111111111111111111111111",
...     "seed",
...     "sha256:x",
... )
>>> entry.message
'seed'

LastCommitInfo

class hubvault.models.LastCommitInfo(oid: str, title: str, date: datetime)[source]

Describe last-commit metadata for a repo path.

Parameters:
  • oid (str) – Git-compatible commit OID

  • title (str) – Commit title

  • date (datetime.datetime) – Commit creation time in UTC

Example:

>>> info = LastCommitInfo("oid", "seed", datetime(2024, 1, 1, 0, 0, 0))
>>> info.title
'seed'

BlobSecurityInfo

class hubvault.models.BlobSecurityInfo(safe: bool, status: str, av_scan: Dict[str, object] | None, pickle_import_scan: Dict[str, object] | None)[source]

Describe security metadata for a repo file.

Parameters:
  • safe (bool) – Whether the file is considered safe

  • status (str) – Security scan status string

  • av_scan (Optional[Dict[str, object]]) – Antivirus scan metadata, if any

  • pickle_import_scan (Optional[Dict[str, object]]) – Pickle-import scan metadata, if any

Example:

>>> info = BlobSecurityInfo(True, "safe", None, None)
>>> info.safe
True

RepoFile

class hubvault.models.RepoFile(path: str, size: int, blob_id: str, lfs: BlobLfsInfo | None = None, last_commit: LastCommitInfo | None = None, security: BlobSecurityInfo | None = None, oid: str | None = None, sha256: str | None = None, etag: str | None = None)[source]

Describe a file entry in HF-style repo listings.

This model follows the main public field layout of huggingface_hub.hf_api.RepoFile. Local-only convenience fields oid, sha256, and etag are retained because the local-path design exposes them directly to callers.

Parameters:
  • path (str) – Repo-relative path

  • size (int) – File size in bytes

  • blob_id (str) – Git blob OID

  • lfs (Optional[BlobLfsInfo]) – LFS-style checksum metadata, if available

  • last_commit (Optional[LastCommitInfo]) – Last-commit metadata, if available

  • security (Optional[BlobSecurityInfo]) – Security metadata, if available

  • oid (Optional[str]) – Local convenience alias for the blob OID

  • sha256 (Optional[str]) – Raw hexadecimal SHA-256 digest of the logical file content

  • etag (Optional[str]) – Public ETag value for download-facing APIs

Example:

>>> info = RepoFile("demo.txt", 4, "oid", None)
>>> info.blob_id
'oid'
property lastCommit: LastCommitInfo | None

Return the backward-compatible HF camelCase alias.

Returns:

Last-commit metadata, if available

Return type:

Optional[LastCommitInfo]

property rfilename: str

Return the backward-compatible HF filename alias.

Returns:

Repo-relative path

Return type:

str

RepoFolder

class hubvault.models.RepoFolder(path: str, tree_id: str, last_commit: LastCommitInfo | None = None)[source]

Describe a folder entry in HF-style repo listings.

Parameters:
  • path (str) – Repo-relative folder path

  • tree_id (str) – Git-compatible tree OID

  • last_commit (Optional[LastCommitInfo]) – Last-commit metadata, if available

Example:

>>> info = RepoFolder("configs", "tree-oid")
>>> info.tree_id
'tree-oid'
property lastCommit: LastCommitInfo | None

Return the backward-compatible HF camelCase alias.

Returns:

Last-commit metadata, if available

Return type:

Optional[LastCommitInfo]

BlobLfsInfo

class hubvault.models.BlobLfsInfo(size: int, sha256: str, pointer_size: int)[source]

Describe large-file metadata for future LFS-compatible modes.

Parameters:
  • size (int) – Logical file size in bytes

  • sha256 (str) – Raw hexadecimal SHA-256 digest of the file content, matching the public huggingface_hub lfs.sha256 style without an algorithm prefix

  • pointer_size (int) – Size of the canonical pointer content

Example:

>>> info = BlobLfsInfo(1024, "abc", 128)
>>> info.pointer_size
128

VerifyReport

class hubvault.models.VerifyReport(ok: bool, checked_refs: ~typing.List[str] = <factory>, warnings: ~typing.List[str] = <factory>, errors: ~typing.List[str] = <factory>)[source]

Report the result of repository verification.

Parameters:
  • ok (bool) – Whether verification completed without errors

  • checked_refs (List[str]) – Refs inspected during verification

  • warnings (List[str]) – Non-fatal diagnostics

  • errors (List[str]) – Fatal verification errors

Example:

>>> report = VerifyReport(True)
>>> report.ok
True

StorageSectionInfo

class hubvault.models.StorageSectionInfo(name: str, path: str, total_size: int, file_count: int, reclaimable_size: int, reclaim_strategy: str, notes: str)[source]

Describe disk usage for one repository storage section.

Parameters:
  • name (str) – Stable section name such as "objects.blobs.data"

  • path (str) – Repo-relative path or descriptive label for the section

  • total_size (int) – Current bytes occupied by the section

  • file_count (int) – Number of files currently present in the section

  • reclaimable_size (int) – Bytes that can be safely reclaimed now through the recommended action

  • reclaim_strategy (str) – Recommended safe action such as "gc", "prune-cache", "keep", or "manual-review"

  • notes (str) – Practical explanation of what the section stores and how to release its space safely

Example:

>>> section = StorageSectionInfo(
...     name="cache",
...     path="cache/",
...     total_size=1024,
...     file_count=3,
...     reclaimable_size=1024,
...     reclaim_strategy="prune-cache",
...     notes="Detached views can be rebuilt.",
... )
>>> section.reclaim_strategy
'prune-cache'

StorageOverview

class hubvault.models.StorageOverview(total_size: int, reachable_size: int, historical_retained_size: int, reclaimable_gc_size: int, reclaimable_cache_size: int, reclaimable_temporary_size: int, sections: ~typing.List[~hubvault.models.StorageSectionInfo] = <factory>, recommendations: ~typing.List[str] = <factory>)[source]

Describe repository-wide storage usage and safe reclamation options.

Parameters:
  • total_size (int) – Total bytes currently occupied by the repository root

  • reachable_size (int) – Bytes currently required to preserve all live refs and their reachable storage after a normal GC pass

  • historical_retained_size (int) – Bytes currently kept only for rollback or historical retention and therefore releasable after explicit history rewriting such as hubvault.api.HubVaultApi.squash_history()

  • reclaimable_gc_size (int) – Bytes that hubvault.api.HubVaultApi.gc() can safely reclaim immediately without rewriting history

  • reclaimable_cache_size (int) – Bytes in rebuildable detached caches

  • reclaimable_temporary_size (int) – Bytes in temporary or quarantine areas that can be cleaned without changing visible repository history

  • sections (List[StorageSectionInfo]) – Per-section usage breakdown

  • recommendations (List[str]) – Ordered safe-action recommendations for operators

Example:

>>> overview = StorageOverview(
...     total_size=4096,
...     reachable_size=2048,
...     historical_retained_size=1024,
...     reclaimable_gc_size=256,
...     reclaimable_cache_size=512,
...     reclaimable_temporary_size=256,
...     sections=[],
...     recommendations=["Run gc()."],
... )
>>> overview.reclaimable_gc_size
256

GcReport

class hubvault.models.GcReport(dry_run: bool, checked_refs: ~typing.List[str], reclaimed_size: int, reclaimed_object_size: int, reclaimed_chunk_size: int, reclaimed_cache_size: int, reclaimed_temporary_size: int, removed_file_count: int, notes: ~typing.List[str] = <factory>)[source]

Describe the result of one storage reclamation pass.

Parameters:
  • dry_run (bool) – Whether the operation only computed the result without mutating repository storage

  • checked_refs (List[str]) – Refs treated as GC roots during the pass

  • reclaimed_size (int) – Total bytes reclaimed or reclaimable in dry-run mode

  • reclaimed_object_size (int) – Bytes reclaimed from JSON/blob object stores

  • reclaimed_chunk_size (int) – Bytes reclaimed from pack/index storage

  • reclaimed_cache_size (int) – Bytes reclaimed from rebuildable cache areas

  • reclaimed_temporary_size (int) – Bytes reclaimed from quarantine or other temporary maintenance areas

  • removed_file_count (int) – Number of files deleted or deletable in dry-run mode

  • notes (List[str]) – Additional human-readable notes about blockers or actions

Example:

>>> report = GcReport(
...     dry_run=True,
...     checked_refs=["refs/heads/main"],
...     reclaimed_size=1024,
...     reclaimed_object_size=512,
...     reclaimed_chunk_size=256,
...     reclaimed_cache_size=128,
...     reclaimed_temporary_size=128,
...     removed_file_count=4,
...     notes=["dry-run"],
... )
>>> report.dry_run
True

SquashReport

class hubvault.models.SquashReport(ref_name: str, old_head: str, new_head: str, root_commit_before: str, rewritten_commit_count: int, dropped_ancestor_count: int, blocking_refs: ~typing.List[str] = <factory>, gc_report: ~hubvault.models.GcReport | None = None)[source]

Describe the result of a history-squash operation.

Parameters:
  • ref_name (str) – Full ref name updated by the squash operation

  • old_head (str) – Previous Git-compatible ref target before the rewrite

  • new_head (str) – New Git-compatible ref target after the rewrite

  • root_commit_before (str) – Git-compatible commit OID selected as the oldest preserved commit before rewriting, or the previous head when the whole branch history was collapsed into one new root commit

  • rewritten_commit_count (int) – Number of commits rewritten onto the new synthetic history chain

  • dropped_ancestor_count (int) – Number of older ancestor commits made unreachable from the rewritten ref

  • blocking_refs (List[str]) – Other refs whose retained history still points into the pre-squash lineage and may therefore limit immediate reclamation

  • gc_report (Optional[GcReport]) – Optional GC result when the squash operation also ran a reclamation pass

Example:

>>> report = SquashReport(
...     ref_name="refs/heads/main",
...     old_head="1111111111111111111111111111111111111111",
...     new_head="2222222222222222222222222222222222222222",
...     root_commit_before="0000000000000000000000000000000000000000",
...     rewritten_commit_count=2,
...     dropped_ancestor_count=3,
...     blocking_refs=[],
...     gc_report=None,
... )
>>> report.new_head
'2222222222222222222222222222222222222222'