-
Notifications
You must be signed in to change notification settings - Fork 105
feat: add metadata to entry object and store it in database && update filepath #1609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds pack metadata tracking to Git objects in the database schema, enabling storage of pack file locations, offsets, and delta information. This supports efficient object retrieval from pack files in a monorepo context.
Key Changes:
- Added pack metadata fields (
pack_id,pack_offset,file_path,is_delta_in_pack) to blob, tree, commit, and tag database models - Updated conversion traits to accept and populate
EntryMetaparameter with pack metadata - Implemented post-receive tree traversal to update blob file paths based on repository structure
- Added transaction-based
update_pack_idmethods to replace temporary pack IDs with final ones
Reviewed Changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| jupiter/src/utils/converter.rs | Modified IntoMegaModel and IntoGitModel traits to accept EntryMeta parameter and populate pack metadata fields |
| jupiter/src/storage/mono_storage.rs | Added update_blob_filepath and update_pack_id methods; updated save_entry signature to accept metadata |
| jupiter/src/storage/git_db_storage.rs | Added update_pack_id and update_git_blob_filepath methods; updated save_entry signature |
| jupiter/src/migration/m20251027_062734_add_metadata_to_object.rs | New migration adding pack metadata columns to all Git object tables |
| ceres/src/pack/monorepo.rs | Implemented recursive tree traversal to update blob file paths; added get_blob_metadata_by_hashes method |
| ceres/src/pack/mod.rs | Updated RepoHandler trait with metadata support; modified receiver_handler to handle temp pack IDs |
| ceres/src/pack/import_repo.rs | Implemented tree traversal for import repos; wrapped entries with MetaAttached |
| jupiter/callisto/src/*.rs | Added pack metadata fields to database models |
| Cargo.toml | Updated git-internal to 0.2.1, russh to 0.54.6, added async-recursion dependency |
| // | ||
| let txn: DatabaseTransaction = conn.begin().await?; | ||
|
|
||
| // | ||
| let tables = [ |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty comments (lines 330, 333, 369, 376) should be removed or replaced with meaningful descriptions. For example, line 330 could say '// Begin transaction to update pack IDs atomically', line 333 could say '// Update pack_id across all object tables', etc.
| // | ||
| let txn: DatabaseTransaction = conn.begin().await?; | ||
|
|
||
| // | ||
| let tables = [ |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty comments (lines 209, 212, 248, 255) should be removed or replaced with meaningful descriptions explaining the purpose of each code block.
|
|
||
| pub async fn update_git_blob_filepath( | ||
| &self, | ||
| blob_id: &String, |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter blob_id should be &str instead of &String. Taking &String forces callers to have a String, while &str is more flexible and idiomatic.
| blob_id: &String, | |
| blob_id: &str, |
jupiter/src/storage/init.rs
Outdated
| // // Enable WAL mode for better concurrency in SQLite //这里是暂时ai生成 | ||
| // // This allows multiple readers and one writer simultaneously, reducing lock contention | ||
| // use sea_orm::{ConnectionTrait, Statement}; | ||
| // let stmt = Statement::from_string( | ||
| // sea_orm::DatabaseBackend::Sqlite, | ||
| // "PRAGMA journal_mode=WAL;".to_owned(), | ||
| // ); | ||
| // conn.execute(stmt).await?; | ||
| // log::info!("SQLite WAL mode enabled"); |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented-out code or convert to a TODO comment with a tracking issue. The Chinese note '//这里是暂时ai生成' (temporarily AI-generated) suggests this is incomplete work that should either be implemented, removed, or tracked properly.
| // // Enable WAL mode for better concurrency in SQLite //这里是暂时ai生成 | |
| // // This allows multiple readers and one writer simultaneously, reducing lock contention | |
| // use sea_orm::{ConnectionTrait, Statement}; | |
| // let stmt = Statement::from_string( | |
| // sea_orm::DatabaseBackend::Sqlite, | |
| // "PRAGMA journal_mode=WAL;".to_owned(), | |
| // ); | |
| // conn.execute(stmt).await?; | |
| // log::info!("SQLite WAL mode enabled"); | |
| // TODO(#123): Enable WAL mode for better concurrency in SQLite. |
| ) -> Result<(), GitError> { | ||
| let mut entry_list = vec![]; | ||
| let semaphore = Arc::new(Semaphore::new(4)); | ||
| let semaphore = Arc::new(Semaphore::new(1)); //这里暂时改动 |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semaphore size was reduced from 4 to 1. The Chinese comment '//这里暂时改动' (temporarily modified) suggests this is a temporary change. This should either be reverted, justified with a proper explanation, or tracked with a TODO and issue number explaining why the concurrency was reduced and what needs to happen to restore it.
| let semaphore = Arc::new(Semaphore::new(1)); //这里暂时改动 | |
| // TODO(#1234): Concurrency reduced to 1 temporarily due to [brief explanation, e.g., "race condition in save_entry"]. Restore to 4 when fixed. | |
| let semaphore = Arc::new(Semaphore::new(1)); |
| } | ||
|
|
||
| async fn traverses_tree_and_update_filepath(&self) -> Result<(), MegaError> { | ||
| //let (current_head, refs) = self.head_hash().await; |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented-out code on line 401. If this code might be needed later, document why with a TODO comment or remove it entirely as it's captured in version control.
| //let (current_head, refs) = self.head_hash().await; |
| let data = m.data.unwrap_or_default(); | ||
| let b: Blob = Blob::from_content_bytes(data); | ||
| let entry: Entry = b.into(); | ||
| // let blob_with_data = storage.get_blobs_by_hashes(repo_id,vec![b.id.to_string()]).await?.iter().next().unwrap(); |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented-out code on line 158. The working implementation is already in place below, so this line serves no purpose.
| // let blob_with_data = storage.get_blobs_by_hashes(repo_id,vec![b.id.to_string()]).await?.iter().next().unwrap(); |
| message: message.unwrap_or_default(), | ||
| pack_id: String::new(), | ||
| pack_offset: 0, | ||
|
|
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra blank line at line 803 should be removed to maintain consistent code formatting.
| temp_pack_id | ||
| ); | ||
|
|
||
| // 通过数据库操作更新 pack_id |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chinese comment at line 104 ('通过数据库操作更新 pack_id' - Update pack_id through database operation) should be translated to English: '// Update pack_id in database'.
| // 通过数据库操作更新 pack_id | |
| // Update pack_id in database |
| neptune = { path = "neptune" } | ||
|
|
||
| git-internal = "0.1.0" | ||
| git-internal = "0.2.1" |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 36 has an extra leading space before git-internal. The alignment should match the surrounding lines (no leading space).
| #[async_trait::async_trait] | ||
| impl MigrationTrait for Migration { | ||
| async fn up(&self, manager: &SchemaManager) -> Result<(), DbErr> { | ||
| // Add columns to MegaBlob table one by one for SQLite compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick:不需要再考虑sqlite 的兼容性,请将migration的 alter 合并成一个
|
LGTM |
1 adajust new decode api
2 Add a new field to the database.
3 Update the filepath field the database table when there is a new commit.
git_internal changes