Skip to content

Commit 95b5977

Browse files
oleanderGit AI Test
andauthored
Speed up hook diff processing (#27)
* test * Add profiling feature with macro for measuring execution time * Refactor diff processing, optimize token handling and storage * Update test signatures and add profiling tests * Refactor GitFile signatures and commit messages Optimize PatchDiff with parallel processing Add rayon dependency to Cargo.toml Remove redundant patch tests Update Cargo.lock with new dependencies Remove profiling tests * Update Cargo.lock dependencies and checksums * Update dependencies in Cargo.toml and Cargo.lock * Add StringPool for efficient memory use in PatchDiff * Update dependencies in Cargo.toml and Cargo.lock * Add `num_cpus` crate and parallelize file processing * Refactor file processing to use parallel chunks and atomic tokens * Remove redundant import of `bail` from anyhow * Sort files by token count in `PatchDiff` implementation. * Delete test.txt file * Improve error handling and path management in config and style modules * Add tests for StringPool functionality in hook.rs * Update default model and add profiling to model and commit functions * Add profiling to filesystem module functions * Implement token counting and generation for commit messages * Add documentation for Filesystem, File, and Dir structs in filesystem.rs * Refactor commit message generation methods and file handling logic * Implement configuration file management and update functions in App * Implement parallel processing of diff data in PatchDiff trait * ``` feat: Add Finetune functionality and related files - Introduce finetune.rs to manage fine-tuning workflows with OpenAI. - Create finetune.md for documenting the finetuning process. - Update Cargo.toml and Cargo.lock with necessary dependencies for finetuning. - Add stats.json to track various parameters in the finetuning process. ``` * ``` Add instruction template constant to commit.rs - Introduce `INSTRUCTION_TEMPLATE` constant for prompt file content. ``` * Remove unused import of `std::fs` from `commit.rs` file. * Remove unused import and adjust available tokens calculation - Remove the import of `Model` and update the `available_tokens` calculations in the `call` function. * Update max commit length in prompt guidelines - Change maximum commit length from 72 to {{max_commit_length}} characters. * ``` Modify imports and refactor filesystem profiling Based only on the changes visible in the diff, this commit: - Removes unnecessary profiling statements and imports in `filesystem.rs`. - Adds an import for `App` from `config` in `main.rs`. ``` * Add directory creation for hooks if it does not exist - Implement logic to check for the existence of the hooks directory and create it if it's missing. * Add dead code allowance in filesystem.rs Based only on the changes visible in the diff, this commit adds a line to allow dead code in the `filesystem.rs` file. * Revert "```" This reverts commit 7b9aa2f. * ``` Update Command enum definition Based only on the changes visible in the diff, this commit modifies the existing Command enum without altering its structure or commands. ``` * Delete stats.json file Based only on the changes visible in the diff, this commit removes the stats.json file. * ``` Remove install, reinstall, and uninstall modules Based only on the changes visible in the diff, this commit deletes the files src/install.rs, src/reinstall.rs, and src/uninstall.rs. ``` * Build inline * Update default model name in Args implementation Based only on the changes visible in the diff, this commit modifies the default model name in the Args implementation from "gpt-4o" to "gpt-4o-mini". * ``` Create hook stress test script Based only on the changes visible in the diff, this commit adds a new script for testing various operations in a Git hook context. ``` * ``` Add comprehensive tests script Based only on the changes visible in the diff, this commit adds a new Fish script `comprehensive-tests` that includes a series of tests for various Git operations. ``` * Change file permission of comprehensive-tests. - Update file mode from 644 to 755. * Update `comprehensive-tests` script to load environment variables from `.env.local` This commit updates the `comprehensive-tests` script by adding logic to read and load environment variables from a * Remove note about output being used as a git commit message from 'prompt.md' * Update comprehensive-tests script and prompt.md documentation Based only on * Update scripts and source code according to visible changes in the diff Based on the changes visible in the diff, this commit updates the comprehensive testing scripts, modifies several source files related to- 'hook.rs','patch_test.rs', and adds new methods and tests. Here is a summary of specific changes: - In `comprehensive-tests` script, add `--debug` flag to `cargo install` command - Adjust logic to check for ample remaining tokens in `hook.rs` - Modify Diff and PatchRepository implementations for `hook.rs` and `patch_test.rs`. - Modify test cases in the `patch_test.rs` script. Please note each individual file has significant changes that add, modify or enhance the functionality as per the needs reflected in the diffs. However, all changes stick to the theme of improving handling of patches, diffs, and commit-related functionalities. * Refactor `hook.rs` and ensure a minimum of 512 tokens Based only on the changes visible in the diff, this commit: - Adds nine lines to check for an empty diff and handle amend operations in `src/bin/hook.rs` - Removes four lines related to the error message for no changes found to commit - Modifies a line to ensure a minimum of 512 remaining tokens - Appends a four-line snippet to handle amend operations when the source is a commit * Update clean-up command in comprehensive-tests script Based only on the changes visible in the diff, this commit: - Replaces the commented clean-up command in comprehensive-tests script with an active one. * Add attribute to suppress dead code warnings in hook.rs * Add initial boilerplate for hook.rs This commit introduces the initial boilerplate for the `hook.rs` file. The added codes allow for dead code, which is often used during the early stages of development. This single-line addition simply contributes to the initial setup of the file. * Add debug message when a commit message already exists in hook.rs Based only on the changes visible in the diff, this commit: - Adds code for checking if a commit message already exists and is not empty in the file hook.rs, - If the message exists, a debug log message is displayed, and the function returns, clearing the progress bar. * Add `to_commit_diff` and `configure_commit_diff_options` methods to `PatchRepository` trait Based only on the changes visible in the diff, this commit: - Adds the `to_commit_diff` method to the `PatchRepository` trait which returns a `Result<git2::Diff<'_>>` - Implements `to_commit_diff` in the `PatchRepository` trait for `Repository` class, where it configures diff options and conducts the diff operation based on the provided tree option - Adds the `configure_commit_diff_options` method to the `PatchRepository` trait which doesn't return anything but changes the state of provided `DiffOptions` - Implements `configure_commit_diff_options` in the `PatchRepository` trait for `Repository` class, where it sets various options for a diff operation - Replaces the usage of `to_diff` method with `to_commit_diff` in the `PatchRepository` implementation for `Repository`. * Optimize max_tokens_per_file calculation in hook.rs The max_tokens_per_file calculation within the `process_chunk` function of hook.rs now considers the case of zero remaining files. If no files are remaining, the total_remaining value is assigned to max_tokens_per_file directly. * Refactor method calls and condition checks in openai.rs and patch_test.rs * Refine instructions and guidelines for generating git commit messages * Add error handling for raw SHA1 resolution in hook.rs * Refactor function calls in patch_test.rs and simplify conditional logic in hook.rs * Refactor reference resolution in hook.rs --------- Co-authored-by: Git AI Test <[email protected]>
1 parent 74b1027 commit 95b5977

File tree

10 files changed

+42
-296
lines changed

10 files changed

+42
-296
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,5 @@ http-cacache/*
66
${env:TMPDIR}
77
bin/
88
tmp/
9+
finetune_verify.jsonl
10+
finetune_train.jsonl

Cargo.lock

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ path = "src/bin/hook.rs"
2626

2727
[dependencies]
2828
# Core functionality
29-
anyhow = "1.0.95"
29+
anyhow = { version = "1.0.95", features = ["backtrace"] }
3030
thiserror = "2.0.11"
3131
tokio = { version = "1.43", features = ["full"] }
3232
futures = "0.3"
@@ -52,9 +52,11 @@ serde_derive = "1.0.217"
5252
serde_ini = "0.2.0"
5353
serde_json = "1.0"
5454

55+
# OpenAI integration
5556
async-openai = { version = "0.27.2", default-features = false }
5657
tiktoken-rs = "0.6.0"
5758
reqwest = { version = "0.12.12", default-features = true }
59+
5860
# System utilities
5961
openssl-sys = { version = "0.9.105", features = ["vendored"] }
6062
rayon = "1.10.0"
@@ -64,6 +66,7 @@ ctrlc = "3.4.5"
6466
lazy_static = "1.5.0"
6567
home = "0.5.11"
6668
dirs = "6.0"
69+
6770
# Syntax highlighting and markdown rendering
6871
syntect = { version = "5.2", default-features = false, features = [
6972
"default-fancy",
@@ -74,6 +77,7 @@ textwrap = "0.16"
7477
structopt = "0.3.26"
7578
mustache = "0.9.0"
7679
maplit = "1.0.2"
80+
7781
[dev-dependencies]
7882
tempfile = "3.16.0"
7983

src/commit.rs

Lines changed: 7 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
use anyhow::{bail, Result};
1+
use anyhow::{anyhow, bail, Result};
22
use maplit::hashmap;
3+
use mustache;
34

45
use crate::{config, openai, profile};
56
use crate::model::Model;
@@ -13,11 +14,11 @@ fn get_instruction_template() -> Result<String> {
1314
profile!("Generate instruction template");
1415
let max_length = config::APP.max_commit_length.unwrap_or(72).to_string();
1516
let template = mustache::compile_str(INSTRUCTION_TEMPLATE)
16-
.map_err(|e| anyhow::anyhow!("Template compilation error: {}", e))?
17+
.map_err(|e| anyhow!("Template compilation error: {}", e))?
1718
.render_to_string(&hashmap! {
1819
"max_length" => max_length
1920
})
20-
.map_err(|e| anyhow::anyhow!("Template rendering error: {}", e))?;
21+
.map_err(|e| anyhow!("Template rendering error: {}", e))?;
2122
Ok(template)
2223
}
2324

@@ -43,18 +44,11 @@ pub fn get_instruction_token_count(model: &Model) -> Result<usize> {
4344
///
4445
/// # Returns
4546
/// * `Result<openai::Request>` - The prepared request
46-
pub fn create_commit_request(diff: String, max_tokens: usize, model: Model) -> Result<openai::Request> {
47+
fn create_commit_request(diff: String, max_tokens: usize, model: Model) -> Result<openai::Request> {
4748
profile!("Prepare OpenAI request");
48-
let max_length = config::APP.max_commit_length.unwrap_or(72).to_string();
49-
let instruction_template = mustache::compile_str(INSTRUCTION_TEMPLATE)
50-
.map_err(|e| anyhow::anyhow!("Template compilation error: {}", e))?
51-
.render_to_string(&hashmap! {
52-
"max_length" => max_length
53-
})
54-
.map_err(|e| anyhow::anyhow!("Template rendering error: {}", e))?;
55-
49+
let template = get_instruction_template()?;
5650
Ok(openai::Request {
57-
system: instruction_template,
51+
system: template,
5852
prompt: diff,
5953
max_tokens: max_tokens.try_into().unwrap_or(u16::MAX),
6054
model

src/hook.rs

Lines changed: 17 additions & 172 deletions
Original file line numberDiff line numberDiff line change
@@ -279,33 +279,28 @@ fn process_chunk(
279279
};
280280

281281
if max_tokens_per_file == 0 {
282-
// No tokens left to allocate, skip remaining files
283-
break;
282+
continue;
284283
}
285284

286285
let token_count = *token_count;
287286
let allocated_tokens = token_count.min(max_tokens_per_file);
288287

289-
// Attempt to atomically update remaining tokens
290-
match remaining_tokens.fetch_update(Ordering::SeqCst, Ordering::SeqCst, |current| {
291-
if current >= allocated_tokens {
292-
Some(current - allocated_tokens)
293-
} else {
294-
None
295-
}
296-
}) {
297-
Ok(_) => {
298-
let processed_content = if token_count > allocated_tokens {
299-
model.truncate(content, allocated_tokens)?
288+
if remaining_tokens
289+
.fetch_update(Ordering::SeqCst, Ordering::SeqCst, |current| {
290+
if current >= allocated_tokens {
291+
Some(current - allocated_tokens)
300292
} else {
301-
content.clone()
302-
};
303-
chunk_results.push((path.clone(), processed_content));
304-
}
305-
Err(_) => {
306-
// Failed to allocate tokens, skip remaining files
307-
break;
308-
}
293+
None
294+
}
295+
})
296+
.is_ok()
297+
{
298+
let processed_content = if token_count > allocated_tokens {
299+
model.truncate(content, allocated_tokens)?
300+
} else {
301+
content.clone()
302+
};
303+
chunk_results.push((path.clone(), processed_content));
309304
}
310305
}
311306

@@ -371,7 +366,7 @@ impl PatchRepository for Repository {
371366

372367
fn configure_diff_options(&self, opts: &mut DiffOptions) {
373368
opts
374-
.ignore_whitespace_change(false)
369+
.ignore_whitespace_change(true)
375370
.recurse_untracked_dirs(true)
376371
.recurse_ignored_dirs(false)
377372
.ignore_whitespace_eol(true)
@@ -408,8 +403,6 @@ impl PatchRepository for Repository {
408403

409404
#[cfg(test)]
410405
mod tests {
411-
use tempfile::TempDir;
412-
413406
use super::*;
414407

415408
#[test]
@@ -452,152 +445,4 @@ mod tests {
452445

453446
assert_eq!(pool.strings.len(), MAX_POOL_SIZE);
454447
}
455-
456-
#[test]
457-
fn test_process_chunk_token_allocation() {
458-
let model = Arc::new(Model::default());
459-
let total_files = 3;
460-
let processed_files = Arc::new(AtomicUsize::new(0));
461-
let remaining_tokens = Arc::new(AtomicUsize::new(60)); // Reduced to force allocation limits
462-
let result_chunks = Arc::new(Mutex::new(Vec::new()));
463-
464-
let chunk = vec![
465-
(PathBuf::from("file1.txt"), "content1".to_string(), 50),
466-
(PathBuf::from("file2.txt"), "content2".to_string(), 40),
467-
(PathBuf::from("file3.txt"), "content3".to_string(), 30),
468-
];
469-
470-
process_chunk(&chunk, &model, total_files, &processed_files, &remaining_tokens, &result_chunks).unwrap();
471-
472-
let results = result_chunks.lock();
473-
// With 60 total tokens and 3 files:
474-
// First file gets 20 tokens (60/3)
475-
// Second file gets 30 tokens (40/2)
476-
// Third file gets 10 tokens (10/1)
477-
assert_eq!(results.len(), 3);
478-
assert_eq!(remaining_tokens.load(Ordering::SeqCst), 0);
479-
assert_eq!(processed_files.load(Ordering::SeqCst), 3);
480-
}
481-
482-
#[test]
483-
fn test_process_chunk_concurrent_safety() {
484-
use std::thread;
485-
486-
let model = Arc::new(Model::default());
487-
let total_files = 6;
488-
let processed_files = Arc::new(AtomicUsize::new(0));
489-
let remaining_tokens = Arc::new(AtomicUsize::new(100));
490-
let result_chunks = Arc::new(Mutex::new(Vec::new()));
491-
492-
let chunk1 = vec![
493-
(PathBuf::from("file1.txt"), "content1".to_string(), 20),
494-
(PathBuf::from("file2.txt"), "content2".to_string(), 20),
495-
(PathBuf::from("file3.txt"), "content3".to_string(), 20),
496-
];
497-
498-
let chunk2 = vec![
499-
(PathBuf::from("file4.txt"), "content4".to_string(), 20),
500-
(PathBuf::from("file5.txt"), "content5".to_string(), 20),
501-
(PathBuf::from("file6.txt"), "content6".to_string(), 20),
502-
];
503-
504-
// Clone values for thread 2
505-
let model2 = model.clone();
506-
let processed_files2 = processed_files.clone();
507-
let remaining_tokens2 = remaining_tokens.clone();
508-
let result_chunks2 = result_chunks.clone();
509-
510-
// Clone values for main thread access after threads complete
511-
let processed_files_main = processed_files.clone();
512-
let remaining_tokens_main = remaining_tokens.clone();
513-
let result_chunks_main = result_chunks.clone();
514-
515-
let t1 = thread::spawn(move || {
516-
process_chunk(&chunk1, &model, total_files, &processed_files, &remaining_tokens, &result_chunks).unwrap();
517-
});
518-
519-
let t2 = thread::spawn(move || {
520-
process_chunk(&chunk2, &model2, total_files, &processed_files2, &remaining_tokens2, &result_chunks2).unwrap();
521-
});
522-
523-
t1.join().unwrap();
524-
t2.join().unwrap();
525-
526-
let results = result_chunks_main.lock();
527-
assert_eq!(results.len(), 6);
528-
assert_eq!(remaining_tokens_main.load(Ordering::SeqCst), 0);
529-
assert_eq!(processed_files_main.load(Ordering::SeqCst), 6);
530-
}
531-
532-
#[test]
533-
fn test_to_commit_diff_with_head() -> Result<()> {
534-
let temp_dir = TempDir::new()?;
535-
let repo = Repository::init(temp_dir.path())?;
536-
let mut index = repo.index()?;
537-
538-
// Create a file and stage it
539-
let file_path = temp_dir.path().join("test.txt");
540-
std::fs::write(&file_path, "initial content")?;
541-
index.add_path(file_path.strip_prefix(temp_dir.path())?)?;
542-
index.write()?;
543-
544-
// Create initial commit
545-
let tree_id = index.write_tree()?;
546-
let tree = repo.find_tree(tree_id)?;
547-
let signature = git2::Signature::now("test", "[email protected]")?;
548-
repo.commit(Some("HEAD"), &signature, &signature, "Initial commit", &tree, &[])?;
549-
550-
// Modify and stage the file
551-
std::fs::write(&file_path, "modified content")?;
552-
index.add_path(file_path.strip_prefix(temp_dir.path())?)?;
553-
index.write()?;
554-
555-
// Get HEAD tree
556-
let head = repo.head()?.peel_to_tree()?;
557-
558-
// Get diff
559-
let diff = repo.to_commit_diff(Some(head))?;
560-
561-
// Verify diff shows only staged changes
562-
let mut diff_found = false;
563-
diff.print(DiffFormat::Patch, |_delta, _hunk, line| {
564-
let content = line.content().to_utf8();
565-
if line.origin() == '+' && content.contains("modified content") {
566-
diff_found = true;
567-
}
568-
true
569-
})?;
570-
571-
assert!(diff_found, "Expected to find staged changes in diff");
572-
Ok(())
573-
}
574-
575-
#[test]
576-
fn test_to_commit_diff_without_head() -> Result<()> {
577-
let temp_dir = TempDir::new()?;
578-
let repo = Repository::init(temp_dir.path())?;
579-
let mut index = repo.index()?;
580-
581-
// Create and stage a new file
582-
let file_path = temp_dir.path().join("test.txt");
583-
std::fs::write(&file_path, "test content")?;
584-
index.add_path(file_path.strip_prefix(temp_dir.path())?)?;
585-
index.write()?;
586-
587-
// Get diff (no HEAD exists yet)
588-
let diff = repo.to_commit_diff(None)?;
589-
590-
// Verify diff shows staged changes
591-
let mut diff_found = false;
592-
diff.print(DiffFormat::Patch, |_delta, _hunk, line| {
593-
let content = line.content().to_utf8();
594-
if line.origin() == '+' && content.contains("test content") {
595-
diff_found = true;
596-
}
597-
true
598-
})?;
599-
600-
assert!(diff_found, "Expected to find staged changes in diff");
601-
Ok(())
602-
}
603448
}

src/install.rs

Lines changed: 0 additions & 34 deletions
This file was deleted.

src/openai.rs

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -145,17 +145,18 @@ pub async fn call(request: Request) -> Result<Response> {
145145
"ERROR:".bold().bright_red(),
146146
"Network error:".bright_white(),
147147
e.to_string().dimmed(),
148-
"Failed to connect to OpenAI service.".dimmed(),
148+
"Failed to connect to OpenAI API.".dimmed(),
149149
"Check your internet connection".yellow(),
150-
"Verify OpenAI service is not experiencing downtime".yellow()
150+
"Verify OpenAI service availability".yellow()
151151
),
152152
_ =>
153153
format!(
154-
"{} {}\n {}\n\nDetails:\n {}",
154+
"{} {}\n {}\n\nDetails:\n {}\n\nSuggested Actions:\n 1. {}",
155155
"ERROR:".bold().bright_red(),
156156
"Unexpected error:".bright_white(),
157157
err.to_string().dimmed(),
158-
"An unexpected error occurred while communicating with OpenAI.".dimmed()
158+
"An unexpected error occurred while calling OpenAI API.".dimmed(),
159+
"Please report this issue on GitHub".yellow()
159160
),
160161
};
161162
return Err(anyhow!(error_msg));
@@ -165,11 +166,11 @@ pub async fn call(request: Request) -> Result<Response> {
165166
let content = response
166167
.choices
167168
.first()
168-
.context("No choices returned")?
169+
.context("No response choices available")?
169170
.message
170171
.content
171172
.clone()
172-
.context("No content returned")?;
173+
.context("Response content is empty")?;
173174

174175
Ok(Response { response: content })
175176
}

src/profiling.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@ impl Drop for Profile {
2929
#[macro_export]
3030
macro_rules! profile {
3131
($name:expr) => {
32-
// Currently a no-op, but can be expanded for actual profiling
33-
let _profile_span = $name;
32+
let _profile = $crate::Profile::new($name);
3433
};
3534
}

0 commit comments

Comments
 (0)