-
-
Notifications
You must be signed in to change notification settings - Fork 214
fix: Resolve template compilation errors in crates.io installation #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ttern) - Renamed 22 get_*() methods to Rust-idiomatic names (remove get_ prefix) - Updated all call sites across codebase - Fixed broken tests that relied on non-existent methods - Updated copilot-instructions.md with py command and bash ! escaping Changed methods: - get_tool() → tool() - get_gpu_layers() → gpu_layers() - get_backend_info() → backend_info() - get_metrics() → metrics() - get_model() → model() - get_usage_stats() → usage_stats() - get_preload_stats() → preload_stats() - get_model_info() → model_info() - get_allocated_ports() → allocated_ports() - get_mlx_info() → mlx_info() - get_stats() → stats() - get_checked_invariants() → checked_invariants() - get_failed_invariants() → failed_invariants() - get_memory_usage() → memory_usage() - get_cpu_usage() → cpu_usage() - get_disk_usage() → disk_usage() Fixes: I2 audit pattern (Java-style getters) Test: cargo test --lib (295/295 passing)
…N5 pattern) Phase 2 of systematic audit cleanup - replaced 14 production unwraps: src/metrics.rs (5 unwraps): - config.as_ref().unwrap() → match with early return - Mutex locks (request_times, endpoints_used, models_used) → unwrap_or_else with panic src/openai_compat.rs (3 unwraps): - JSON serialization unwraps → unwrap_or_else with error logging + fallback src/preloading.rs (2 unwraps): - stats.get().unwrap() → unwrap_or(&default) src/model_manager.rs (1 unwrap): - partial_cmp().unwrap() → unwrap_or(Ordering::Equal) src/workflow.rs (1 unwrap): - strip_prefix().unwrap() → unwrap_or(fallback) src/engine/llama.rs (1 unwrap): - Mutex lock (no-op function, kept unwrap_or_else with panic) src/observability/mod.rs (1 unwrap): - partial_cmp().unwrap() → unwrap_or(Ordering::Equal) Note: 226+ unwraps remain in test code (acceptable - tests should panic). All 295 unit tests passing.
…tringly pattern - Part 1) Phase 3 of systematic audit cleanup - replaced string-based errors with typed ShimmyError variants: New error variants added to src/error.rs: - WorkflowStepNotFound - WorkflowVariableNotFound - WorkflowCircularDependency - UnsupportedOperation - ToolExecutionFailed - InvalidPath - FileNotFound - ScriptExecutionFailed - ProcessFailed - SafeTensorsConversionNeeded - PortAllocationFailed - DiscoveryFailed - ToolNotFound src/workflow.rs (7 string errors → typed): src/safetensors_adapter.rs (4 string errors → typed): All 295 unit tests passing.
…ng (A3_stringly pattern - Part 2) Phase 3 Part 2 of systematic audit cleanup - replaced string-based errors with typed ShimmyError: New error variants added to src/error.rs: - MissingParameter (for tool arguments) - MlxNotAvailable, MlxIncompatible, NotImplemented - UnsupportedBackend - PythonDependenciesMissing, ModelVerificationFailed Files converted to typed errors: - src/workflow.rs: 7 errors → ShimmyError variants - src/safetensors_adapter.rs: 4 errors → ShimmyError variants - src/tools.rs: 3 parameter errors + parse errors → ShimmyError - src/preloading.rs: 2 model not found errors → ShimmyError::ModelNotFound Note: Engine layer (llama, mlx, huggingface, adapter, safetensors_native, universal) kept with anyhow::Result to avoid deep refactoring of third-party error conversions. The engine provides a clean boundary - higher-level code uses ShimmyError. All 295 unit tests passing.
- Fixed backend_info() call site in main.rs (was renamed to get_backend_info) - Removed unused import GLOBAL_PORT_ALLOCATOR from cli.rs - Fixed trailing whitespace in safetensors_adapter.rs - Ran cargo fmt to fix all formatting issues - Removed unnecessary .into() conversions (clippy::useless_conversion) - Prefixed unused test variables with underscore All regression tests should now pass.
Documented completed work: - Phase 1: I2 (Java getters) - 22 methods renamed - Phase 2: N5 (Unwraps) - 14 production unwraps fixed - Phase 3: A3_stringly (Typed errors) - 16+ string errors converted - Formatting & clippy fixes Updated status to reflect: - 5 commits ahead of origin/main - Ready to create feature branch for PR workflow - Next steps: branch creation, regression tests, Issue queue review
- Add global CLI flags: --cpu-moe and --n-cpu-moe N - Integrate MoE configuration through engine adapter - Use local llama-cpp-rs fork with MoE support (feat/moe-cpu-offload branch) - Fix ANSI color output (respects NO_COLOR and TERM env vars) This enables running large MoE models like GPT-OSS 20B on consumer GPUs by offloading expert tensors to CPU memory, reducing VRAM requirements. Related: Issue #81, llama-cpp-rs PR pending
The serve command was creating a new LlamaEngine without the MoE configuration, causing --cpu-moe and --n-cpu-moe flags to be ignored when auto-registering discovered models. Now creates enhanced_engine with same MoE config as the initial engine, ensuring expert tensor offloading works in serve mode. Verified: 144 expert tensors offloaded to CPU with GPT-OSS 20B model.
- Remove async from template generation functions (they were synchronous) - Eliminate nested tokio runtime that caused panics during template generation - Fix issue #83 where cargo install shimmy failed due to missing template files - Template files are properly included in package, runtime issue was the blocker 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Closed
Michael-A-Kuykendall
added a commit
that referenced
this pull request
Oct 13, 2025
* refactor: Rename Java-style getters to Rust naming conventions (I2 pattern) - Renamed 22 get_*() methods to Rust-idiomatic names (remove get_ prefix) - Updated all call sites across codebase - Fixed broken tests that relied on non-existent methods - Updated copilot-instructions.md with py command and bash ! escaping Changed methods: - get_tool() → tool() - get_gpu_layers() → gpu_layers() - get_backend_info() → backend_info() - get_metrics() → metrics() - get_model() → model() - get_usage_stats() → usage_stats() - get_preload_stats() → preload_stats() - get_model_info() → model_info() - get_allocated_ports() → allocated_ports() - get_mlx_info() → mlx_info() - get_stats() → stats() - get_checked_invariants() → checked_invariants() - get_failed_invariants() → failed_invariants() - get_memory_usage() → memory_usage() - get_cpu_usage() → cpu_usage() - get_disk_usage() → disk_usage() Fixes: I2 audit pattern (Java-style getters) Test: cargo test --lib (295/295 passing) * refactor: Replace all production unwraps with proper error handling (N5 pattern) Phase 2 of systematic audit cleanup - replaced 14 production unwraps: src/metrics.rs (5 unwraps): - config.as_ref().unwrap() → match with early return - Mutex locks (request_times, endpoints_used, models_used) → unwrap_or_else with panic src/openai_compat.rs (3 unwraps): - JSON serialization unwraps → unwrap_or_else with error logging + fallback src/preloading.rs (2 unwraps): - stats.get().unwrap() → unwrap_or(&default) src/model_manager.rs (1 unwrap): - partial_cmp().unwrap() → unwrap_or(Ordering::Equal) src/workflow.rs (1 unwrap): - strip_prefix().unwrap() → unwrap_or(fallback) src/engine/llama.rs (1 unwrap): - Mutex lock (no-op function, kept unwrap_or_else with panic) src/observability/mod.rs (1 unwrap): - partial_cmp().unwrap() → unwrap_or(Ordering::Equal) Note: 226+ unwraps remain in test code (acceptable - tests should panic). All 295 unit tests passing. * refactor: Add typed errors for workflow and safetensors modules (A3_stringly pattern - Part 1) Phase 3 of systematic audit cleanup - replaced string-based errors with typed ShimmyError variants: New error variants added to src/error.rs: - WorkflowStepNotFound - WorkflowVariableNotFound - WorkflowCircularDependency - UnsupportedOperation - ToolExecutionFailed - InvalidPath - FileNotFound - ScriptExecutionFailed - ProcessFailed - SafeTensorsConversionNeeded - PortAllocationFailed - DiscoveryFailed - ToolNotFound src/workflow.rs (7 string errors → typed): src/safetensors_adapter.rs (4 string errors → typed): All 295 unit tests passing. * refactor: Add typed errors for workflow, safetensors, tools, preloading (A3_stringly pattern - Part 2) Phase 3 Part 2 of systematic audit cleanup - replaced string-based errors with typed ShimmyError: New error variants added to src/error.rs: - MissingParameter (for tool arguments) - MlxNotAvailable, MlxIncompatible, NotImplemented - UnsupportedBackend - PythonDependenciesMissing, ModelVerificationFailed Files converted to typed errors: - src/workflow.rs: 7 errors → ShimmyError variants - src/safetensors_adapter.rs: 4 errors → ShimmyError variants - src/tools.rs: 3 parameter errors + parse errors → ShimmyError - src/preloading.rs: 2 model not found errors → ShimmyError::ModelNotFound Note: Engine layer (llama, mlx, huggingface, adapter, safetensors_native, universal) kept with anyhow::Result to avoid deep refactoring of third-party error conversions. The engine provides a clean boundary - higher-level code uses ShimmyError. All 295 unit tests passing. * fix: Formatting and clippy warnings from refactoring - Fixed backend_info() call site in main.rs (was renamed to get_backend_info) - Removed unused import GLOBAL_PORT_ALLOCATOR from cli.rs - Fixed trailing whitespace in safetensors_adapter.rs - Ran cargo fmt to fix all formatting issues - Removed unnecessary .into() conversions (clippy::useless_conversion) - Prefixed unused test variables with underscore All regression tests should now pass. * docs: Update copilot instructions with Phase 1-3 cleanup progress Documented completed work: - Phase 1: I2 (Java getters) - 22 methods renamed - Phase 2: N5 (Unwraps) - 14 production unwraps fixed - Phase 3: A3_stringly (Typed errors) - 16+ string errors converted - Formatting & clippy fixes Updated status to reflect: - 5 commits ahead of origin/main - Ready to create feature branch for PR workflow - Next steps: branch creation, regression tests, Issue queue review * feat: Add MoE CPU offloading support (--cpu-moe, --n-cpu-moe) - Add global CLI flags: --cpu-moe and --n-cpu-moe N - Integrate MoE configuration through engine adapter - Use local llama-cpp-rs fork with MoE support (feat/moe-cpu-offload branch) - Fix ANSI color output (respects NO_COLOR and TERM env vars) This enables running large MoE models like GPT-OSS 20B on consumer GPUs by offloading expert tensors to CPU memory, reducing VRAM requirements. Related: Issue #81, llama-cpp-rs PR pending * fix: Apply MoE config to serve command's enhanced engine The serve command was creating a new LlamaEngine without the MoE configuration, causing --cpu-moe and --n-cpu-moe flags to be ignored when auto-registering discovered models. Now creates enhanced_engine with same MoE config as the initial engine, ensuring expert tensor offloading works in serve mode. Verified: 144 expert tensors offloaded to CPU with GPT-OSS 20B model. * fix: Resolve template compilation errors in crates.io installation - Remove async from template generation functions (they were synchronous) - Eliminate nested tokio runtime that caused panics during template generation - Fix issue #83 where cargo install shimmy failed due to missing template files - Template files are properly included in package, runtime issue was the blocker 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: Claude <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #83 - resolves the "Cannot install" error when running
cargo install shimmy --features huggingface
Root Cause
The issue had two parts:
Changes Made
async
from all template generation functionsTesting
cargo build --no-default-features --features huggingface
shimmy init --template docker --output ./test-templates
include_str!()
Impact
🤖 Generated with Claude Code