Skip to content

Commit 37ff40a

Browse files
authored
Merge pull request #28 from lablup/feature/ssh-client-compatible-config
feat: Add SSH configuration file parsing (-F option)
2 parents 9c6be63 + ea20b32 commit 37ff40a

18 files changed

Lines changed: 4592 additions & 81 deletions

ARCHITECTURE.md

Lines changed: 120 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,87 @@ Focus on more impactful optimizations like:
230230
- Early termination on critical failures
231231
- Parallel DNS resolution
232232

233-
### 6. Node Management (`node.rs`)
233+
### 6. SSH Configuration Caching (`ssh/config_cache.rs`)
234+
235+
**Status:** Implemented (Phase 4, 2025-08-28)
236+
237+
**Design Motivation:**
238+
SSH configuration files are frequently accessed and parsed during bssh operations, especially for multi-node commands. Caching eliminates redundant file I/O and parsing overhead, providing significant performance improvements for repeated operations.
239+
240+
**Implementation Details:**
241+
- **LRU Cache:** Uses `lru` crate with configurable size (default: 100 entries)
242+
- **TTL Support:** Time-to-live expiration (default: 5 minutes)
243+
- **File Modification Detection:** Automatic cache invalidation via file mtime comparison
244+
- **Thread Safety:** `Arc<RwLock<LruCache>>` for concurrent access
245+
- **Global Instance:** Lazy-initialized singleton via `once_cell`
246+
247+
**Cache Entry Structure:**
248+
```rust
249+
struct CacheEntry {
250+
config: SshConfig, // Parsed SSH configuration
251+
cached_at: Instant, // Creation timestamp
252+
file_mtime: SystemTime, // File modification time
253+
access_count: u64, // Number of accesses
254+
last_accessed: Instant, // Last access timestamp
255+
}
256+
```
257+
258+
**Cache Invalidation Strategy:**
259+
1. **TTL Expiration:** Remove entries older than configured TTL
260+
2. **File Modification:** Detect changes via mtime comparison
261+
3. **LRU Eviction:** Remove least recently used entries when full
262+
4. **Manual Maintenance:** Periodic cleanup of expired entries
263+
264+
**API Design:**
265+
```rust
266+
// Cached versions (recommended)
267+
SshConfig::load_from_file_cached(path)?;
268+
SshConfig::load_default_cached()?;
269+
270+
// Original versions (still supported)
271+
SshConfig::load_from_file(path)?;
272+
SshConfig::load_default()?;
273+
274+
// Direct cache access
275+
GLOBAL_CACHE.stats();
276+
GLOBAL_CACHE.clear();
277+
GLOBAL_CACHE.maintain();
278+
```
279+
280+
**Configuration (Environment Variables):**
281+
- `BSSH_CACHE_ENABLED=true/false` - Enable/disable caching (default: true)
282+
- `BSSH_CACHE_SIZE=100` - Maximum entries (default: 100)
283+
- `BSSH_CACHE_TTL=300` - TTL in seconds (default: 300)
284+
285+
**Performance Impact:**
286+
- **Cache Hits:** 10-100x faster than file access
287+
- **Reduced I/O:** Eliminates repeated file reads
288+
- **Lower CPU:** Avoids re-parsing SSH config syntax
289+
- **Memory Overhead:** ~1KB per cached config entry
290+
291+
**CLI Integration:**
292+
New `cache-stats` command provides comprehensive monitoring:
293+
```bash
294+
bssh cache-stats # Basic statistics
295+
bssh cache-stats --detailed # Per-entry information
296+
bssh cache-stats --clear # Clear cache
297+
bssh cache-stats --maintain # Remove expired entries
298+
```
299+
300+
**Security Considerations:**
301+
- Path canonicalization prevents traversal attacks
302+
- No sensitive data cached (only configuration)
303+
- Atomic cache operations prevent corruption
304+
- Safe defaults for security-critical environments
305+
306+
**Test Coverage:**
307+
- 10 comprehensive test cases covering all scenarios
308+
- Cache hit/miss behavior validation
309+
- File modification detection testing
310+
- TTL expiration and LRU eviction testing
311+
- Thread safety and concurrent access testing
312+
313+
### 7. Node Management (`node.rs`)
234314

235315
**Design Decisions:**
236316
- Flexible parsing supporting multiple formats
@@ -285,6 +365,45 @@ Focus on more impactful optimizations like:
285365
2. **Pipelining:** Send multiple commands in single session
286366
3. **Compression:** Enable SSH compression for large outputs
287367
4. **Caching:** Cache host keys and authentication
368+
5. **Environment Variable Caching:** Cache safe environment variables for path expansion
369+
370+
### Environment Variable Caching (Added 2025-01-28)
371+
372+
To improve performance during SSH configuration path expansion, bssh implements a comprehensive environment variable cache:
373+
374+
**Implementation:** `src/ssh/ssh_config/env_cache.rs`
375+
- Thread-safe LRU cache with configurable TTL (default: 30 seconds)
376+
- Whitelisted safe variables only (HOME, USER, SSH_AUTH_SOCK, etc.)
377+
- O(1) lookups using HashMap storage
378+
- Automatic expiration and size-based eviction
379+
380+
**Performance Impact:**
381+
- 6x faster path expansion (387µs → 60µs in benchmarks)
382+
- 99%+ cache hit rate in typical usage
383+
- Reduces system calls from repeated `std::env::var()` calls
384+
- Memory overhead: ~50 environment variables max (configurable)
385+
386+
**Security Features:**
387+
- Only whitelisted safe variables are cached
388+
- Dangerous variables (PATH, LD_PRELOAD, etc.) are blocked
389+
- Defense-in-depth: both cache and path expansion validate safety
390+
- TTL prevents stale values from persisting
391+
392+
**Configuration:**
393+
- `BSSH_ENV_CACHE_TTL`: Cache TTL in seconds (default: 30)
394+
- `BSSH_ENV_CACHE_SIZE`: Max cache entries (default: 50)
395+
- `BSSH_ENV_CACHE_ENABLED`: Enable/disable caching (default: true)
396+
397+
**Usage Pattern:**
398+
```rust
399+
// Automatic caching during path expansion
400+
let expanded = expand_path_internal("${HOME}/.ssh/config");
401+
402+
// Direct cache access (for advanced use)
403+
if let Ok(Some(home)) = GLOBAL_ENV_CACHE.get_env_var("HOME") {
404+
// Use cached HOME value
405+
}
406+
```
288407

289408
## Interactive Mode Architecture
290409

Cargo.lock

Lines changed: 74 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,10 @@ signal-hook = "0.3.18"
4242
atty = "0.2.14"
4343
arrayvec = "0.7.6"
4444
smallvec = "1.13.2"
45+
lru = "0.12"
4546

4647
[dev-dependencies]
4748
tempfile = "3"
4849
mockito = "1"
4950
once_cell = "1.21.3"
51+
tokio-test = "0.4"

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ A high-performance SSH client with **SSH-compatible syntax** for both single-hos
2020
- **Cross-Platform**: Works on Linux and macOS
2121
- **Output Management**: Save command outputs to files per node with detailed logging
2222
- **Interactive Mode**: Interactive shell sessions with single-node or multiplexed multi-node support
23+
- **SSH Config Caching**: High-performance caching of SSH configurations with TTL and file modification detection
2324
- **Configurable Timeouts**: Set command execution timeouts with support for unlimited execution (timeout=0)
2425

2526
## Installation

src/cli.rs

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ use std::path::PathBuf;
2121
version,
2222
before_help = "\n\nBackend.AI SSH - Parallel command execution across cluster nodes",
2323
about = "Backend.AI SSH - SSH-compatible parallel command execution tool",
24-
long_about = "bssh is a high-performance SSH client with parallel execution capabilities.\nIt can be used as a drop-in replacement for SSH (single host) or as a powerful cluster management tool (multiple hosts).\n\nThe tool provides secure file transfer using SFTP and supports SSH keys, SSH agent, and password authentication.\nIt automatically detects Backend.AI multi-node session environments.",
25-
after_help = "EXAMPLES:\n SSH Mode:\n bssh user@host # Interactive shell\n bssh admin@server.com \"uptime\" # Execute command\n bssh -p 2222 -i ~/.ssh/key user@host # Custom port and key\n\n Multi-Server Mode:\n bssh -C production \"systemctl status\" # Use cluster config\n bssh -H \"web1,web2,web3\" \"df -h\" # Direct hosts\n\n File Operations:\n bssh -C staging upload file.txt /tmp/ # Upload to cluster\n bssh -H host1,host2 download /etc/hosts ./backups/\n\n Other Commands:\n bssh list # List configured clusters\n bssh -C production ping # Test connectivity\n\nDeveloped and maintained as part of the Backend.AI project.\nFor more information: https://github.com/lablup/bssh"
24+
long_about = "bssh is a high-performance SSH client with parallel execution capabilities.\nIt can be used as a drop-in replacement for SSH (single host) or as a powerful cluster management tool (multiple hosts).\n\nThe tool provides secure file transfer using SFTP and supports SSH keys, SSH agent, and password authentication.\nIt automatically detects Backend.AI multi-node session environments.\n\nSSH Configuration Support:\n- Reads standard SSH config files (defaulting to ~/.ssh/config)\n- Supports Host patterns, HostName, User, Port, IdentityFile, StrictHostKeyChecking\n- ProxyJump, and many other SSH configuration directives\n- CLI arguments override SSH config values following SSH precedence rules",
25+
after_help = "EXAMPLES:\n SSH Mode:\n bssh user@host # Interactive shell\n bssh admin@server.com \"uptime\" # Execute command\n bssh -p 2222 -i ~/.ssh/key user@host # Custom port and key\n bssh -F ~/.ssh/myconfig webserver # Use custom SSH config\n\n Multi-Server Mode:\n bssh -C production \"systemctl status\" # Use cluster config\n bssh -H \"web1,web2,web3\" \"df -h\" # Direct hosts\n bssh -F /etc/ssh/ssh_config -H web* # SSH config with wildcards\n\n File Operations:\n bssh -C staging upload file.txt /tmp/ # Upload to cluster\n bssh -H host1,host2 download /etc/hosts ./backups/\n\n Other Commands:\n bssh list # List configured clusters\n bssh -C production ping # Test connectivity\n\n SSH Config Example (~/.ssh/config):\n Host web*\n HostName web.example.com\n User webuser\n Port 2222\n IdentityFile ~/.ssh/web_key\n StrictHostKeyChecking yes\n\nDeveloped and maintained as part of the Backend.AI project.\nFor more information: https://github.com/lablup/bssh"
2626
)]
2727
pub struct Cli {
2828
/// SSH destination in format: [user@]hostname[:port] or ssh://[user@]hostname[:port]
@@ -137,7 +137,7 @@ pub struct Cli {
137137
short = 'F',
138138
long = "ssh-config",
139139
value_name = "configfile",
140-
help = "Specifies an alternative SSH configuration file"
140+
help = "Specifies an alternative SSH configuration file\nSupports standard SSH config format with Host, HostName, User, Port, IdentityFile, etc.\nDefaults to ~/.ssh/config if not specified and file exists"
141141
)]
142142
pub ssh_config: Option<PathBuf>,
143143

@@ -299,6 +299,22 @@ pub enum Commands {
299299
)]
300300
work_dir: Option<String>,
301301
},
302+
303+
#[command(
304+
about = "Display SSH config cache statistics",
305+
long_about = "Shows detailed statistics and debug information about the SSH configuration cache.\nIncludes hit rates, cache size, eviction counts, and entry details.\nUseful for performance monitoring and cache tuning.\n\nCache can be configured via environment variables:\n BSSH_CACHE_ENABLED=true/false - Enable/disable caching\n BSSH_CACHE_SIZE=100 - Maximum cache entries\n BSSH_CACHE_TTL=300 - TTL in seconds",
306+
after_help = "Examples:\n bssh cache-stats # Show basic statistics\n bssh cache-stats --detailed # Show per-entry information\n bssh cache-stats --clear # Clear cache and show stats"
307+
)]
308+
CacheStats {
309+
#[arg(long, help = "Show detailed per-entry information")]
310+
detailed: bool,
311+
312+
#[arg(long, help = "Clear the cache before showing statistics")]
313+
clear: bool,
314+
315+
#[arg(long, help = "Perform cache maintenance (remove expired entries)")]
316+
maintain: bool,
317+
},
302318
}
303319

304320
impl Cli {

0 commit comments

Comments
 (0)