AiDotNet is a comprehensive .NET machine learning library (currently in preview) designed to:
- Bring latest AI/ML algorithms to the .NET community
- Support both beginners and expert users with customization
- Provide unified interfaces for diverse ML algorithms
- Target .NET 8.0 and .NET Framework 4.6.2+
Current Version: 0.0.5-preview
Dependencies: Newtonsoft.Json, Azure Search, Elasticsearch, Pinecone, Redis, PostgreSQL EF Core
Architecture: Modular namespace-based organization with extensive interface patterns
AiDotNet/
├── src/ # Main library source code
├── tests/ # Unit and integration tests
├── testconsole/ # Example console applications
├── AiDotNetBenchmarkTests/ # Performance benchmarks
├── data/ # Data files for examples
├── TemplateEngineHost/ # Template utilities
├── AiDotNet.sln # Main solution file
└── [config/docs files] # CI/CD, documentation, etc.
Core ML Components:
NeuralNetworks/- 44 different NN implementations (CNN, LSTM, GAN, RNN, Transformer, Vision Transformer, etc.)Optimizers/- 40+ optimizer algorithms (Adam, SGD, Genetic Algorithm, Particle Swarm, etc.)LossFunctions/- Loss function implementationsActivationFunctions/- Activation functions for neural networks
Training & Meta-Learning:
MetaLearning/- Meta-learning algorithms (Reptile trainer config/implementation)Config/- Configuration classes for meta-learnersTrainers/- Trainer implementations
AutoML/- Automated ML and hyperparameter searchArchitecture.cs,SuperNet.cs,NeuralArchitectureSearch.cs- Supports trial-based search with constraints and result tracking
Model Management:
Models/- Model definitions and metadataNeuralNetworkModel.cs- Wrapper for neural networksModelMetadata.cs- Model metadata and propertiesOptions/- 170+ configuration option classes (AdamOptimizerOptions, various model options)Results/- Result classes (OptimizationResult, MetaTrainingResult, etc.)Inputs/- Input model classes
Data & Processing:
Data/- Data abstractions and loadersAbstractions/- MetaLearningTaskLoaders/- Episodic data loaders (Balanced, Curriculum, Stratified, Uniform)
DataProcessor/- Data preprocessing utilitiesNormalizers/- Data normalization strategies
Evaluation & Validation:
Evaluation/- Model evaluationDefaultModelEvaluator.cs- Main evaluator implementation
CrossValidators/- Cross-validation implementationsValidation/- Validators for architectures, serialization, etc.FitDetectors/- Model fit detection algorithms
Feature Engineering:
FeatureSelectors/- Feature selection algorithmsDecompositionMethods/- Matrix and time series decompositionInterpretability/- Model interpretation tools
Advanced Techniques:
LoRA/- Low-Rank Adaptation for fine-tuningTransferLearning/- Transfer learning algorithms and domain adaptationRetrievalAugmentedGeneration/- RAG implementation (17 subdirectories!)GaussianProcesses/- GP implementationsTimeSeries/- Time series modelsRegression/- Various regression algorithms
Utilities:
Serialization/- Model serialization (JSON converters)Interfaces/- 75+ interface definitions (critical for architecture)Enums/- 60+ enum types (AlgorithmTypes, ActivationFunction, MetricType, etc.)Exceptions/- Custom exception typesFactories/- Factory pattern implementationsHelpers/- Utility methodsLinearAlgebra/- Linear algebra operationsNumericOperations/- Numeric computation utilitiesCaching/- Caching implementationsExtensions/- Extension methodsRegularization/- Regularization techniquesKernels/- Kernel functionsOutlierRemoval/- Outlier detection/removalStatistics/- Statistical utilitiesInterpolation/- Interpolation methodsRadialBasisFunctions/- RBF implementationsWindowFunctions/- Window functionsWaveletFunctions/- Wavelet implementationsImages/- Image processing utilities
| Pattern | Purpose | Examples |
|---|---|---|
{Algorithm}Optimizer |
Optimization algorithms | AdamOptimizer, GeneticAlgorithmOptimizer, LionOptimizer |
{Algorithm}OptimizerOptions |
Configuration for optimizers | AdamOptimizerOptions, LBFGSOptimizerOptions |
{NeuralNet}NeuralNetwork |
Different neural network types | LSTMNeuralNetwork, ConvolutionalNeuralNetwork, AttentionNetwork |
{Type}Trainer / {Type}TrainerConfig |
Meta-learning trainers and their configs | ReptileTrainer, ReptileTrainerConfig |
{Name}FitDetector |
Fit detection algorithms | ConfusionMatrixFitDetector, BootstrapFitDetector |
{Type}Model |
Model wrapper classes | NeuralNetworkModel, VectorModel |
{Operation}Result |
Result objects from operations | OptimizationResult, MetaTrainingResult, CrossValidationResult |
I{Interface} |
Interface definitions | IOptimizer, IModel, INeuralNetwork, IModelEvaluator |
{Type}Exception |
Custom exceptions | ModelTrainingException, TensorDimensionException |
{Concept}Loader |
Data loaders | BalancedEpisodicDataLoader, StratifiedEpisodicDataLoader |
- Generic Interfaces:
IModel<TInput, TOutput, TMetadata>,IOptimizer<T, TInput, TOutput> - Feature Interfaces:
IActivationFunction,ILossFunction,ILayer - Container Interfaces:
IModelCache<T, TInput, TOutput>,IAutoMLModel<T, TInput, TOutput> - Evaluation Interfaces:
IModelEvaluator<T, TInput, TOutput>,IFitDetector<T, TInput, TOutput>
AiDotNet.{FeatureArea}[.{SubArea}]
Examples:
- AiDotNet.Optimizers
- AiDotNet.NeuralNetworks
- AiDotNet.MetaLearning.Trainers
- AiDotNet.Data.Loaders
- AiDotNet.DecompositionMethods.MatrixDecomposition
- AiDotNet.RetrievalAugmentedGeneration.Configuration
- Base config files at the algorithm level
- "{Algorithm}Options" classes in
Models/Options/ - All inherit from common patterns with validation methods
- Support for parameter ranges and constraints (via AutoML classes)
{Operation}Result<T>pattern for outcomes- Located in
Models/Results/ - Include metrics, history, and detailed information
- Support for nested results (e.g.,
OptimizationResultcontainsDatasetResult)
1. Optimizers (40+ implementations)
- Location:
/src/Optimizers/ - Base class:
OptimizerBase<T, TInput, TOutput> - Features:
- Fitness calculation and caching
- Iteration history tracking
- Early stopping capability
- Adaptive parameters (learning rate, momentum)
- Model evaluation and fit detection
- Supports: Gradient-based, evolutionary, swarm-based, and hybrid optimization
2. Meta-Learning Framework
- Location:
/src/MetaLearning/ - Current implementation:
ReptileTrainerandReptileTrainerConfigReptileTrainerBasefor extension- Configuration validation and parameter management
- Interfaces:
IMetaLearner<T, TInput, TOutput>,IMetaLearnerConfig<T> - Supports task-based learning and few-shot scenarios
3. AutoML Infrastructure
- Location:
/src/AutoML/ - Key classes:
AutoMLModelBase<T, TInput, TOutput>- Base class for AutoMLSuperNet.cs- Supernetwork for architecture searchNeuralArchitectureSearch.cs- NAS implementationTrialResult- Tracks individual trial outcomesSearchConstraintandParameterRange- Search space definition
- Features:
- Trial history management
- Early stopping with patience
- Metric-based optimization (maximize/minimize)
- Time and trial limits
- Result tracking:
TrialResultclass
4. Model Evaluation
- Location:
/src/Evaluation/ DefaultModelEvaluator<T, TInput, TOutput>- Main evaluator- Supports various metrics and performance assessment
5. Fit Detection
- Location:
/src/FitDetectors/ - Multiple strategies for model fit assessment
- Used in optimization and AutoML
6. Fitness Calculation
- Location:
/src/FitnessCalculators/ - Genetic algorithm fitness and evolutionary strategies
7. Cross-Validation
- Location:
/src/CrossValidators/ - Model validation strategies
8. Data Versioning & Episodic Loading
- Location:
/src/Data/Loaders/ - Episodic data loaders for meta-learning:
BalancedEpisodicDataLoaderCurriculumEpisodicDataLoaderStratifiedEpisodicDataLoaderUniformEpisodicDataLoader
- Support for task-based data management
9. Model Serialization
- Location:
/src/Serialization/ - JSON converters for matrices, tensors, vectors
- Basic model persistence
10. Caching Infrastructure
- Location:
/src/Caching/ IModelCache<T, TInput, TOutput>interface- Used by optimizers to avoid redundant evaluations
- ❌ No Experiment Tracking: No centralized experiment management
- ❌ No Checkpoint Management: No model checkpoint/savepoint system
- ❌ No Training Monitoring/Logging: Limited real-time training metrics
- ❌ No Model Registry: No centralized model storage/versioning
- ❌ Limited Data Versioning: No version control for datasets
- ❌ No Distributed Training: No multi-GPU/distributed support
- ❌ No Hyperparameter Tuning Framework: Limited to AutoML integration
- ❌ No Training Callbacks: No hook system for training events
- ❌ No Advanced Logging: Basic logging only
The project heavily uses interfaces for flexibility:
Key Core Interfaces:
IModel<TInput, TOutput, TMetadata>- All models implementIOptimizer<T, TInput, TOutput>- All optimizers inheritIOptimizer- Common interface for all optimizersIAutoMLModel<T, TInput, TOutput>- AutoML implementationsIModelEvaluator<T, TInput, TOutput>- Evaluation strategiesIMetaLearner<T, TInput, TOutput>- Meta-learning algorithmsILayer- Neural network layersINeuralNetwork- Neural network baseIModelCache<T, TInput, TOutput>- Caching strategiesIModelSerializer- Serialization interfaceIParameterizable- Parameterizable components
Based on the analysis, here are recommendations for placing the 6 new components:
Recommended Location: /src/ExperimentTracking/
Structure:
ExperimentTracking/
├── Abstractions/
│ ├── IExperimentTracker.cs
│ ├── IExperimentRun.cs
│ ├── IMetricLogger.cs
│ └── IArtifactStorage.cs
├── Models/
│ ├── ExperimentMetadata.cs
│ ├── RunMetrics.cs
│ ├── TrainingEvent.cs
│ └── ExperimentConfiguration.cs
├── DefaultExperimentTracker.cs
├── ExperimentRunResult.cs
└── MetricAggregator.cs
Follows Pattern:
- Similar to
Models/Options/andModels/Results/pattern - Generic type parameters:
ExperimentTracker<T, TInput, TOutput> - Namespace:
AiDotNet.ExperimentTracking
Integration Points:
- Register with
AiModelBuilder - Hook into
OptimizerBaseiteration loops - Use existing
OptimizationIterationInfostructure
Recommended Location: /src/HyperparameterOptimization/
Structure:
HyperparameterOptimization/
├── Abstractions/
│ ├── IHyperparameterOptimizer.cs
│ ├── IParameterSpace.cs
│ └── ITrialScheduler.cs
├── Strategies/
│ ├── GridSearchStrategy.cs
│ ├── RandomSearchStrategy.cs
│ ├── BayesianOptimizationStrategy.cs
│ └── PopulationBasedTrainingStrategy.cs
├── Models/
│ ├── HyperparameterSearchConfig.cs
│ ├── HyperparameterTrial.cs
│ └── HyperparameterOptimizationResult.cs
├── ParameterDistribution.cs
├── TrialScheduler.cs
└── ConvergenceAnalyzer.cs
Follows Pattern:
- Extends existing
AutoML/infrastructure - Builds on
ParameterRangeandSearchConstraint - Can reuse
TrialResultfrom AutoML - Namespace:
AiDotNet.HyperparameterOptimization
Integration Points:
- Works with existing optimizer options
- Integrates with AutoML trial framework
- Uses
IModelEvaluatorfor trial evaluation
Recommended Location: /src/CheckpointManagement/
Structure:
CheckpointManagement/
├── Abstractions/
│ ├── ICheckpointManager.cs
│ ├── ICheckpointStorage.cs
│ └── ICheckpointRestoration.cs
├── Models/
│ ├── CheckpointMetadata.cs
│ ├── CheckpointInfo.cs
│ └── CheckpointValidation.cs
├── Strategies/
│ ├── BestMetricCheckpointStrategy.cs
│ ├── LatestCheckpointStrategy.cs
│ └── PeriodicCheckpointStrategy.cs
├── DefaultCheckpointManager.cs
├── CheckpointSerializer.cs
└── CheckpointRepository.cs
Follows Pattern:
- Similar to existing
Serialization/module - Uses existing
ModelSerializerandModelMetadata - Namespace:
AiDotNet.CheckpointManagement - Generic:
CheckpointManager<T, TInput, TOutput>
Integration Points:
- Leverage existing JSON serialization infrastructure
- Extend
IModelSerializerinterface - Store checkpoint metadata in enhanced
ModelMetadata
Recommended Location: /src/TrainingMonitoring/
Structure:
TrainingMonitoring/
├── Abstractions/
│ ├── ITrainingMonitor.cs
│ ├── IMetricsCollector.cs
│ ├── ITrainingLogger.cs
│ └── IAlertSystem.cs
├── Models/
│ ├── TrainingMetrics.cs
│ ├── MetricSnapshot.cs
│ ├── PerformanceAlert.cs
│ └── TrainingStatistics.cs
├── Collectors/
│ ├── IterationMetricsCollector.cs
│ ├── EpochMetricsCollector.cs
│ └── ResourceMetricsCollector.cs
├── Loggers/
│ ├── ConsoleTrainingLogger.cs
│ ├── FileTrainingLogger.cs
│ └── RemoteTrainingLogger.cs
├── Alerts/
│ ├── ConvergenceAlert.cs
│ ├── OverfittingAlert.cs
│ └── DivergenceAlert.cs
├── DefaultTrainingMonitor.cs
└── MetricsAggregator.cs
Follows Pattern:
- Similar to
Evaluation/module - Reuses existing
OptimizationIterationInfo - Namespace:
AiDotNet.TrainingMonitoring
Integration Points:
- Hook into
OptimizerBaseiteration loops - Consume metrics from
DefaultModelEvaluator - Integrate with
ExperimentTrackingmodule
Recommended Location: /src/ModelRegistry/
Structure:
ModelRegistry/
├── Abstractions/
│ ├── IModelRegistry.cs
│ ├── IModelRepository.cs
│ ├── IModelCatalog.cs
│ └── IModelVersioning.cs
├── Models/
│ ├── RegisteredModel.cs
│ ├── ModelVersion.cs
│ ├── ModelArtifact.cs
│ └── RegistryQueryResult.cs
├── Backends/
│ ├── FileSystemRegistry.cs
│ ├── DatabaseRegistry.cs
│ └── RemoteRegistry.cs
├── Search/
│ ├── ModelSearchCriteria.cs
│ ├── ModelSearchEngine.cs
│ └── ModelIndexer.cs
├── DefaultModelRegistry.cs
├── ModelRegistrationConfig.cs
└── ModelComparisonEngine.cs
Follows Pattern:
- Extends existing
ModelMetadataand serialization - Similar to RAG document store architecture
- Namespace:
AiDotNet.ModelRegistry - Generic:
ModelRegistry<T, TInput, TOutput>
Integration Points:
- Leverage existing
IModelSerializerinterface - Store enhanced
ModelMetadata - Integrate with checkpoint management
- Work with experiment tracking for lineage
Recommended Location: /src/DataVersioning/
Structure:
DataVersioning/
├── Abstractions/
│ ├── IDataVersionManager.cs
│ ├── IDatasetVersion.cs
│ ├── IDataHash.cs
│ └── IDataChangeTracker.cs
├── Models/
│ ├── DatasetVersionInfo.cs
│ ├── DataVersionMetadata.cs
│ ├── DataStatistics.cs
│ └── VersionComparisonResult.cs
├── Tracking/
│ ├── DataHashCalculator.cs
│ ├── DataChangeDetector.cs
│ └── DataIntegrityValidator.cs
├── Storage/
│ ├── VersionedDataStore.cs
│ ├── DataSnapshotStorage.cs
│ └── DeltaStorage.cs
├── DefaultDataVersionManager.cs
├── DatasetRegistry.cs
└── DataProvenanceTracker.cs
Follows Pattern:
- Extends existing
Data/Loaders/infrastructure - Reuses
MetaLearningTaskstructure - Namespace:
AiDotNet.DataVersioning - Works with episodic data loaders
Integration Points:
- Extend
EpisodicDataLoaderBase - Track data used in experiments
- Link with experiment tracking and model registry
- Integrate with existing data loaders
1. Configuration Management
- Follow existing pattern: Create
{Component}Configclasses - Location:
Models/Options/for options - Add enums to
Enums/folder if needed
2. Result Classes
- Create
{Component}Resultclasses - Location:
Models/Results/ - Example:
ExperimentTrackingResult,CheckpointRestoreResult
3. Exceptions
- Extend existing exceptions in
Exceptions/ - Follow naming:
{Component}Exception - Examples:
CheckpointException,DataVersionException
4. Integration with AiModelBuilder
- Add builder methods to configure each component
- Follow existing pattern:
Configure{Component}() - Example:
ConfigureCheckpointManager(),ConfigureExperimentTracker()
5. Serialization
- Create JSON converters in
Serialization/if needed - Extend
IModelSerializerfor new types - Ensure round-trip serialization works
6. Caching Strategy
- Leverage existing
IModelCachefor caching - Consider caching frequently accessed data (experiments, models)
- Example: Cache recent model versions, experiment metadata
7. Interfaces First
- Always define interfaces first
- Follow existing I-prefix pattern
- Generic type parameters:
<T, TInput, TOutput>where applicable
- Data Versioning - Foundational for tracking data lineage
- Experiment Tracking - Core infrastructure for recording runs
- Training Monitoring - Metrics collection during training
- Checkpoint Management - Save/restore during training
- Hyperparameter Optimization - Build on experiment tracking
- Model Registry - Central storage of trained models
- Phase 1 provides the foundation for all other components
- Phase 2 builds on Phase 1 infrastructure
- Each component has clear dependencies and integration points
| Existing Component | How to Leverage |
|---|---|
OptimizationIterationInfo<T> |
Base for training metrics |
OptimizationResult<T, TInput, TOutput> |
Base for experiment results |
ModelMetadata<T> |
Extend for registry and experiments |
IModelEvaluator<T, TInput, TOutput> |
Use for metric calculation |
OptimizerBase iteration patterns |
Hook points for monitoring |
AutoML/TrialResult |
Foundation for hyperparameter trials |
Serialization/ JSON converters |
Serialize new data types |
Data/Loaders/ episodic pattern |
Data versioning base |
IModelCache<T, TInput, TOutput> |
Cache experiments and models |
Models/Options/ pattern |
Configuration management |
- Generic Type Parameters: Always use
<T>for numeric type,<TInput, TOutput>for model data - Interface-First Design: Define interfaces before implementations
- Builder Pattern: Use for complex configurations
- Result Objects: Encapsulate outcomes in Result classes
- Namespace Organization: Group by feature area
- Configuration Pattern: Use
{Feature}Configor{Feature}Options - Separation of Concerns: Abstractions separate from implementations
- Documentation: Rich XML comments for beginners and experts
- Validation: Always validate configuration in
IsValid()methods - Extensibility: Enable custom implementations via interfaces