Thank you for your interest in contributing to SciGo! We welcome contributions from everyone, whether you're fixing a typo, adding a test, implementing a new algorithm, or improving documentation.
- Go 1.21 or later
- Git
- Make (optional but recommended)
-
Set up your development environment:
git clone https://github.com/YuminosukeSato/scigo.git cd scigo make setup-dev # Installs all tools and dependencies
-
Find an issue to work on:
- Look for issues labeled
good first issue - Or fix a typo in documentation
- Or add a missing test
- Look for issues labeled
-
Create your branch:
git checkout -b fix/issue-description # or git checkout -b feature/new-algorithm -
Make your changes and test:
make test # Run tests make lint-full # Check code style
-
Submit a Pull Request:
- Push your branch to your fork
- Open a PR with a clear description
- Wait for review and address feedback
That's it! π
SciGo aims to provide a high-performance, production-ready machine learning library for Go with scikit-learn compatible APIs. We prioritize:
- API Compatibility: Following scikit-learn's proven interface patterns
- Performance: Leveraging Go's concurrency and efficiency
- Reliability: Comprehensive testing and error handling
- Simplicity: Clear, idiomatic Go code
scigo/
βββ core/ # Core abstractions and utilities
β βββ model/ # Base estimator and interfaces
β βββ tensor/ # Tensor operations
β βββ parallel/ # Parallel processing utilities
βββ linear/ # Linear models (regression, classification)
βββ preprocessing/ # Data preprocessing (scalers, encoders)
βββ metrics/ # Evaluation metrics
βββ sklearn/ # Advanced scikit-learn compatible models
βββ pkg/ # Shared packages
β βββ errors/ # Error handling utilities
β βββ log/ # Structured logging
βββ examples/ # Usage examples
-
Format your code:
make fmt # or: go fmt ./... goimports -w . # Organize imports
-
Follow Go conventions:
- Use
camelCasefor unexported identifiers - Use
PascalCasefor exported identifiers - Keep line length under 100 characters when possible
- Write clear, concise comments
- Use
-
Run linters:
make lint-full # Runs comprehensive linting
All ML models in SciGo follow the scikit-learn estimator pattern:
// Basic Estimator Pattern
type Estimator interface {
Fit(X, y mat.Matrix) error
Predict(X mat.Matrix) (mat.Matrix, error)
Score(X, y mat.Matrix) (float64, error)
}
// Transformer Pattern
type Transformer interface {
Fit(X mat.Matrix) error
Transform(X mat.Matrix) (mat.Matrix, error)
FitTransform(X mat.Matrix) (mat.Matrix, error)
}Key Principles:
-
State Management: Use
BaseEstimatorfor consistent fitted state trackingtype MyModel struct { model.BaseEstimator // model-specific fields }
-
Error Handling: Use structured errors from
pkg/errorsif !m.IsFitted() { return nil, errors.NewNotFittedError("MyModel", "Predict") }
-
Logging: Use structured logging for ML operations
m.LogInfo("Training started", log.OperationKey, log.OperationFit, log.SamplesKey, nSamples, )
-
Numerical Precision: Always use
float64for numerical computations -
Matrix Operations: Use
gonum.org/v1/gonum/matfor matrix operations
- Coverage: Aim for >80% test coverage for new code
- Types: Write unit tests, integration tests, and benchmarks
- Naming: Use descriptive test names that explain what is being tested
-
Unit Tests: Test individual functions/methods
func TestLinearRegression_Fit(t *testing.T) { tests := []struct { name string X, y mat.Matrix wantErr bool }{ // test cases } // test implementation }
-
Example Tests: Provide usage examples
func ExampleLinearRegression() { // Create and train model lr := linear.NewLinearRegression() _ = lr.Fit(X, y) // Output: expected output }
-
Benchmarks: Measure performance
func BenchmarkLinearRegression_Fit(b *testing.B) { // benchmark implementation }
make test # Run all tests
make test-short # Run short tests only
make coverage # Generate coverage report
make bench # Run benchmarksEvery exported type, function, and method must have a godoc comment:
// LinearRegression implements ordinary least squares regression.
//
// The model minimizes the residual sum of squares between observed
// targets and predictions made by linear approximation.
//
// Example:
// lr := linear.NewLinearRegression()
// err := lr.Fit(X, y)
// predictions, err := lr.Predict(X_test)
type LinearRegression struct {
// ...
}Each package should have a doc.go or package comment explaining:
- Package purpose
- Main types and functions
- Usage examples
- Related packages
-
Before submitting:
- Ensure all tests pass:
make test - Run linters:
make lint-full - Update documentation if needed
- Add tests for new functionality
- Update CHANGELOG.md if applicable
- Ensure all tests pass:
-
PR Description should include:
- What problem does this solve?
- How does it solve it?
- Any breaking changes?
- Related issues (use "Fixes #123" to auto-close)
-
Review process:
- CI must pass (tests, linting, coverage)
- At least one maintainer approval required
- Address review feedback promptly
- Squash commits if requested
# Set up development environment
make setup-dev
# Run tests
make test
# Check code coverage
make coverage
# Run linters
make lint-full
# Format code
make fmt
# Run benchmarks
make bench
# Clean build artifacts
make clean
# See all available commands
make help-
Create the implementation:
// mypackage/algorithm.go package mypackage type MyAlgorithm struct { model.BaseEstimator // fields } func (m *MyAlgorithm) Fit(X, y mat.Matrix) error { // implementation m.SetFitted() return nil }
-
Add comprehensive tests:
// mypackage/algorithm_test.go func TestMyAlgorithm_Fit(t *testing.T) { // test implementation }
-
Add an example:
// mypackage/example_test.go func ExampleMyAlgorithm() { // example usage }
-
Update documentation:
- Add package documentation if new package
- Update README.md if significant feature
SciGo uses structured errors for better debugging:
// Use predefined error types
errors.NewNotFittedError("ModelName", "Method")
errors.NewDimensionError("Method", expected, got, axis)
errors.NewValueError("Method", "description")
// Wrap errors with context
fmt.Errorf("failed to train model: %w", err)
// Use panic recovery for public APIs
func (m *MyModel) Fit(X, y mat.Matrix) (err error) {
defer errors.Recover(&err, "MyModel.Fit")
// implementation
}-
Memory Efficiency:
- Reuse allocated memory when possible
- Use in-place operations for large matrices
- Clear references to allow garbage collection
-
Parallelization:
- Use
core/parallelutilities for concurrent operations - Set appropriate thresholds for parallel vs sequential processing
- Benchmark to verify performance improvements
- Use
-
Numerical Stability:
- Use stable algorithms (e.g., QR decomposition over matrix inversion)
- Check for numerical edge cases (division by zero, overflow)
- Use appropriate epsilon values for floating-point comparisons
- Check if the issue already exists
- Try with the latest version
- Ensure it's not a usage problem (check examples/documentation)
Include:
- Go version and OS
- Minimal reproducible example
- Expected vs actual behavior
- Error messages and stack traces
- Relevant logs (use
log.SetLevel(log.LevelDebug))
- Check existing issues/PRs for similar proposals
- Open a discussion for significant changes
- Provide use cases and example API
- Consider backward compatibility
- Be respectful and inclusive
- Welcome newcomers and help them get started
- Focus on constructive criticism
- Respect differing viewpoints and experiences
By contributing to SciGo, you agree that your contributions will be licensed under the MIT License.
Contributors are recognized in:
- Git history
- CONTRIBUTORS.md file
- Release notes for significant contributions
- Documentation: Check the README and package documentation
- Examples: Look at the examples directory
- Issues: Search existing issues or create a new one
- Discussions: Join GitHub Discussions for questions and ideas
Thank you for contributing to SciGo! π