Skip to content

Conversation

@stevenstetzler
Copy link
Owner

Address #1

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates MLflow for experiment tracking and result management, addressing issue #1. It refactors the main search function to be callable programmatically and adds database storage capabilities for search results using SQLAlchemy.

Key changes:

  • Extracted core search logic into run_search() function for programmatic access
  • Added MLflow integration via run_search_mlflow() for experiment tracking
  • Implemented database models and result compilation utilities for persistent storage

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_search.py Added tests for the new run_search and run_search_mlflow functions
src/find_asteroids/search.py Refactored main() to extract run_search() and added run_search_mlflow() wrapper with MLflow tracking
src/find_asteroids/results.py New module for reading and compiling results from local directories or MLflow, with database storage support
src/find_asteroids/models.py New SQLAlchemy models for storing experiments, runs, and search results in a database
pyproject.toml Added sqlalchemy dependency and mlflow as optional pipeline dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

with mlflow.start_run() as run:
run_id = run.info.run_id
print(f"MLflow Run ID: {run_id}")
for k, v in tags:
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop iterates over tags without validation. If tags is an empty list (the default), this will work, but if a tag is not a two-element tuple, this will fail. While the CLI parsing at line 353 ensures tuples, when called directly via run_search_mlflow(), invalid tag formats could cause an unpacking error.

Suggested change
for k, v in tags:
for tag in tags:
if not (isinstance(tag, (tuple, list)) and len(tag) == 2):
raise ValueError(f"Each tag must be a tuple or list of length 2, got: {tag!r}")
k, v = tag

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +8

def read_results(results_dir : Path, name : str, output_format='ecsv'):
import astropy.table
if type(results_dir) is str:
results_dir = Path(results_dir)
for p in sorted(results_dir.glob("*/"+f"{name}.{output_format}"), key=lambda x: int(x.parent.name)):
print("reading", p)
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the logging module instead of print statements for consistency with the rest of the codebase. Import and use log.info() or log.debug() instead.

Suggested change
def read_results(results_dir : Path, name : str, output_format='ecsv'):
import astropy.table
if type(results_dir) is str:
results_dir = Path(results_dir)
for p in sorted(results_dir.glob("*/"+f"{name}.{output_format}"), key=lambda x: int(x.parent.name)):
print("reading", p)
import logging
log = logging.getLogger(__name__)
def read_results(results_dir : Path, name : str, output_format='ecsv'):
import astropy.table
if type(results_dir) is str:
results_dir = Path(results_dir)
for p in sorted(results_dir.glob("*/"+f"{name}.{output_format}"), key=lambda x: int(x.parent.name)):
log.info(f"reading {p}")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants