Results database with MLFlow #10

stevenstetzler · 2025-10-28T02:06:23Z

Address #1

Copilot

Pull Request Overview

This PR integrates MLflow for experiment tracking and result management, addressing issue #1. It refactors the main search function to be callable programmatically and adds database storage capabilities for search results using SQLAlchemy.

Key changes:

Extracted core search logic into run_search() function for programmatic access
Added MLflow integration via run_search_mlflow() for experiment tracking
Implemented database models and result compilation utilities for persistent storage

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/test_search.py	Added tests for the new `run_search` and `run_search_mlflow` functions
src/find_asteroids/search.py	Refactored `main()` to extract `run_search()` and added `run_search_mlflow()` wrapper with MLflow tracking
src/find_asteroids/results.py	New module for reading and compiling results from local directories or MLflow, with database storage support
src/find_asteroids/models.py	New SQLAlchemy models for storing experiments, runs, and search results in a database
pyproject.toml	Added sqlalchemy dependency and mlflow as optional pipeline dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/find_asteroids/search.py

Copilot · 2025-10-28T02:07:16Z

src/find_asteroids/search.py

+    with mlflow.start_run() as run:
+        run_id = run.info.run_id
+        print(f"MLflow Run ID: {run_id}")
+        for k, v in tags:


The loop iterates over tags without validation. If tags is an empty list (the default), this will work, but if a tag is not a two-element tuple, this will fail. While the CLI parsing at line 353 ensures tuples, when called directly via run_search_mlflow(), invalid tag formats could cause an unpacking error.

Suggested change

for k, v in tags:

for tag in tags:

if not (isinstance(tag, (tuple, list)) and len(tag) == 2):

raise ValueError(f"Each tag must be a tuple or list of length 2, got: {tag!r}")

k, v = tag

Copilot · 2025-10-28T02:07:16Z

src/find_asteroids/results.py

+
+def read_results(results_dir : Path, name : str, output_format='ecsv'):
+    import astropy.table
+    if type(results_dir) is str:
+        results_dir = Path(results_dir)
+    for p in sorted(results_dir.glob("*/"+f"{name}.{output_format}"), key=lambda x: int(x.parent.name)):
+        print("reading", p)


Use the logging module instead of print statements for consistency with the rest of the codebase. Import and use log.info() or log.debug() instead.

Suggested change

def read_results(results_dir : Path, name : str, output_format='ecsv'):

import astropy.table

if type(results_dir) is str:

results_dir = Path(results_dir)

for p in sorted(results_dir.glob("*/"+f"{name}.{output_format}"), key=lambda x: int(x.parent.name)):

print("reading", p)

import logging

log = logging.getLogger(__name__)

def read_results(results_dir : Path, name : str, output_format='ecsv'):

import astropy.table

if type(results_dir) is str:

results_dir = Path(results_dir)

for p in sorted(results_dir.glob("*/"+f"{name}.{output_format}"), key=lambda x: int(x.parent.name)):

log.info(f"reading {p}")

try mlflow for runs; compile results

e01172a

stevenstetzler requested a review from Copilot October 28, 2025 02:06

Copilot AI reviewed Oct 28, 2025

View reviewed changes

stevenstetzler added 2 commits October 27, 2025 19:08

upgrade workflows

95c3000

fix print

cba6eea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Results database with MLFlow #10

Results database with MLFlow #10

Uh oh!

stevenstetzler commented Oct 28, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Oct 28, 2025

Uh oh!

Copilot AI Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        for k, v in tags:
+        for tag in tags:
+            if not (isinstance(tag, (tuple, list)) and len(tag) == 2):
+                raise ValueError(f"Each tag must be a tuple or list of length 2, got: {tag!r}")
+            k, v = tag

Results database with MLFlow #10

Are you sure you want to change the base?

Results database with MLFlow #10

Uh oh!

Conversation

stevenstetzler commented Oct 28, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants