Skip to content

Conversation

@legout
Copy link
Owner

@legout legout commented Apr 28, 2025

No description provided.

legout added 30 commits April 10, 2025 03:32
- Replaced YAML handling in BaseConfig with msgspec for serialization and deserialization.
- Updated methods to use msgspec for converting to/from dictionaries and YAML.
- Removed unused DataStore, EventBroker, and Trigger classes along with their associated logic.
- Adjusted imports in pipeline.py to reflect new structure.
- Added cloudpickle and msgspec as dependencies in uv.lock.
- Cleaned up ProjectWorkerConfig to align with new BaseConfig structure.
- Added `utils.py` for crontab humanization and schedule display functions.
- Created `base.py` defining abstract classes for backend types, triggers, and schedulers.
- Developed `rq_backend.py` implementing RQ and RQ-Scheduler for job scheduling and management.
- Introduced methods for job result storage, event publishing, and schedule management in RQ.
- Enhanced display functions for schedules and jobs using Rich library for better visualization.
- Added RQBackend class for job result storage using Redis or in-memory storage.
- Introduced RQTrigger class to adapt trigger logic for RQ worker backend.
- Created utility functions to display scheduled jobs and job statuses.
- Developed RQWorker class to manage job scheduling and execution with RQ and rq-scheduler.
- Removed legacy rq_backend.py file as functionality is now modularized.
- Updated uv.lock to reflect new dependencies and platform markers.
…onality

- Introduced APSDataStore and APSEventBroker classes for APScheduler backend.
- Enhanced RQWorker to support multiple queues and improved job scheduling.
- Updated configuration handling in worker classes to streamline backend setup.
- Added context manager support to BaseWorker for better resource management.
- Improved logging for better traceability during worker operations.
- Cleaned up imports and formatting across various modules for consistency.
…ndling

- Deleted the TUI implementation in `src/flowerpower/tui.py`.
- Renamed `backend_type` parameter to `type` in `Worker` and `Backend` classes for consistency.
- Updated conditional checks in `Worker` and `Backend` classes to use the new `type` parameter.
- Added logic to set the multiprocessing start method to 'fork' for macOS ARM in `RQWorker`.
… classes

- Added a new configuration file `pwc.yml` for APScheduler backend settings.
- Refactored `worker.py` to implement `APSDataStore` and `APSEventBroker` classes.
- Updated `APSBackend` to utilize the new data store and event broker classes.
- Enhanced `ProjectWorkerConfig` to support dynamic backend configuration.
- Implemented utility functions for updating configurations from dictionaries.
- Refactored `APSWorker` and `RQWorker` classes to streamline configuration loading and job management.
- Improved error handling and logging for backend setup processes.
- Added support for SSL configuration in various backend types.
- Cleaned up redundant code and improved method signatures for clarity.
- Commented out the `store_job_result` and `get_job_result` methods in RQBackend to indicate they are not currently in use.
- Updated the `get_job_result` method in RQWorker to raise a NotImplementedError, signaling that the functionality is not yet implemented.
…line management

- Introduced PipelineRunner class to handle execution of pipelines with support for telemetry, tracking, and progress visualization.
- Implemented PipelineScheduler class to manage scheduling of pipeline runs, including job queuing and scheduling configurations.
- Created PipelineVisualizer class for visualizing pipeline DAGs, with methods to save and display graphs.
- Enhanced logging throughout the new classes for better traceability and debugging.
- Added necessary imports and type hints for improved code clarity and maintainability.
- Updated formatting in visualizer.py for consistency in comments.
- Refactored update_config_from_dict and update_nested_dict functions in misc.py for better readability.
- Rearranged import statements in worker/__init__.py for clarity.
- Cleaned up whitespace and comments in base.py for better organization.
- Removed unnecessary blank lines in huey/__init__.py and trigger.py.
- Enhanced error messages in huey/worker.py for clarity and consistency.
- Improved formatting in RQWorker class in rq/worker.py for better readability.
- Updated `PipelineRunConfig` and `PipelineTrackerConfig` to allow for None types in certain fields.
- Introduced `ProjectConfig` to encapsulate project-related configurations.
- Refactored `PipelineIOManager` to streamline import/export methods, replacing path parameters with base directory and source filesystem options.
- Enhanced `PipelineManager` to delegate pipeline management tasks to `PipelineIOManager`, including importing and exporting pipelines.
- Updated `PipelineRegistry` to improve pipeline creation logic and error handling.
- Adjusted `PipelineRunner`, `PipelineScheduler`, and `PipelineVisualizer` to accommodate new callable type hints and refactored method signatures.
- Removed deprecated code and improved overall code readability and maintainability.
- Added a new hosts file for local development in Docker configuration.
- Commented out port mappings in docker-compose.yml for various services.
- Refactored pipeline configuration classes to use more concise naming conventions (e.g., PipelineRunConfig to RunConfig).
- Updated imports and adjusted class definitions in the pipeline configuration files.
- Improved code readability by organizing import statements and removing unnecessary comments.
- Enhanced worker classes to support new backend configurations and improved error handling.
- Cleaned up unused code and ensured consistent formatting across multiple files.
legout added 14 commits April 22, 2025 01:31
- Introduced a centralized logging setup in `src/flowerpower/utils/logging.py` to standardize log formatting and levels across the application.
- Updated `src/flowerpower/pipeline/scheduler.py`, `src/flowerpower/worker/__init__.py`, and other worker modules to utilize the new logging setup.
- Created a new settings module `src/flowerpower/settings.py` to manage configuration variables, including logging levels and executor settings.
- Removed unused imports and commented-out code in various files for improved code clarity.
- Enhanced timezone handling in `src/flowerpower/utils/sql.py` to ensure proper error handling and timezone awareness.
- Updated dependency markers in `uv.lock` for better compatibility across platforms.
- Added support for `openlineage-python` package in the dependency management.
- Added context manager support to PipelineRunner for better resource management.
- Integrated logging level configuration from settings into the setup_logging function.
- Enhanced adapter setup to allow additional Hamilton adapters and improved logging of enabled adapters.
- Updated pipeline execution logging to include execution time using the humanize library.
- Refactored settings to include Hamilton-specific configurations for telemetry and data capture limits.
- Added humanize library as a dependency for better time formatting.
…uration handling

- Updated PipelineScheduler to accept ProjectConfig directly, simplifying initialization and worker type determination.
- Refactored job scheduling methods in PipelineScheduler to streamline argument handling and improve logging.
- Enhanced PipelineVisualizer to utilize ProjectConfig for loading pipeline configurations and modules.
- Removed unnecessary parameters and imports across various modules to clean up the codebase.
- Improved logging setup consistency across worker modules.
- Adjusted settings loading for better readability and maintainability.
- Cleaned up imports and ensured proper module loading practices.
…and configuration handling

- Added log_level parameter to PipelineManager for customizable logging.
- Updated logging setup to use loguru and configured it with settings.
- Refactored PipelineRegistry, PipelineScheduler, and PipelineVisualizer to accept filesystem and directory parameters directly.
- Enhanced error handling and logging throughout the pipeline management process.
- Streamlined pipeline configuration loading and caching mechanisms.
- Updated worker classes to support logging configuration.
- Improved job scheduling in RQWorker with options for delayed execution.
- Cleaned up unused code and comments for better readability.
- Added cron-descriptor version 1.4.5 and croniter version 6.0.0 to the project.
- Included new dependencies for cron-descriptor and croniter in the development environment.
- Updated the requirements for gevent, raven, rq-dashboard, and added redis-sentinel-url package.
- Added new package definitions for gevent, raven, redis-sentinel-url, and zope libraries with their respective versions.
- Updated the wheels and source URLs for the new packages.
…w features

- Added comprehensive docstrings for all storage option classes and methods, improving code readability and usability.
- Implemented `from_env` and `to_env` methods for Azure, GCS, GitHub, GitLab, and AWS storage options to facilitate environment variable configuration.
- Introduced `infer_protocol_from_uri` and `storage_options_from_uri` functions to infer storage protocol and create options from URI strings.
- Enhanced `merge_storage_options` function to combine multiple storage options with control over precedence.
- Improved type hints and added examples in docstrings for better clarity.
- Refactored existing methods to ensure consistency and maintainability across storage option classes.
Copilot AI review requested due to automatic review settings April 28, 2025 11:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates new pipeline examples and updates configuration and deployment files for the Rq2 development integration. Key changes include:

  • Addition of new pipeline modules (test_mqtt.py and hello_world.py) with associated configuration files.
  • Updates and enhancements to docker-compose and related deployment files.
  • Removal of an unused file (abc.py) and an updated changelog.

Reviewed Changes

Copilot reviewed 99 out of 107 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
examples/apscheduler/hello-world/pipelines/test_mqtt.py New test pipeline for MQTT functionality
examples/apscheduler/hello-world/pipelines/hello_world.py New pipeline with functions for spend and signups processing
examples/apscheduler/hello-world/conf/*.yml New configuration files for pipelines and project settings
examples/apscheduler/hello-world/README.md README added for the hello-world example
docker/python-worker/pyproject.toml Python worker project configuration
docker/python-worker/hello.py Simple entry-point script for the python worker
docker/docker-compose.yml Updated compose file with several service definitions and network setups
abc.py File removal for cleanup
CHANGELOG.md Updated changelog with version 0.9.13.1 changes
Files not reviewed (8)
  • docker/Caddyfile: Language not supported
  • docker/Caddyfile.bak: Language not supported
  • docker/Dockerfile: Language not supported
  • docker/conf/Caddyfile: Language not supported
  • docker/conf/etc/hosts: Language not supported
  • docker/conf/nginx.conf: Language not supported
  • docker/python-worker/.dockerignore: Language not supported
  • docker/python-worker/Dockerfile.dev: Language not supported
Comments suppressed due to low confidence (1)

docker/docker-compose.yml:240

  • [nitpick] The service name 'dockge' may be ambiguous. Consider renaming it to a clearer identifier to avoid confusion.
dockge:

def spend__1000() -> pd.Series:
"""Returns a series of spend data."""
# time.sleep(2)
return pd.Series(range(10_000)) * 10
Copy link

Copilot AI Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In function spend__1000, the series is generated using range(10_000) despite the decorator @config.when(range=1_000). Consider using range(1_000) to ensure consistent data sizes.

Suggested change
return pd.Series(range(10_000)) * 10
return pd.Series(range(1_000)) * 10

Copilot uses AI. Check for mistakes.
def signups__1000() -> pd.Series:
"""Returns a series of signups data."""
time.sleep(1)
return pd.Series(range(10_000))
Copy link

Copilot AI Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In function signups__1000, the series is created with range(10_000) even though it is annotated with @config.when(range=1_000). Consider correcting it to range(1_000) to match the expected data size.

Suggested change
return pd.Series(range(10_000))
return pd.Series(range(1_000))

Copilot uses AI. Check for mistakes.
Copy link
Owner Author

@legout legout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far :)

@legout legout merged commit 00d27a9 into main Apr 28, 2025
1 check failed
legout added a commit that referenced this pull request May 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants