Rydlr MotionBlend connector #326

RydlrCS · 2025-10-24T10:30:09Z

Description of Change
Added the MotionBlend connector example, which ingests motion-capture (MoCap) metadata from Google Cloud Storage (GCS) and delivers it to BigQuery for downstream transformation and analytics. The connector scans folders containing .bvh and .fbx files, extracts key metadata (motion type, frame rate, skeleton ID, timestamps), and loads structured information into BigQuery.

Testing

Ran fivetran debug on the connector; no errors found in connector.py, README.md, or configuration.json.
README follows the template and describes usage, configuration, and schema.
Example configuration uses placeholders and no real secrets.
Python code is PEP 8 compliant and passes linting.

Checklist

Tested the connector with fivetran debug command (no errors found).
Added/Updated example specific README.md file, refer here for template.
Followed Python Coding Standards, refer here.
All automated checks passed. If you need screenshots or further validation, let me know!

Project Write Up
Rydlr _ Moverse_ High-Throughput MoCap AI Pipeline (2).pdf

- Add METRICS.md with detailed metric specifications: * L2 Velocity/Acceleration formulas * FID and Coverage metrics * Diversity metrics (Global/Local/Inter/Intra) * Transition smoothness scoring * Quality categories and scoring weights - Document key joints tracked for discontinuity detection - Include BigQuery schema for metrics columns - Add academic references (Tselepi 2025, Guo 2020, Petrovich 2021) - Provide usage instructions for full metrics pipeline Complements connector.py with research-backed quality evaluation.

- Add 19 new columns for motion blend quality metrics: * FID and Coverage (distribution metrics) * Global/Local/Inter/Intra Diversity * L2 Velocity stats (mean, std, max, transition) * L2 Acceleration stats (mean, std, max, transition) * Transition smoothness, velocity/acceleration ratios * Overall quality_score and quality_category - Update transform_blend_record() to include metric placeholders - Metrics populated as NULL initially, filled by post-processing pipeline - Enables complete quality evaluation workflow: sync → compute → update Aligns with research metrics from Tselepi (2025), Guo (2020), Petrovich (2021) See METRICS.md for detailed specifications

- Complete setup instructions (Docker, Cloud, config) - Full index schema with all quality metric fields - 6 search examples (quality filtering, smoothness, diversity, similarity, aggregations, issue detection) - Python client code samples - Incremental sync strategy with Cloud Run job - Performance tuning (sharding, batch size, query optimization) - Monitoring and health checks - Security best practices (API keys, HTTPS, VPC) - Cost optimization strategies - Complete pipeline architecture (GCS → Fivetran → BQ → dbt → ES → API) Enables fast, scalable search across motion blend quality metrics for animation studios and ML pipelines.

Copilot

Pull Request Overview

This PR introduces a new MotionBlend connector for syncing motion-capture (MoCap) metadata from Google Cloud Storage (GCS) to BigQuery. The connector scans GCS folders for .bvh and .fbx files, extracts metadata, and loads it into three BigQuery tables representing different motion categories (seed, build, and blend motions).

Key changes:

Implements a new connector with GCS-to-BigQuery data pipeline
Supports three motion categories with different schemas and transformation logic
Includes comprehensive documentation for Elasticsearch integration and quality metrics

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 24 comments.

Show a summary per file

File	Description
connectors/motionblend/connector.py	Core connector implementation with schema, update, and transformation logic
connectors/motionblend/configuration.json	Configuration parameters for GCS bucket, prefixes, and BigQuery dataset
connectors/motionblend/requirements.txt	Third-party dependencies (structlog, GCP libraries, testing frameworks)
connectors/motionblend/README.md	Comprehensive documentation covering setup, authentication, and usage
connectors/motionblend/METRICS.md	Quality metrics documentation for motion blend evaluation
connectors/motionblend/ELASTICSEARCH.md	Elasticsearch integration guide with search examples

Comments suppressed due to low confidence (4)

connectors/motionblend/connector.py:18

Import of 'bigquery' is not used.

from google.cloud import storage, bigquery

connectors/motionblend/connector.py:19

Import of 'retry' is not used.

from google.api_core import retry

connectors/motionblend/connector.py:21

Import of 'time' is not used.

import time

connectors/motionblend/connector.py:326

Import of 'sys' is not used.

    import sys

connectors/motionblend/connector.py

connectors/motionblend/README.md

connectors/motionblend/connector.py

connectors/motionblend/README.md

- Remove excessive inline comments from table definitions - Move skeleton hierarchy documentation to schema() docstring - Improve formatting consistency with other connectors - Maintain clean, readable structure matching SDK patterns

RydlrCS

Pushed successfully! The schema formatting cleanup has been committed and pushed to the main branch.

connectors/motionblend/README.md

connectors/motionblend/connector.py

connectors/motionblend/requirements.txt

connectors/motionblend/README.md

RydlrCS · 2025-11-20T06:19:08Z

Production-ready code with comprehensive documentation and proper error handling.

Copilot

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 22 comments.

connectors/motionblend/connector.py

connectors/motionblend/README.md

connectors/motionblend/requirements.txt

connectors/motionblend/README.md

- Add last_sync_time parameter to list_gcs_files() function - Filter GCS blobs by updated timestamp for incremental processing - Read state before processing each prefix - Only process files updated after last sync time - Log incremental vs full sync mode for observability

- Remove confusing __RETRY_DELAY_SECONDS constant - Use standard pattern: sleep_time = min(60, 2 ** attempt) - Improves code readability and follows Python best practices

- Update schema() to lines 43-128 - Update list_gcs_files() to lines 131-222 - Update generate_record_id() to lines 249-262 - Update transform functions to lines 265-373 - Update update() to lines 375-469 Line numbers changed after incremental sync implementation

- Change google-cloud-storage>=2.0.0 to ==2.18.2 - Remove requests dependency (already available in SDK runtime) - Ensures reproducible builds and prevents breaking changes

- Move json import from __main__ block to top with other imports - Follow PEP 8 standard: all imports at top of file - Maintain proper import grouping: stdlib after SDK, before third-party

Copilot

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

connectors/motionblend/connector.py

connectors/motionblend/blend_utils.py

connectors/motionblend/connector.py

connectors/motionblend/README.md

connectors/motionblend/blend_utils.py

Critical fix: Track global_max_timestamp incrementally at each checkpoint instead of only at prefix end. This prevents state inconsistency where intermediate checkpoints would use different timestamps than the final state update. Problem: State was updated with last_file_timestamp at checkpoints (line 651) but then overwritten with global_max_timestamp at the end (line 953). If connector failed between last checkpoint and final update, state would have intermediate timestamp, causing reprocessing. Solution: Use shared global_max_timestamp_tracker dict that is updated for each file and used consistently at all checkpoints. State now always contains the true global maximum timestamp of processed files, ensuring monotonic progression and preventing data loss/duplication.

Fix inconsistency where generate_blend_id() returned 40-character hash while generate_record_id() returned 16 characters. This caused blend motion IDs to be 40 chars while seed/build motion IDs were 16 chars. Standardized to 16 characters ([:16]) for all ID generation: - Seed motion IDs: 16 chars (already consistent) - Build motion IDs: 16 chars (already consistent) - Blend motion IDs: 16 chars (fixed from 40) This ensures consistent ID format across all tables and prevents confusion in downstream analytics.

Refactored to reuse transform_blend_record() function instead of duplicating 32 lines of transformation logic in blend generation loop. Changes: - Added optional 'blend_meta' parameter to transform_blend_record() - Function now accepts pre-computed blend metadata from generate_blend_pairs() - Removed duplicated transformation logic (lines 749-781) - Replaced with single call to transform_blend_record() Benefits: - Single source of truth for blend record transformation - Schema changes only need to be applied in one place - Reduced maintenance burden and risk of inconsistencies - 32 lines of code eliminated

Reworded framework reference for better readability: - Combined documentation references on single line - Changed 'See https://...' to 'Framework implementation: https://...' - More consistent and professional formatting

Extracted three helper functions to reduce cognitive complexity from >15 to <10: - _calculate_blend_batch_limit(): Calculates batch limit based on remaining budget - _upsert_blend_records(): Handles record transformation, upserting, and checkpointing - _clear_motion_buffers(): Clears buffers to free memory Benefits: - Main function now has single-level conditional logic (complexity ~8) - Each helper has single responsibility and low complexity (<5) - Improved testability - helpers can be unit tested independently - Better code organization and readability - Maintained all error handling and logging behavior

Updated README configuration table to match exact placeholders from configuration.json: - google_cloud_storage_prefixes: <COMMA_SEPARATED_GCS_PREFIXES> → <YOUR_COMMA_SEPARATED_PREFIXES> - batch_limit: <OPTIONAL_BATCH_LIMIT_FOR_TESTING_ONLY> → <YOUR_OPTIONAL_BATCH_LIMIT> - include_extensions: Changed Default column from '.bvh,.fbx' to '<YOUR_OPTIONAL_FILE_EXTENSIONS>' - max_blend_pairs: Changed Default column from '100' to '<YOUR_OPTIONAL_MAX_BLEND_PAIRS>' - max_motion_buffer_size: Changed Default column from '1000' to '<YOUR_OPTIONAL_MAX_BUFFER_SIZE>' This ensures documentation accurately reflects the configuration template users will see.

Replaced conditional assignment pattern with early return for clearer intent: - Before: if min_frames > 0 ... else actual_window_ratio = 0.0 - After: if min_frames == 0: return None (early return) Benefits: - Prevents potential ZeroDivisionError more explicitly - Clearer intent - zero-length motions cannot have quality scores - Simpler control flow without conditional assignment - Consistent with existing early return at line 121

fivetran-sahilkhirwal

The logical flow is correct
As these examples will be available for users to use for their use cases, it is important that these are readable and maintainable.
Suggested some changes to improve readability and refactoring some long methods to smaller functions
Thanks for contributing to the repo :)

connectors/motionblend/README.md

connectors/motionblend/connector.py

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

connectors/motionblend/connector.py

connectors/motionblend/blend_utils.py

connectors/motionblend/connector.py

connectors/motionblend/blend_utils.py

connectors/motionblend/connector.py

connectors/motionblend/blend_utils.py

connectors/motionblend/connector.py

fivetran-sahilkhirwal · 2025-12-03T09:20:12Z

Please add this in the main repository readme as well.

Python formatting check failed. See formatting.diff artifact for details.
To fix the error(s) run the command below from the root of the repo and commit the changes:
./.github/scripts/fix-python-formatting.sh

fivetran-pranavtotala

Pending: Please update on the above comments, and add required testing details in the description.

cla-assistant · 2026-01-02T08:37:01Z

All committers have signed the CLA.

…omprehensive testing docs Major improvements: - Refactored connector.py: broke down long methods into focused helper functions * list_gcs_files() -> 4 helpers (connection, blob filtering, timestamp check, record creation) * validate_configuration() -> 4 validators (string, prefixes, extensions, integers) * update() -> 6 module-level functions (batch processing, prefix processing, blend generation) * Reduced update() from ~300 lines to ~50 lines for better maintainability Critical bug fixes: - Fixed state tracking: changed from global maximum to batch-local sequential tracking * Prevents data loss when files arrive out of order across different batches * State now reflects actual processing progress within current batch - Fixed validation edge cases: * Empty string handling ("0" now correctly validates as integer) * Division by zero in blend_utils.est Major improvements: - Refactored connector.py: broke down long methods ured (testing-only parameter) Code quality improvements: - Eliminated nested functions for bette- Refactored connemplified zero-division guards in calculate_blend_ratio() - Explicit is not None checks instead of truthiness for optional parameters - Applied Black formatting (99-char line length * update() -> 6 module-level functions (batch processing, prefix processing, ble with: * Local testing setup and validation checklist * 6 detailed test scenarios (first sync, in Critical bug fixes: - Fiecovery, etc.) * Log analysis patterns and known limitations * Production deployment checklist - Updated Features and Error handling sections - Enhanced par * State now reflects actual processing progress within current batch - Fixecking behavior Contributor: Ted Iro, Rydlr Cloud Services Ltd

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

connectors/motionblend/connector.py

- Fix _validate_optional_integer_parameter to use min_value parameter instead of hardcoded <= 0 check - Update schema() docstring to match required template format (moved skeleton hierarchy context to module-level docstring) These changes ensure proper parameter validation and maintain consistency with coding guidelines.

- Apply Black formatting to connector.py - Add motionblend entry to README.md in alphabetical order between microsoft_intune and n8n

RydlrCS added 6 commits October 22, 2025 15:29

Add MotionBlend connector example for GCS to BigQuery MoCap data sync

ee36351

Update README.md

065b73a

Merge branch 'fivetran:main' into main

10d47f3

RydlrCS requested review from a team as code owners October 24, 2025 10:30

fivetran-dejantucakov requested a review from 5tran-alexil October 24, 2025 13:32

fivetran-dejantucakov assigned 5tran-alexil Oct 24, 2025

fivetran-rishabhghosh requested review from fivetran-pranavtotala and fivetran-sahilkhirwal and removed request for a team October 27, 2025 11:33

fivetran-sahilkhirwal requested a review from Copilot October 29, 2025 21:34

Copilot AI reviewed Oct 29, 2025

View reviewed changes

github-actions bot added the size/XXL PR size: extra extra large label Nov 20, 2025

RydlrCS commented Nov 20, 2025

View reviewed changes

RydlrCS closed this Nov 20, 2025

RydlrCS reopened this Nov 20, 2025

RydlrCS requested a review from Copilot November 20, 2025 06:20

Copilot started reviewing on behalf of RydlrCS November 20, 2025 06:20 View session

Copilot finished reviewing on behalf of RydlrCS November 20, 2025 06:23

Copilot AI reviewed Nov 20, 2025

View reviewed changes

RydlrCS added 5 commits November 20, 2025 06:42

Fix exponential backoff calculation for clarity

4dc8794

- Remove confusing __RETRY_DELAY_SECONDS constant - Use standard pattern: sleep_time = min(60, 2 ** attempt) - Improves code readability and follows Python best practices

Pin exact dependency version for reproducible builds

ae60b9a

- Change google-cloud-storage>=2.0.0 to ==2.18.2 - Remove requests dependency (already available in SDK runtime) - Ensures reproducible builds and prevents breaking changes

Move json import to top of file per PEP 8

6d7c48f

- Move json import from __main__ block to top with other imports - Follow PEP 8 standard: all imports at top of file - Maintain proper import grouping: stdlib after SDK, before third-party

Copilot finished reviewing on behalf of RydlrCS November 20, 2025 15:54

Copilot AI reviewed Nov 20, 2025

View reviewed changes

RydlrCS added 7 commits November 20, 2025 16:01

Improve documentation clarity in module docstring

755a939

Reworded framework reference for better readability: - Combined documentation references on single line - Changed 'See https://...' to 'Framework implementation: https://...' - More consistent and professional formatting

fivetran-sahilkhirwal added the accel Google hack 2025 PR related to the Accelerate AI 2025 Google sponsored Hackathon. label Dec 2, 2025

fivetran-sahilkhirwal reviewed Dec 3, 2025

View reviewed changes

fivetran-sahilkhirwal requested a review from Copilot December 3, 2025 08:00

fivetran-sahilkhirwal assigned RydlrCS Dec 3, 2025

Copilot started reviewing on behalf of fivetran-sahilkhirwal December 3, 2025 08:01 View session

Copilot finished reviewing on behalf of fivetran-sahilkhirwal December 3, 2025 08:04

Copilot AI reviewed Dec 3, 2025

View reviewed changes

fivetran-pranavtotala requested changes Dec 10, 2025

View reviewed changes

fivetran deleted a comment from cla-assistant bot Jan 2, 2026

RydlrCS requested review from Copilot, fivetran-pranavtotala and fivetran-sahilkhirwal January 4, 2026 06:42

Copilot started reviewing on behalf of RydlrCS January 4, 2026 06:45 View session

Copilot AI reviewed Jan 4, 2026

View reviewed changes

connectors/motionblend/connector.py Show resolved Hide resolved

connectors/motionblend/connector.py Show resolved Hide resolved

connectors/motionblend/connector.py Outdated Show resolved Hide resolved

RydlrCS added 4 commits January 4, 2026 10:06

Merge branch 'fivetran:main' into main

30590f6

Fix formatting and add motionblend connector to README

5f70a62

- Apply Black formatting to connector.py - Add motionblend entry to README.md in alphabetical order between microsoft_intune and n8n

Merge branch 'fivetran:main' into main

f9bb048

Rydlr MotionBlend connector #326

Are you sure you want to change the base?

Rydlr MotionBlend connector #326

Uh oh!

Conversation

RydlrCS commented Oct 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key changes:

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RydlrCS left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RydlrCS commented Nov 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fivetran-sahilkhirwal left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cla-assistant bot commented Jan 2, 2026 •

edited

Loading