Skip to content

Flytekit: Rename map_task to map, replace min_successes and min_success_ratio with tolerance, rename max_parallelism to concurrency #3107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

ChihTsungLu
Copy link

@ChihTsungLu ChihTsungLu commented Feb 4, 2025

Tracking issue

Related to flyteorg/flyte#6139

Why are the changes needed?

The current Flytekit has several areas that could be improved for a better developer experience:

  1. The map_task name is unnecessarily verbose when imported via the recommended import flytekit as fl
  2. The failure tolerance parameters (min_successes and min_success_ratio) are powerful but overly verbose
  3. The max_parallelism parameter naming in workflow and LaunchPlan needs to be aligned with map_task's concurrency parameter

What changes were proposed in this pull request?

  1. Rename map_task to map

    • While this conflicts with Python's built-in map, it's acceptable since we recommend using import flytekit as fl
    • All changes will maintain backwards compatibility
  2. Simplify failure tolerance parameters

    • Deprecate min_successes and min_success_ratio
    • Introduce new tolerance parameter that accepts both float and int types
    • Maintain backwards compatibility with existing parameters
  3. Standardize parallelism parameter

    • Deprecate max_parallelism argument in workflow and LaunchPlan
    • Introduce new concurrency parameter to match map_task's parameter
    • Maintain backwards compatibility with existing parameter

How was this patch tested?

Ran tests with the command: make test

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

This PR standardizes API naming by changing 'map_task' to 'map' and 'max_parallelism' to 'concurrency', while consolidating failure tolerance parameters. It transitions from legacy 'agent' nomenclature to a 'connector' paradigm while maintaining backward compatibility with deprecation warnings. The changes enhance execution metric retrieval, asynchronous task handling, resource validation, and file serialization for a more intuitive developer experience. It also improves exception handling and introduces geospatial support.

Unit tests added: False

Estimated effort to review (1-5, lower is better): 5

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 4, 2025

Code Review Agent Run #d47fe6

Actionable Suggestions - 13
  • tests/flytekit/unit/types/directory/test_listdir.py - 2
    • Consider implications of map vs map_task · Line 4-4
    • Consider using map_task for workflow operations · Line 29-29
  • plugins/flytekit-papermill/tests/test_task.py - 1
    • Consider using map_task for notebook tasks · Line 417-417
  • flytekit/__init__.py - 1
    • Consider maintaining backward compatibility for imports · Line 222-222
  • flytekit/core/array_node_map_task.py - 1
    • Consider keeping descriptive function name · Line 373-373
  • tests/flytekit/unit/core/test_array_node_map_task.py - 8
Additional Suggestions - 10
  • flytekit/core/options.py - 3
    • Consider adding concurrency parameter validation · Line 26-27
    • Consider adding validation for concurrency parameter · Line 38-38
    • Consider using @deprecated decorator instead · Line 43-66
  • tests/flytekit/unit/core/test_array_node_map_task.py - 2
  • tests/flytekit/integration/remote/workflows/basic/array_map.py - 1
    • Consider potential naming confusion with map · Line 4-4
  • tests/flytekit/unit/core/test_array_node.py - 1
  • flytekit/models/launch_plan.py - 2
    • Consider validating concurrency value before use · Line 277-277
    • Consider simplifying concurrency handling logic · Line 301-318
  • flytekit/tools/translator.py - 1
    • Consider consolidating duplicate warning logic · Line 355-382
Review Details
  • Files reviewed - 24 · Commit Range: 87dfe2f..d8e5d4b
    • flytekit/__init__.py
    • flytekit/clis/sdk_in_container/run.py
    • flytekit/core/array_node_map_task.py
    • flytekit/core/launch_plan.py
    • flytekit/core/options.py
    • flytekit/models/execution.py
    • flytekit/models/launch_plan.py
    • flytekit/remote/entities.py
    • flytekit/remote/remote.py
    • flytekit/tools/translator.py
    • plugins/flytekit-k8s-pod/tests/test_pod.py
    • plugins/flytekit-papermill/tests/test_task.py
    • tests/flytekit/integration/remote/workflows/basic/array_map.py
    • tests/flytekit/integration/remote/workflows/basic/pydantic_wf.py
    • tests/flytekit/unit/core/test_array_node.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_artifacts.py
    • tests/flytekit/unit/core/test_interface.py
    • tests/flytekit/unit/core/test_launch_plan.py
    • tests/flytekit/unit/core/test_node_creation.py
    • tests/flytekit/unit/core/test_partials.py
    • tests/flytekit/unit/core/test_type_hints.py
    • tests/flytekit/unit/remote/test_remote.py
    • tests/flytekit/unit/types/directory/test_listdir.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 4, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - API Refactoring and Deprecation Overhaul

__init__.py - Renamed 'map_task' to 'map' and added warnings import for improved API clarity.

run.py - Replaced 'max_parallelism' with 'concurrency' in CLI options to streamline parameter usage.

array_node_map_task.py - Renamed function to 'map', consolidated failure tolerance parameters into a unified 'tolerance' parameter, and added deprecation warnings.

launch_plan.py - Updated workflow parameters by switching to 'concurrency' while preserving backward compatibility via deprecation warnings.

options.py - Revised option definitions to substitute 'max_parallelism' with 'concurrency'.

execution.py - Modified execution model to prioritize 'concurrency', issuing deprecation warnings for 'max_parallelism'.

launch_plan.py - Aligned launch plan parameter naming by replacing 'max_parallelism' with 'concurrency'.

entities.py - Adapted remote entity logic to adopt 'concurrency' and added warnings for deprecated usage.

remote.py - Shifted parameter passing from 'max_parallelism' to 'concurrency' for consistency in remote calls.

translator.py - Enhanced launch plan serialization logic by prioritizing the new 'concurrency' parameter.

Testing - Test Suite Updates for API Migration

test_pod.py - Updated import statements to use 'map' instead of 'map_task', reflecting API changes.

test_task.py - Replaced 'map_task' with 'map' in task definitions to align with new naming.

array_map.py - Modified workflow integration tests to use the new 'map' function.

pydantic_wf.py - Updated workflow definitions by replacing 'map_task' with 'map'.

test_package.py - Applied minor test adjustments in line with updated CLI parameter names.

test_array_node.py - Altered test imports to replace 'map_task' with the new 'map'.

test_array_node_map_task.py - Replaced multiple occurrences of 'map_task' with 'map' to reflect API renaming.

test_artifacts.py - Updated import and usage of 'map' in place of 'map_task' for artifact tests.

test_interface.py - Refactored import statements and function calls from 'map_task' to 'map'.

test_launch_plan.py - Renamed instances of 'max_parallelism' to 'concurrency' and updated related variable names in test cases.

test_node_creation.py - Replaced 'map_task' with 'map' for node creation and testing consistency.

test_partials.py - Updated aliasing of 'map_task' to 'map' in partial function tests.

test_type_hints.py - Replaced 'map_task' with 'map' to update type hints and metadata usage.

test_remote.py - Updated remote task invocations to use 'map' instead of 'map_task'.

test_listdir.py - Replaced 'map_task' with 'map' in directory listing tests.

Documentation - Documentation and Baseline Lint Updates

pydoclint-errors-baseline.txt - Removed outdated lint baseline errors related to class attribute documentation inconsistencies.

@@ -1,7 +1,7 @@
import tempfile
from pathlib import Path

from flytekit import FlyteDirectory, FlyteFile, map_task, task, workflow
from flytekit import FlyteDirectory, FlyteFile, map, task, workflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider implications of map vs map_task

Consider if replacing map_task with map is intentional as they might have different functionality in the Flyte framework. map_task is typically used for task parallelization while map might have different semantics.

Code suggestion
Check the AI-generated fix before applying
Suggested change
from flytekit import FlyteDirectory, FlyteFile, map, task, workflow
from flytekit import FlyteDirectory, FlyteFile, map_task, task, workflow

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -26,6 +26,6 @@ def list_dir(dir: FlyteDirectory) -> list[FlyteFile]:
def wf() -> list[str]:
tmpdir = setup()
files = list_dir(dir=tmpdir)
return map_task(read_file)(file=files)
return map(read_file)(file=files)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task for workflow operations

Consider using map_task instead of map for task mapping operations in Flytekit workflows. The map function may not provide the same task-level parallelization and execution guarantees as map_task.

Code suggestion
Check the AI-generated fix before applying
Suggested change
return map(read_file)(file=files)
return map_task(read_file)(file=files)

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -414,7 +414,7 @@ def create_sd() -> StructuredDataset:
def test_map_over_notebook_task():
@workflow
def wf(a: float) -> typing.List[float]:
return map_task(nb_sub_task)(a=[a, a])
return map(nb_sub_task)(a=[a, a])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task for notebook tasks

Consider using map_task instead of map for mapping over notebook tasks. The map function may not handle notebook task specific requirements correctly.

Code suggestion
Check the AI-generated fix before applying
Suggested change
return map(nb_sub_task)(a=[a, a])
return map_task(nb_sub_task)(a=[a, a])

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

from flytekit._version import __version__
from flytekit.configuration import Config
from flytekit.core.array_node_map_task import map_task
from flytekit.core.array_node_map_task import map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider maintaining backward compatibility for imports

Consider keeping both map_task and map imports to maintain backward compatibility. The alias is defined later but importing directly as map may break existing code that uses map_task.

Code suggestion
Check the AI-generated fix before applying
Suggested change
from flytekit.core.array_node_map_task import map
from flytekit.core.array_node_map_task import map_task

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -369,11 +370,12 @@ def _raw_execute(self, **kwargs) -> Any:
return outputs


def map_task(
def map(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider keeping descriptive function name

Consider keeping the original function name map_task instead of renaming to map as it could conflict with Python's built-in map function and cause confusion. The original name was more descriptive of the function's purpose.

Code suggestion
Check the AI-generated fix before applying
Suggested change
def map(
def map_task(

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -63,7 +63,7 @@ def say_hello(name: str) -> str:

@workflow
def wf() -> List[str]:
return map_task(say_hello)(name=["abc", "def"])
return map(say_hello)(name=["abc", "def"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map task function call change

Consider if using map() instead of map_task() is intentional as it changes the behavior from using Flyte's map task functionality to Python's built-in map().

Code suggestion
Check the AI-generated fix before applying
Suggested change
return map(say_hello)(name=["abc", "def"])
return map_task(say_hello)(name=["abc", "def"])

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -575,7 +575,7 @@ def say_hello(name: str) -> str:
for index, map_input_str in enumerate(list_strs):
monkeypatch.setenv("BATCH_JOB_ARRAY_INDEX_VAR_NAME", "name")
monkeypatch.setenv("name", str(index))
t = map_task(say_hello)
t = map(say_hello)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential task mapping behavior change

Consider if using map() instead of map_task() is intentional as this could change the behavior of task mapping functionality.

Code suggestion
Check the AI-generated fix before applying
Suggested change
t = map(say_hello)
t = map_task(say_hello)

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -410,7 +410,7 @@ def test_serialization_metadata(serialization_settings):
def t1(a: int) -> int:
return a + 1

arraynode_maptask = map_task(t1, metadata=TaskMetadata(retries=2))
arraynode_maptask = map(t1, metadata=TaskMetadata(retries=2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function rename may affect compatibility

Consider if changing from map_task to map could impact backward compatibility. The function name change from map_task to map may affect existing code that imports and uses the original function name.

Code suggestion
Check the AI-generated fix before applying
Suggested change
arraynode_maptask = map(t1, metadata=TaskMetadata(retries=2))
# Maintain both for backward compatibility
arraynode_maptask = map_task(t1, metadata=TaskMetadata(retries=2))

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

t1 = map(say_hello, **kwargs1)
t2 = map(say_hello, **kwargs2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify intended function call change

Consider if replacing map_task with map is intentional as this changes the function being called which could affect functionality. The map_task decorator appears to be imported but not used after this change.

Code suggestion
Check the AI-generated fix before applying
Suggested change
t1 = map(say_hello, **kwargs1)
t2 = map(say_hello, **kwargs2)
t1 = map_task(say_hello, **kwargs1)
t2 = map_task(say_hello, **kwargs2)

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -316,7 +316,7 @@ def test_bounded_inputs_vars_order(serialization_settings):
def task1(a: int, b: float, c: str) -> str:
return f"{a} - {b} - {c}"

mt = map_task(functools.partial(task1, c=1.0, b="hello", a=1))
mt = map(functools.partial(task1, c=1.0, b="hello", a=1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task instead of map

Consider using map_task() instead of map() as it appears to be the intended function based on the test context and imports. Using map() could lead to unexpected behavior since it's a built-in Python function.

Code suggestion
Check the AI-generated fix before applying
Suggested change
mt = map(functools.partial(task1, c=1.0, b="hello", a=1))
mt = map_task(functools.partial(task1, c=1.0, b="hello", a=1))

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -492,7 +492,7 @@ def test_supported_node_type():
def test_task():
...

map_task(test_task)
map(test_task)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task instead of map

The function call has been changed from map_task(test_task) to map(test_task). This could potentially cause confusion with Python's built-in map() function. Consider using the imported map_task decorator/function to maintain clarity and avoid potential naming conflicts.

Code suggestion
Check the AI-generated fix before applying
Suggested change
map(test_task)
map_task(test_task)

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -533,7 +533,7 @@ def consume_directories(dirs: List[FlyteDirectory]):
for path_info, other_info in d.crawl():
print(path_info)

mt = map_task(generate_directory, min_success_ratio=0.1)
mt = map(generate_directory, min_success_ratio=0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify map function usage intention

Consider if using map() instead of map_task() is intentional as it may change the expected behavior. The map_task() function is typically used for array node map tasks in Flytekit.

Code suggestion
Check the AI-generated fix before applying
Suggested change
mt = map(generate_directory, min_success_ratio=0.1)
mt = map_task(generate_directory, min_success_ratio=0.1)

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -575,7 +575,7 @@ def say_hello(name: str) -> str:
for index, map_input_str in enumerate(list_strs):
monkeypatch.setenv("BATCH_JOB_ARRAY_INDEX_VAR_NAME", "name")
monkeypatch.setenv("name", str(index))
t = map_task(say_hello)
t = map(say_hello)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task instead of map

Consider using map_task instead of map as it appears to be the intended decorator based on the imports and test context. The map function could be confused with Python's built-in map function.

Code suggestion
Check the AI-generated fix before applying
Suggested change
t = map(say_hello)
t = map_task(say_hello)

Code Review Run #d47fe6


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 5, 2025

Code Review Agent Run #99b31d

Actionable Suggestions - 8
  • tests/flytekit/unit/core/test_array_node_map_task.py - 4
  • flytekit/remote/remote.py - 1
  • tests/flytekit/unit/core/test_node_creation.py - 1
    • Consider using map_task for workflow testing · Line 276-276
  • tests/flytekit/unit/remote/test_remote.py - 1
  • flytekit/core/launch_plan.py - 1
Additional Suggestions - 10
  • flytekit/core/options.py - 4
    • Consider adding concurrency parameter validation · Line 26-27
    • Consider adding validation for concurrency parameter · Line 38-38
    • Consider using standard deprecation decorator pattern · Line 43-66
    • Consider consolidating duplicate warning message · Line 48-64
  • flytekit/models/execution.py - 1
    • Consider adding property setter for deprecation · Line 290-302
  • flytekit/clis/sdk_in_container/run.py - 1
    • Consider updating deprecated parameter name · Line 529-529
  • tests/flytekit/unit/core/test_node_creation.py - 1
  • tests/flytekit/unit/core/test_array_node_map_task.py - 3
Review Details
  • Files reviewed - 24 · Commit Range: 87dfe2f..09755a2
    • flytekit/__init__.py
    • flytekit/clis/sdk_in_container/run.py
    • flytekit/core/array_node_map_task.py
    • flytekit/core/launch_plan.py
    • flytekit/core/options.py
    • flytekit/models/execution.py
    • flytekit/models/launch_plan.py
    • flytekit/remote/entities.py
    • flytekit/remote/remote.py
    • flytekit/tools/translator.py
    • plugins/flytekit-k8s-pod/tests/test_pod.py
    • plugins/flytekit-papermill/tests/test_task.py
    • tests/flytekit/integration/remote/workflows/basic/array_map.py
    • tests/flytekit/integration/remote/workflows/basic/pydantic_wf.py
    • tests/flytekit/unit/core/test_array_node.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_artifacts.py
    • tests/flytekit/unit/core/test_interface.py
    • tests/flytekit/unit/core/test_launch_plan.py
    • tests/flytekit/unit/core/test_node_creation.py
    • tests/flytekit/unit/core/test_partials.py
    • tests/flytekit/unit/core/test_type_hints.py
    • tests/flytekit/unit/remote/test_remote.py
    • tests/flytekit/unit/types/directory/test_listdir.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@@ -315,7 +316,7 @@ def test_bounded_inputs_vars_order(serialization_settings):
def task1(a: int, b: float, c: str) -> str:
return f"{a} - {b} - {c}"

mt = map_task(functools.partial(task1, c=1.0, b="hello", a=1))
mt = map(functools.partial(task1, c=1.0, b="hello", a=1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter type mismatch in task call

The function call parameters c=1.0, b="hello", a=1 appear to have mismatched types with the task definition. The task expects a: int, b: float, c: str but receives c as float, b as string, and a as int. Consider adjusting the parameter types to match the task signature.

Code suggestion
Check the AI-generated fix before applying
Suggested change
mt = map(functools.partial(task1, c=1.0, b="hello", a=1))
mt = map(functools.partial(task1, c="1.0", b=1.0, a=1))

Code Review Run #99b31d


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -1551,7 +1551,7 @@ def _execute(
annotations=options.annotations,
raw_output_data_config=options.raw_output_data_config,
auth_role=None,
max_parallelism=options.max_parallelism,
concurrency=options.concurrency,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter rename may break compatibility

Consider verifying if renaming max_parallelism to concurrency maintains backward compatibility. This change could potentially break existing code that relies on the max_parallelism parameter.

Code suggestion
Check the AI-generated fix before applying
Suggested change
concurrency=options.concurrency,
concurrency=options.max_parallelism if hasattr(options, 'max_parallelism')
else options.concurrency,
# TODO: Remove max_parallelism support in next major version
# Deprecated in favor of concurrency parameter

Code Review Run #99b31d


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -273,7 +273,7 @@ def t1(a: str) -> str:

@workflow
def my_wf(a: typing.List[str]) -> typing.List[str]:
mappy = map_task(t1)
mappy = map(t1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task for workflow testing

Consider using map_task instead of map as it appears to be the intended function based on the test context. The map function may not provide the same task mapping functionality needed for workflow testing.

Code suggestion
Check the AI-generated fix before applying
Suggested change
mappy = map(t1)
mappy = map_task(t1)

Code Review Run #99b31d


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -726,7 +726,7 @@ def t1(x: int, y: int) -> int:

@workflow
def w() -> int:
return map_task(partial(t1, y=2))(x=[1, 2, 3])
return map(partial(t1, y=2))(x=[1, 2, 3])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task for consistency

Consider using map_task instead of map as it appears to be testing map task functionality based on the test name and context.

Code suggestion
Check the AI-generated fix before applying
Suggested change
return map(partial(t1, y=2))(x=[1, 2, 3])
return map_task(partial(t1, y=2))(x=[1, 2, 3])

Code Review Run #99b31d


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines +307 to +308
m1 = map(functools.partial(task1, c=param_c))(a=param_a, b=param_b)
m2 = map(functools.partial(task2, c=param_c))(a=param_a, b=param_b)
m3 = map(functools.partial(task3, c=param_c))(a=param_a, b=param_b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using map_task for array testing

Consider using map_task instead of map for consistency with the test name and module being tested (test_array_node_map_task.py). The test appears to be validating array node map task functionality.

Code suggestion
Check the AI-generated fix before applying
Suggested change
m1 = map(functools.partial(task1, c=param_c))(a=param_a, b=param_b)
m2 = map(functools.partial(task2, c=param_c))(a=param_a, b=param_b)
m3 = map(functools.partial(task3, c=param_c))(a=param_a, b=param_b)
m1 = map_task(functools.partial(task1, c=param_c))(a=param_a, b=param_b)
m2 = map_task(functools.partial(task2, c=param_c))(a=param_a, b=param_b)
m3 = map_task(functools.partial(task3, c=param_c))(a=param_a, b=param_b)

Code Review Run #99b31d


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines 378 to 379
<<<<<<< Updated upstream
=======
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolved merge conflict markers

There appears to be a merge conflict marker in the code. The lines <<<<<<< Updated upstream and ======= are Git merge conflict markers that should be resolved before committing.

Code suggestion
Check the AI-generated fix before applying
Suggested change
<<<<<<< Updated upstream
=======

Code Review Run #9482f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

def test_map_task_override(serialization_settings):
@task
def my_mappable_task(a: int) -> typing.Optional[str]:
return str(a)

>>>>>>> Stashed changes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolved Git merge conflict marker

There appears to be a Git merge conflict marker (>>>>>>> Stashed changes) in the code. This needs to be resolved before the code can be properly merged.

Code suggestion
Check the AI-generated fix before applying
 -<<<<<<< Updated upstream
 -=======
 -def test_map_task_override(serialization_settings):
 -    @task
 -    def my_mappable_task(a: int) -> typing.Optional[str]:
 -        return str(a)
 -
 ->>>>>>> Stashed changes

Code Review Run #9482f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

- Rename map_task to map for simpler API
- Replace min_successes/min_success_ratio with tolerance parameter
- Rename max_parallelism to concurrency for consistency

Signed-off-by: Chih Tsung Lu <[email protected]>
Signed-off-by: Chih Tsung Lu <[email protected]>
Signed-off-by: Chih Tsung Lu <[email protected]>
@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 3, 2025

Code Review Agent Run #58bfcc

Actionable Suggestions - 2
  • flytekit/remote/entities.py - 1
    • Inconsistent handling of concurrency parameter · Line 811-812
  • flytekit/clis/sdk_in_container/run.py - 1
    • Possible conflict between concurrency parameters · Line 269-276
Review Details
  • Files reviewed - 26 · Commit Range: 9213010..afa8096
    • flytekit/__init__.py
    • flytekit/clis/sdk_in_container/run.py
    • flytekit/core/array_node_map_task.py
    • flytekit/core/launch_plan.py
    • flytekit/core/options.py
    • flytekit/models/execution.py
    • flytekit/models/launch_plan.py
    • flytekit/remote/entities.py
    • flytekit/remote/remote.py
    • flytekit/tools/translator.py
    • plugins/flytekit-k8s-pod/tests/test_pod.py
    • plugins/flytekit-papermill/tests/test_task.py
    • pydoclint-errors-baseline.txt
    • tests/flytekit/integration/remote/workflows/basic/array_map.py
    • tests/flytekit/integration/remote/workflows/basic/pydantic_wf.py
    • tests/flytekit/unit/cli/pyflyte/test_package.py
    • tests/flytekit/unit/core/test_array_node.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_artifacts.py
    • tests/flytekit/unit/core/test_interface.py
    • tests/flytekit/unit/core/test_launch_plan.py
    • tests/flytekit/unit/core/test_node_creation.py
    • tests/flytekit/unit/core/test_partials.py
    • tests/flytekit/unit/core/test_type_hints.py
    • tests/flytekit/unit/remote/test_remote.py
    • tests/flytekit/unit/types/directory/test_listdir.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Signed-off-by: Chih Tsung Lu <[email protected]>
Signed-off-by: Chih Tsung Lu <[email protected]>
(cherry picked from commit a62560a9235f2591a5530cd1834b8e3f5fa9c492)
Signed-off-by: Chih Tsung Lu <[email protected]>
Comment on lines +811 to +812
if "concurrency" in kwargs:
kwargs["max_parallelism"] = kwargs.pop("concurrency")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent handling of concurrency parameter

The code sets self._max_parallelism directly from kwargs["max_parallelism"] but doesn't handle the case where concurrency is used. This creates inconsistent behavior between the two parameter options.

Code suggestion
Check the AI-generated fix before applying
Suggested change
if "concurrency" in kwargs:
kwargs["max_parallelism"] = kwargs.pop("concurrency")
if "concurrency" in kwargs:
kwargs["max_parallelism"] = kwargs.pop("concurrency")
self._max_parallelism = kwargs["max_parallelism"]

Code Review Run #58bfcc


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Comment on lines +269 to +276
max_parallelism: int = make_click_option_field(
click.Option(
param_decls=["--max-parallelism"],
required=False,
type=int,
show_default=True,
help="[Deprecated] Use --concurrency instead",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible conflict between concurrency parameters

The parameter name has been changed from --max-parallelism to --concurrency, but there's still a max_parallelism parameter defined at line 269 that uses the old flag name. This creates a situation where both flags are available but they likely control the same functionality. Consider updating the code to handle both parameters appropriately, perhaps by having the deprecated max_parallelism parameter set the concurrency value when provided.

Code suggestion
Check the AI-generated fix before applying
 @@ -268,12 +268,19 @@
      )
      max_parallelism: int = make_click_option_field(
          click.Option(
              param_decls=["--max-parallelism"],
              required=False,
              type=int,
              show_default=True,
              help="[Deprecated] Use --concurrency instead",
 +            callback=lambda ctx, param, value: _handle_max_parallelism(ctx, value),
          )
      )
 +
 +def _handle_max_parallelism(ctx, value):
 +    """Handle the deprecated max_parallelism parameter by setting concurrency if provided."""
 +    if value is not None and 'concurrency' not in ctx.params:
 +        ctx.params['concurrency'] = value
 +    return value
 +

Code Review Run #58bfcc


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 3, 2025

Code Review Agent Run #184856

Actionable Suggestions - 2
  • flytekit/core/array_node_map_task.py - 1
  • flytekit/core/launch_plan.py - 1
    • Possible missing migration for renamed attribute · Line 382-382
Review Details
  • Files reviewed - 26 · Commit Range: 9213010..7b2e45e
    • flytekit/__init__.py
    • flytekit/clis/sdk_in_container/run.py
    • flytekit/core/array_node_map_task.py
    • flytekit/core/launch_plan.py
    • flytekit/core/options.py
    • flytekit/models/execution.py
    • flytekit/models/launch_plan.py
    • flytekit/remote/entities.py
    • flytekit/remote/remote.py
    • flytekit/tools/translator.py
    • plugins/flytekit-k8s-pod/tests/test_pod.py
    • plugins/flytekit-papermill/tests/test_task.py
    • pydoclint-errors-baseline.txt
    • tests/flytekit/integration/remote/workflows/basic/array_map.py
    • tests/flytekit/integration/remote/workflows/basic/pydantic_wf.py
    • tests/flytekit/unit/cli/pyflyte/test_package.py
    • tests/flytekit/unit/core/test_array_node.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_artifacts.py
    • tests/flytekit/unit/core/test_interface.py
    • tests/flytekit/unit/core/test_launch_plan.py
    • tests/flytekit/unit/core/test_node_creation.py
    • tests/flytekit/unit/core/test_partials.py
    • tests/flytekit/unit/core/test_type_hints.py
    • tests/flytekit/unit/remote/test_remote.py
    • tests/flytekit/unit/types/directory/test_listdir.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

else:
raise TypeError("tolerance must be float or int")

final_min_ratio = computed_min_ratio if min_success_ratio is None else min_success_ratio
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential None value assignment issue

The variable final_min_ratio is assigned min_success_ratio even when it's None. This could lead to unexpected behavior since the default value for min_success_ratio has been changed from 1.0 to None. Consider ensuring a non-None value is always assigned.

Code suggestion
Check the AI-generated fix before applying
Suggested change
final_min_ratio = computed_min_ratio if min_success_ratio is None else min_success_ratio
final_min_ratio = min_success_ratio if min_success_ratio is not None else computed_min_ratio

Code Review Run #184856


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

@@ -367,7 +379,14 @@ def __init__(
self._labels = labels
self._annotations = annotations
self._raw_output_data_config = raw_output_data_config
self._max_parallelism = max_parallelism
self._concurrency = concurrency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible missing migration for renamed attribute

The max_parallelism property is now returning self._concurrency instead of the original self._max_parallelism. This suggests that the attribute name has changed, but there might be a missing initialization or migration of the old attribute value. Consider ensuring that existing _max_parallelism values are properly migrated to _concurrency during initialization.

Code suggestion
Check the AI-generated fix before applying
Suggested change
self._concurrency = concurrency
# Ensure backward compatibility by using max_parallelism if provided
if hasattr(self, '_max_parallelism') and self._max_parallelism is not None:
self._concurrency = self._max_parallelism
else:
self._concurrency = concurrency

Code Review Run #184856


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 8, 2025

Code Review Agent Run #032d70

Actionable Suggestions - 0
Review Details
  • Files reviewed - 39 · Commit Range: 7b2e45e..5926fa8
    • Dockerfile.agent
    • flytekit/clients/friendly.py
    • flytekit/clis/sdk_in_container/metrics.py
    • flytekit/core/array_node.py
    • flytekit/core/constants.py
    • flytekit/core/context_manager.py
    • flytekit/core/interface.py
    • flytekit/core/node_creation.py
    • flytekit/core/python_function_task.py
    • flytekit/core/worker_queue.py
    • flytekit/core/workflow.py
    • flytekit/image_spec/__init__.py
    • flytekit/image_spec/default_builder.py
    • flytekit/image_spec/image_spec.py
    • flytekit/models/filters.py
    • flytekit/models/interface.py
    • flytekit/remote/executions.py
    • flytekit/remote/metrics.py
    • flytekit/remote/remote.py
    • flytekit/tools/serialize_helpers.py
    • flytekit/tools/translator.py
    • flytekit/types/file/file.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/__init__.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/function/task.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/script/agent.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/script/task.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/ssh_utils.py
    • plugins/flytekit-slurm/setup.py
    • plugins/flytekit-slurm/tests/test_slurm_shell_task.py
    • plugins/flytekit-spark/flytekitplugins/spark/task.py
    • tests/flytekit/integration/remote/workflows/basic/flytefile.py
    • tests/flytekit/unit/core/image_spec/test_image_spec.py
    • tests/flytekit/unit/core/test_async.py
    • tests/flytekit/unit/core/test_context_manager.py
    • tests/flytekit/unit/core/test_eager_cleanup.py
    • tests/flytekit/unit/core/test_imperative.py
    • tests/flytekit/unit/core/test_serialization.py
    • tests/flytekit/unit/models/test_interface.py
    • tests/flytekit/unit/types/file/test_file.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 13, 2025

Code Review Agent Run #1a2fb7

Actionable Suggestions - 0
Review Details
  • Files reviewed - 38 · Commit Range: 5926fa8..2dc189b
    • flytekit/__init__.py
    • flytekit/core/array_node.py
    • flytekit/core/array_node_map_task.py
    • flytekit/core/container_task.py
    • flytekit/core/context_manager.py
    • flytekit/core/interface.py
    • flytekit/core/legacy_map_task.py
    • flytekit/core/node.py
    • flytekit/core/python_auto_container.py
    • flytekit/core/python_customized_container_task.py
    • flytekit/core/resources.py
    • flytekit/core/task.py
    • flytekit/core/utils.py
    • flytekit/core/worker_queue.py
    • flytekit/models/interface.py
    • flytekit/models/launch_plan.py
    • flytekit/remote/remote.py
    • flytekit/tools/script_mode.py
    • flytekit/tools/translator.py
    • flytekit/types/pickle/pickle.py
    • plugins/flytekit-kf-pytorch/flytekitplugins/kfpytorch/pod_template.py
    • plugins/flytekit-kf-pytorch/flytekitplugins/kfpytorch/task.py
    • plugins/flytekit-kf-pytorch/tests/test_shared.py
    • plugins/flytekit-spark/setup.py
    • tests/flytekit/integration/remote/test_remote.py
    • tests/flytekit/integration/remote/workflows/basic/pickle_wf.py
    • tests/flytekit/unit/cli/pyflyte/test_package.py
    • tests/flytekit/unit/core/test_array_node.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_context_manager.py
    • tests/flytekit/unit/core/test_interface.py
    • tests/flytekit/unit/core/test_local_raw_container.py
    • tests/flytekit/unit/core/test_node_creation.py
    • tests/flytekit/unit/core/test_resources.py
    • tests/flytekit/unit/core/test_task.py
    • tests/flytekit/unit/core/test_type_hints.py
    • tests/flytekit/unit/core/test_worker_queue.py
    • tests/flytekit/unit/models/test_interface.py
  • Files skipped - 1
    • .gitignore - Reason: Filter setting
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

Refer to the documentation for additional commands.

Configuration

This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 19, 2025

Code Review Agent Run #31323f

Actionable Suggestions - 0
Review Details
  • Files reviewed - 73 · Commit Range: 2dc189b..3292aab
    • flytekit/clis/sdk_in_container/serve.py
    • flytekit/core/array_node_map_task.py
    • flytekit/core/data_persistence.py
    • flytekit/core/python_function_task.py
    • flytekit/core/type_engine.py
    • flytekit/exceptions/system.py
    • flytekit/exceptions/user.py
    • flytekit/extend/backend/base_agent.py
    • flytekit/extend/backend/base_connector.py
    • flytekit/extend/backend/utils.py
    • flytekit/extras/webhook/__init__.py
    • flytekit/extras/webhook/task.py
    • flytekit/image_spec/default_builder.py
    • flytekit/image_spec/image_spec.py
    • flytekit/models/task.py
    • flytekit/sensor/base_sensor.py
    • flytekit/sensor/file_sensor.py
    • flytekit/sensor/sensor_engine.py
    • plugins/flytekit-airflow/flytekitplugins/airflow/__init__.py
    • plugins/flytekit-airflow/flytekitplugins/airflow/task.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/__init__.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/boto3_mixin.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/boto3_task.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/task.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/workflow.py
    • plugins/flytekit-aws-sagemaker/tests/test_boto3_mixin.py
    • plugins/flytekit-aws-sagemaker/tests/test_inference_task.py
    • plugins/flytekit-aws-sagemaker/tests/test_inference_workflow.py
    • plugins/flytekit-bigquery/flytekitplugins/bigquery/__init__.py
    • plugins/flytekit-bigquery/flytekitplugins/bigquery/task.py
    • plugins/flytekit-geopandas/flytekitplugins/geopandas/__init__.py
    • plugins/flytekit-geopandas/flytekitplugins/geopandas/gdf_transformers.py
    • plugins/flytekit-geopandas/setup.py
    • plugins/flytekit-geopandas/tests/test_geopandas_plugin.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/__init__.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/k8s/kube_config.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/sensor.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/task.py
    • plugins/flytekit-k8sdataservice/tests/k8sdataservice/test_task.py
    • plugins/flytekit-mmcloud/flytekitplugins/mmcloud/__init__.py
    • plugins/flytekit-mmcloud/tests/test_mmcloud.py
    • plugins/flytekit-openai/flytekitplugins/openai/__init__.py
    • plugins/flytekit-openai/flytekitplugins/openai/batch/task.py
    • plugins/flytekit-openai/flytekitplugins/openai/chatgpt/task.py
    • plugins/flytekit-openai/tests/chatgpt/test_chatgpt.py
    • plugins/flytekit-perian/flytekitplugins/perian_job/__init__.py
    • plugins/flytekit-perian/flytekitplugins/perian_job/task.py
    • plugins/flytekit-perian/setup.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/__init__.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/function/task.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/script/task.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/ssh_utils.py
    • plugins/flytekit-snowflake/flytekitplugins/snowflake/__init__.py
    • plugins/flytekit-snowflake/flytekitplugins/snowflake/task.py
    • plugins/flytekit-snowflake/tests/test_snowflake.py
    • plugins/flytekit-spark/flytekitplugins/spark/__init__.py
    • plugins/flytekit-spark/flytekitplugins/spark/models.py
    • plugins/flytekit-spark/flytekitplugins/spark/task.py
    • plugins/flytekit-spark/tests/test_spark_task.py
    • plugins/setup.py
    • pydoclint-errors-baseline.txt
    • pyproject.toml
    • tests/flytekit/clis/sdk_in_container/test_serve.py
    • tests/flytekit/unit/bin/test_python_entrypoint.py
    • tests/flytekit/unit/cli/pyflyte/test_serve.py
    • tests/flytekit/unit/core/image_spec/test_image_spec.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_eager_cleanup.py
    • tests/flytekit/unit/core/test_partials.py
    • tests/flytekit/unit/extras/webhook/test_end_to_end.py
    • tests/flytekit/unit/sensor/test_file_sensor.py
    • tests/flytekit/unit/sensor/test_sensor_engine.py
    • tests/flytekit/unit/types/structured_dataset/test_snowflake.py
  • Files skipped - 9
    • .github/workflows/build_image.yml - Reason: Filter setting
    • .github/workflows/pythonbuild.yml - Reason: Filter setting
    • .github/workflows/pythonpublish.yml - Reason: Filter setting
    • plugins/flytekit-aws-sagemaker/README.md - Reason: Filter setting
    • plugins/flytekit-geopandas/README.md - Reason: Filter setting
    • plugins/flytekit-mmcloud/README.md - Reason: Filter setting
    • plugins/flytekit-openai/README.md - Reason: Filter setting
    • plugins/flytekit-perian/README.md - Reason: Filter setting
    • plugins/flytekit-slurm/README.md - Reason: Filter setting
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

Refer to the documentation for additional commands.

Configuration

This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants