Skip to content

refactor: (codeflash) ⚡️ Speed up function get_successors by 20% #5390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

misrasaurabh1
Copy link
Contributor

📄 get_successors in src/backend/base/langflow/graph/graph/utils.py

✨ Performance Summary:

  • Speed Increase: 📈 20% (0.20x faster)
  • Runtime Reduction: ⏱️ From 107 microseconds down to 88.5 microseconds (best of 44 runs)

📝 Explanation and details

Certainly! The initial code includes some redundancy in checking and processing nodes, which can be optimized to reduce unnecessary operations. Here's an improved version of the program.

Key Optimizations.

  1. Avoid unnecessary checks - We start by initializing the stack with the direct successors of the given vertex_id, so there's no need to recheck the initial vertex.
  2. Use set operations efficiently - By using a set to keep track of visited nodes, we ensure O(1) complexity for the check and add operations, instead of checking it each time later.
  3. Reduce redundant operations - Only push successors that haven't been visited to avoid redundant checks within the loop.

Further runtime improvements could involve specific optimizations depending on the characteristics of the graph and the operations' patterns if any additional details are provided.


Correctness verification

The new optimized code was tested for correctness. The results are listed below:

Test Status Details
⚙️ Existing Unit Tests 51 Passed See below
🌀 Generated Regression Tests 25 Passed See below
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Coverage 100.0%

⚙️ Existing Unit Tests Details

Click to view details
- graph/graph/test_utils.py

🌀 Generated Regression Tests Details

Click to view details
import pytest  # used for our unit tests
from langflow.graph.graph.utils import get_successors

# unit tests

def test_single_vertex_no_successors():
    graph = {'A': {'successors': []}}
    codeflash_output = get_successors(graph, 'A')

def test_single_vertex_one_successor():
    graph = {'A': {'successors': ['B']}, 'B': {'successors': []}}
    codeflash_output = get_successors(graph, 'A')

def test_multiple_vertices_linear_successors():
    graph = {'A': {'successors': ['B']}, 'B': {'successors': ['C']}, 'C': {'successors': []}}
    codeflash_output = get_successors(graph, 'A')

def test_branching_graph():
    graph = {'A': {'successors': ['B', 'C']}, 'B': {'successors': ['D']}, 'C': {'successors': []}, 'D': {'successors': []}}
    codeflash_output = get_successors(graph, 'A')

def test_graph_with_cycles():
    graph = {'A': {'successors': ['B']}, 'B': {'successors': ['C']}, 'C': {'successors': ['A']}}
    codeflash_output = get_successors(graph, 'A')

def test_disconnected_graph():
    graph = {'A': {'successors': ['B']}, 'B': {'successors': []}, 'C': {'successors': ['D']}, 'D': {'successors': []}}
    codeflash_output = get_successors(graph, 'A')


def test_vertex_with_self_loop():
    graph = {'A': {'successors': ['A']}}
    codeflash_output = get_successors(graph, 'A')

def test_vertex_with_multiple_successors():
    graph = {'A': {'successors': ['B', 'C', 'D']}, 'B': {'successors': []}, 'C': {'successors': []}, 'D': {'successors': []}}
    codeflash_output = get_successors(graph, 'A')

def test_large_linear_graph():
    graph = {chr(i): {'successors': [chr(i+1)]} for i in range(65, 90)}
    graph['Z'] = {'successors': []}
    expected_result = [chr(i) for i in range(66, 91)]
    codeflash_output = get_successors(graph, 'A')

def test_large_branching_graph():
    graph = {'A': {'successors': [chr(i) for i in range(66, 91)]}}
    for i in range(66, 91):
        graph[chr(i)] = {'successors': []}
    codeflash_output = get_successors(graph, 'A')
    expected_result = [chr(i) for i in range(66, 91)]

def test_deeply_nested_graph():
    graph = {chr(i): {'successors': [chr(i+1)]} for i in range(65, 90)}
    graph['Z'] = {'successors': ['A']}
    expected_result = [chr(i) for i in range(66, 91)]
    codeflash_output = get_successors(graph, 'A')

def test_high_fan_out_graph():
    graph = {'A': {'successors': [chr(i) for i in range(66, 91)]}}
    for i in range(66, 91):
        graph[chr(i)] = {'successors': []}
    codeflash_output = get_successors(graph, 'A')
    expected_result = [chr(i) for i in range(66, 91)]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest  # used for our unit tests
from langflow.graph.graph.utils import get_successors

# unit tests

def test_single_vertex_no_successors():
    graph = {"A": {"successors": []}}
    codeflash_output = get_successors(graph, "A")

def test_single_vertex_one_successor():
    graph = {"A": {"successors": ["B"]}, "B": {"successors": []}}
    codeflash_output = get_successors(graph, "A")

def test_multiple_vertices_linear_successors():
    graph = {"A": {"successors": ["B"]}, "B": {"successors": ["C"]}, "C": {"successors": []}}
    codeflash_output = get_successors(graph, "A")

def test_single_vertex_self_loop():
    graph = {"A": {"successors": ["A"]}}
    codeflash_output = get_successors(graph, "A")

def test_multiple_vertices_with_cycle():
    graph = {"A": {"successors": ["B"]}, "B": {"successors": ["C"]}, "C": {"successors": ["A"]}}
    codeflash_output = get_successors(graph, "A")



def test_vertex_multiple_successors():
    graph = {"A": {"successors": ["B", "C"]}, "B": {"successors": []}, "C": {"successors": []}}
    codeflash_output = get_successors(graph, "A")

def test_complex_graph_multiple_successors():
    graph = {"A": {"successors": ["B", "C"]}, "B": {"successors": ["D"]}, "C": {"successors": ["E"]}, "D": {"successors": []}, "E": {"successors": []}}
    codeflash_output = get_successors(graph, "A")


def test_vertex_not_in_graph():
    graph = {"A": {"successors": ["B"]}}
    with pytest.raises(KeyError):
        get_successors(graph, "C")

def test_graph_missing_successors_key():
    graph = {"A": {"successors": ["B"]}, "B": {}}
    with pytest.raises(KeyError):
        get_successors(graph, "A")

def test_large_graph_many_vertices_edges():
    graph = {"A": {"successors": ["B", "C", "D"]}, "B": {"successors": ["E", "F"]}, "C": {"successors": ["G", "H"]}, "D": {"successors": ["I", "J"]}, "E": {"successors": []}, "F": {"successors": []}, "G": {"successors": []}, "H": {"successors": []}, "I": {"successors": []}, "J": {"successors": []}}
    codeflash_output = get_successors(graph, "A")


def test_graph_with_integer_vertex_ids():
    graph = {1: {"successors": [2]}, 2: {"successors": [3]}, 3: {"successors": []}}
    codeflash_output = get_successors(graph, 1)

def test_graph_with_mixed_type_vertex_ids():
    graph = {1: {"successors": ["A"]}, "A": {"successors": [2]}, 2: {"successors": []}}
    codeflash_output = get_successors(graph, 1)

# Run the tests
if __name__ == "__main__":
    pytest.main()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

📣 **Feedback**

If you have any feedback or need assistance, feel free to join our Discord community:

Discord

Certainly! The initial code includes some redundancy in checking and processing nodes, which can be optimized to reduce unnecessary operations. Here's an improved version of the program.



### Key Optimizations.
1. Avoid unnecessary checks - We start by initializing the stack with the direct successors of the given `vertex_id`, so there's no need to recheck the initial vertex.
2. Use set operations efficiently - By using a set to keep track of visited nodes, we ensure O(1) complexity for the check and add operations, instead of checking it each time later.
3. Reduce redundant operations - Only push successors that haven't been visited to avoid redundant checks within the loop.

Further runtime improvements could involve specific optimizations depending on the characteristics of the graph and the operations' patterns if any additional details are provided.
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Dec 20, 2024
@dosubot dosubot bot added the enhancement New feature or request label Dec 20, 2024
@github-actions github-actions bot added refactor Maintenance tasks and housekeeping and removed enhancement New feature or request labels Dec 20, 2024
Copy link

codspeed-hq bot commented Dec 20, 2024

CodSpeed Performance Report

Merging #5390 will degrade performances by 18.2%

Comparing codeflash-ai:codeflash/optimize-get_successors-2024-12-11T14.11.20 (de6f2b3) with main (243055e)

Summary

⚡ 1 improvements
❌ 2 regressions
✅ 12 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main codeflash-ai:codeflash/optimize-get_successors-2024-12-11T14.11.20 Change
test_setup_llm_caching 2.1 ms 1.2 ms +76.5%
test_successful_run_with_output_type_any 203 ms 248.2 ms -18.2%
test_successful_run_with_output_type_debug 225.1 ms 257.9 ms -12.72%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Maintenance tasks and housekeeping size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant