Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(BA-75): Introduce Raftify and refactor DistributedGlobalTimer with Raft #2105

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jopemachine
Copy link
Member

@jopemachine jopemachine commented May 3, 2024

Resolves #2969 (BA-75).
Prerequisite of #415 (BA-269).

This PR aims to resolve the distribution locking issues by integrating Raftify with Backend.AI manager (based on GlobalTimer operating by Raft algorithm).

Any kind of feedback is welcome.

Note: This PR also partially resolves #1634.

How to setup test environment

  • Set num-proc of manager.toml to an arbitrary number other than 1, and set raft section when running Backend.AI manager.
  • Create raft-cluster-config.toml and set initial_peers there. Below is an example.
[[peers.myself]]
host = "192.168.0.1"
port = 60151
node-id = 1
role = "voter"

[[peers.other]]
host = "192.168.0.2"
port = 60151
node-id = 2
role = "voter"

[[peers.other]]
host = "192.168.0.3"
port = 60151
node-id = 3
role = "voter"

Testing and debugging

For example, you can use the below command for putting a new log entry to the cluster.

# Append { "1": "test" }
curl -XGET http://localhost:60251/put/1/test

For printing all persisted logs, you can use the below command.

./backend.ai mgr raft debug persisted-all ./logs
---- Persisted entries ----
Key: 1, "Entry { context: [], data: [], entry_type: EntryNormal, index: 1, sync_log: false, term: 1 }"
Key: 2, "Entry { context: [], data: ConfChange { change_type: AddNode, node_id: 2, context: [127.0.0.1:60062], id: 0 }, entry_type: EntryConfChange, index: 2, sync_log: false, term: 1 }"
Key: 3, "Entry { context: [], data: ConfChange { change_type: AddNode, node_id: 3, context: [127.0.0.1:60063], id: 0 }, entry_type: EntryConfChange, index: 3, sync_log: false, term: 1 }"

---- Metadata ----
HardState { term: 1, vote: 1, commit: 3 }
ConfState { voters: [1, 2, 3], learners: [], voters_outgoing: [], learners_next: [], auto_leave: false }
Snapshot { data: HashStore(RwLock { data: {}, poisoned: false, .. }), metadata: Some(SnapshotMetadata { conf_state: Some(ConfState { voters: [1, 2, 3], learners: [], voters_outgoing: [], learners_next: [], auto_leave: false }), index: 1, term: 1 }) }
Last index: 3

For more details, please refer to the Raftify documentation.


Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation

📚 Documentation preview 📚: https://sorna--2105.org.readthedocs.build/en/2105/


📚 Documentation preview 📚: https://sorna-ko--2105.org.readthedocs.build/ko/2105/

@jopemachine jopemachine marked this pull request as ready for review May 3, 2024 08:26
@github-actions github-actions bot added area:docs Documentations comp:manager Related to Manager component comp:common Related to Common component comp:cli Related to CLI component size:XL 500~ LoC labels May 3, 2024
Copy link
Member Author

jopemachine commented May 3, 2024

Copy link

graphite-app bot commented May 3, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “flow:merge-queue” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “flow:hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

@github-actions github-actions bot added type:feature Add new features urgency:3 Must be finished within a certain time frame. labels May 3, 2024
@github-actions github-actions bot added this to the 24.03 milestone May 3, 2024
@jopemachine jopemachine changed the title feat: Introduce Raftify and RaftGlobalTimer to replace DistributedGlobalTimer feat: Introduce Raftify and RaftGlobalTimer to replace DistributedGlobalTimer May 3, 2024
@jopemachine jopemachine force-pushed the topic/05-03-feat_introduce_raftify_and_raftglobaltimer_to_replace_distributedglobaltimer branch 2 times, most recently from 07cc613 to 5be82bb Compare May 3, 2024 08:45
@jopemachine jopemachine modified the milestones: 24.03, 24.09 May 3, 2024
@achimnol achimnol requested review from achimnol and kyujin-cho May 3, 2024 09:10
@jopemachine jopemachine force-pushed the topic/05-03-feat_introduce_raftify_and_raftglobaltimer_to_replace_distributedglobaltimer branch from 5be82bb to fbe8733 Compare May 3, 2024 09:23
@github-actions github-actions bot modified the milestones: 24.09, 24.03 May 3, 2024
@jopemachine jopemachine force-pushed the topic/05-03-feat_introduce_raftify_and_raftglobaltimer_to_replace_distributedglobaltimer branch from fbe8733 to e5705e7 Compare May 3, 2024 09:24
@kyujin-cho kyujin-cho force-pushed the topic/05-03-feat_introduce_raftify_and_raftglobaltimer_to_replace_distributedglobaltimer branch from e5705e7 to b9b8a24 Compare May 9, 2024 07:36
requirements.txt Outdated Show resolved Hide resolved
src/ai/backend/common/distributed.py Outdated Show resolved Hide resolved
Comment on lines 77 to 88
async def join(self) -> None:
self._tick_task = asyncio.create_task(self.generate_tick())

async def leave(self) -> None:
self._stopped = True
await asyncio.sleep(0)
if not self._tick_task.done():
try:
self._tick_task.cancel()
await self._tick_task
except asyncio.CancelledError:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitions here are copy & pasted from DistributedLockGlobalTimer. I suggest you to just directly inherit DistributedLockGlobalTimer instead of AbstractGlobalTimer so that the RaftGlobalTimer can benefit from already defined join() and leave() functions.

Copy link
Member Author

@jopemachine jopemachine May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To have RaftGlobalTimer inherit from DistributedLockGlobalTimer instead of AbstractGlobalTimer, it is necessary to define RaftDistributedLock. This seems to conflict with the intent to replace the existing DistributedLock, and to implement this, it is essential to decide how the RaftDistributedLock should work.

IMHO, if our purpose here is simply to reuse join() and leave(), it would be appropriate to move the implementation of these two functions to AbstractGlobalTimer.

Comment on lines 253 to 270

if root_ctx.raft_ctx.use_raft():
app_ctx.log_cleanup_timer = RaftGlobalTimer(
root_ctx.raft_ctx.raft_node,
root_ctx.event_producer,
lambda: DoLogCleanupEvent(),
20.0,
initial_delay=17.0,
)
else:
app_ctx.log_cleanup_timer = DistributedLockGlobalTimer(
root_ctx.distributed_lock_factory(LockID.LOCKID_LOG_CLEANUP_TIMER, 20.0),
root_ctx.event_producer,
lambda: DoLogCleanupEvent(),
20.0,
initial_delay=17.0,
task_name="log_cleanup_task",
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These kind of flag-like approach can be a big burdensome when adapting a third lock backend - please refactor this to match our match-case selection convention (ref).

Copy link
Member Author

@jopemachine jopemachine May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the related codes in 5eb1e9f to reflect the feedback, and this issue is resolved in the commit.

In this commit, the use_raft() checks are replaced with match-case statements.

Comment on lines +38 to +44
[raft]
heartbeat-tick = 3
election-tick = 10
log-dir = "./logs"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding raft backend is only enabled when this directive is specified at the configuration file. This kind of approach can be too implicit from system manager's perspective; Please refactor activation mechanism so that config writer can explicitly mention which timer backend to be used.

Copy link
Member Author

@jopemachine jopemachine May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the related codes in 5eb1e9f to reflect the feedback, and this issue is resolved in the commit.

In this commit, I created global-timer option to manager.toml (manager local config).

I think the global-timer option allows to specify whether to use raft or distributed-lock more explicitly.

May I ask for a review for the commit?

from typing import Any


class Logger:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to retain separate Logger class for raft APIs only?

Copy link
Member Author

@jopemachine jopemachine May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logs printed on the Rust side should be passed to the Python logger instead of being directly output to stdout or a file.

In other words, the methods of this class are called from Rust. That's why the logs output by raft-rs are formatted to match the Backend.AI logger format.

You can see the details of this class in the code below.
Ref: https://github.com/lablup/raftify/blob/main/binding/python/src/bindings/logger.rs

Comment on lines +6 to +22
class SetCommand:
"""
Represent simple key-value command.
Use pickle to serialize the data.
"""

def __init__(self, key: str, value: str) -> None:
self.key = key
self.value = value

def encode(self) -> bytes:
return pickle.dumps(self.__dict__)

@classmethod
def decode(cls, packed: bytes) -> "SetCommand":
unpacked = pickle.loads(packed)
return cls(unpacked["key"], unpacked["value"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the general concepts of this class can be achieved by just using dataclass library.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I haven't yet found a proper way to express the association with Rust types, this class is used on the Rust side as the LogEntry type. (Refer to the PyLogEntry implementation below)

https://github.com/lablup/raftify/blob/main/binding/python/src/bindings/state_machine.rs#L27-L31

This means that the encode and decode method should be callable from Rust.

If we try to refactor SetCommand like below,

@dataclass
class SetCommand:
    key: str
    value: str

We will encounter the following error when we call them from Rust.

2024-05-10 11:48:16,101 - ERROR    - Error handling request
Traceback (most recent call last):
  File "/Users/jopemachine/.pyenv/versions/3.12.2/lib/python3.12/site-packages/aiohttp/web_protocol.py", line 452, in _handle_request
    resp = await request_handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jopemachine/.pyenv/versions/3.12.2/lib/python3.12/site-packages/aiohttp/web_app.py", line 543, in _handle
    resp = await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jopemachine/Desktop/raftify/binding/python/examples/web_server_api.py", line 52, in put
    await raft_node.propose(message.encode())
                            ^^^^^^^^^^^^^^
AttributeError: 'SetCommand' object has no attribute 'encode'

If there is still a way to improve the SetCommand using dataclass, I would appreciate your advice. If there is room for improvement, I will reflect it immediately.

Comment on lines +8 to +15
set_confchange_context_deserializer,
set_confchangev2_context_deserializer,
set_entry_context_deserializer,
set_entry_data_deserializer,
set_fsm_deserializer,
set_log_entry_deserializer,
set_message_context_deserializer,
set_snapshot_data_deserializer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more of an upstream problem, but since we also has the key of raftify project: why does every context have its own _deserializer() API function? Won't it be better to merge these into one gateway function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your point. It would be more helpful for code readability to register callbacks all at once, rather than the current method.

I will make improvements in the PR below.
lablup/raftify#101

src/ai/backend/manager/scheduler/dispatcher.py Outdated Show resolved Hide resolved
src/ai/backend/manager/scheduler/dispatcher.py Outdated Show resolved Hide resolved
Comment on lines 46 to 49
class RaftNodeInitialRole(str, enum.Enum):
LEADER = "leader"
VOTER = "voter"
LEARNER = "learner"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If raftify or raft-rs define the behaviors for each role, how about having those roles in raftify?

Copy link
Member Author

@jopemachine jopemachine May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, in Raftify, InitialRole type is already defined on the Rust side, and this type exists in the Python bindings as well.

The reason I define and use a separate Enum with the same contents here is because the type is not an actual Enum in Python.

For instance, below type validating code produce the following erorr.

from raftify import InitialRole
...
	t.Key("role", default=InitialRole.VOTER): tx.Enum(InitialRole),
  File "/home/jopemachine/backend.ai/dist/export/python/virtualenvs/python-default/3.12.2/lib/python3.12/site-packages/trafaret/base.py", line 1143, in transform
    for k, v, names in key(value, context=context):
  File "/home/jopemachine/backend.ai/dist/export/python/virtualenvs/python-default/3.12.2/lib/python3.12/site-packages/trafaret/base.py", line 972, in __call__
    result = self.trafaret(self.get_data(data, default), context=context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jopemachine/backend.ai/dist/export/python/virtualenvs/python-default/3.12.2/lib/python3.12/site-packages/trafaret/base.py", line 152, in __call__
    return self.check(val, context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jopemachine/backend.ai/dist/export/python/virtualenvs/python-default/3.12.2/lib/python3.12/site-packages/trafaret/base.py", line 115, in check
    return self.check_and_return(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jopemachine/backend.ai/src/ai/backend/common/validators.py", line 216, in check_and_return
    return self.enum_cls(value)
           ^^^^^^^^^^^^^^^^^^^^
TypeError: No constructor defined

There might be a way to handle Enum types more wisely in writing PyO3 bindings.
There are several aspects for improvement currently existing on the Python bindings side of Raftify like this.

Ref: lablup/raftify#91

@jopemachine jopemachine modified the milestones: 24.03, 24.09 May 10, 2024
@github-actions github-actions bot modified the milestones: 24.09, 24.03 May 10, 2024
@jopemachine
Copy link
Member Author

jopemachine commented May 10, 2024

CI test is failing while downloading raftify package, this should be fixed after setting up raftify PyPI package deployment workflow

@jopemachine jopemachine force-pushed the topic/05-03-feat_introduce_raftify_and_raftglobaltimer_to_replace_distributedglobaltimer branch from 1fec8fa to a585601 Compare September 2, 2024 07:27
@jopemachine jopemachine changed the title feat: Introduce Raftify and RaftGlobalTimer to replace DistributedGlobalTimer feat: Introduce Raftify and refactor DistributedGlobalTimer with Raft Sep 2, 2024
@jopemachine jopemachine changed the title feat: Introduce Raftify and refactor DistributedGlobalTimer with Raft feat: Introduce Raftify and refactor DistributedGlobalTimer with Raft Sep 2, 2024
@jopemachine jopemachine marked this pull request as draft November 27, 2024 04:11
@jopemachine jopemachine modified the milestones: 24.03, 25Q1 Nov 27, 2024
@jopemachine jopemachine changed the title feat: Introduce Raftify and refactor DistributedGlobalTimer with Raft feat(BA-75): Introduce Raftify and refactor DistributedGlobalTimer with Raft Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:docs Documentations comp:cli Related to CLI component comp:common Related to Common component comp:manager Related to Manager component size:XL 500~ LoC type:feature Add new features urgency:3 Must be finished within a certain time frame.
Projects
None yet
3 participants