Skip to content

Commit 614f48f

Browse files
authored
Merge pull request #8 from smartcomputer-ai/dist
Dist
2 parents dcfa1fc + 0f5bdc7 commit 614f48f

File tree

5 files changed

+158
-6
lines changed

5 files changed

+158
-6
lines changed

src/grit/object_model_v2.py

+12-4
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,20 @@
2121
MessageId = ObjectId
2222
Message = NamedTuple("Message",
2323
[('previous', MessageId | None), #if none, it's a signal, otherwise, a queue
24+
('prune', MessageId | None),
25+
#NEW: /if set, previous is not allowed to be set, instead, the previous message has to be set here,
26+
# which migh be pruned by grit (ie not available anymore)
2427
('headers', Headers | None),
25-
#('type', str), #NEW aka, "message_type"/"mt" -- is this a good idea, or should it remain part of the headers?
26-
# the pro is that the message types could be made more explicit in the object model here since the runtime inspects the message types substiantly (e.g., "genesis", "update", and, in the future "gc/garbage/disconnect")
28+
('type', str),
29+
#NEW aka, "message_type"/"mt" -- is this a good idea, or should it remain part of the headers?
30+
# the pro is that the message types could be made more explicit in the object model here since the runtime inspects the message types substiantly (e.g., "genesis", "update", and, in the future "gc/garbage/disconnect")
2731
('content', BlobId | TreeId | ListId | None)]) #NEW with None option, because many messages are just a singal or a ping, and have no content
2832
MailboxId = ObjectId
29-
Mailbox = dict[ActorId, MessageId]
33+
34+
Mailbox = dict[tuple(ActorId, str|None), MessageId]
35+
#NEW: Channel name (str), to allow to send on multiple channels to an actor
36+
# if channel name is None then it is the "default channel"
37+
# ActorId can be either sender or receiver
3038

3139
StepId = ObjectId
3240
Step = NamedTuple("Step",
@@ -39,4 +47,4 @@
3947
Object = Blob | Tree | List | Message | Mailbox | Step
4048

4149

42-
50+
# TODO: in serialization, add grit/object model version header

tests/perf/perf_grid.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
VERBOSE = False
1818
N_COLUMNS = 50
1919
N_ROWS = 50
20-
N_TEST_MESSAGES = 50
20+
N_TEST_MESSAGES = 200
2121

2222
count_init_messages = 0
2323
count_grid_messages = 0
@@ -150,6 +150,7 @@ async def perf_grid_run(store:ObjectStore, refs:References) -> None:
150150

151151
print(f"count_init_messages: {count_init_messages}")
152152
print(f"count_grid_messages: {count_grid_messages}")
153-
print(f"total processed in the grid: {N_COLUMNS * N_ROWS * N_TEST_MESSAGES}")
153+
total_grid_messages = N_COLUMNS * N_ROWS * N_TEST_MESSAGES
154+
print(f"total processed in the grid: {total_grid_messages}")
154155
assert count_init_messages == N_TEST_MESSAGES
155156
assert count_grid_messages == N_COLUMNS * N_TEST_MESSAGES

thinking/distributed.md

+63
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
2+
## Grit
3+
4+
- send objects
5+
- retrieve object
6+
- set refs (should that be here?, maybe just for runtime use)
7+
- multi-tenant support
8+
- stream larger objects
9+
10+
11+
## Worker
12+
13+
- figure out what manifests are supported on this worker
14+
- request work from orch. -> which actors to run
15+
- run (some) OS level actors (? or should they only exist once?, but that would mean they dont have an actor id... that would be ok)
16+
- timers (should the worker do that?)
17+
- send messages to orch. ... ideally send messages straight to other worker where actor is p2p. or route to local ones (could be future work)
18+
- route messages internally
19+
- backoff of delivering messages if actor does not accept (stop trying at some point? but then how to persist that mismatch)
20+
- listen to messages for its own actors
21+
- set step head for actors (when messages are sent off? or before?)
22+
- keep track of performance of actors and general resource usage, to message orch when re-balancing is needed
23+
- keep a local grit cache
24+
- run queries
25+
- ask the orch. for which Grit to use (if Grit is sharded)
26+
27+
28+
## Orchestrator
29+
30+
- aware of all actors
31+
- aware of manifests of all actors
32+
- decide which actor to run on which worker
33+
- route messages between workers, as long as there is no p2p solution yet
34+
- re-route if worker goes down/offline
35+
- warn if no worker exists for manifest
36+
- restart actor messaging after complete shutdown (tradeoff, to allow worker to either set the step head before sending all mesages, or having to wait. with the former, we can pesist undelivered messages in the orch. and then start from there without re-analyzing the entire message state.)
37+
- initiate pruning
38+
- snapshot all heads for an agent (before updates), refert to certain snapshots
39+
- initiate updates
40+
- host web server (could be different service) and route queries and messages to workers
41+
42+
43+
## Structure
44+
45+
We'll implement the first version as a monolith that can be started with different settings.
46+
47+
All of it will be in python.
48+
49+
- protos
50+
- src
51+
- shared
52+
- protos
53+
- grit (interfaces, object model, serialization)
54+
- wit (intefaces, inner wit runner)
55+
- runtime (?, interfaces)
56+
- web
57+
- grit (grit server)
58+
- apex (orchestrator)
59+
- worker (runs actors)
60+
- inproc (in process runtime) (or "play", "reference", "inproc")
61+
- web (webserver)
62+
- cli (connects to apex and grit, or simple runtime)
63+

thinking/manifests.md

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Manifests
2+
3+
## How to know what a wit needs to run?
4+
5+
We need a way to indicate to the runtime or worker what the wit/actor needs to run.
6+
7+
## Indicators
8+
- Language or language runtime
9+
- Library versions (sem versioned)
10+
11+
I think that's it.
12+
13+
## Structure
14+
15+
Where to save this information? We could re-use the wit structure for this. But probably should be a separate file.

thinking/pruning.md

+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# How to Prune (GC) Grit?
2+
3+
How do we prune the Grit DAG? This requires a garbage collection type system.
4+
5+
The main idea is pretty simple: grit needs to be extended so that the "previous" fields in messages or steps can be set to "None" while maintaining a "soft-link" to the history.
6+
7+
Here is how this could look like:
8+
```
9+
Message = NamedTuple("Message",
10+
[('previous', MessageId | None),
11+
('previous-pruned', MessageId | None), --only one or the other "previous.." would be allowed to be set
12+
('headers', Headers | None),
13+
('type', str),
14+
('content', BlobId | TreeId | None)])
15+
```
16+
17+
If `previous-pruned` is set, `previous` is not allowed to be set. Although this maintains a historical link to the obj ids that came before, they can be discarded by the garbage collector.
18+
19+
## Messages
20+
Maintaining a link to the history is important for the message object type because it allows an actor to send a pruned message list, giving the recipient a chane to process also previous, now pruned messages before it accepts the message with the prune marker.
21+
22+
## Lifecycle
23+
The runtime would probably send "prune" signals to each actor when it is time to prune. But actors could also decide to prune messages or step histories on their own initiative. The mechanics would be the same
24+
25+
1) Runtime sends "prune signal" via normal message
26+
2) Actor sends a pruned message to all or most of its outbox
27+
3) Actor also incorporates a prune marker in the new step
28+
5) (later and indepenently) reviever accepts the pruned messages in its inbox, completeing the cycle for that message channel.
29+
4) Grit can now garbage collect the messages and steps that are not needed anymore
30+
31+
## Maintaining Some History
32+
33+
It would be nice if an actor could retain *some* history of what happened to it. That is, if a prune request does not prune all the way to the present moment, but rather a little bit back.
34+
35+
How much back could be configurable (or part of the prune request signal).
36+
37+
How to do this?
38+
```
39+
Message = NamedTuple("Message",
40+
[('previous', MessageId | None),
41+
('prune-from', MessageId | None), --if set, prunes back from the message id specified here
42+
('headers', Headers | None),
43+
('type', str),
44+
('content', BlobId | TreeId | None)])
45+
```
46+
In this case, `previous` would still be always set (if it is a message queue), and `prune-from` would indicate any message id in the history of previous messages where to prune from...
47+
48+
However, it is not certain if this is the best design. It requires a lot on the part of the actor. Altenatively, the pruning happens often, which would result in many pruning markers throughout the history, *and then the runtime or Grit decides what to actually prune.*
49+
50+
In the second design, the prune messages could also contain some sort of timestamp which allows grit to decide, but grit could maintain that timestamp too.
51+
52+
With the most sleek design the message could just have a flag whether pruning previous messages is allowed, everything else would stay the same:
53+
54+
```
55+
Message = NamedTuple("Message",
56+
[('previous', MessageId | None),
57+
('prune', bool), --if set, prunes back from the message id specified here
58+
('headers', Headers | None),
59+
('type', str),
60+
('content', BlobId | TreeId | None)])
61+
```
62+
63+
I'm not sure if there is a better way to indicate the prune marker... I think somehow the first design at the very top is better, because it makes the prune action much more explicit than a flag (branching mechanisms have to be introduced anyhow).
64+
65+
Finally, it could also be that we simply have Grit track the time and prune without any markers and/or involvement of the actors. But that would make it difficult to guess whether data or history is available or not. Especially if the wit logic relies on the history (such as comparing two obejects how they changed over history). The actor would have no way to know why data is not available in Grit (although we could return a "pruned" object if it doesnt exist anymore, but then that would work like an additional null, which is bad, better make it explicit).

0 commit comments

Comments
 (0)