Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#221] non-distributed optimization algorithm #296

Merged
merged 31 commits into from
Mar 18, 2025
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
3098601
optimization agente draft
marcocapozzoli Feb 14, 2025
9736392
optimizer v1
marcocapozzoli Feb 19, 2025
5f110f1
change AttentionBroker service
marcocapozzoli Feb 25, 2025
210638a
Add ThreadPoolExecutor and refactor
marcocapozzoli Feb 25, 2025
dd79bd2
Remove comments
marcocapozzoli Feb 25, 2025
1182098
Refactor and create main file
marcocapozzoli Feb 26, 2025
1e43286
change
marcocapozzoli Feb 26, 2025
8f472a1
delete file
marcocapozzoli Feb 26, 2025
f0b652b
Merge branch 'master' into masc/221-non-distributed-optimization-algo…
marcocapozzoli Feb 26, 2025
e8c43e2
distributed algorithm
marcocapozzoli Mar 6, 2025
62ef3a2
update config file
marcocapozzoli Mar 6, 2025
d189f91
refactor methods
marcocapozzoli Mar 6, 2025
75dc588
refactor non distribuited version
marcocapozzoli Mar 6, 2025
b34ff29
Create main file for distributed version
marcocapozzoli Mar 6, 2025
d39876b
Code improvement
marcocapozzoli Mar 6, 2025
87a4ba5
Add docs
marcocapozzoli Mar 6, 2025
c027252
rename directory
marcocapozzoli Mar 7, 2025
ab05dd6
Apply some adjustments
marcocapozzoli Mar 7, 2025
f5dfe70
Add unit tests V1
marcocapozzoli Mar 7, 2025
903e927
Fix imports
marcocapozzoli Mar 10, 2025
c5c4b43
improvement tests
marcocapozzoli Mar 10, 2025
2aa09ca
Merge branch 'master' into masc/221-non-distributed-optimization-algo…
marcocapozzoli Mar 10, 2025
a60f417
Add BUILD files
marcocapozzoli Mar 10, 2025
9ce7d65
Fix service
marcocapozzoli Mar 10, 2025
cb99fa8
Fix Readme
marcocapozzoli Mar 10, 2025
21054d7
Add Dockerfile to build the binary file
marcocapozzoli Mar 11, 2025
1c10fc2
Add bin file to project build flow
marcocapozzoli Mar 11, 2025
3c8de66
Improve the README
marcocapozzoli Mar 11, 2025
2bcf588
Adjustments applied based on PR feedback
marcocapozzoli Mar 17, 2025
1eeb972
Adjust dependencies
marcocapozzoli Mar 17, 2025
8159bb9
Roulette does not allow the same individual more than once
marcocapozzoli Mar 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -196,3 +196,6 @@ tags

src/bin
src/bazel**

.nogit/
.vscode/
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ run-inference-agent:
run-inference-agent-client:
@bash -x src/scripts/run.sh inference_agent_client $(OPTIONS)

run-evolution:
src/bin/evolution $(OPTIONS)

setup-nunet-dms:
@bash -x src/scripts/setup-nunet-dms.sh

Expand Down
57 changes: 57 additions & 0 deletions src/evolution/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
load("@rules_python//python:defs.bzl", "py_library")
load("@pypi//:requirements.bzl", "requirement")


package(default_visibility = ["//visibility:public"])

filegroup(
name = "py_files",
srcs = glob(["*.py"]),
)

filegroup(
name = "py_all_files",
srcs = glob(["**/*.py"]),
)

py_library(
name = "evolution",
srcs = [":py_all_files"],
deps = [
":fitness_functions",
":selection_methods",
":utils",
":optimizer",
"//evolution/das_node",
],
)

py_library(
name = "fitness_functions",
srcs = ["fitness_functions.py"],
deps = [],
)

py_library(
name = "selection_methods",
srcs = ["selection_methods.py"],
deps = [],
)

py_library(
name = "utils",
srcs = ["utils.py"],
deps = [],
)

py_library(
name = "optimizer",
srcs = ["optimizer.py"],
deps = [
":fitness_functions",
":selection_methods",
":utils",
"//evolution/das_node:das_node",
"//evolution/das_node:query_answer",
],
)
33 changes: 33 additions & 0 deletions src/evolution/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
FROM ubuntu:22.04

ARG DAS_QUERY_ENGINE_BRANCH="masc/change-service-in-grpc"

RUN apt-get update && apt-get install -y \
python3.10 \
python3.10-distutils \
python3.10-venv \
python3-pip \
binutils \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
binutils \
binutils \
git \

&& rm -rf /var/lib/apt/lists/*

RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1

WORKDIR /app

ENV PYTHONPATH=/app

RUN apt-get update && apt-get install -y git
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RUN apt-get update && apt-get install -y git


RUN git clone https://github.com/singnet/das-query-engine.git
RUN cd das-query-engine && git checkout ${DAS_QUERY_ENGINE_BRANCH} && mv hyperon_das/ ../hyperon_das
RUN pip3 install --no-cache --upgrade pip hyperon_das_atomdb requests grpcio google protobuf pyinstaller

COPY asset/hyperon_das_node-0.0.1-cp310-abi3-manylinux_2_28_x86_64.whl /app/
RUN pip3 install /app/hyperon_das_node-0.0.1-cp310-abi3-manylinux_2_28_x86_64.whl

COPY . /app/evolution

COPY build.sh .
RUN chmod +x build.sh

CMD ["./build.sh"]
56 changes: 56 additions & 0 deletions src/evolution/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Non Distributed Query Optimizer

## How to build

To build run the following command from the project root:

```bash
make build-all
```

This will generate the binary file in the `das/src/bin` directory.

## Usage

Run the optimizer using the main script. The command-line arguments specify the configuration file and the query tokens to optimize.

#### Config file example

```
mongo_hostname = "localhost"
mongo_port = 27017
mongo_username = "root"
mongo_password = "root"
redis_hostname = "localhost"
redis_port = 6379
redis_cluster = False
redis_ssl = False
query_agent_node_id = "localhost:31701"
query_agent_server_id = "localhost:31700"
attention_broker_server_id = "localhost:37007"
max_generations = 2
max_query_answers = 5
population_size = 500
qtd_selected_for_attention_update = 100
selection_method = "roulette"
fitness_function = "multiply_strengths"
```

#### Running client:

```bash
make run-evolution OPTIONS='--config-file /path/to/config.cfg --query-tokens "LINK_TEMPLATE Evaluation 2 NODE Type Name VARIABLE V1"'
```

**Parameters:**
--query-tokens: The query string that will be optimized.
--config-file: Path to the configuration file (default is config.cfg).

If successful, you should see a message like this:

```
Starting optimizer
Processing generation 1/n
Results:
['QueryAnswer<1,1> [296bfeeb2ce5148d78f371d0ddf395b2] {(V2: f9a98ccf36f4ba3dcfe2fc99243546fa)} 0.0029751847', 'QueryAnswer<1,1> [2f21f35e3936307c29367adf41aec59e] {(V2: a7d045ace9ea9f9ecbc9094a770cae50)} 0.0025941788', 'QueryAnswer<1,1> [9d6fe9c68e5b29a1b4616ef582e075a3] {(V2: c04cafa6ca7f157321547f4c9ff4bb39)} 0.0025350566', 'QueryAnswer<1,1> [15e8247142c5a46b6079d9df9ea61833] {(V2: f54acd64cd8541c2125588e84da17288)} 0.0024081622', 'QueryAnswer<1,1> [a9335106b2ab5e652749769b72b9e29c] {(V2: 074bd74b5b8c2c87777bf57696ec3edd)} 0.0023660637']
```
Empty file added src/evolution/__init__.py
Empty file.
Binary file not shown.
7 changes: 7 additions & 0 deletions src/evolution/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

pyinstaller --onefile ./evolution/main.py
mkdir -p bin
mv ./dist/main ./dist/evolution
mv ./dist/evolution ./bin/evolution
rm -rf build/ __pycache__
Comment on lines +3 to +7
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this creating a wheel file?

Please check how the other modules are created using a bazel target:

py_wheel(
name = "hyperon_das_wheel",
abi = "none",
author = "Andre Senna",
author_email = "[email protected]",
classifiers = [
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
],
description_content_type = "text/markdown",
description_file = "README.md",
distribution = "hyperon_das",
platform = "any",
python_requires = ">=3.10",
python_tag = "py3",
requires_file = "//deps:requirements_lock.txt",
stamp = 1,
summary = "Query Engine API for Distributed AtomSpace",
version = "$(DAS_VERSION)", # must be defined when calling `bazel build` with `--define=DAS_VERSION=<version>`
deps = [":hyperon_das"],
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not create a wheel file.

Copy link
Collaborator

@angeloprobst angeloprobst Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, got it. It's a way of creating a kind of a binary which embeds a python application along with its dependencies, allowing this:

das/Makefile

Lines 52 to 53 in 2bcf588

run-evolution:
src/bin/evolution $(OPTIONS)

I suppose this is temporary and will be replaced by a wheel file once it is considered ready/done, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a way of creating a kind of a binary which embeds a python application along with its dependencies
I suppose this is temporary and will be replaced by a wheel file once it is considered ready/done, right?

It's a way to test the algorithm in a simpler way. This binary probably won't exist, but I don't know if we would have a wheel file up front. In short, yes, this is temporary.

19 changes: 19 additions & 0 deletions src/evolution/config.example.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
context = "das-poc"
mongo_hostname = "localhost"
mongo_port = 27017
mongo_username = "root"
mongo_password = "root"
redis_hostname = "localhost"
redis_port = 6379
redis_cluster = False
redis_ssl = False
query_agent_node_id = "localhost:31701"
query_agent_server_id = "localhost:31700"
attention_broker_server_id = "localhost:37007"
number_nodes = 5
max_generations = 2
max_query_answers = 30
population_size = 500
qtd_selected_for_attention_update = 100
selection_method = "roulette"
fitness_function = "multiply_strengths"
69 changes: 69 additions & 0 deletions src/evolution/das_node/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
load("@rules_python//python:defs.bzl", "py_library")
load("@pypi//:requirements.bzl", "requirement")


package(default_visibility = ["//visibility:public"])

filegroup(
name = "py_files",
srcs = glob(["*.py"]),
)

py_library(
name = "das_node",
srcs = ["das_node.py"],
deps = [
":remote_iterator",
":star_node",
],
)

py_library(
name = "query_answer",
srcs = ["query_answer.py"],
deps = [],
)

py_library(
name = "query_element",
srcs = ["query_element.py"],
deps = [],
)

py_library(
name = "query_node",
srcs = ["query_node.py"],
deps = [
":query_answer",
":shared_queue"
],
)

py_library(
name = "remote_iterator",
srcs = ["remote_iterator.py"],
deps = [
":query_answer",
":query_element",
":query_node",
],
)

py_library(
name = "shared_queue",
srcs = ["shared_queue.py"],
deps = [],
)


py_library(
name = "star_node",
srcs = ["star_node.py"],
deps = [],
)






Empty file.
57 changes: 57 additions & 0 deletions src/evolution/das_node/das_node.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from hyperon_das_node import Message

from evolution.das_node.remote_iterator import RemoteIterator
from evolution.das_node.star_node import StarNode


class DASNode(StarNode):
local_host: str
next_query_port: int
first_query_port: int
last_query_port: int
PATTERN_MATCHING_QUERY = "pattern_matching_query"

def __init__(self, node_id: str = None, server_id: str = None):
super().__init__(node_id, server_id)
self.initialize()

def pattern_matcher_query(
self, tokens: list, context: str = "", update_attention_broker: bool = False
):
if self.is_server:
raise ValueError("pattern_matcher_query() is not available in DASNode server.")
query_id = self.next_query_id()
args = [query_id, context, "true" if update_attention_broker else "false"] + tokens
self.send(DASNode.PATTERN_MATCHING_QUERY, args, self.server_id)
return RemoteIterator(query_id)

def next_query_id(self) -> str:
port = self.next_query_port
limit = 0
if self.is_server:
limit = (self.first_query_port + self.last_query_port) // 2 - 1
if self.next_query_port > limit:
self.next_query_port = self.first_query_port
else:
limit = self.last_query_port
if self.next_query_port > limit:
self.next_query_port = (self.first_query_port + self.last_query_port) // 2

query_id = f"{self.local_host}:{port}"
self.next_query_port += 1
return query_id

def message_factory(self, command: str, args: list) -> Message:
message = super().message_factory(command, args)
if message:
return message
return None

def initialize(self):
self.first_query_port = 60000
self.last_query_port = 61999
self.local_host = self.node_id().split(":")[0] # Extracting the host part of node_id
if self.is_server:
self.next_query_port = self.first_query_port
else:
self.next_query_port = (self.first_query_port + self.last_query_port) // 2
Loading