-
Notifications
You must be signed in to change notification settings - Fork 1k
feat: implement get chat completions APIs #2200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ac9f40f
to
096cd0c
Compare
@@ -76,6 +80,9 @@ async def get_auto_router_impl(api: Api, routing_table: RoutingTable, deps: dict | |||
if dep_api in deps: | |||
api_to_dep_impl[dep_name] = deps[dep_api] | |||
|
|||
if api.value == "inference" and run_config.inference_store: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if api.value == "inference" and run_config.inference_store: | |
if api.value == Api.inference and run_config.inference_store: |
if self.store: | ||
return await self.store.list_chat_completions(after, limit, model, order) | ||
else: | ||
raise NotImplementedError("List chat completions not implemented") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it an error, or just that the store is not configured? The config is optional, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the message.
if self.store: | ||
return await self.store.get_chat_completion(completion_id) | ||
else: | ||
raise NotImplementedError("Get chat completion not implemented") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
||
impl = SqliteInferenceStore(config.db_path) | ||
else: | ||
raise ValueError(f"Unknown inference store type {config.type}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print the supported types as well?
# Create directory if it doesn't exist | ||
os.makedirs(os.path.dirname(self.conn_string), exist_ok=True) | ||
|
||
async with aiosqlite.connect(self.conn_string) as conn: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like most of the code in this file already exists and is duplicated from other vector db implementations. Is there any chance to reuse some of those instead of adding another instance?
At a quick glance, only the table name would vary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sql queries and the surrounding code is the bulk of the logic; I'm not sure how much of the other code makes sense to re-use, as it's minimal. Let me know if you think otherwise.
llama_stack/templates/template.py
Outdated
@@ -117,6 +118,9 @@ def run_config( | |||
__distro_dir__=f"~/.llama/distributions/{name}", | |||
db_name="registry.db", | |||
), | |||
inference_store=SqliteInferenceStoreConfig.sample_run_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it something we always want to enable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good table-stake functionality that a users of Stack would reasonably expect and comes for free with OpenAI platform. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have a separate inference_store? Cant we just reuse the kvstore that's already in the distro?
My understanding is that kvstore doesn't support SQL-like queries? Are you suggesting to augment those interfaces? |
yeah agreed. I don't think we should introduce a specific inference store. We should add a "relational db store" or a "sql_store" (or some appopriate name) which allows us to target Relational DBs like Postgres in general. that said, we should also see whether we truly need SQL queries here and how much can be done via decently designed schema around a KV store with ranges. (It is possible we truly need SQL support here.) |
Introduced a simple SqlStore API: llama_stack/providers/utils/sqlstore/api.py. PTAL |
A protocol for a SQL store. | ||
""" | ||
|
||
async def select(self, sql: str, params: tuple | None = None) -> list[dict]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to add a tables()
for introspection also.
Is RelationalStore
a better name? (I don't think so but putting it out there.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think SQL as an interface is way too opaque and later, way too difficult to optimize should the need be. We shouldn't need to parse the SQL query (and / or find that one needs to sanitize it and such). Instead I think we should do something like this:
def insert(self, table: str, row: Mapping[str, Any]) -> Any: ...
def fetch_one(self, table: str, primary_key: Any) -> Mapping[str, Any] | None: ...
def fetch_all(
self,
table: str,
where: Mapping[str, Any] | str | None = None,
*,
limit: int | None = None,
offset: int = 0,
) -> list[Mapping[str, Any]]: ...
def update(
self, table: str, where: Mapping[str, Any] | str, values: Mapping[str, Any]
) -> int: ...
def delete(self, table: str, where: Mapping[str, Any] | str) -> int: ...
The most important aspect is the first argument -- a table. Every query should be within the context of a table. We can keep or add the execute
power only if we start needing joins.
Alternately, what happened with adopting the sqlalchemy
API for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely think opaque sql
should be a no-no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to keep the expressivity, but this makes more sense. Updated the commit.
There's now a sqlalchemy sql store that's configurable by engine type.
Is RelationalStore a better name? (I don't think so but putting it out there.)
Don't have a strong opinion here either, can change if preferred.
c2d90bc
to
9fb8299
Compare
@@ -41,6 +41,7 @@ dependencies = [ | |||
"pillow", | |||
"h11>=0.16.0", | |||
"kubernetes", | |||
"sqlalchemy[asyncio]>=2.0.41", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we only add this dependency if a distro actually includes a sqlalchemy store?
@@ -61,6 +62,7 @@ rsa==4.9 | |||
setuptools==75.8.0 | |||
six==1.17.0 | |||
sniffio==1.3.1 | |||
sqlalchemy==2.0.41 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need to move this to be part of build
script?
What does this PR do?
Test Plan