Skip to content

feat: implement get chat completions APIs #2200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ehhuang
Copy link
Contributor

@ehhuang ehhuang commented May 19, 2025

What does this PR do?

Test Plan

INFERENCE_MODEL="llama3.2:3b-instruct-fp16" llama stack build --template ollama --image-type conda --run
LLAMA_STACK_CONFIG=http://localhost:5001 INFERENCE_MODEL="llama3.2:3b-instruct-fp16" python -m pytest -v tests/integration/inference/test_openai_completion.py --text-model "llama3.2:3b-instruct-fp16" -k 'inference_store and openai'

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 19, 2025
@ehhuang ehhuang force-pushed the pr2200 branch 2 times, most recently from ac9f40f to 096cd0c Compare May 19, 2025 05:24
@ehhuang ehhuang changed the title impl feat: implement get chat completions APIs May 19, 2025
@ehhuang ehhuang marked this pull request as ready for review May 19, 2025 05:38
@@ -76,6 +80,9 @@ async def get_auto_router_impl(api: Api, routing_table: RoutingTable, deps: dict
if dep_api in deps:
api_to_dep_impl[dep_name] = deps[dep_api]

if api.value == "inference" and run_config.inference_store:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if api.value == "inference" and run_config.inference_store:
if api.value == Api.inference and run_config.inference_store:

if self.store:
return await self.store.list_chat_completions(after, limit, model, order)
else:
raise NotImplementedError("List chat completions not implemented")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it an error, or just that the store is not configured? The config is optional, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the message.

if self.store:
return await self.store.get_chat_completion(completion_id)
else:
raise NotImplementedError("Get chat completion not implemented")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


impl = SqliteInferenceStore(config.db_path)
else:
raise ValueError(f"Unknown inference store type {config.type}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Print the supported types as well?

# Create directory if it doesn't exist
os.makedirs(os.path.dirname(self.conn_string), exist_ok=True)

async with aiosqlite.connect(self.conn_string) as conn:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like most of the code in this file already exists and is duplicated from other vector db implementations. Is there any chance to reuse some of those instead of adding another instance?

At a quick glance, only the table name would vary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sql queries and the surrounding code is the bulk of the logic; I'm not sure how much of the other code makes sense to re-use, as it's minimal. Let me know if you think otherwise.

@@ -117,6 +118,9 @@ def run_config(
__distro_dir__=f"~/.llama/distributions/{name}",
db_name="registry.db",
),
inference_store=SqliteInferenceStoreConfig.sample_run_config(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it something we always want to enable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good table-stake functionality that a users of Stack would reasonably expect and comes for free with OpenAI platform. Wdyt?

Copy link
Contributor

@raghotham raghotham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have a separate inference_store? Cant we just reuse the kvstore that's already in the distro?

@ehhuang
Copy link
Contributor Author

ehhuang commented May 19, 2025

Why have a separate inference_store? Cant we just reuse the kvstore that's already in the distro?

My understanding is that kvstore doesn't support SQL-like queries? Are you suggesting to augment those interfaces?

@ashwinb
Copy link
Contributor

ashwinb commented May 19, 2025

Why have a separate inference_store? Cant we just reuse the kvstore that's already in the distro?

yeah agreed. I don't think we should introduce a specific inference store. We should add a "relational db store" or a "sql_store" (or some appopriate name) which allows us to target Relational DBs like Postgres in general.

that said, we should also see whether we truly need SQL queries here and how much can be done via decently designed schema around a KV store with ranges. (It is possible we truly need SQL support here.)

@ehhuang
Copy link
Contributor Author

ehhuang commented May 19, 2025

Why have a separate inference_store? Cant we just reuse the kvstore that's already in the distro?

yeah agreed. I don't think we should introduce a specific inference store. We should add a "relational db store" or a "sql_store" (or some appopriate name) which allows us to target Relational DBs like Postgres in general.

that said, we should also see whether we truly need SQL queries here and how much can be done via decently designed schema around a KV store with ranges. (It is possible we truly need SQL support here.)

Introduced a simple SqlStore API: llama_stack/providers/utils/sqlstore/api.py. PTAL

A protocol for a SQL store.
"""

async def select(self, sql: str, params: tuple | None = None) -> list[dict]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to add a tables() for introspection also.

Is RelationalStore a better name? (I don't think so but putting it out there.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SQL as an interface is way too opaque and later, way too difficult to optimize should the need be. We shouldn't need to parse the SQL query (and / or find that one needs to sanitize it and such). Instead I think we should do something like this:

    def insert(self, table: str, row: Mapping[str, Any]) -> Any: ...

    def fetch_one(self, table: str, primary_key: Any) -> Mapping[str, Any] | None: ...

    def fetch_all(
        self,
        table: str,
        where: Mapping[str, Any] | str | None = None,
        *,
        limit: int | None = None,
        offset: int = 0,
    ) -> list[Mapping[str, Any]]: ...

    def update(
        self, table: str, where: Mapping[str, Any] | str, values: Mapping[str, Any]
    ) -> int: ...

    def delete(self, table: str, where: Mapping[str, Any] | str) -> int: ...

The most important aspect is the first argument -- a table. Every query should be within the context of a table. We can keep or add the execute power only if we start needing joins.

Alternately, what happened with adopting the sqlalchemy API for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely think opaque sql should be a no-no.

Copy link
Contributor Author

@ehhuang ehhuang May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to keep the expressivity, but this makes more sense. Updated the commit.

There's now a sqlalchemy sql store that's configurable by engine type.

Is RelationalStore a better name? (I don't think so but putting it out there.)

Don't have a strong opinion here either, can change if preferred.

@ehhuang ehhuang requested a review from bbrowning as a code owner May 20, 2025 19:28
@ehhuang ehhuang force-pushed the pr2200 branch 3 times, most recently from c2d90bc to 9fb8299 Compare May 20, 2025 21:47
# What does this PR do?


## Test Plan
@@ -41,6 +41,7 @@ dependencies = [
"pillow",
"h11>=0.16.0",
"kubernetes",
"sqlalchemy[asyncio]>=2.0.41",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we only add this dependency if a distro actually includes a sqlalchemy store?

@@ -61,6 +62,7 @@ rsa==4.9
setuptools==75.8.0
six==1.17.0
sniffio==1.3.1
sqlalchemy==2.0.41
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need to move this to be part of build script?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants