[QUO-899][QUO-898] Support document metadata in Q-SDK by crekhari · Pull Request #81 · quotient-ai/quotient-python

crekhari · 2025-03-27T15:39:40Z

Example code with document metadata:

quotient_logger.log(
    user_query="How do I cook a goose?",
    model_output="The capital of France is Paris",
    documents=[
        "Here is an excellent goose recipe...",
        {"page_content": "Sample document", "metadata": {"source": "google"}},
        {"page_content": "Sample document2"}
    ]
)

Validation errors are shown via logger.

linear · 2025-03-27T15:39:43Z

QUO-899 Modify SDK document arg to accept a union of List of string and list of doc types

List of Doc type - should include page_content, and metadata properties

or a a list of strings - what it currently is

class Document:
   content: str
   metadata: dict

List[Document]

QUO-898 Add documents metadata to detections endpoint

Allow users to pass metadata about their documents (e.g. as a dictionary). For example - Tavily mentioned they want to report where documents are being retrieved from.

freddiev4

@crekhari I have a suggestion down below. Also can you make the equivalent PR in quotient-typescript since we have that now? Or ask @waldnzwrld for help

quotientai/client.py

freddiev4 · 2025-03-27T19:03:41Z

quotientai/client.py

+                        self.logger.error(f"Documents must be a list of strings or dictionaries with 'page_content' and optional 'metadata' keys. Metadata keys must be strings")
+                        return None
+                else:
+                    self.logger.error(f"Documents must be a list of strings or dictionaries with 'page_content' and optional 'metadata' keys, got {type(doc)}")


Something we need to keep in mind: we should not log these errors if there is nothing that the user can do about them, as that will be very confusing and noisy for people to see.

My understanding is these log if we get back basically invalid data from the API and there's a mismatch with the what the client expects (?)

I would suggest we don't log anything here, and instead raise an exception with information about what was returned from the API and what the client expected, and suggest what the user should do about it.

Talked in office, and user errors should be surfaced as quickly and "loudly" as possible. Logging to stout is not as loud as raising exceptions. We want this to be loud because it would suck if the user doesnt see the stout error msg because its cluttered with all their other logs, and then are wondering why logs are not being saved.

waldnzwrld · 2025-03-27T19:06:04Z

quotientai/resources/logs.py

 from dataclasses import dataclass
 from datetime import datetime
 import time
+from pydantic import BaseModel


Why are we importing pydantic here? All other classes in the sdk are instantiated without it?

@waldnzwrld had a quick chat in the office: it'd be good to use pydantic for client-side validation over the dataclasses. Dataclasses are useful but don't do type validation.

@crekhari can you make a separate PR to replace stuff with pydantic entirely?

It'd be better for us to replace everything in one go rather than doing it piece-meal

Wholeheartedly agree that if we're going to do it, we should just go for it.

If nobody has opened an issue for it, I will

https://linear.app/quotient-ai/issue/QUO-971/use-pydantic-for-all-typing-in-python-sdk

Thx for opening the issue, Ill assign it to myself for now, but its a lowish priority for me rn

waldnzwrld

Could you please update the tests in tests/resources/logs and tests/client to make sure your changes are behaving as expected?

crekhari · 2025-03-27T19:38:35Z

Could you please update the tests in tests/resources/logs and tests/client to make sure your changes are behaving as expected?

Will do!

freddiev4 · 2025-03-31T14:28:08Z

pytest.ini

@@ -1,2 +1,3 @@
 [pytest]
 addopts = -s
+asyncio_default_fixture_loop_scope = function


What's this for?

I get a Deprecation warning when running tests (poetry run pytest --cov=quotientai tests/ --cov-report term-missing):

PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset. The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

Setting it explicitly makes it go away. Setting the event loop scope to function means a new event loop will be created for each test function that uses an async fixture

freddiev4 · 2025-03-31T14:29:08Z

quotientai/client.py

+                    except Exception as _:
+                        self._logger.error(f"Documents must be a list of strings or dictionaries with 'page_content' and optional 'metadata' keys. Metadata keys must be strings")
+                        return None
+                else:
+                    self._logger.error(f"Documents must be a list of strings or dictionaries with 'page_content' and optional 'metadata' keys, got {type(doc)}")


I think this still needs to be addressed? #81 (comment)

tests/test_async_client.py

Co-authored-by: Freddie Vargus <freddie@quotientai.co>

Tests are in place and looking good

waldnzwrld

Non blocking but it is good to make sure that we have a happy case path in tests as well

waldnzwrld · 2025-03-31T15:15:08Z

tests/test_client.py

            mock_random.return_value = 0.6
            assert logger._should_sample() is False
+
+    def test_log_with_invalid_document_dict(self):


We should add a happy case test, (A test where the document data is valid) for both the sync and async client arch

freddiev4

LGTM one request below and think good to 🚢

freddiev4 · 2025-03-31T18:02:34Z

quotientai/async_client.py

+                        raise QuotientAIError("Documents must be a list of strings or dictionaries with 'page_content' and optional 'metadata' keys. Metadata keys must be strings")
+                else:
+                    raise QuotientAIError(f"Documents must be a list of strings or dictionaries with 'page_content' and optional 'metadata' keys, got {type(doc)}")


Can we make these more actionable? Like in TS

freddiev4 · 2025-03-31T18:02:59Z

After this get's merged, can you cut a release @crekhari ? Do you know what the steps are?

crekhari added 2 commits March 26, 2025 23:26

suppport metas in sdk

12e1781

optional metas + type checking

c379737

crekhari added 5 commits March 27, 2025 14:28

pydantic

8074159

base_url

e3b3976

error msg to logger

c6a825d

update error msg

7245a9d

typos

0379bdf

crekhari requested review from freddiev4 and mike-goitia March 27, 2025 18:58

freddiev4 reviewed Mar 27, 2025

View reviewed changes

waldnzwrld reviewed Mar 27, 2025

View reviewed changes

waldnzwrld previously requested changes Mar 27, 2025

View reviewed changes

logger fixes

91975c0

pytest warning, async_client, code cov tests

ab6ebb0

crekhari requested review from freddiev4 and waldnzwrld March 31, 2025 13:54

freddiev4 reviewed Mar 31, 2025

View reviewed changes

crekhari and others added 2 commits March 31, 2025 10:34

Apply suggestions from code review

90f3d75

Co-authored-by: Freddie Vargus <freddie@quotientai.co>

raise exceptions

10f6a2a

waldnzwrld approved these changes Mar 31, 2025

View reviewed changes

happy paths

4da3a8e

crekhari requested a review from freddiev4 March 31, 2025 16:54

freddiev4 approved these changes Mar 31, 2025

View reviewed changes

crekhari added 2 commits March 31, 2025 14:13

actionable error msgs

d61a1d1

unused imports

bbcbbb1

crekhari merged commit 5646dc4 into main Mar 31, 2025
2 checks passed

Conversation

crekhari commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear bot commented Mar 27, 2025

Uh oh!

freddiev4 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

crekhari Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

waldnzwrld left a comment

Choose a reason for hiding this comment

Uh oh!

crekhari commented Mar 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

waldnzwrld left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

freddiev4 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

freddiev4 commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

crekhari commented Mar 27, 2025 •

edited

Loading

freddiev4 left a comment •

edited

Loading

crekhari Mar 27, 2025 •

edited

Loading

freddiev4 commented Mar 31, 2025 •

edited

Loading