httpx from scratch #1823

technillogue · 2024-07-24T18:40:13Z

as a prelude to #1813, we would like to cut an intermediate release that can be tested so we can discover bugs in a smaller surface area. here, we're postponing all of the async worker changes to only bring async runner and httpx, without supporting any kind of concurrency. this is slightly artificial but hopefully will reduce how many times we need to roll back #1813

this is perhaps an opportunity to fix the webhook and files related tests as well

async runner (async runner #1352)
replace requests with httpx and factor out clients (replace requests with httpx and factor out clients #1574, Fix upload logging #1707, fix upload redirect handling #1714)

nickstenning · 2024-07-25T09:55:16Z

pyproject.toml

 dependencies = [
  # intentionally loose. perhaps these should be vendored to not collide with user code?
  "attrs>=20.1,<24",
  "fastapi>=0.75.2,<0.99.0",
+  # we may not need http2


We almost certainly don't. What's the cost of including it vs not?

I think the only real motivation for this when it was written is using http2 for file uploads which did not go through localhost at the time. h2 is not a big library so I don't think this is terribly consequential either way

nickstenning · 2024-07-25T09:56:09Z

python/cog/logging.py

@@ -86,4 +86,5 @@ def setup_logging(*, log_level: int = logging.NOTSET) -> None:

    # Reconfigure log levels for some overly chatty libraries
    logging.getLogger("uvicorn.access").setLevel(logging.WARNING)
+    # FIXME: no more urllib3(?)


Does httpx log to a system logger for normal requests? If not, you can just remove this.

looks like it https://www.python-httpx.org/logging/

nickstenning · 2024-07-25T09:56:34Z

python/cog/predictor.py

    weights_url = os.environ.get("COG_WEIGHTS")
-    weights_path = "weights"
+    weights_path = "weights"  # this is the source of a bug isn't it?


What does this comment refer to, and does it need to be in this PR?

https://www.notion.so/replicate/Cog-Setup-Path-Problem-2fc41d40bcaf47579ccd8b2f4c71ee24

I think the rationale for including this here was that the behavior of CogPath.validate was changed, in that URLPath aka URLTempFile's validate method is now async and takes httpx.AsyncClient as an argument.

probably more problematically for this PR, it changes the behavior of setup paths altogether

that said, most of the changes included here are running in the child worker, which is supposed to be out of scope for this PR

besides downloading the weights twice as mentioned in that notion doc, there's a lot of code that refers to a bug that I don't quite understand

https://github.com/replicate/cog-flan-models/blob/f35f162f03b2559a755c5ffbe47a5f17009ab867/predict.py#L168-L170

https://github.com/replicate/cog-mpt-7b-storywriter-65k/blob/main/predict.py#L41-L43

https://github.com/replicate/cog-llama/blob/main/predict.py#L18-L20

https://github.com/replicate/cog-sdxl/blob/main/predict.py#L184-L185

https://github.com/replicate/cog-llama-template/blob/main/predict.py#L44-L46

python/cog/schema.py

nickstenning · 2024-07-25T10:01:30Z

python/cog/server/clients.py

+
+def webhook_headers() -> "dict[str, str]":
+    headers = common_headers()
+    auth_token = os.environ.get("WEBHOOK_AUTH_TOKEN")


I don't think this is needed any more.

it seems like it's still in main? https://github.com/replicate/cog/blob/main/python/cog/server/webhook.py#L73

python/cog/server/clients.py

nickstenning · 2024-07-25T10:05:29Z

python/cog/server/clients.py

+        return urlparse(final_url)._replace(query="").geturl()
+
+    # this previously lived in json.upload_files, but it's clearer here
+    # this is a great pattern that should be adopted for input files


What pattern is being referred to here?

recursively iterating through the entire object rather than only the top level

nickstenning · 2024-07-25T10:06:12Z

python/cog/server/clients.py

+        # # it would be kind of cleaner to make the default file_url
+        # # instead of skipping entirely, we need to convert to datauri
+        # if url is None:
+        #     return obj


What did this commented-out code do and why is it commented out?

nickstenning · 2024-07-25T10:09:13Z

python/cog/server/clients.py

+            yield chunk
+
+
+# there's a case for splitting this apart or inlining parts of it


I don't think HTTP clients and data structure manipulations should be smooshed together like this. We should extract these back into separate business logic and client management pieces.

The current approach makes it hard to test, hard to iterate on (e.g. TODO: upload concurrently would be far easier to achieve if we separated the processes of working out what we have to upload from the process of actually uploading it), and hard to understand.

I'm not entirely convinced, this is identical to json.upload_files. I am sympathetic to extracting some helper functions for data structure manipulations.

you could have a helper function yield each file that has to be uploaded, but then those have to go back into the data structure anyway, it doesn't seem like less complexity to me.

the previous setup had about five layers of indirection across four files

that said if you have a narrow proposal I would be open to that

python/cog/server/eventtypes.py

nickstenning · 2024-07-25T10:12:15Z

python/cog/server/runner.py

-            error_callback=handle_error,
-        )
-
+                raise  # we don't actually want to raise anymore but w/e


What does this mean?

python/cog/server/runner.py

nickstenning

I don't think this is really reviewable. The tests aren't passing, and I don't think the changeset is any fit state for us to merge to main in good conscience.

I appreciate that this probably isn't the review you'd like. As a reviewer, anything more than a few hundred lines is quite hard to review effectively unless a substantial portion of the changes are trivial. Reviewing a diff five times that size that's full of half-finished code (stuff commented out, comments suggesting that the code isn't ready for primetime) is just not something anyone is likely to be able to do effectively.

Let's take a step back here and rather than trying to adapt what we have on the async branch by cutting stuff away from it, let's take what we have learned from that branch and make a series of smaller and more incremental changes -- possibly starting afresh from current main -- to get us to where we want to be.

mattt · 2024-07-25T10:21:58Z

python/cog/server/clients.py

+        self, url: str, response: Dict[str, Any], event: WebhookEvent
+    ) -> None:
+        if Status.is_terminal(response["status"]):


event is unused. Should we drop it from the method signature entirely?

ah, right, this was moved up to make_webhook_sender

python/cog/server/clients.py

* have runner return asyncio.Task instead of AsyncFuture * make tests async and fix them * delete remaining runner thread code :) * review changes to tests and server (reverts commit 828eee9) Signed-off-by: technillogue <[email protected]>

* input downloads, output uploads, and webhooks are now handled by ClientManager, which persists for the lifetime of runner, allowing us to reuse connections, which may significantly help with large uploads. * although I was originally going to drop output_file_prefix, it's not actually hard to maintain. the behavior is changed now and objects are uploaded as soon as they're outputted rather than after the prediction is completed. * there's an ugly hack with uploading an empty body to get the redirect instead of making api time out from trying to upload an 140GB file. that can be fixed by implemented an MPU endpoint and/or a "fetch upload url" endpoint. * the behavior of the non-indempotent endpoint is changed; the id is now randomly generated if it's not provided in the body. this isn't strictly required for this change alone, but is hard to carve out. * the behavior of Path is changed significantly. see https://www.notion.so/replicate/Cog-Setup-Path-Problem-2fc41d40bcaf47579ccd8b2f4c71ee24 Co-authored-by: Mattt <[email protected]> * format * stick a %s on line 190 clients.py (#1707) * local upload server can be called cluster.local in addition to .internal (#1714) Signed-off-by: technillogue <[email protected]>

Based on the implementation in #1698 for sync cog. If the request to /predict contains headers `traceparent` and `tracestate` defined by w3c Trace Context[^1] then these headers are forwarded on to the webhook and upload calls. This allows observability systems to link requests passing through cog. [^1]: https://www.w3.org/TR/trace-context/ Signed-off-by: technillogue <[email protected]>

* Cast TraceContext into Mapping[str, str] to fix linter * Include prediction id upload request Based on #1667 This PR introduces two small changes to the file upload interface. 1. We now allow downstream services to include the destination of the asset in a `Location` header, rather than assuming that it's the same as the final upload url (either the one passed via `--upload-url` or the result of a 307 redirect response. 2. We now include the `X-Prediction-Id` header in upload request, this allows the downstream client to potentially do configuration/routing based on the prediction ID. This ID should be considered unsafe and needs to be validated by the downstream service. * Extract ChunkFileReader into top-level class --------- Co-authored-by: Mattt Zmuda <[email protected]>

Co-authored-by: Mattt <[email protected]> Signed-off-by: technillogue <[email protected]>

* optimize webhook serialization and logging * optimize logging by binding structlog proxies * fix tests --------- Signed-off-by: technillogue <[email protected]>

yorickvP · 2024-07-30T14:31:42Z

pyproject.toml

@@ -10,11 +10,13 @@ authors = [{ name = "Replicate", email = "[email protected]" }]
 license.file = "LICENSE"
 urls."Source" = "https://github.com/replicate/cog"

-requires-python = ">=3.7"
+requires-python = ">=3.8"


does this officially drop 3.7?

aron · 2024-10-16T09:22:23Z

@technillogue I think this is stale now right? Shall we close it and re-open when we pick it up again?

technillogue force-pushed the syl/httpx-from-scratch branch from 6fa0af9 to b1d4d18 Compare July 24, 2024 18:55

technillogue requested review from a team, aron and mattt July 24, 2024 19:01