Description
There are several issues that affect endpoints that can (optionally) return a Response
object:
-
If they are annotated to return a
Response
, on a cache hit there is an exception:@app.get("/cache_response_obj") @cache(namespace="test", expire=5) async def cache_response_obj() -> JSONResponse: return JSONResponse({"a": 1})
triggers a
RuntimeError: no validator found for <class 'starlette.responses.JSONResponse'>, see
arbitrary_types_allowedin Config
exception as Pydantic can't handle response objects. -
Even when not annotated, only
JSONResponse
objects are handled, but badly. They are explicitly unwrapped byJsonCoder.encode()
, and so on a cache hit theirContent-Type: application/json
header is lost, as is any status code other than 200 and any other headers added. -
The
PickleCoder.encode
method special casesJinja2Templates.TemplateResponse
objects, unwrapping those and so losing headers and the status code, too. The unwrapping was done because the class has atemplate
and acontext
attribute and these can contain unpickleable objects. This could be handled better by replacing the object with a regular response object.
There are two options here:
-
we could disable the cache if any
Response
object is returned from the decorator. Not a popular option given that json and template responses are going to be common, at the very least. -
special-case responses, replacing them with a serialisable wrapper. This wrapper can store the headers, status code and the contained body so that on a cache hit, we can reconstruct it.
For the second case we should include the other response types, which is basically to store their status code, headers and the (encoded) body.
However, StreamingResponse
and FileResponse
can't be cached in this project, full stop, because they represent dynamic content. These can be destinguished by their lack of a body
attribute. We also can't support caching the background
attribute, cached responses won't trigger new background tasks. The attribute should be retained on fresh responses however.
My current thinking is to process responses separately, replacing them with a custom CacheableResponse
dataclass:
@dataclass(init=False)
class CacheableResponseWrapper:
# str to facilitate clean JSON encoding support, decoded with UTF-8 + surrogate escapes
str_body: str
status_code: int
# str to facilitate clean JSON encoding support, decoded as Latin-1.
raw_str_headers: List[Tuple[str, str]]
@classmethod
def from_response(cls, resp: Response) -> Self:
try:
body = resp.body
except AttributeError:
raise TypeError(f"Unsupported dynamic Response type: {type(resp)}")
headers = [(name.decode('latin1'), value.decode('latin1') for name, value in resp.raw_headers]
return cls(body.decode('utf8', 'surrogateescape'), resp.status_code, headers)
@property
def response(self) -> Response:
result = Response(self.body.encode('utf8', 'surrogateescape'), self.status_code)
result.raw_headers = [(name.encode('latin1'), value.encode('latin1')) for name, value in self.raw_str_headers]
return result
- using a dataclass makes this both json encodable and pickaleable.
- decoding the headers values to latin-1 means they can be stored as strings, important for the
JsonCoder
path, which would otherwise treat bytes values as UTF-8. Latin-1 is a codec that always succeeds and is reversable and is already the codec used by theResponse
implementation. - I've picked decoding the body as UTF-8 with the
surrogateescape
error handler because that would be the more efficient choice for the majority of response values, which I would expect to be text (JSON, HTML templates, etc). Thesurrogateescape
handler allows you to 'smuggle' any byte sequence that is not UTF-8 into the resulting string as surrogate codepoints, which are codepoints not normally found in UTF-8 text (they are reserved for UTF-16 encodings), and these codepoints can be re-encoded to their original bytes by using the same error handler when encoding.