-
Notifications
You must be signed in to change notification settings - Fork 24
docs: ADR for an upload API targeted towards the browser #1554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really care about the state of any specific sbom ? I would think the only time we care is if/when there was a problem which might remove the need for an upload API ... so gets a LGTM in terms of design but I wonder if its worth the effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I just added a minor observation about the state cleanup definition.
* When the backend finished processing the upload | ||
* It sets the final `state` (`failed` or `succeeded`) and the `result` | ||
* It stops updating the `updated` column | ||
* The backend cleans up (deletes) all entries with a "stale" `updated` timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the clean up should be done by the client and not by the server. E.g.
- Client uploads file
- Client keeps monitoring
/api/v2/upload/{id}
every 5 seconds. - Once the client gets a valid response at
/api/v2/upload/{id}
then the client stop requesting/api/v2/upload/{id}
and deletes it
If the server decides to delete the upload state then the client might keep trying to fetch /api/v2/upload/{id}
and all of a sudden get a 404 because the server deleted it without the client knowing about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the client should be in charge of cleaning it up. There's a bunch of cases where the client won't be able to. So we'd either have a growing table of stale entries. Or we need to implement it anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall one of the primary motivations for moving from V1 to V2 was that it was very confusing for the user to tell whether a document they uploaded ever got processed successfully.
This smells like backsliding into that same territory.
I'm not wholly against it, but I would like it to be designed from the user's perspective, and I would like real measurements and bug reports indicating that it's actually a problem we need to address.
For example, what's the size threshold beneath which the user has a perfectly satisfying experience uploading an SBOM today? How common is it that users attempt to upload files bigger than that threshold?
If we can determine that size, maybe the UX simply returns a helpful "exceeds size" message that instructs them how to use the dataset api instead.
Maybe all the suggestions in this ADR be applied to the dataset api, making it more robust and more user friendly?
I like our simple upload UX feature as it is. It's a 1000x better than V1, so I'd prefer to keep it simple for the common case, if possible.
And that's still the case. We are able to track this. And this API allows you doing that.
We got a bunch of JIRAs specifically for this already. That's the motivation.
I don't think the user should be involved in that. The UI should deal with this. There would be no difference from a user's perspective. For the user, it just works. (Compared to right now, where it just fails).
The ADR defines a new API. That could including uploading datasets as well. But that would just be another format type.
And it creates a lot of JIRAs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it! We also have a REST proposal, which makes easier to understand the final output of data.
I see there is a proposed endpoint DELETE /api/v2/upload/{id}
. Am I right interpreting it as the client being responsible of deleting Uploads? I think it makes sense but previously in another comment @ctron you gave me the impression that you didn't like that idea as it would generate a growing table of stale entries.
### REST API | ||
|
||
* `GET /api/v2/upload/{id}`: Get information about the upload | ||
|
||
Response (`200 OK`): | ||
|
||
```json5 | ||
{ | ||
"id": "opaque-unique-id", | ||
"state": "processing", // or failed, succeeded | ||
"updated": "2025-05-07T10:13:27Z", // always UTC, | ||
"result": {} // or absent for `processing`, `failed` | ||
} | ||
``` | ||
|
||
* `DELETE /api/v2/upload/{id}`: Delete the state record, will not receive further updates | ||
|
||
Response (`204 No Content`): Sent if found or if not found. | ||
|
||
* `POST /api/v2/upload`: Start an upload | ||
Request: | ||
* `format`: Format of the document, defaults to "auto-detect". Can also be `sbom` or `advisory`. | ||
|
||
Response (`202 Accepted`): | ||
|
||
```json5 | ||
{ | ||
"id": "opaque-unique-id", | ||
"format": "concrete-format" // e.g. "spdx" | ||
} | ||
``` | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the flow will be:
POST /api/v2/upload
. Generates response (202 - Accepted):
{
"id": "opaque-unique-id",
}
-
Then the client need to watch continuously the upload using.
GET /api/v2/upload/{id}
whereid
is theid
generated in the previous step. The response will be:{ "id": "opaque-unique-id", "state": "processing", // or failed, succeeded }
-
Finally, once the client wants to stop monitoring the upload the endpoint
DELETE /api/v2/upload/{id}
should be called.
I think that should work and cover all issues reported by QE.
On a side note
A crazy idea came to me while reading this ADR:
Would it be crazy to have an endpoint GET /api/v2/upload
that list all uploads (with pagination in place)?
Given the fact that we have the endpoint DELET /api/v2/upload/{id}
I guess the client is in charge of deleting Uploads. Then having a list of all Existing uploads would help to know which are the uploads that are pending to be cleared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, that's the idea.
The downside with the enumeration endpoint is, that we'd need to somehow tie in authorization. Right now, we lack proper stuff anyway. The question is: why do we need it? Clearing up is a responsibility of the backend. I don't want to make the API more complex than we really need. If we do, ok. But let's wait for this use case.
I still don't like it 😁 However, I think it makes sense allowing the client to perform this action anyway. Kind of a "best effort'. But not as a "responsibility", but more as an optimization. Should the frontend/client be willing to clean up. Fine. But the backend would clean up in any case. Where the timeout for that should be configurable, and in the area of like 15mins or more. |
One other idea: when we initiate an upload, it can be optional/intentional the creation of an Upload instance
|
Also see: https://issues.redhat.com/browse/TC-2298