Skip to content

Snippets orphan removal #2146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

kamil-bielecki-bosch
Copy link

No description provided.

Copy link
Contributor

@oheger-bosch oheger-bosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One general remark: Scan results produced by ScanCode are shared between multiple ORT runs, so it makes sense to handle them by the orphan service.
For snippet results, this is not the case; they are exclusively assigned to a single ORT run. (Or in other words: For every ORT run that has snippet scanning enabled, another scan is started, and the results are stored.) So, would it be better to directly delete snippet results (if they exist) when the owning ORT run is deleted?

@@ -23,75 +23,58 @@ import io.kotest.core.spec.style.WordSpec
import io.kotest.matchers.shouldBe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message: Please provide a rationale why this extraction is done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

)

// More orphan entries - should be deleted by removal process
createVcsInfoTableEntry(url = "to.delete3")

VcsInfoTable.selectAll().count() shouldBe 11
VcsInfoTable.selectAll().count() shouldBe 17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do some assertions have to be changed? This should not be the case if the fixtures were just moved to a separate class, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've extended commit message for this commit.
Basically - Fixtures were moved to another class, and aligned that every fixture has full list of fields of table that is covering. That caused need to align tests accordingly.

* Default, unique values are generated for every field, to provide entry uniqueness.
* REMARK: The functions are not creating it's own transactions, so should be used in db.dbQuery { ... } blocks.
*/
internal object TableEntryTestFixtures {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name TableEntryTestFixtures is rather unspecific. Should it be named something like OrphanRemovalTestFixtures?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name changed.

* Set of Database fixtures for testing purposes.
* Every function creates a new entry in the corresponding table and returns the id of the created entry.
* Default, unique values are generated for every field, to provide entry uniqueness.
* REMARK: The functions are not creating it's own transactions, so should be used in db.dbQuery { ... } blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... their own transactions..."

id notInSubQuery (
SnippetFindingsSnippetsTable
.select(SnippetFindingsSnippetsTable.snippetId.alias("id"))
.where(SnippetFindingsSnippetsTable.snippetId.isNotNull())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition looks strange. Are you sure it is correct? Are snippets not handled by cascade rules?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SnippetFindingsSnippetsTable is an connection table between snippet_findings and snippets table. So, as far as Snippet can be referenced by more than one scan (at least from DB perspective) it's not safe to delete snippets by cascade delete.

@nnobelis
Copy link
Contributor

nnobelis commented Mar 3, 2025

For snippet results, this is not the case; they are exclusively assigned to a single ORT run.

@oheger-bosch Is it true ? I remember in the past, snippets were assigned to the scan results, and a ORT run could have several scan results because of duplication of them.
Are you really saying that now, an ORT run can only have one set of snippets ? If yes, is this because you removed the aforementioned duplication ?

@oheger-bosch
Copy link
Contributor

For snippet results, this is not the case; they are exclusively assigned to a single ORT run.

@oheger-bosch Is it true ? I remember in the past, snippets were assigned to the scan results, and a ORT run could have several scan results because of duplication of them. Are you really saying that now, an ORT run can only have one se of snippets ? If yes, is this because you removed the aforementioned duplication ?

No, this is not what I am saying. What I say is that a snippet result does only belong to a single ORT run, it is not shared between multiple runs. It might be the case that this ORT run has multiple snippet results though.

@kamil-bielecki-bosch kamil-bielecki-bosch force-pushed the snippets-orphan-removal branch 7 times, most recently from 7be405e to ec2e51c Compare March 6, 2025 21:58
@kamil-bielecki-bosch
Copy link
Author

@oheger-bosch So, in that case, there should be just extension on cascade deletes from ort-runs up to at least snippet_findings_snippets table? Including all related entities like findings?
If yes, what should be done with all orphaned entities that has no relation to any ort-run (ones already deleted with maintenance task)? Some migration script with delete?
Also - what is the point with snippet_findings_snippets table if there are no situations that one snippet can be found from more than one scan run?

@kamil-bielecki-bosch kamil-bielecki-bosch force-pushed the snippets-orphan-removal branch 2 times, most recently from 9542b49 to e67c030 Compare March 17, 2025 13:14
Kamil Bielecki added 2 commits March 26, 2025 11:29
To prevent test class further growth, table fixtures used are
extracted to external class. To provide flexibility, method
parameter lists are extended, to cover full list of fields for
every table. Tests using fixtures are aligned to new parameter
lists.

Signed-off-by: Kamil Bielecki <[email protected]>
During deletion of old / outdated ORT runs some of child database
entities were left in DB. To prevent this situation, that leads to
database performance issues, orphaned records are deleted.
Deleted record types:
- Scan results
- Scan summaries
- Snippet findings
- Snippets
- License findings
- Copyright findings

Signed-off-by: Kamil Bielecki <[email protected]>
@kamil-bielecki-bosch
Copy link
Author

Obsolete, due to design changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants