Skip to content

🐛 linguist-backend: cleanEntities iterates over every entity #6091

@scott-kausler

Description

@scott-kausler

Workspace

linguist

📜 Description

As part of processing entities, the Liniguest plugin calls a cleanEntities method which iterates over ALL entities instead of respecting the batchSize.

When there are hundreds or even thousands of entities to process, this causes a lot of chatter and unnecessary REST calls, particularly when the frequency is the recommended 2 minutes.

👍 Expected behavior

Either:

  1. cleanEntities respects the batchSize like the other methods and only fetches that many entities
  2. cleanEntities makes a single call to get all entities
  3. cleanEntities operates on a different schedule that is configurable

👎 Actual Behavior with Screenshots

cleanEntities iterates over every entity and makes a REST request to the catalog backend for each entity.

👟 Reproduction steps

  • Setup the linguist plugin
  • Set the batch size lower than the number of entities you are operating on
  • Observe through logs or tracing that linguist is still making a request on every entity

📃 Provide the context for the Bug.

In my case I have hundreds of entities for Linguist to process. I can see a REST request for each entity that Linguist is filtered on, even though the batch size is 20.

I haven't proven this, but I believe the large number of requests being made every 2 minutes is causing my health checks to fail, ultimately causing Backstage to restart.

👀 Have you spent some time to check if this bug has been raised before?

  • I checked and didn't find similar issue

🏢 Have you read the Code of Conduct?

Are you willing to submit PR?

None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions