Skip to content

Crawler only and prioritization #207

Draft
oliviagolden0 wants to merge 23 commits intomasterfrom
crawl-rule
Draft

Crawler only and prioritization #207
oliviagolden0 wants to merge 23 commits intomasterfrom
crawl-rule

Conversation

@oliviagolden0
Copy link
Collaborator

@oliviagolden0 oliviagolden0 commented May 6, 2025

Issue https://github.ibm.com/alchemy-containers/armada-ballast/issues/3593

ENSURE THE FOLLOWING ARE MET:

Below is a brief summary of PR requirements (full list).

  • Your PR title summarizes the changes at a high level (not simply "updated X" or "changes Y")
  • Your PR description links to the issue/design. If not available, includes a full description for the change.
  • Your code contains relevant unit tests.

By opening this PR for review, the author has agreed that these criteria must be met.

By approving this PR, the reviewers have also agreed these criteria have been met and it is ready to be merged.

rules/options.go Outdated

// HighPriority places priority on fields
// that are associated with a rule during
// a crawler run
Copy link
Collaborator

@kramvan1 kramvan1 May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to mention that normal watcher processing is also done for these rules.
Do we want to consider an uint here, and let the use specific a specific priority order? 1-n, 1 being the highest? Would allow them to "group" priority 1 rules or which ever way they want. getPrioritizedPrefixes could just do a value sort. Rules that overlap, highest priority wins.

}
}
return out
return out, sortRulesByPriority(prioritized)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humm, I worry about the overhead of doing this for every call when the result is always the same. Every field value, which can be large, 20 million each crawl bootstrap.
Maybe we need some debug logging or enable Go profiling to see how much of a hit this is going to be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants