-
Notifications
You must be signed in to change notification settings - Fork 44
Fix: exclude commercial-existing fields that violates pydantic #428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @haggit-eliyahu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refines the data validation mechanisms within the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
508960e to
02aa5f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces several fixes to handle validation issues with existing commercial integrations. This includes excluding certain parameter descriptions from length validation, adding numerous exceptions to regex patterns for script and parameter names, increasing the maximum word count for parameter names, and providing a default value for a missing field. The approach of adding exceptions is pragmatic for dealing with legacy data. The implementation of the new validator and its application across various data models is well-executed. I have identified a potential bug in one of the new regex patterns and some redundant entries that should be cleaned up for better maintainability.
I am having trouble creating individual review comments. Click here to see my feedback.
packages/mp/src/mp/core/constants.py (476)
This regex pattern appears to contain a typo and might be the result of a copy-paste error.
- The part
Google Rapid Response \(GR$seems incomplete and is likely intended to beGoogle Rapid Response \(GRR\). However, an exclusion forGoogle Rapid Response (GRR)already exists on line 468. - The second part of the alternation,
Tenable\.io - List Endpoint Vulnerabilities$, is missing a^at the beginning, which means it would match any string ending with that text, not the exact string. This is inconsistent with other patterns in this list.
This looks like it should just be an exclusion for the Tenable script name.
r"|^Tenable\.io - List Endpoint Vulnerabilities$"
packages/mp/src/mp/core/constants.py (560-606)
There are several duplicated patterns in this list of exceptions for PARAM_DISPLAY_NAME_REGEX. While this doesn't break the functionality of the regular expression, it adds unnecessary clutter and makes the list harder to read and maintain.
For example:
r"|^Fetch\ Backwards\ Time\ Interval\ \(minutes\)$\"appears 5 times.r"|^Extract\ urls\ from\ HTML\ email\ part\?\$"appears 3 times.r"|^Create\ a\ Separate\ Siemplify\ Alert\ per\ Attached\ Mail\ File\?\$"appears 3 times.
Please remove the duplicate entries to improve code clarity.
# Conflicts: # packages/mp/pyproject.toml # packages/mp/uv.lock
02aa5f2 to
51d4d16
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to fix validation issues for commercial integrations by adding exclusion lists and a new validator for parameter descriptions. The changes correctly apply the new validator for various parameter types. However, there's a critical issue where this new parameter-specific validator is incorrectly applied to action descriptions, which have a different length constraint, potentially breaking validation for valid data. Additionally, the growing exclusion lists added to the constants file are a maintainability concern; I've suggested moving them to separate data files to keep the codebase cleaner.
| EXCLUDED_LONG_PARAM_DESCRIPTION_PREFIXES: set[str] = { | ||
| "\t\n\nIf provided, the connector will use this value for Siemplify Rule Generator. Please re", | ||
| "A comma separated list CSV encoding types used for decoding your CSV files, e.g. utf-8, lati", | ||
| "A comma-separated string of email headers to add to Google SecOps events, such as “DKIM-Siga", | ||
| "A custom alert name. You can provide placeholders in the following format: [name of the fiel", | ||
| "A custom case name. When you configure this parameter, the connector adds a new key called c", | ||
| "A custom rule generator.\nYou can use placeholders in the format [field_name], for example: ", | ||
| "A custom rule generator. You can provide placeholders in the following format: [name of the ", | ||
| "A filter condition that specifies the email labels to search for. This parameter accepts mul", | ||
| "A regular expression pattern to run on the value found in the Environment Field Name field. ", | ||
| 'A regular expression pattern to run on the value found in the "Environment Field Name" field', | ||
| "By default, the search will be executed in the default mailbox specified in the integration ", | ||
| "Client email of your service account. You can configure either this parameter or the User Se", | ||
| "Comma separated. e.g. customer.combo_name,category.sym,status.sym,priority.sym,active,log_ag", | ||
| "End date of the search. Search will return only records equal or before this point in time.", | ||
| "Grouping mechanism that will be used to create Siemplify Alerts. Possible values: Host, ", | ||
| "If defined - connector will extract the environment from the specified event field. You can ", | ||
| "If provided, connector will use this value for Alert Name. Please refer to the documentation", | ||
| "If provided, connector will use this value for Siemplify Alert Name. Please refer to the doc", | ||
| "If provided, connector will use this value for Google Secops Alert Name. Please refer to the", | ||
| "If provided, the connector uses this value for Chronicle SOAR", | ||
| "If specified, connector will use this value from the Microsoft Azure Sentinel API response f", | ||
| "Number of days before the first connector iteration to retrieve vulnerabilities from. This p", | ||
| "Optional. Specify custom query parameter you want to add to the list users search call. For ", | ||
| "Provide a delimiter character, with which the action will split the input it gets into a num", | ||
| "Search field for free text queries (When query doesn't specify a field name).", | ||
| "Search pattern for a elastic index.\r\nIn elastic, index is like a DatabaseName, and data is", | ||
| "Specify a comma separated list of alert attributes that should be used as a fallback for the", | ||
| "Specify a comma separated list of incident or alert attributes that should be used as a fall", | ||
| "Specify a comma-separated list of engines that should be used to retrieve information, wheth", | ||
| "Specify a comma-separated list of fields to return. Example of values:assetType,project,fold", | ||
| "Specify a comma-separated list of the event types that need to be returned. If nothing is pr", | ||
| "Specify a limit for how many events for a single offense connector should query from Qradar ", | ||
| 'Specify a time frame for the results. If "Alert Time Till Now" is selected, action will use ', | ||
| 'Specify a time frame for the results. If "Custom" is selected, you also need to provide "Sta', | ||
| "Specify a time frame for the results. If “Alert Time Till Now” is selected, action will use ", | ||
| "Specify the amount of time in minutes to pass before the connector will try to fetch events ", | ||
| "Specify the filter to fetch the recommendations for. Parameter expects a string of a format ", | ||
| "Specify the query that needs to be executed. Note: the query should follow a strict pattern ", | ||
| "Specify the query that needs to be executed. Note: this query should follow a strict pattern", | ||
| "Specify the time frame for the search. Only hours and days are supported. Note: end time wil", | ||
| 'Specify the wait mode for the action. If "Until Timeout" is selected, action will wait until', | ||
| "Specify what attributes need to be used, when the action is to search for similar alerts. If", | ||
| 'Specify what selection should be used for users. If "From Entities & User Identifiers" is se', | ||
| "Start date of the search. Search will return only records equal or after this point in time.", | ||
| "The client email address of your workload identity. You can configure either this parameter ", | ||
| "The conditions that are required for the custom fields for the action to resume running a pl", | ||
| "The content of the service account key JSON file. You can configure either this parameter or", | ||
| "The number of days for the action to wait before refreshing the entity summary. The action g", | ||
| 'The search query to perform. It is in Lucene syntax.\r\nIE1: "*" (this is a wildcard that wi', | ||
| 'When provided, connector will add a new key called "custom_case_name" to the', | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These large exclusion lists (EXCLUDED_LONG_PARAM_DESCRIPTION_PREFIXES, and the additions to SCRIPT_DISPLAY_NAME_REGEX and PARAM_DISPLAY_NAME_REGEX) make the constants.py file difficult to read and maintain.
To improve this, I recommend moving these lists to separate data files (e.g., YAML or JSON) and loading them at runtime. This will keep the code cleaner and make the exclusion lists easier to manage.
For example, you could have a file long_description_exclusions.json:
[
"\\t\\n\\nIf provided, the connector will use this value for Siemplify Rule Generator. Please re",
"A comma separated list CSV encoding types used for decoding your CSV files, e.g. utf-8, lati",
"..."
]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TalShafir1 what do you say? is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will help maintaining the constants module. loading all of them as constants all the time is also a waste. We can definitely store these string in text, or yaml files instead and use them only when needed, or at least move them to a different module like exclusions.py or something
Description
Checklist:
Please ensure you have completed the following items before submitting your PR.
This helps us review your contribution faster and more efficiently.
General Checks:
Open-Source Specific Checks:
For Google Team Members and Reviewers Only:
Screenshots (If Applicable)
If your changes involve UI or visual elements, please include screenshots or GIFs here.
Ensure any sensitive data is redacted or generalized.
Further Comments / Questions
Any additional comments, questions, or areas where you'd like specific feedback.