🐛 bug + enhancement - prevent false positives findings for user errors#6428
🐛 bug + enhancement - prevent false positives findings for user errors#6428LittleSalkin1806 wants to merge 4 commits intomondoohq:mainfrom
Conversation
56d1445 to
efe9a1c
Compare
|
I could find one similar issue where this problem was described even when it is very specific for one function |
|
Heya @LittleSalkin1806 , yeah this was a design decision we made early on when dealing with JSON structs in all shapes, but it comes with the downsides you highlighted. I'd love to dive into the analysis a bit deeper using a lot of the policies we currently have before moving forward on a decision. Since this is changing core syntax, we'll have to target a major release, which luckily is every 6 month and one is coming up. |
|
Hey sure! My most important message is:
Fuzzy matching is missing in here but i believe this is a quick win. I have already seen that if the check gives an error back that the risk is even higher than the impact should influence the risk. I already experienced some false positivies and just took me time to catch it and to be honest find it better to write simple queries. Take your time and hopefully it helps you. If not feel free to close it :) |
|
+1, I definitely agree on the pitfalls that this can cause and also agree with you that we need to handle it better, because it sucks when you expect it to do something only to find it silently turns it into a false positive. I'll make the discussion public, and your perspective really helps with this as well. Definitely something I want MQL to do better :) |
|
I forgot to mention, the next major release will open it's pre-release at the start of next month (v13-pre). We then have 2 weeks to get all major changes into the pre-release, before it turns into a release-candidate (v13-rc) and locks down for 2 more weeks before the final release (v13). Can't guarantee we can solve it in time, but I'll try! |
|
Interesting thanks for that info. I just started a few weeks ago working with Mondoo but as we need the tool and the idears are great :) |
When accessing a dict/map key that doesn't exist but a case-insensitive
match is found, provide a helpful error message like:
"key 'Hello' not found, did you mean 'hello'? (keys are case-sensitive)"
This prevents silent false positives in queries like:
gcp.storage.buckets { iamConfiguration['UniformBucketLevelAccess'] }
when the actual key is 'uniformBucketLevelAccess'.
For keys that truly don't exist (no similar key found), returns nil
for backward compatibility.
Block operations like `where`, `all`, `any`, `one`, `none` were silently absorbing errors from their predicate execution. This caused case-mismatch errors to be swallowed when using queries like: items.where(_['WrongCase'] == value) Now errors are propagated via multierr.Errors so users see helpful messages. Note: `filterList` intentionally doesn't propagate errors because it's used by `recurse` which iterates through heterogeneous nested structures where type-related errors are expected.
Tests cover: - Non-existent keys returning nil (backward compatibility) - Case-mismatch detection with "did you mean" suggestions - Error propagation through chained function calls - Error propagation through block operations (where, all, any, one, none)
efe9a1c to
ce67f6d
Compare
This introduces optional chaining in MQL. It also changes the behavior so that by default we do not blindly chain elements anymore but instead print errors when an element in the chain is `null`. **Background** Since MQL and cnspec are often used for security and compliance policies, we have to create a careful balance between user-comfort and precision. This means that e.g. a comparison like `user.id == 123` will work perfectly well, even if we compare strings and numbers, because the users intention is understood. However, when chaining calls, we have so far sided on the side of comfort - maybe a bit too much. The current behavior - which is optional chaining by default - can lead to cases where the user intent is misunderstood. It's ok if an AI hallucinates (not really, but you get the point), but it's not ok for a query/policy engine to lead to such negative unexpected behaviors. You can see a full example on display here: #6428 **Action** The system makes all chaining mandatory non-null by default. This means that if you have json like this: ```json { "a": { "b": "c" } } ``` and called it with: ``` > parse.json('example.json').params.d.b ``` it would now produce an error: ``` > parse.json('example.json').params.d.b == "c" [failed] parse.json.params.d.b == "c" error: cannot access field "b", parent element is null ``` This errors is new. It's becase the field `d` doesn't exist in this JSON. This is where **optional chaining** comes back into the picture: It allows you to say "please don't print an error, just see if you can get the value" and thus use the previous behavior of MQL: ``` > parse.json('example.json').params.d?.b == "c" [failed] parse.json.params.d.[]? == "c" expected: == "c" actual: _ ``` This is also in line with behaviors in JavaScript and TypeScript.
This introduces optional chaining in MQL. It also changes the behavior so that by default we do not blindly chain elements anymore but instead print errors when an element in the chain is `null`. **Background** Since MQL and cnspec are often used for security and compliance policies, we have to create a careful balance between user-comfort and precision. This means that e.g. a comparison like `user.id == 123` will work perfectly well, even if we compare strings and numbers, because the users intention is understood. However, when chaining calls, we have so far sided on the side of comfort - maybe a bit too much. The current behavior - which is optional chaining by default - can lead to cases where the user intent is misunderstood. It's ok if an AI hallucinates (not really, but you get the point), but it's not ok for a query/policy engine to lead to such negative unexpected behaviors. You can see a full example on display here: #6428 **Action** The system makes all chaining mandatory non-null by default. This means that if you have json like this: ```json { "a": { "b": "c" } } ``` and called it with: ``` > parse.json('example.json').params.d.b ``` it would now produce an error: ``` > parse.json('example.json').params.d.b == "c" [failed] parse.json.params.d.b == "c" error: cannot access field "b", parent element is null ``` This errors is new. It's becase the field `d` doesn't exist in this JSON. This is where **optional chaining** comes back into the picture: It allows you to say "please don't print an error, just see if you can get the value" and thus use the previous behavior of MQL: ``` > parse.json('example.json').params.d?.b == "c" [failed] parse.json.params.d.[]? == "c" expected: == "c" actual: _ ``` This is also in line with behaviors in JavaScript and TypeScript.
This introduces optional chaining in MQL. It also changes the behavior so that by default we do not blindly chain elements anymore but instead print errors when an element in the chain is `null`. **Background** Since MQL and cnspec are often used for security and compliance policies, we have to create a careful balance between user-comfort and precision. This means that e.g. a comparison like `user.id == 123` will work perfectly well, even if we compare strings and numbers, because the users intention is understood. However, when chaining calls, we have so far sided on the side of comfort - maybe a bit too much. The current behavior - which is optional chaining by default - can lead to cases where the user intent is misunderstood. It's ok if an AI hallucinates (not really, but you get the point), but it's not ok for a query/policy engine to lead to such negative unexpected behaviors. You can see a full example on display here: #6428 **Action** The system makes all chaining mandatory non-null by default. This means that if you have json like this: ```json { "a": { "b": "c" } } ``` and called it with: ``` > parse.json('example.json').params.d.b ``` it would now produce an error: ``` > parse.json('example.json').params.d.b == "c" [failed] parse.json.params.d.b == "c" error: cannot access field "b", parent element is null ``` This errors is new. It's becase the field `d` doesn't exist in this JSON. This is where **optional chaining** comes back into the picture: It allows you to say "please don't print an error, just see if you can get the value" and thus use the previous behavior of MQL: ``` > parse.json('example.json').params.d?.b == "c" [failed] parse.json.params.d.[]? == "c" expected: == "c" actual: _ ``` This is also in line with behaviors in JavaScript and TypeScript.
This introduces optional chaining in MQL. It also changes the behavior so that by default we do not blindly chain elements anymore but instead print errors when an element in the chain is `null`. **Background** Since MQL and cnspec are often used for security and compliance policies, we have to create a careful balance between user-comfort and precision. This means that e.g. a comparison like `user.id == 123` will work perfectly well, even if we compare strings and numbers, because the users intention is understood. However, when chaining calls, we have so far sided on the side of comfort - maybe a bit too much. The current behavior - which is optional chaining by default - can lead to cases where the user intent is misunderstood. It's ok if an AI hallucinates (not really, but you get the point), but it's not ok for a query/policy engine to lead to such negative unexpected behaviors. You can see a full example on display here: #6428 **Action** The system makes all chaining mandatory non-null by default. This means that if you have json like this: ```json { "a": { "b": "c" } } ``` and called it with: ``` > parse.json('example.json').params.d.b ``` it would now produce an error: ``` > parse.json('example.json').params.d.b == "c" [failed] parse.json.params.d.b == "c" error: cannot access field "b", parent element is null ``` This errors is new. It's becase the field `d` doesn't exist in this JSON. This is where **optional chaining** comes back into the picture: It allows you to say "please don't print an error, just see if you can get the value" and thus use the previous behavior of MQL: ``` > parse.json('example.json').params.d?.b == "c" [failed] parse.json.params.d.[]? == "c" expected: == "c" actual: _ ``` This is also in line with behaviors in JavaScript and TypeScript.
|
@arlimus i think the fuzzy search is theoretically the only point not introduced but anway do you see any value to let this PR open or should i close it ? |
Problem
While fixing cnspec#2037, I discovered that MQL silently returns
nilwhen accessing dict/map keys that don't exist due to case mismatch.Example:
gcloud.storage.bucket.iamConfiguration.UniformBucketLevelAccess.enabledThe key
UniformBucketLevelAccessdoesn't exist — it should beuniformBucketLevelAccess(lowercaseu). Instead of an error, the query silently returnsnil, causing false positives.For me personally, false positives in security queries should be avoided. They aren't as bad as false negatives, but they still sit there silently, occupy resources, and need to be handled. At worst, they trigger incidents for issues that aren't actually happening.
Solution
Commit 1: Case-mismatch detection
nil(backward compatible)Commit 2: Error propagation in block operations
where,all,any,one,nonenow propagate errors instead of silently absorbing themrunBlockalready works (line 639-640 of llx.go)filterList(used byrecurse) remains error-absorbing sincerecurseintentionally traverses heterogeneous structuresCommit 3: Tests
Before / After
Before:
After:
Why propagate ALL errors (not just case-mismatch)?
runBlockalready propagates errors viaerr.Deduplicate()Learnings & Open Questions
After implementing these changes, I learned that we only deal with the actual API response of each resource and do not use any provider schema. Therefore,
nilreturns are mandatory to keep the current logic working.I'm aware that safety can be achieved with better-written MQL queries (e.g., using
has), but I wanted to address this at a lower level (in the code). So I assume the current behavior is intentional / a feature.Potential improvements:
With case-insensitive checks and fuzzy matching, we could likely reduce many false positives.
Conclusion
There are still some open questions, but at minimum this is a finding: the current code can produce many false positives without additional checks. Safety can be achieved today using explicit MQL query functions (like
has), but there may be room for improvement at the engine level.Especially fuzzy matching and case insensitive is the case for nested dictionaries where no schema from .lr can be used inside the interactive shell and scans.