Support for more string operations#60
Conversation
…h such a function
…e second argument of trim functions
… type inference of string operations
…sic(t *testing.T) {
There was a problem hiding this comment.
Pull request overview
This PR extends SMT translation and type handling for additional Rego string operations, including trim variants, case conversion, and substring negative-length behavior, while removing direct concat handling.
Changes:
- Adds SMT mappings/declarations and constraints for
trim,trim_left,trim_right,substring,upper, andlower. - Adds
-case-unicodeCLI support for Unicode-aware case conversion. - Adds end-to-end tests for new string operation behavior and updates related type/inference tests.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
cmd/smt/main.go |
Adds CLI flag for Unicode case handling. |
pkg/analysis/string_operations.go |
Tracks trim-left/right as string operations. |
pkg/infer/infer_test.go |
Updates builtin hint expectations after concat removal. |
pkg/smt/e2e_inline_test.go |
Adds end-to-end tests for string operations. |
pkg/smt/exprs.go |
Routes builtin calls through constraint-aware handling. |
pkg/smt/functions.go |
Adds SMT function mappings, declarations, and constraints for string operations. |
pkg/smt/representation.go |
Updates SMT string literal escaping for Unicode. |
pkg/smt/run_policy_test.go |
Adds test helper for Unicode case mode. |
pkg/smt/translator.go |
Initializes translators with builtin SMT declarations. |
pkg/smt/translator_test.go |
Adjusts test setup for builtin declarations. |
pkg/types/predef_functions.go |
Updates predefined function typing for string builtins. |
pkg/types/type_analysis_test.go |
Removes obsolete concat-related type test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // UseUnicodeCase controls whether to_lower/to_upper should operate on full Unicode | ||
| // or only ASCII. Default is false (ASCII-only behavior). | ||
| var UseUnicodeCase bool = false | ||
|
|
||
| // SetUseUnicodeCase sets the package-wide flag controlling case handling. | ||
| func SetUseUnicodeCase(v bool) { | ||
| UseUnicodeCase = v |
There was a problem hiding this comment.
I will fix it after #54 so I can use the translator options.
chore: typo Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
hohran
left a comment
There was a problem hiding this comment.
This PR refines the handling of multiple string operations. Most notably, i find the lower and upper functions to work as expected in all tests I tried.
For the other functions, specifically trim variants and substring, I found them working only when having all arguments constant. Introducing a variable argument most usually resulted in them behaving as uninterpreted functions.
I know you somehow mentioned this, I just wanted to bring it up, because I did not see any hints at that in the code (so that we do not forget about it later).
| value SmtValue | ||
| } | ||
|
|
||
| func (et *ExprTranslator) callBuiltin(op *Function, params []*SmtValue) (*SmtValue, error) { |
There was a problem hiding this comment.
From what i see, this function also possibly handles user-defined functions. Is there any reason to have it named this way? Do you plan on handling user-defined functions differently?
| lenS := fmt.Sprintf("(str.len %s)", s.String()) | ||
| correctNegative := fmt.Sprintf("(str.substr %s %s %s)", s.String(), i.String(), lenS) | ||
| correctNonNeg := fmt.Sprintf("(str.substr %s %s %s)", s.String(), i.String(), j.String()) | ||
| iteExpr := fmt.Sprintf("(ite (< %s 0) %s %s)", j.String(), correctNegative, correctNonNeg) |
There was a problem hiding this comment.
I would prefer using the SmtValue functions, such as Ite, or Equals
Other than the second argument of |
|
|
That is a bug related to #61, This is a problem, the local variables have to be handled trough |
This PR adds support for the string functions
trim,trim_left,trim_right,upper,lower. It also fixes the behaviour ofsubstringfor the case that length arg is negative and removes the handling ofconcatas it behaves like join which we currently cannot handle.To be able to handle the new functions, I had to extend the structure
Functionby a member that takes the resulting "call" of the SMT function and returns extra axioms (asserts, fresh vars declarations). Furthermore, I added functionGetBuiltinDecls()that returns the declarations/definitions of the mapped SMT functions.I also added a new option for smt converter,
-case-unicode, if passed to the converter, it will handleupper/lowerfunction for the whole Unicode, otherwise only for ASCII.