Conversation
…string_view support - Added constructor and assignment operator for std::u16string_view. - Implemented conversion operator to std::u16string_view. - Introduced methods to append std::u16string_view. - Added Copy and assignment operator overloads for char16_t pointers. - Fixed variable name inconsistencies (myLength vs mylength). - Added new string manipulation methods: LeftAdjust, RightAdjust, LeftJustify, RightJustify, Center, Capitalize, Prepend, FirstLocationInSet, FirstLocationNotInSet, IntegerValue, IsIntegerValue, RealValue, IsRealValue, IsSameString.
…and methods for better handling of empty strings and parameter validation
…r and real number strings in TCollection_ExtendedString constructors
…Collection_ExtendedString
…xtendedString - Updated constructor and method documentation for clarity and consistency. - Renamed parameters in constructors and methods to follow a consistent naming convention (e.g., `astring` to `theString`, `isMultiByte` to `theIsMultiByte`). - Enhanced examples in documentation to illustrate usage more effectively. - Improved comments for methods to provide better context and usage scenarios.
…proved methods for string manipulation
…iver::Paste method
…String.cxx Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Pasukhin Dmitry <pasuhinsvzn@gmail.com>
…ndedString to use char16_t for ASCII characters
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
WalkthroughThis pull request refactors the 🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@src/FoundationClasses/TKernel/TCollection/TCollection_ExtendedString.cxx`:
- Around line 227-233: The TCollection_ExtendedString constructor that takes
(const int theLength, const char16_t theFiller) must validate theLength is
non-negative before calling allocate to avoid underflow/oversized allocations;
add a guard at the start of
TCollection_ExtendedString::TCollection_ExtendedString to check theLength >= 0
and handle invalid values (e.g., throw an appropriate exception or set length to
0) and only call allocate(theLength) and fill myString when theLength is valid.
- Around line 495-590: Several pointer+length string methods (IsEqual,
IsDifferent, IsLess, IsGreater, StartsWith, EndsWith) perform memcmp/indexing
without validating theOther or theLength, causing UB for nullptr or negative
lengths; add guarded early checks: if theLength < 0 return false (or false/true
consistent with semantics), treat theOther == nullptr && theLength == 0 as empty
(continue), otherwise if theOther == nullptr with positive length return false
(or appropriate comparison result), and only call memcmp or index myString when
theOther != nullptr and theLength > 0; apply the same guard pattern to each
named method (IsEqual, IsDifferent, IsLess, IsGreater, StartsWith, EndsWith).
- Around line 1200-1214: TCollection_ExtendedString::Prepend currently captures
theOther before reallocate which can dangle if theOther aliases into the current
buffer (self‑prepend); fix by detecting aliasing first: save the old buffer
pointer (oldPtr = myString) and old length (anOldLength), check if theOther lies
in [oldPtr, oldPtr + anOldLength) and if so compute an index offset = theOther -
oldPtr, then call reallocate(myLength + theLength), use memmove(myString +
theLength, myString, anOldLength * sizeof(char16_t)) to shift old contents, and
finally copy the prefix using memmove(myString, myString + theLength + index,
theLength * sizeof(char16_t)) when aliased (otherwise copy directly from
theOther with memcpy or memmove); reference functions/vars:
TCollection_ExtendedString::Prepend, reallocate, myString, myLength,
anOldLength, theOther, memmove/memcpy.
- Around line 305-341: The AssignCat overloads use memcpy after reallocate which
causes UB for self-append or when the source pointer lies inside the current
buffer (dangling after reallocate); change the logic in AssignCat(const
TCollection_ExtendedString&), AssignCat(const char16_t*,...), and the
pointer-based path to detect overlap or self-append (e.g. if &theOther == this
or if theString lies between myString and myString+myLength) and in those cases
first copy the source range into a temporary buffer (stack array or
std::vector<char16_t>), then call reallocate and memcpy from the temporary; for
non-overlapping cases you can reallocate and memcpy directly (or use memmove
where appropriate). Ensure references to reallocate, myString, myLength,
AssignCat and theOther are updated accordingly.
🧹 Nitpick comments (1)
src/FoundationClasses/TKernel/TCollection/TCollection_ExtendedString.hxx (1)
69-122: Use Standard_ typedefs for new public APIs.*Several new signatures use raw
int/double/bool(e.g., Line 69+, Line 893+, and the internal length helper/member). OCCT style prefersStandard_Integer,Standard_Real, andStandard_Boolean. Consider aligning these declarations (and the .cxx definitions) for consistency.As per learnings, prefer Standard_Real/Standard_Integer/Standard_Boolean in src/**/*.@(cxx|hxx).♻️ Example alignment (apply broadly)
- Standard_EXPORT TCollection_ExtendedString(const int theValue); - Standard_EXPORT TCollection_ExtendedString(const double theValue); + Standard_EXPORT TCollection_ExtendedString(const Standard_Integer theValue); + Standard_EXPORT TCollection_ExtendedString(const Standard_Real theValue); - Standard_EXPORT int IntegerValue() const; - Standard_EXPORT bool IsIntegerValue() const; - Standard_EXPORT double RealValue() const; - Standard_EXPORT bool IsRealValue(bool theToCheckFull = false) const; + Standard_EXPORT Standard_Integer IntegerValue() const; + Standard_EXPORT Standard_Boolean IsIntegerValue() const; + Standard_EXPORT Standard_Real RealValue() const; + Standard_EXPORT Standard_Boolean IsRealValue(Standard_Boolean theToCheckFull = Standard_False) const;Also applies to: 893-934, 949-969
| TCollection_ExtendedString::TCollection_ExtendedString(const int theLength, | ||
| const char16_t theFiller) | ||
| { | ||
| allocate(length); | ||
| for (int i = 0; i < length; i++) | ||
| mystring[i] = filler; | ||
| allocate(theLength); | ||
| for (int i = 0; i < theLength; i++) | ||
| myString[i] = theFiller; | ||
| } |
There was a problem hiding this comment.
Validate negative length in filler constructor.
Line 227 currently accepts negative lengths, which can underflow allocation sizing and trigger oversized allocations.
🛠 Proposed fix
TCollection_ExtendedString::TCollection_ExtendedString(const int theLength,
const char16_t theFiller)
{
+ if (theLength < 0)
+ {
+ throw Standard_OutOfRange("TCollection_ExtendedString : negative length");
+ }
allocate(theLength);
for (int i = 0; i < theLength; i++)
myString[i] = theFiller;
}🤖 Prompt for AI Agents
In `@src/FoundationClasses/TKernel/TCollection/TCollection_ExtendedString.cxx`
around lines 227 - 233, The TCollection_ExtendedString constructor that takes
(const int theLength, const char16_t theFiller) must validate theLength is
non-negative before calling allocate to avoid underflow/oversized allocations;
add a guard at the start of
TCollection_ExtendedString::TCollection_ExtendedString to check theLength >= 0
and handle invalid values (e.g., throw an appropriate exception or set length to
0) and only call allocate(theLength) and fill myString when theLength is valid.
| void TCollection_ExtendedString::AssignCat(const TCollection_ExtendedString& theOther) | ||
| { | ||
| if (theOther.mylength == 0) | ||
| if (theOther.myLength == 0) | ||
| { | ||
| return; | ||
| } | ||
|
|
||
| const int anOtherLength = theOther.mylength; | ||
| const int anOldLength = mylength; | ||
| reallocate(mylength + anOtherLength); | ||
| memcpy(mystring + anOldLength, theOther.mystring, anOtherLength * sizeof(char16_t)); | ||
| const int anOtherLength = theOther.myLength; | ||
| const int anOldLength = myLength; | ||
| reallocate(myLength + anOtherLength); | ||
| memcpy(myString + anOldLength, theOther.myString, anOtherLength * sizeof(char16_t)); | ||
| } | ||
|
|
||
| //================================================================================================= | ||
| //================================================================================================== | ||
|
|
||
| void TCollection_ExtendedString::AssignCat(const char16_t theChar) | ||
| { | ||
| if (theChar != '\0') | ||
| if (theChar != u'\0') | ||
| { | ||
| reallocate(mylength + 1); | ||
| mystring[mylength - 1] = theChar; | ||
| reallocate(myLength + 1); | ||
| myString[myLength - 1] = theChar; | ||
| } | ||
| } | ||
|
|
||
| //================================================================================================= | ||
| //================================================================================================== | ||
|
|
||
| TCollection_ExtendedString TCollection_ExtendedString::Cat( | ||
| const TCollection_ExtendedString& other) const | ||
| void TCollection_ExtendedString::AssignCat(const char16_t* theString, const int theLength) | ||
| { | ||
| TCollection_ExtendedString res(mylength + other.mylength, 0); | ||
| if (mylength > 0) | ||
| memcpy(res.mystring, mystring, mylength * sizeof(char16_t)); | ||
| if (other.mylength > 0) | ||
| memcpy(res.mystring + mylength, other.mystring, other.mylength * sizeof(char16_t)); | ||
| return res; | ||
| if (theString == nullptr || theLength <= 0) | ||
| { | ||
| return; | ||
| } | ||
|
|
||
| const int anOldLength = myLength; | ||
| reallocate(myLength + theLength); | ||
| memcpy(myString + anOldLength, theString, theLength * sizeof(char16_t)); | ||
| } |
There was a problem hiding this comment.
Fix self‑append / overlapping source handling in AssignCat.
Line 315 uses memcpy after reallocate, so s += s or overlapping sources are UB; the pointer+length overload can also dangle if the source points into the current buffer.
🛠 Proposed fix
void TCollection_ExtendedString::AssignCat(const TCollection_ExtendedString& theOther)
{
if (theOther.myLength == 0)
{
return;
}
const int anOtherLength = theOther.myLength;
const int anOldLength = myLength;
reallocate(myLength + anOtherLength);
- memcpy(myString + anOldLength, theOther.myString, anOtherLength * sizeof(char16_t));
+ // memmove to allow self-append (overlapping ranges)
+ memmove(myString + anOldLength, theOther.myString, anOtherLength * sizeof(char16_t));
}
void TCollection_ExtendedString::AssignCat(const char16_t* theString, const int theLength)
{
if (theString == nullptr || theLength <= 0)
{
return;
}
+ // Guard against aliasing with this->myString (reallocate may move the buffer)
+ if (theString >= myString && theString < myString + myLength)
+ {
+ TCollection_ExtendedString aTmp(theString, theLength);
+ AssignCat(aTmp);
+ return;
+ }
const int anOldLength = myLength;
reallocate(myLength + theLength);
memcpy(myString + anOldLength, theString, theLength * sizeof(char16_t));
}🤖 Prompt for AI Agents
In `@src/FoundationClasses/TKernel/TCollection/TCollection_ExtendedString.cxx`
around lines 305 - 341, The AssignCat overloads use memcpy after reallocate
which causes UB for self-append or when the source pointer lies inside the
current buffer (dangling after reallocate); change the logic in AssignCat(const
TCollection_ExtendedString&), AssignCat(const char16_t*,...), and the
pointer-based path to detect overlap or self-append (e.g. if &theOther == this
or if theString lies between myString and myString+myLength) and in those cases
first copy the source range into a temporary buffer (stack array or
std::vector<char16_t>), then call reallocate and memcpy from the temporary; for
non-overlapping cases you can reallocate and memcpy directly (or use memmove
where appropriate). Ensure references to reallocate, myString, myLength,
AssignCat and theOther are updated accordingly.
| bool TCollection_ExtendedString::IsEqual(const char16_t* theOther, const int theLength) const | ||
| { | ||
| return ExtStrCmp(mystring, other) < 0; | ||
| if (myLength != theLength) | ||
| { | ||
| return false; | ||
| } | ||
| if (theLength == 0) | ||
| { | ||
| return true; | ||
| } | ||
| return memcmp(myString, theOther, theLength * sizeof(char16_t)) == 0; | ||
| } | ||
|
|
||
| //================================================================================================= | ||
| //================================================================================================== | ||
|
|
||
| bool TCollection_ExtendedString::IsLess(const TCollection_ExtendedString& other) const | ||
| bool TCollection_ExtendedString::IsDifferent(const char16_t* theOther, const int theLength) const | ||
| { | ||
| return ExtStrCmp(mystring, other.mystring) < 0; | ||
| if (myLength != theLength) | ||
| { | ||
| return true; | ||
| } | ||
| if (theLength == 0) | ||
| { | ||
| return false; | ||
| } | ||
| return memcmp(myString, theOther, theLength * sizeof(char16_t)) != 0; | ||
| } | ||
|
|
||
| //================================================================================================= | ||
| //================================================================================================== | ||
|
|
||
| bool TCollection_ExtendedString::IsGreater(const char16_t* other) const | ||
| bool TCollection_ExtendedString::IsLess(const char16_t* theOther, const int theLength) const | ||
| { | ||
| return ExtStrCmp(mystring, other) > 0; | ||
| const int aMinLen = (myLength < theLength) ? myLength : theLength; | ||
| for (int i = 0; i < aMinLen; ++i) | ||
| { | ||
| if (myString[i] < theOther[i]) | ||
| { | ||
| return true; | ||
| } | ||
| if (myString[i] > theOther[i]) | ||
| { | ||
| return false; | ||
| } | ||
| } | ||
| return myLength < theLength; | ||
| } | ||
|
|
||
| //================================================================================================= | ||
| //================================================================================================== | ||
|
|
||
| bool TCollection_ExtendedString::IsGreater(const TCollection_ExtendedString& other) const | ||
| bool TCollection_ExtendedString::IsGreater(const char16_t* theOther, const int theLength) const | ||
| { | ||
| return ExtStrCmp(mystring, other.mystring) > 0; | ||
| const int aMinLen = (myLength < theLength) ? myLength : theLength; | ||
| for (int i = 0; i < aMinLen; ++i) | ||
| { | ||
| if (myString[i] > theOther[i]) | ||
| { | ||
| return true; | ||
| } | ||
| if (myString[i] < theOther[i]) | ||
| { | ||
| return false; | ||
| } | ||
| } | ||
| return myLength > theLength; | ||
| } | ||
|
|
||
| //================================================================================================= | ||
| //================================================================================================== | ||
|
|
||
| bool TCollection_ExtendedString::StartsWith(const TCollection_ExtendedString& theStartString) const | ||
| bool TCollection_ExtendedString::StartsWith(const char16_t* theStartString, | ||
| const int theLength) const | ||
| { | ||
| if (this == &theStartString) | ||
| if (theLength == 0) | ||
| { | ||
| return true; | ||
| } | ||
|
|
||
| return mylength >= theStartString.mylength | ||
| && memcmp(theStartString.mystring, mystring, theStartString.mylength * sizeof(char16_t)) | ||
| == 0; | ||
| if (myLength < theLength) | ||
| { | ||
| return false; | ||
| } | ||
| return memcmp(myString, theStartString, theLength * sizeof(char16_t)) == 0; | ||
| } | ||
|
|
||
| //================================================================================================= | ||
| //================================================================================================== | ||
|
|
||
| bool TCollection_ExtendedString::EndsWith(const TCollection_ExtendedString& theEndString) const | ||
| bool TCollection_ExtendedString::EndsWith(const char16_t* theEndString, const int theLength) const | ||
| { | ||
| if (this == &theEndString) | ||
| if (theLength == 0) | ||
| { | ||
| return true; | ||
| } | ||
|
|
||
| return mylength >= theEndString.mylength | ||
| && memcmp(theEndString.mystring, | ||
| mystring + mylength - theEndString.mylength, | ||
| theEndString.mylength * sizeof(char16_t)) | ||
| == 0; | ||
| if (myLength < theLength) | ||
| { | ||
| return false; | ||
| } | ||
| return memcmp(myString + myLength - theLength, theEndString, theLength * sizeof(char16_t)) == 0; | ||
| } |
There was a problem hiding this comment.
Add nullptr/negative‑length guards for pointer+length comparisons.
Line 495+ uses memcmp/indexing without validating theOther or theLength; a nullptr with positive length (or a negative length) is UB. Consider early validation (e.g., treat nullptr+0 as empty, otherwise return false/throw) across IsEqual, IsDifferent, IsLess, IsGreater, StartsWith, and EndsWith.
🛠 Proposed fix (apply similarly to the other pointer+length overloads)
bool TCollection_ExtendedString::IsEqual(const char16_t* theOther, const int theLength) const
{
+ if (theLength < 0)
+ {
+ throw Standard_OutOfRange("TCollection_ExtendedString::IsEqual : negative length");
+ }
+ if (theOther == nullptr)
+ {
+ return (myLength == 0 && theLength == 0);
+ }
if (myLength != theLength)
{
return false;
}
if (theLength == 0)
{
return true;
}
return memcmp(myString, theOther, theLength * sizeof(char16_t)) == 0;
}🤖 Prompt for AI Agents
In `@src/FoundationClasses/TKernel/TCollection/TCollection_ExtendedString.cxx`
around lines 495 - 590, Several pointer+length string methods (IsEqual,
IsDifferent, IsLess, IsGreater, StartsWith, EndsWith) perform memcmp/indexing
without validating theOther or theLength, causing UB for nullptr or negative
lengths; add guarded early checks: if theLength < 0 return false (or false/true
consistent with semantics), treat theOther == nullptr && theLength == 0 as empty
(continue), otherwise if theOther == nullptr with positive length return false
(or appropriate comparison result), and only call memcmp or index myString when
theOther != nullptr and theLength > 0; apply the same guard pattern to each
named method (IsEqual, IsDifferent, IsLess, IsGreater, StartsWith, EndsWith).
| void TCollection_ExtendedString::Prepend(const char16_t* theOther, const int theLength) | ||
| { | ||
| if (theOther == nullptr || theLength <= 0) | ||
| { | ||
| return; | ||
| } | ||
|
|
||
| const int anOldLength = myLength; | ||
| reallocate(myLength + theLength); | ||
|
|
||
| // Move existing characters to the end | ||
| memmove(myString + theLength, myString, anOldLength * sizeof(char16_t)); | ||
| // Copy theOther to the beginning | ||
| memcpy(myString, theOther, theLength * sizeof(char16_t)); | ||
| } |
There was a problem hiding this comment.
Prepend needs overlap‑safe handling (self‑prepend / aliasing).
Line 1208 captures the source pointer before reallocate; if theOther points into this string (including self‑prepend), that pointer can dangle after reallocation.
🛠 Proposed fix
void TCollection_ExtendedString::Prepend(const char16_t* theOther, const int theLength)
{
if (theOther == nullptr || theLength <= 0)
{
return;
}
+ // Protect against aliasing (e.g., self-prepend)
+ TCollection_ExtendedString aTmp;
+ if (theOther >= myString && theOther < myString + myLength)
+ {
+ aTmp = TCollection_ExtendedString(theOther, theLength);
+ theOther = aTmp.myString;
+ }
+
const int anOldLength = myLength;
reallocate(myLength + theLength);
// Move existing characters to the end
memmove(myString + theLength, myString, anOldLength * sizeof(char16_t));
// Copy theOther to the beginning
memcpy(myString, theOther, theLength * sizeof(char16_t));
}🤖 Prompt for AI Agents
In `@src/FoundationClasses/TKernel/TCollection/TCollection_ExtendedString.cxx`
around lines 1200 - 1214, TCollection_ExtendedString::Prepend currently captures
theOther before reallocate which can dangle if theOther aliases into the current
buffer (self‑prepend); fix by detecting aliasing first: save the old buffer
pointer (oldPtr = myString) and old length (anOldLength), check if theOther lies
in [oldPtr, oldPtr + anOldLength) and if so compute an index offset = theOther -
oldPtr, then call reallocate(myLength + theLength), use memmove(myString +
theLength, myString, anOldLength * sizeof(char16_t)) to shift old contents, and
finally copy the prefix using memmove(myString, myString + theLength + index,
theLength * sizeof(char16_t)) when aliased (otherwise copy directly from
theOther with memcpy or memmove); reference functions/vars:
TCollection_ExtendedString::Prepend, reallocate, myString, myLength,
anOldLength, theOther, memmove/memcpy.
No description provided.