-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
impl(spanner): add uuid type #15023
base: main
Are you sure you want to change the base?
impl(spanner): add uuid type #15023
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #15023 +/- ##
=========================================
+ Coverage 0 92.90% +92.90%
=========================================
Files 0 2354 +2354
Lines 0 210355 +210355
=========================================
+ Hits 0 195435 +195435
- Misses 0 14920 +14920 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion (waiting on @cuiy0006)
872b789
to
2fb1b37
Compare
/gcbrun |
google/cloud/spanner/uuid.cc
Outdated
kMaxUuidNumberOfHexDigits, str), | ||
GCP_ERROR_INFO()); | ||
} | ||
std::string original_str = std::string(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this could/should be done without copying the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's necessary to copy the original string for error message contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, copying the string isn't necessary. Nothing here mutates the characters of the argument string_view
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few of the operations performed do mutate the string_view. I added additional expectation criteria to the tests that verified that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You appear to be confusing "mutate the string_view" with "mutate the characters of the string_view". Nothing does the latter. Indeed, string_view
prohibits it. So, there is no need to copy the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying the string_view instead.
google/cloud/spanner/uuid.cc
Outdated
str.remove_suffix(1); | ||
} | ||
|
||
// Check for leading hyphen after braces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"hyphen after braces" implies that this only applies to a hyphen after a brace. So, perhaps "hyphen after stripping any surrounding braces" would sound better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
} | ||
std::string original_str = std::string(str); | ||
// Check and remove optional braces | ||
bool has_braces = (str[0] == '{'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has_braces == true
seems wrong for something we only know starts with a brace. Given that the variable is only used once, I'd just get rid of it. For example, ...
if (absl::ConsumePrefix(&str, "{")) {
if (!absl::ConsumeSuffix(&str, "}") {
return internal::InvalidArgumentError(...);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
absl::StrFormat("UUID must be at least %d characters long: %s", | ||
kMaxUuidNumberOfHexDigits, original_str), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider a string of 31 hex digits, each separated by a hyphen, for a total length of 61 chars. Complaining that the string must be at least 32 chars long, when it is, seems wrong. Something about "not enough hex digits" would be more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this error message and some others to specify hex digits as appropriate.
google/cloud/spanner/uuid.cc
Outdated
// Helper function to parse a single hexadecimal block of a UUID. | ||
// A hexadecimal block is a 16-digit hexadecimal number, which is represented | ||
// as 8 bytes. | ||
StatusOr<uint64_t> ParseHexBlock(absl::string_view& str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/uint64_t/std::uint64_t/
et seq.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
if (str[0] == '-') { | ||
return internal::InvalidArgumentError( | ||
absl::StrFormat("UUID cannot contain consecutive hyphens: %s", | ||
original_str), | ||
GCP_ERROR_INFO()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be clearer to move this into the if (it == char_to_hex->end()
block, if only to highlight that we're just special-casing the error message for a particular invalid character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
to_hex(high_bits, 15, 8, target); | ||
*(target + kHyphenPos[0]) = '-'; | ||
to_hex(high_bits, 7, 4, target + kHyphenPos[0] + 1); | ||
*(target + kHyphenPos[1]) = '-'; | ||
to_hex(high_bits, 3, 0, target + kHyphenPos[1] + 1); | ||
*(target + kHyphenPos[2]) = '-'; | ||
to_hex(low_bits, 15, 12, target + kHyphenPos[2] + 1); | ||
*(target + kHyphenPos[3]) = '-'; | ||
to_hex(low_bits, 11, 0, target + kHyphenPos[3] + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to generalize using kHyphenPos
, you should also compute the start/end_index
pairs using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I began computing start/end pairs per call to to_hex
, looked at the results and thought "I keep recomputing the values of start/end, this could be a loop". Refactored it into a loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the operator std::string()
snippet above, which has a single loop, and doesn't do any index computations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #15043 for future improvements as I'm out of time budget to work on this.
google/cloud/spanner/uuid.h
Outdated
return !(lhs < rhs); | ||
} | ||
friend bool operator>(Uuid const& lhs, Uuid const& rhs) { | ||
return (rhs < lhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redundant parentheses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
|
||
StatusOr<Uuid> MakeUuid(absl::string_view str) { | ||
// Early checks for invalid length or leading hyphen | ||
if (str.size() < kMaxUuidNumberOfHexDigits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test seems best omitted, and then handled by the actual failure to find enough hex digits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this check as recommended, and also added a guard for an empty string.
2fb1b37
to
b2e52cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review, Brad. I appreciate you donating your time.
Reviewable status: 0 of 6 files reviewed, 10 unresolved discussions (waiting on @cuiy0006 and @devbww)
google/cloud/spanner/uuid.h
Outdated
return !(lhs < rhs); | ||
} | ||
friend bool operator>(Uuid const& lhs, Uuid const& rhs) { | ||
return (rhs < lhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
// Helper function to parse a single hexadecimal block of a UUID. | ||
// A hexadecimal block is a 16-digit hexadecimal number, which is represented | ||
// as 8 bytes. | ||
StatusOr<uint64_t> ParseHexBlock(absl::string_view& str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
absl::StrFormat("UUID must be at least %d characters long: %s", | ||
kMaxUuidNumberOfHexDigits, original_str), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this error message and some others to specify hex digits as appropriate.
google/cloud/spanner/uuid.cc
Outdated
if (str[0] == '-') { | ||
return internal::InvalidArgumentError( | ||
absl::StrFormat("UUID cannot contain consecutive hyphens: %s", | ||
original_str), | ||
GCP_ERROR_INFO()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
to_hex(high_bits, 15, 8, target); | ||
*(target + kHyphenPos[0]) = '-'; | ||
to_hex(high_bits, 7, 4, target + kHyphenPos[0] + 1); | ||
*(target + kHyphenPos[1]) = '-'; | ||
to_hex(high_bits, 3, 0, target + kHyphenPos[1] + 1); | ||
*(target + kHyphenPos[2]) = '-'; | ||
to_hex(low_bits, 15, 12, target + kHyphenPos[2] + 1); | ||
*(target + kHyphenPos[3]) = '-'; | ||
to_hex(low_bits, 11, 0, target + kHyphenPos[3] + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I began computing start/end pairs per call to to_hex
, looked at the results and thought "I keep recomputing the values of start/end, this could be a loop". Refactored it into a loop.
google/cloud/spanner/uuid.cc
Outdated
|
||
StatusOr<Uuid> MakeUuid(absl::string_view str) { | ||
// Early checks for invalid length or leading hyphen | ||
if (str.size() < kMaxUuidNumberOfHexDigits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this check as recommended, and also added a guard for an empty string.
google/cloud/spanner/uuid.cc
Outdated
kMaxUuidNumberOfHexDigits, str), | ||
GCP_ERROR_INFO()); | ||
} | ||
std::string original_str = std::string(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's necessary to copy the original string for error message contents.
google/cloud/spanner/uuid.cc
Outdated
} | ||
std::string original_str = std::string(str); | ||
// Check and remove optional braces | ||
bool has_braces = (str[0] == '{'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
str.remove_suffix(1); | ||
} | ||
|
||
// Check for leading hyphen after braces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.h
Outdated
* Parses a textual representation a `Uuid` from a string of hexadecimal digits. | ||
* Returns an error if unable to parse the given input. | ||
* | ||
* Acceptable input strings must consist of [0-f] digits with formatting such: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[0-f]
is not good definition of a character class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanded definition. If this still isn't a "good" definition, please suggest one.
google/cloud/spanner/uuid.h
Outdated
/// Default construction yields a zero value UUID. | ||
Uuid() = default; | ||
|
||
/// Construct a UUID from a packed integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "packed" is supposed to convey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, although we now have "128 bit unsigned" and "unsigned 64 bit" in consecutive comments. Consistency would be nice. And "N-bit" should be hyphenated when used as an adjective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded across usages for consistency.
google/cloud/spanner/uuid.h
Outdated
/// Construct a UUID from two 64 bit pieces. | ||
Uuid(std::uint64_t high_bits, std::uint64_t low_bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd get rid of this constructor. If the caller has two std::uint64_t
s, they can build the absl::uint128
by themselves. It shouldn't be part of this class (just like there shouldn't be an accessor that returns a pair of std::uint64_t
s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to give users a options that uses std
integer types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, there should be an accessor that returns a pair of std::uint64_t
s.
And, you need to #include <cstdint>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.h
Outdated
/// @name Conversion to packed integer representation. | ||
explicit operator absl::uint128() const { return uuid_; } | ||
|
||
/// @name Conversion to a lower case string using formatted: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"using formatted:" doesn't parse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ using//
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
|
||
std::string output; | ||
output.resize(kUuidStringLen); | ||
char* target = &((output)[output.size() - kUuidStringLen]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output.size() - kUuidStringLen
is zero. Why obfuscate that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-obfuscated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const_cast
ing isn't good either. Why not just ...
char* target = &output[0];
char* const last = &output[output.size()];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&output[0]
turns into a clang-tidy diagnostic about using .data()
instead which until c++17 only returns a const char*
. I have multiple non-ideal choices.
auto high_bits = ParseHexBlock(str, original_str); | ||
if (!high_bits.ok()) return std::move(high_bits).status(); | ||
auto low_bits = ParseHexBlock(str, original_str); | ||
if (!low_bits.ok()) return std::move(low_bits).status(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I would suggest not introducing two 64-bit values here. (Otherwise, why not four 32-bit values, for example?) A single 128-bit value should suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lacking a std::uint128_t
and absl::uint128
not behaving exactly like a std::uint128_t
, std::uint64_t
is the largest integer currently available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't know what absl::uint128
behavior you think is lacking. For example,
#include <cctype>
#include <cstring>
constexpr char kHexDigits[] = "0123456789abcdef";
StatusOr<Uuid> MakeUuid(absl::string_view str) {
absl::uint128 uuid = 0;
auto const original_str = str;
if (absl::ConsumePrefix(&str, "{")) {
if (!absl::ConsumeSuffix(&str, "}")) {
return internal::InvalidArgumentError(
absl::StrFormat("UUID missing closing '}': %s", original_str),
GCP_ERROR_INFO());
}
}
if (absl::StartsWith(str, "-")) {
return internal::InvalidArgumentError(
absl::StrFormat("UUID cannot begin with '-': %s", original_str),
GCP_ERROR_INFO());
}
constexpr int kUuidNumberOfHexDigits = 32;
for (int j = 0; j != kUuidNumberOfHexDigits; ++j) {
absl::ConsumePrefix(&str, "-");
if (str.empty()) {
return internal::InvalidArgumentError(
absl::StrFormat("UUID must contain %d hexadecimal digits: %s",
kUuidNumberOfHexDigits, original_str),
GCP_ERROR_INFO());
}
auto const* dp = std::strchr(
kHexDigits, std::tolower(static_cast<unsigned char>(str[0])));
if (dp == nullptr) {
if (str[0] == '-') {
return internal::InvalidArgumentError(
absl::StrFormat("UUID cannot contain consecutive hyphens: %s",
original_str),
GCP_ERROR_INFO());
}
return internal::InvalidArgumentError(
absl::StrFormat("UUID contains invalid character (%c): %s", str[0],
original_str),
GCP_ERROR_INFO());
}
str.remove_prefix(1);
uuid <<= 4;
uuid += dp - kHexDigits;
}
if (!str.empty()) {
return internal::InvalidArgumentError(
absl::StrFormat("Extra characters found after parsing UUID: %s", str),
GCP_ERROR_INFO());
}
return Uuid{uuid};
}
No "blocks" (or hash maps). Just a simple per-nibble loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #15043 for future improvements as I'm out of time budget to work on this.
google/cloud/spanner/uuid.cc
Outdated
absl::StrFormat("Extra characters (%d) found after parsing UUID: %s", | ||
str.size(), original_str), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not provide the actual extra characters rather than the number of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
} | ||
|
||
StatusOr<Uuid> MakeUuid(absl::string_view str) { | ||
if (str.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this check as recommended, and also added a guard for an empty string.
I wouldn't make a special case of the empty string either. Just change the
if (str[0] == '-') {
test to
if (absl::StartsWith(str, "-")) {
and the empty string will be treated like any other illegal argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the early exit and specific error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Early exit" is only a preferred strategy when it makes the following code simpler. Here it remains the same.
Making "UUID cannot be empty" a special case over "UUID must contain 32 hexadecimal digits" is not worth the redundancy. It makes the code more cluttered and more difficult to understand. And why shouldn't MakeUuid("{}")
behave the same way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this check and added a test for "{}". Both now result in a satisfactory error message.
google/cloud/spanner/uuid.cc
Outdated
GOOGLE_CLOUD_CPP_INLINE_NAMESPACE_BEGIN | ||
namespace { | ||
|
||
constexpr int kMaxUuidNumberOfHexDigits = 32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this localized to ParseHexBlock()
just like kMaxUuidBlockLength
is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.h
Outdated
friend bool operator==(Uuid const& lhs, Uuid const& rhs); | ||
friend bool operator!=(Uuid const& lhs, Uuid const& rhs) { | ||
return !(lhs == rhs); | ||
} | ||
friend bool operator<(Uuid const& lhs, Uuid const& rhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that ==
and <
are simple operations on the representation, I don't know why we wouldn't choose to inline them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
d091b8a
to
7c4a4ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 21 unresolved discussions (waiting on @cuiy0006 and @devbww)
google/cloud/spanner/uuid.h
Outdated
/// Default construction yields a zero value UUID. | ||
Uuid() = default; | ||
|
||
/// Construct a UUID from a packed integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded
google/cloud/spanner/uuid.h
Outdated
/// Construct a UUID from two 64 bit pieces. | ||
Uuid(std::uint64_t high_bits, std::uint64_t low_bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to give users a options that uses std
integer types.
google/cloud/spanner/uuid.h
Outdated
friend bool operator==(Uuid const& lhs, Uuid const& rhs); | ||
friend bool operator!=(Uuid const& lhs, Uuid const& rhs) { | ||
return !(lhs == rhs); | ||
} | ||
friend bool operator<(Uuid const& lhs, Uuid const& rhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.h
Outdated
/// @name Conversion to packed integer representation. | ||
explicit operator absl::uint128() const { return uuid_; } | ||
|
||
/// @name Conversion to a lower case string using formatted: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded
google/cloud/spanner/uuid.h
Outdated
* Parses a textual representation a `Uuid` from a string of hexadecimal digits. | ||
* Returns an error if unable to parse the given input. | ||
* | ||
* Acceptable input strings must consist of [0-f] digits with formatting such: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanded definition. If this still isn't a "good" definition, please suggest one.
google/cloud/spanner/uuid.cc
Outdated
|
||
std::string output; | ||
output.resize(kUuidStringLen); | ||
char* target = &((output)[output.size() - kUuidStringLen]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-obfuscated
output.resize(kUuidStringLen); | ||
char* target = &((output)[output.size() - kUuidStringLen]); | ||
char* const last = &((output)[output.size()]); | ||
auto bits = Uint128High64(uuid_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absl::uint128
isn't a drop in replacement in the existing code (I tried). When we get a std::uint128_t
I'm happy to revisit.
google/cloud/spanner/uuid.cc
Outdated
} | ||
|
||
StatusOr<Uuid> MakeUuid(absl::string_view str) { | ||
if (str.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the early exit and specific error message.
auto high_bits = ParseHexBlock(str, original_str); | ||
if (!high_bits.ok()) return std::move(high_bits).status(); | ||
auto low_bits = ParseHexBlock(str, original_str); | ||
if (!low_bits.ok()) return std::move(low_bits).status(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lacking a std::uint128_t
and absl::uint128
not behaving exactly like a std::uint128_t
, std::uint64_t
is the largest integer currently available.
google/cloud/spanner/uuid.cc
Outdated
absl::StrFormat("Extra characters (%d) found after parsing UUID: %s", | ||
str.size(), original_str), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're welcome.
Aside: "switch from absl::flat_hash_map to std::unordered_map for 32 bit compatibility" sounds wacky.
Reviewed 3 of 6 files at r1, 3 of 3 files at r6, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @cuiy0006 and @scotthart)
google/cloud/spanner/uuid.cc
line 31 at r6 (raw file):
StatusOr<std::uint64_t> ParseHexBlock(absl::string_view& str, absl::string_view original_str) { constexpr int kMaxUuidNumberOfHexDigits = 32;
In what sense is this a "Max" value? That implies that we could have fewer hex digits, but that is not the case.
google/cloud/spanner/uuid.cc
line 33 at r6 (raw file):
constexpr int kMaxUuidNumberOfHexDigits = 32; constexpr int kMaxUuidBlockLength = 16; static auto const* char_to_hex = new std::unordered_map<char, std::uint8_t>(
#include <unordered_map>
google/cloud/spanner/uuid.h
Outdated
/// Default construction yields a zero value UUID. | ||
Uuid() = default; | ||
|
||
/// Construct a UUID from a packed integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, although we now have "128 bit unsigned" and "unsigned 64 bit" in consecutive comments. Consistency would be nice. And "N-bit" should be hyphenated when used as an adjective.
google/cloud/spanner/uuid.h
Outdated
/// Construct a UUID from two 64 bit pieces. | ||
Uuid(std::uint64_t high_bits, std::uint64_t low_bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, there should be an accessor that returns a pair of std::uint64_t
s.
And, you need to #include <cstdint>
.
google/cloud/spanner/uuid.h
Outdated
/// @name Conversion to packed integer representation. | ||
explicit operator absl::uint128() const { return uuid_; } | ||
|
||
/// @name Conversion to a lower case string using formatted: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ using//
google/cloud/spanner/uuid.cc
Outdated
to_hex(high_bits, 15, 8, target); | ||
*(target + kHyphenPos[0]) = '-'; | ||
to_hex(high_bits, 7, 4, target + kHyphenPos[0] + 1); | ||
*(target + kHyphenPos[1]) = '-'; | ||
to_hex(high_bits, 3, 0, target + kHyphenPos[1] + 1); | ||
*(target + kHyphenPos[2]) = '-'; | ||
to_hex(low_bits, 15, 12, target + kHyphenPos[2] + 1); | ||
*(target + kHyphenPos[3]) = '-'; | ||
to_hex(low_bits, 11, 0, target + kHyphenPos[3] + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the operator std::string()
snippet above, which has a single loop, and doesn't do any index computations.
google/cloud/spanner/uuid.cc
Outdated
kMaxUuidNumberOfHexDigits, str), | ||
GCP_ERROR_INFO()); | ||
} | ||
std::string original_str = std::string(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You appear to be confusing "mutate the string_view" with "mutate the characters of the string_view". Nothing does the latter. Indeed, string_view
prohibits it. So, there is no need to copy the string.
google/cloud/spanner/uuid.cc
Outdated
|
||
std::string output; | ||
output.resize(kUuidStringLen); | ||
char* target = &((output)[output.size() - kUuidStringLen]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const_cast
ing isn't good either. Why not just ...
char* target = &output[0];
char* const last = &output[output.size()];
output.resize(kUuidStringLen); | ||
char* target = &((output)[output.size() - kUuidStringLen]); | ||
char* const last = &((output)[output.size()]); | ||
auto bits = Uint128High64(uuid_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what you tried, but if absl::uint128
didn't work as an integral type then that would be an Abseil bug.
In any case, I'll assert that it does work.
constexpr char kHexDigits[] = "0123456789abcdef";
Uuid::operator std::string() const {
constexpr char kTemplate[] = "00000000-0000-0000-0000-000000000000";
char buf[sizeof kTemplate];
auto uuid = uuid_;
for (auto j = sizeof buf; j-- != 0;) {
if (kTemplate[j] != '0') {
buf[j] = kTemplate[j];
} else {
buf[j] = kHexDigits[static_cast<int>(uuid & 0xf)];
uuid >>= 4;
}
}
return buf;
}
google/cloud/spanner/uuid.cc
Outdated
} | ||
|
||
StatusOr<Uuid> MakeUuid(absl::string_view str) { | ||
if (str.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Early exit" is only a preferred strategy when it makes the following code simpler. Here it remains the same.
Making "UUID cannot be empty" a special case over "UUID must contain 32 hexadecimal digits" is not worth the redundancy. It makes the code more cluttered and more difficult to understand. And why shouldn't MakeUuid("{}")
behave the same way?
auto high_bits = ParseHexBlock(str, original_str); | ||
if (!high_bits.ok()) return std::move(high_bits).status(); | ||
auto low_bits = ParseHexBlock(str, original_str); | ||
if (!low_bits.ok()) return std::move(low_bits).status(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't know what absl::uint128
behavior you think is lacking. For example,
#include <cctype>
#include <cstring>
constexpr char kHexDigits[] = "0123456789abcdef";
StatusOr<Uuid> MakeUuid(absl::string_view str) {
absl::uint128 uuid = 0;
auto const original_str = str;
if (absl::ConsumePrefix(&str, "{")) {
if (!absl::ConsumeSuffix(&str, "}")) {
return internal::InvalidArgumentError(
absl::StrFormat("UUID missing closing '}': %s", original_str),
GCP_ERROR_INFO());
}
}
if (absl::StartsWith(str, "-")) {
return internal::InvalidArgumentError(
absl::StrFormat("UUID cannot begin with '-': %s", original_str),
GCP_ERROR_INFO());
}
constexpr int kUuidNumberOfHexDigits = 32;
for (int j = 0; j != kUuidNumberOfHexDigits; ++j) {
absl::ConsumePrefix(&str, "-");
if (str.empty()) {
return internal::InvalidArgumentError(
absl::StrFormat("UUID must contain %d hexadecimal digits: %s",
kUuidNumberOfHexDigits, original_str),
GCP_ERROR_INFO());
}
auto const* dp = std::strchr(
kHexDigits, std::tolower(static_cast<unsigned char>(str[0])));
if (dp == nullptr) {
if (str[0] == '-') {
return internal::InvalidArgumentError(
absl::StrFormat("UUID cannot contain consecutive hyphens: %s",
original_str),
GCP_ERROR_INFO());
}
return internal::InvalidArgumentError(
absl::StrFormat("UUID contains invalid character (%c): %s", str[0],
original_str),
GCP_ERROR_INFO());
}
str.remove_prefix(1);
uuid <<= 4;
uuid += dp - kHexDigits;
}
if (!str.empty()) {
return internal::InvalidArgumentError(
absl::StrFormat("Extra characters found after parsing UUID: %s", str),
GCP_ERROR_INFO());
}
return Uuid{uuid};
}
No "blocks" (or hash maps). Just a simple per-nibble loop.
e32df85
to
8593205
Compare
bccda6d
to
5fd97bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the m32-pr build was failing with an assert from absl::flat_hash_map about the constructed from value not being the same as the key type. As inefficient as it is, std::unordered_map did not have the same issue.
Reviewable status: 3 of 6 files reviewed, 12 unresolved discussions (waiting on @cuiy0006 and @devbww)
google/cloud/spanner/uuid.cc
line 31 at r6 (raw file):
Previously, devbww (Bradley White) wrote…
In what sense is this a "Max" value? That implies that we could have fewer hex digits, but that is not the case.
Renamed
google/cloud/spanner/uuid.cc
line 33 at r6 (raw file):
Previously, devbww (Bradley White) wrote…
#include <unordered_map>
Added.
google/cloud/spanner/uuid.h
Outdated
/// Default construction yields a zero value UUID. | ||
Uuid() = default; | ||
|
||
/// Construct a UUID from a packed integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded across usages for consistency.
google/cloud/spanner/uuid.h
Outdated
/// Construct a UUID from two 64 bit pieces. | ||
Uuid(std::uint64_t high_bits, std::uint64_t low_bits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.h
Outdated
/// @name Conversion to packed integer representation. | ||
explicit operator absl::uint128() const { return uuid_; } | ||
|
||
/// @name Conversion to a lower case string using formatted: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
google/cloud/spanner/uuid.cc
Outdated
to_hex(high_bits, 15, 8, target); | ||
*(target + kHyphenPos[0]) = '-'; | ||
to_hex(high_bits, 7, 4, target + kHyphenPos[0] + 1); | ||
*(target + kHyphenPos[1]) = '-'; | ||
to_hex(high_bits, 3, 0, target + kHyphenPos[1] + 1); | ||
*(target + kHyphenPos[2]) = '-'; | ||
to_hex(low_bits, 15, 12, target + kHyphenPos[2] + 1); | ||
*(target + kHyphenPos[3]) = '-'; | ||
to_hex(low_bits, 11, 0, target + kHyphenPos[3] + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #15043 for future improvements as I'm out of time budget to work on this.
google/cloud/spanner/uuid.cc
Outdated
kMaxUuidNumberOfHexDigits, str), | ||
GCP_ERROR_INFO()); | ||
} | ||
std::string original_str = std::string(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying the string_view instead.
google/cloud/spanner/uuid.cc
Outdated
|
||
std::string output; | ||
output.resize(kUuidStringLen); | ||
char* target = &((output)[output.size() - kUuidStringLen]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&output[0]
turns into a clang-tidy diagnostic about using .data()
instead which until c++17 only returns a const char*
. I have multiple non-ideal choices.
output.resize(kUuidStringLen); | ||
char* target = &((output)[output.size() - kUuidStringLen]); | ||
char* const last = &((output)[output.size()]); | ||
auto bits = Uint128High64(uuid_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't try much. I changed to_hex
first parameter to an absl::uint128
and got compile errors trying to index into kHexChar
Created #15043 for future improvements as I'm out of time budget to work on this.
google/cloud/spanner/uuid.cc
Outdated
} | ||
|
||
StatusOr<Uuid> MakeUuid(absl::string_view str) { | ||
if (str.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this check and added a test for "{}". Both now result in a satisfactory error message.
auto high_bits = ParseHexBlock(str, original_str); | ||
if (!high_bits.ok()) return std::move(high_bits).status(); | ||
auto low_bits = ParseHexBlock(str, original_str); | ||
if (!low_bits.ok()) return std::move(low_bits).status(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #15043 for future improvements as I'm out of time budget to work on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to understand what "constructed value does not match the lookup key" means rather than switching to a container that doesn't make a similar check.
Reviewed 3 of 3 files at r7, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @cuiy0006)
This change is