-
Notifications
You must be signed in to change notification settings - Fork 172
Reimplement GetWindowsLocalTimeZone without icu.h
#329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
derekmauro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing this. I only have some minor comments.
4af4c81 to
ffa61d1
Compare
a193211 to
d47ce9c
Compare
| // This function is intended to be lock free to avoid potential deadlocks | ||
| // with loader-lock taken inside LoadLibraryW. As LoadLibraryW and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be lock-free so this is a kind of feature. Added a comment at the beginning of the function.
I don't know why "lock-free" is a goal here. What deadlocks are possible if we only call LoadLibraryW() at most once?
That is, for my money, GetWindowsLocalTimeZone() should just say something like ...
std::string local_time_zone;
static const auto getTimeZoneIDForWindowsID = LoadIcuGetTimeZoneIDForWindowsID();
if (getTimeZoneIDForWindowsID != nullptr) {
// Set local_time_zone.
...
}
return local_time_zone;
No need for atomic anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why "lock-free" is a goal here. What deadlocks are possible if we only call
LoadLibraryW()at most once?
Win32 is an interesting environment, where things like LoadLibraryW(), GetProcAddress(), and DllMain() share the same lock.
As explained in #316 (comment), I guess deadlocks are still possible in the following scenario.
Suppose there are two running thread X and thread Y.
- Thread X enters
DllMain, where the operating system implicitly acquires Loader lock. - Thread Y acquires the lock to initialize
static const auto getTimeZoneIDForWindowsIDvariable. It then get blocked atLoadLibraryWbecause the thread X still holds the Loader Lock. - Thread X also tries to read
static const auto getTimeZoneIDForWindowsIDthen gets blocked because it is not yet fully initialized.
The current approach with std::atomic_* doesn't have the above dead lock issue, as it is just an optimistic optimization that does not try to enforce the exactly-once initialization.
Let me know if I misunderstood your question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devbww Let me know if there is any remaining concerns on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my question boils down to situation 3. Why/how would a thread inside DllMain be calling CCTZ's GetWindowsLocalTimeZone()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why/how would a thread inside DllMain be calling CCTZ's GetWindowsLocalTimeZone()?
In my observation, people do not directly call cctz::local_time_zone() but end up calling it through higher level libraries that internally use CCTZ. A most notable example would be abseil and protobuf.
For instance, just calling absl::InitializeLog() ends up internally calling GetWindowsLocalTimeZone().
void InitializeLog() { InitializeLogImpl(absl::LocalTimeZone()); }Other commonly-used methods that internally trigger GetWindowsLocalTimeZone() would be:
absl::AbslStringify(absl::Time)absl::FormatTime(absl::Time)
In the real world DllMain implementations, it becomes extremely difficult for us to see if methods like absl::InitializeLog() and absl::FormatTime(absl::Time) are indirectly called or not. Here is an excerpt from Chromium code base.
BOOL CGaiaCredentialProviderModule::DllMain(HINSTANCE /*hinstance*/,
DWORD reason,
LPVOID reserved) {
switch (reason) {
case DLL_PROCESS_ATTACH: {
exit_manager_ = std::make_unique<base::AtExitManager>();
_set_invalid_parameter_handler(InvalidParameterHandler);
// Initialize base. Command line will be set from GetCommandLineW().
base::CommandLine::Init(0, nullptr);
// Initialize logging.
logging::LoggingSettings settings;
settings.logging_dest = logging::LOG_NONE;
std::wstring log_file_path =
GetGlobalFlagOrDefault(kRegLogFilePath, std::wstring{});
if (not log_file_path.empty()) {
settings.logging_dest = logging::LOG_TO_FILE;
bool append_log = GetGlobalFlagOrDefault(kRegLogFileAppend, 0);
settings.delete_old = append_log ? logging::APPEND_TO_OLD_LOG_FILE
: logging::DELETE_OLD_LOG_FILE;
settings.log_file_path = log_file_path;
}
logging::InitLogging(settings);
logging::SetLogItems(true, // Enable process id.
true, // Enable thread id.
true, // Enable timestamp.
false); // Enable tickcount.
logging::SetEventSource("GCPW", GCPW_CATEGORY, MSG_LOG_MESSAGE);
if (GetGlobalFlagOrDefault(kRegEnableVerboseLogging, 0))
logging::SetMinLogLevel(logging::LOGGING_VERBOSE);
break;
}
case DLL_PROCESS_DETACH:
LOGFN(VERBOSE) << "DllMain(DLL_PROCESS_DETACH)";
// When this DLL is loaded for testing, don't reset the command line
// since it causes tests to crash.
if (!is_testing_)
base::CommandLine::Reset();
_set_invalid_parameter_handler(nullptr);
exit_manager_.reset();
crash_reporter::DestroyCrashpadClient();
break;
default:
break;
}
return ATL::CAtlDllModuleT<CGaiaCredentialProviderModule>::DllMain(reason,
reserved);
}I actually don't know whether methods like absl::FormatTime(absl::Time) and absl::FormatTime(absl::Time) are called or not in the above code, but the fact it's hard to see is already a good reason to make it safe in the first place, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm going to defer to your expertise here.
Still, it seems to me that it is a fundamental flaw having interfaces that take a lock, while having others that can call out to arbitrary code while holding the same lock. If that didn't happen everyone would be better off, and we wouldn't have to perform these dances.
e9571db to
1e8d3bb
Compare
derekmauro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to get the presubmit to run on this.
1e8d3bb to
af9947a
Compare
af9947a to
c8bc1fb
Compare
src/time_zone_name_win.cc
Outdated
| // UChar is defined as char16_t in ICU by default, but it is also safe to assume | ||
| // wchar_t and char16_t are equivalent on Windows. | ||
| using UChar = wchar_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why not just define it as char16_t and remove the need to comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main motivation is to avoid reinterpret_cast around Win32 APIs, which receive wchar_t* instead of char16_t*. Rephrased the comment to clarify it.
src/time_zone_name_win.cc
Outdated
| } | ||
|
|
||
| std::wstring result; | ||
| // TODO: Remove std::max() when we require C++17 or higher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this referring to &result[0] being undefined when result.empty()? If so, then I believe anything after C++11 is sufficient. Either way it is worth clarifying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, you are right. Removed TODO comment for simplicity.
src/time_zone_name_win.cc
Outdated
| while (true) { | ||
| UErrorCode status = U_ZERO_ERROR; | ||
| result.resize(len); | ||
| // TODO: Switch to std::string::data() when we require C++17 or higher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this referring to data() returning a pointer to non-const? It is worth clarifying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed TODO as &result[0] remains to be sufficient.
| if (U_SUCCESS(status)) { | ||
| return Utf16ToUtf8(result.data(), len); | ||
| } | ||
| if (status != U_BUFFER_OVERFLOW_ERROR) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably also fail if len != static_cast<std::int32_t>(len), as that would cause us to pass the wrong length on the next loop.
Indeed, that test should come before the getTimeZoneIDForWindowsID() call in case result.capacity() is initially beyond std::int32_t limits. That's highly unlikely, I know, but why not get it exactly right anyway.
Alternatively, the initial len could be capped at the maximum std::int32_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Initialized len with std::int32_t as the maximum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that using &result[0] (here and elsewhere) while claiming to support C++11 still requires us to bound the initial len from below at 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/time_zone_name_win.cc
Outdated
| } | ||
| const int chars_len = static_cast<int>(size); | ||
| std::string result; | ||
| // TODO: Remove std::max() when we require C++17 or higher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the other comment on excluding the empty initial capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Initialized len with std::int32_t as the maximum.
src/time_zone_name_win.cc
Outdated
| result.resize(len); | ||
| // TODO: Switch to std::string::data() when we require C++17 or higher. | ||
| len = ::WideCharToMultiByte(CP_UTF8, WC_ERR_INVALID_CHARS, ptr, chars_len, | ||
| &result[0], static_cast<int>(result.size()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the other comment on checking for int overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Initialized len with std::int32_t as the maximum.
src/time_zone_name_win.cc
Outdated
| do { | ||
| result.resize(len); | ||
| // TODO: Switch to std::string::data() when we require C++17 or higher. | ||
| len = ::GetSystemDirectoryW(&result[0], static_cast<UINT>(result.size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the other comment on checking for cast truncation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Initialized len with std::uint32_t as the maximum.
| // This function is intended to be lock free to avoid potential deadlocks | ||
| // with loader-lock taken inside LoadLibraryW. As LoadLibraryW and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my question boils down to situation 3. Why/how would a thread inside DllMain be calling CCTZ's GetWindowsLocalTimeZone()?
| if (U_SUCCESS(status)) { | ||
| return Utf16ToUtf8(result.data(), len); | ||
| } | ||
| if (status != U_BUFFER_OVERFLOW_ERROR) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that using &result[0] (here and elsewhere) while claiming to support C++11 still requires us to bound the initial len from below at 1.
| // This function is intended to be lock free to avoid potential deadlocks | ||
| // with loader-lock taken inside LoadLibraryW. As LoadLibraryW and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm going to defer to your expertise here.
Still, it seems to me that it is a fundamental flaw having interfaces that take a lock, while having others that can call out to arbitrary code while holding the same lock. If that didn't happen everyone would be better off, and we wouldn't have to perform these dances.
This commit reimplements my previous commits [1][2], which removed the dependency on WinRT on Windows by using system "icu.dll" (google#315) when <icu.h> is available in the build environment. Here are major improvements: * The relevant code is extracted into separate source files "src/time_zone_name_win.{cc,h}". This is also a preparation for implementing TimeZoneIf with Windows time APIs (google#328), where one more ICU API needs to be dynamically imported. * The dependency on <icu.h> is removed by locally defining required constants and API prototypes. This should make it possible to build CCTZ with the MinGW toolchain, where <icu.h> is not yet available. * LoadLibraryExW(L"icu.dll", nullptr, LOAD_LIBRARY_SEARCH_SYSTEM32) is replaced with LoadLibraryW with the full path to "icu.dll" in the System32 directory, because LOAD_LIBRARY_SEARCH_SYSTEM32 can be ignored when "icu.dll" is already loaded from a non-System32 directory. Let's stick to the system "icu.dll" in favor of predictability. Other than the above improvements, the overall logic remains unchanged. [1]: 2ebbe0d [2]: f8e494f
c900357 to
401182f
Compare
devbww
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks. I'll leave it to @derekmauro to run the checks and give final approval.
This commit reimplements my previous commits (2ebbe0d)(f8e494f), which removed the dependency on WinRT on Windows by using system "icu.dll" (#315) when
<icu.h>is available in the build environment.Here are major improvements:
The relevant code is extracted into separate source files
src/time_zone_name_win.{cc,h}. This is also a preparation for implementingTimeZoneIfwith Windows time APIs, where one more ICU API needs to be dynamically imported.cctz::TimeZoneIfimplementation that is compatible with Windows time APIs #328The dependency on
<icu.h>is removed by locally defining required constants and API prototypes. This should make it possible to build CCTZ with the MinGW toolchain, where<icu.h>is not yet available.LoadLibraryExW(L"icu.dll", nullptr, LOAD_LIBRARY_SEARCH_SYSTEM32)is replaced withLoadLibraryWwith the full path toicu.dllin the System32 directory, becauseLOAD_LIBRARY_SEARCH_SYSTEM32can be ignored whenicu.dllis already loaded from a non-System32 directory. Let's stick to the systemicu.dllin favor of predictability.Other than the above improvements, the overall logic remains unchanged.