Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev build locales issue #868

Closed
andychu opened this issue Dec 5, 2020 · 10 comments
Closed

dev build locales issue #868

andychu opened this issue Dec 5, 2020 · 10 comments
Labels

Comments

@andychu
Copy link
Contributor

andychu commented Dec 5, 2020

https://oilshell.zulipchat.com/#narrow/stream/266977-shell-gui/topic/dev.20build.20locales.20issue.20(Lobste.2Ers.20followup)

I could use some help from a knowledgable C programmer!

@abathur
Copy link
Collaborator

abathur commented Feb 26, 2021

I'm looking at moving resholve up to a more recent rev of oil and noticed this same break trying to build against 0.8.7 on macOS (and arrived at the same fix, before I found this issue...).

error messages
======================================================================
ERROR: testFnmatch (__main__.LibcTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "native/libc_test.py", line 98, in testFnmatch
    actual = libc.fnmatch(pat, s, False)
SystemError: Invalid locale for LC_CTYPE

======================================================================
ERROR: testFnmatchExtglob (__main__.LibcTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "native/libc_test.py", line 125, in testFnmatchExtglob
    actual = libc.fnmatch(pat, s, True)
SystemError: Invalid locale for LC_CTYPE

======================================================================
ERROR: testRegexFirstGroupMatch (__main__.LibcTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "native/libc_test.py", line 173, in testRegexFirstGroupMatch
    libc.regex_first_group_match('(X.)', s, 0))
SystemError: Invalid locale for LC_CTYPE

======================================================================
ERROR: testSpecialCharsInCharClass (__main__.LibcTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "native/libc_test.py", line 223, in testSpecialCharsInCharClass
    result = libc.regex_first_group_match(pat, s, 0)
SystemError: Invalid locale for LC_CTYPE

======================================================================
ERROR: testWcsWidth (__main__.LibcTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "native/libc_test.py", line 255, in testWcsWidth
    self.assertEqual(2, libc.wcswidth("→ "))
SystemError: C.UTF-8 is not a valid locale

----------------------------------------------------------------------
Ran 13 tests in 0.003s

FAILED (errors=5)

I'm flummoxed by locales myself, but I'll fish around a bit on Nix IRC.

@andychu
Copy link
Contributor Author

andychu commented Feb 26, 2021

Thanks for the heads up... This is something we need to fix, but reading over that thread, my understanding has not improved any :-( I havren't had time to dig deeper.

The core issue is that many libc string processing functions respect locales (regex, glob). But this is a smelly set global variables that leads to bugs. (It's quite easy to get bash to do weird things because of unicode and locales)

So Oil is UTF-8 only, and that is somewhat inspired by the fact that OpenBSD did it in their shell? I don't have the link handy right now. But the UTF-8 only model conflicts with the libc locale model.

So we either have to fix this interaction, or rewrite all our own string processing to not use libc. This is actually not as unreasonable as it sounds, because Oil is one of the only shells that uses libc's glob(). All other shells do it themselves (to support things like extended globs, etc.)

@andychu
Copy link
Contributor Author

andychu commented Mar 23, 2021

This should be fixed since I fixed #912 . That one is user-visible and this is for "devtools", so I'm closing it.

So if you git pull from master you shouldn't get this anymore. Let me know if anything doesn't work!

@andychu andychu closed this as completed Mar 23, 2021
@abathur
Copy link
Collaborator

abathur commented Mar 23, 2021

I'm not sure how quickly I'll free some mental space to play with it, but I did go add code comments in a few spots so that I hopefully won't forget :)

Feel free to poke me if this is something you'd like affirmatively tested before a release or somesuch, though--I'll make time.

@andychu
Copy link
Contributor Author

andychu commented Mar 23, 2021

I think everything will work, but the only thing that might need to change is this list of hard-coded UTF-8 locales:

https://github.com/oilshell/oil/blob/master/native/libc.c#L342

It first tries C.UTF-8 and then en_US.UTF-8. If it doesn't work you'll get a noisy error.

This only affects the Python build because of CPython internals, not oil-native (and not even the app bundle because I patched CPython). So it will affect resholve since I assume you're using a regular Python interpreter.

That list should usually work but distros are configured more differently than I thought. Apparently there is not a single string you can put for a locale to make it work everywhere!

@andychu
Copy link
Contributor Author

andychu commented Mar 23, 2021

Also note the change affects bin/oil.py way up front:

https://github.com/oilshell/oil/blob/master/bin/oil.py#L291

So depending on how you're importing Oil, you might not get this? I think it will be fine; this only affects thing like whether the glob ? matches a byte or character, or the regex . matches a byte or a character. resholve may or may not even depend on that.

Either way you will get cleaner behavior -- Oil no longer messes with locales as much. It's possible you may want to do the same thing in main() and set libc.cpython_reset_locale() to undo what CPython is doing.

@abathur
Copy link
Collaborator

abathur commented Mar 23, 2021

I do use a regular Python interpreter, at least for now. I also do some ~weird monkeypatching to force an oil.* namespace on oil's subpackages.

I guess it makes sense to go ahead and try this while the knowledge is fresh. I'll try to find time tonight. :)

@abathur
Copy link
Collaborator

abathur commented Mar 23, 2021

I gave this a quick try and after a few fixes ended up with:

...
fixing libtool script ./Python-2.7.13/Modules/_ctypes/libffi/ltmain.sh
fixing libtool script ./benchmarks/testdata/ltmain.sh
configure flags: --prefix=/nix/store/bbm8ww51k8qzywlihxf7b3gak8ykk8hy-python2.7-oildev-unstable-2021-02-26
./configure: Wrote _build/detected-config.sh and _build/detected-config.h
building
Executing setuptoolsBuildPhase
Removing _devbuild/gen/*
asdl/hnode.asdl -> (asdl/tool) -> _devbuild/gen/hnode_asdl.py
frontend/types.asdl -> (asdl/tool) -> _devbuild/gen/types_asdl.py
core/runtime.asdl -> (asdl/tool) -> _devbuild/gen/runtime_asdl.py
tools/find/find.asdl -> (asdl/tool) -> _devbuild/gen/find_asdl.py
  (frontend/consts_gen) -> _devbuild/gen/id_kind_asdl.py
  (frontend/consts_gen) -> _devbuild/gen/id_kind.py
  (frontend/option_gen) -> _devbuild/gen/option_asdl.py
  (frontend/flag_gen) -> _devbuild/gen/arg_types.py
frontend/syntax.asdl -> (asdl/tool) -> _devbuild/gen/syntax_asdl.py
asdl/demo_lib.asdl -> (asdl/tool) -> _devbuild/gen/demo_lib_asdl.py
asdl/typed_demo.asdl -> (asdl/tool) -> _devbuild/gen/typed_demo_asdl.py
asdl/typed_arith.asdl -> (asdl/tool) -> _devbuild/gen/typed_arith_asdl.py
asdl/shared_variant.asdl -> (asdl/tool) -> _devbuild/gen/shared_variant_asdl.py
oil_lang/grammar.pgen2 -> (oil_lang/grammar_gen) -> _devbuild/gen/grammar{.marshal,_nt.py,_nt.h}
tools/find/find.pgen2 -> (oil_lang/grammar_gen) -> _devbuild/gen/find{.marshal,_nt.py,_nt.h}

build/setup.py -> libc.so
native/libc.c:349:1: warning: control may reach end of non-void function [-Wreturn-type]
}
^
1 warning generated.
Traceback (most recent call last):
  File "native/libc_test.py", line 266, in <module>
    libc.cpython_reset_locale()
SystemError: error return without exception set
error: builder for '/nix/store/cpm76hp1yhgggghb0d0k8m0vrwwi2bxn-python2.7-oildev-unstable-2021-02-26.drv' failed with exit code 1
error: 1 dependencies of derivation '/nix/store/717c1gi6m3hcxqcpkbafsdnkxpfc5s1b-resholve-0.5.1.drv' failed to build
error: 1 dependencies of derivation '/nix/store/ki32rwrjivgwnsaa48hrv40gjpana1mp-resholve-ci.drv' failed to build

I'm not at all familiar with the python C api, but after a little googling I did come up with the following fix that at least got it to build and run through my test suite on macOS:

diff --git a/native/libc.c b/native/libc.c
index 9ba49e94..663bcbdb 100644
--- a/native/libc.c
+++ b/native/libc.c
@@ -346,6 +346,7 @@ func_cpython_reset_locale(PyObject *self, PyObject *unused)
   	  return NULL;
     }
   }
+  Py_RETURN_NONE;
 }
 
 #ifdef OVM_MAIN

@andychu
Copy link
Contributor Author

andychu commented Mar 24, 2021

Great catch! Somehow I didn't get a crash on my machine, although I should have heeded the compiler warning. Fixed on master:

270a93a

@abathur
Copy link
Collaborator

abathur commented Mar 24, 2021

270a93a

Cool. I retargeted a branch of resholve against this and it passed tests on local macOS + NixOS, and Ubuntu + macOS in CI. I'll probably wait a bit to move up, but it also looks like it'll let me drop a patch and the explicit LOCALE_ARCHIVE env I've been setting for one or more of the linuxes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants