Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix POSIX compliance issues uncovered by smoosh (Run smoosh test suite and make a report) #331

Open
andychu opened this issue Jun 10, 2019 · 10 comments

Comments

@andychu
Copy link
Contributor

andychu commented Jun 10, 2019

From Zulip, the way to run them against OSH is:

TEST_SHELL=osh TEST_DEBUG=1 make

I did this and got 89/153 on OSH 0.6.pre21, while @mgree got 86/153 (probably due to some environment differences).

shell_tests.sh: shell/sh.ps1.override: expected STDERR shell/sh.ps1.override.err differs from logged STDERR /home/andy/git/languages/smoosh/tests/log/2019-06-10_16:31/shell/sh.ps1.override.err
shell_tests.sh: shell/sh.ps1.override failed
shell_tests.sh: shell/sh.set.ifs passed
shell_tests.sh: 89/153 tests passed
Makefile:11: recipe for target 'shell' failed
make: *** [shell] Error 1

The format that is very compatible with our spec tests (code snippet, stdout, stderr, exit status).

https://github.com/mgree/smoosh/tree/master/tests

It should be easy to adapt test/sh_spec.py to run them and produce an HTML report.

Conversely, I will also try running smoosh in our spec tests.


@mgree FYI I just read through the repo and test harness and realized that these tests should be pretty easy to adapt (without any reformatting).

Then we will get a report like this: http://www.oilshell.org/git-branch/dev/andy-14/6da1da1a/spec/

which I like because you can scan across a row to see where shells agree and disagree.

I'm not sure when I'll get to this but I think it will be useful to run on a regular basis.

As mentioned I would also accept PRs to add annotations to the Oil test suite for smoosh. But I think it makes sense to try to import your 153 test cases as well, and doesn't look too difficult.

@andychu
Copy link
Contributor Author

andychu commented Jun 10, 2019

notes:

  • from Zulip, dash passes 113 / 153 tests
  • POSIX has 494 tests but the suite isn't freely available

@mgree
Copy link

mgree commented Jun 10, 2019

I love the idea of a public leaderboard of shell characteristics. I think incorporating some of modernish's FTL_ and BUG_ and QRK_ tests along with modernish --test would be a good move, too.

Please incorporate whatever tests you like! They're under MIT license; a URL and a copy of my license somewhere near those tests would be great.

What do you need from me in order to have smoosh show up on that leaderboard? I know the build process is complicated, but I could probably package up something nicer for release (e.g., drop the Lem dependency).

@andychu
Copy link
Contributor Author

andychu commented Jun 11, 2019

I think the first cut can just read the testdata data as is from your repo, but we'll see. The format is very similar to what I use.

I didn't try building smoosh yet, but I'll let you know if I have any issues.

If we find another bug or two in OSH I'll consider it worth it :)

@andychu
Copy link
Contributor Author

andychu commented Jun 11, 2019

@mgree Also I remembered I linked to a ksh test suite here, which was forked for mksh:

http://www.oilshell.org/blog/2017/06/22.html

According to the README there are some cases that are considered POSIX. Not sure exactly how they are labeled.

I never tried running these tests, probably because OSH was too immature back then. They also have a very similar format, although it requires some parsing (done in an 1000-line Perl script).

name: xxx-set-option-1
description:
	Check option parsing in set
stdin:
	set -vsA foo -- A 1 3 2
	echo ${foo[*]}
expected-stderr:
	echo ${foo[*]}
expected-stdout:
1 2 3 A

@andychu
Copy link
Contributor Author

andychu commented Jul 27, 2019

Just ran them with 0.7.pre2, got 97/157.

A couple issues:

  • the tests sometimes assert stderr has a specific value, but I think those are trivial textual changes
    • disabling stderr checks makes it 102/157
  • I'm running my oshrc, which makes things slower, and causes timeouts!!!
    • timeouts cause exit status 143
    • I disabled oshrc and still got 97 failures, so hm that wasn't a correctness issue
+  shopt -s globstar  # long overdue, try out wc -l **/*.py
+  ^~~~~
+/home/andy/git/dotfiles//interactive.bash:23: 'shopt' got invalid option 'globstar'
+Terminated
shell_tests.sh: shell/sh.ps1.override: expected STDERR shell/sh.ps1.override.err differs from logged STDERR /home/andy/git/languages/smoosh/tests/log/2019-07-26_20:53/shell/sh.ps1.override.err
shell_tests.sh: shell/sh.ps1.override failed
shell_tests.sh: shell/sh.set.ifs passed
shell_tests.sh: 97/157 tests passed
Makefile:11: recipe for target 'shell' failed
make: *** [shell] Error 1

@andychu
Copy link
Contributor Author

andychu commented Jul 27, 2019

Known differences - https://www.oilshell.org/release/0.7.pre2/doc/known-differences.html

  • semantics.escaping.quote.err fails, I believe this is the backtick issue documented

Special builtin rule:

  • parse.eval.error.test -- this is because source doesn't fail as a special builtin, like bash but unlike POSIX shells
  • semantics.special-assign.visible.nonposix -- bindings before : are visible

Real bugs:

  • semantics.var.star.emptyifs.test -- seems like a legit IFS difference
  • shell/builtin.alias.empty.test -- empty alias gives parse error, other shells accept
  • builtin.dot.path.test -- differs from bash? But similar to dash? check it again.
  • builtin.history.nonposix.test -- for some reason OSH doesn't exit, but other shells do
  • several tilde expansion failures
    • semantics.tilde.sep.test -- seems like a legit difference, but not all shells agree
      • several tilde tests fail, look into these
  • a whole bunch of trap failures. This could also be unimplemented stuff.
    • builtin.trap.supershell.test --
      • osh -c "(trap 'echo so long' EXIT)" doesn't do anything, but it does in other shells
    • semantics.kill.traps.err -- maybe a broken invariant: osh: PID 32142 stopped, but osh didn't start it

Fixed:

  • builtin.test.bigint.test -- OverflowError from Python !!!

Minor:

  • shell/builtin.eval.break.test gives a misleading error message in OSH. eval break thinks it's in a different process. TODO: If we change it to a builtin, it might work?
  • sh.interactive.ps1.* fail due to prompt being written to stdout rather than stderr, which I noticed too (echo exit | osh --rcfile /dev/null -i)
  • semantics.backtick.ppid.test -- I think OSH fails because the shell wrapper starts another
    process

Unimplemented:

  • shell/builtin.times.ioerror -- times builtin not implemented
    • ditto for benchmark.fact5, benchmark.while
  • shell/builtin.export.unset.test -- export -p not implemented
  • shell/builtin.command.ec.test -- command -V not implemented
  • implement printf %b to do backslash escaping #357 Implementing printf %b would fix a couple
  • sh.interactive.ps1.test, sh.ps1.override.test -- because prompt goes to stdout and not stderr,

Probably don't care:

  • {break,continue}.nonlexical
  • semantics.interactive.expansion.exit -- OSH and bash fail earlier

This was referenced Jul 27, 2019
andychu pushed a commit that referenced this issue Jul 27, 2019
Caught by smoosh test suite.

Related to #331.
@andychu
Copy link
Contributor Author

andychu commented Jul 27, 2019

This one was also caught by the test suite:

dd42678

andychu pushed a commit that referenced this issue Jul 28, 2019
- Shell paths are now absolute
- Add a bunch of flags to sh_spec.py to support smoosh
- Don't import a few tests that can't be run even with 'timeout 1s'.
  TODO: Debug this.

Running 149 cases out of 156 or so.

Output:
FATAL: 249 tests failed (72 osh failures)

(note: 249 counts each shell failure separately.)

Addresses issue #331.
andychu pushed a commit that referenced this issue Jul 28, 2019
- Adjust the environment
- Add support for a timeout result, show in purple
- Tweak summary text

Addresses issue #331.
andychu pushed a commit that referenced this issue Jul 28, 2019
- smoosh.test.sh runs with the coreutils timeout binary
- smoosh-hang.test.sh runs with the smoosh timeout shell script

Also:

- Add --rm-tmp to clear the test dir.
- Handle stdout without trailing newline

Addresses issue #331.
@andychu
Copy link
Contributor Author

andychu commented Jul 29, 2019

Latest results:

http://www.oilshell.org/git-branch/dev/andy-14/b15b66a8/spec/smoosh.html

http://www.oilshell.org/git-branch/dev/andy-14/b15b66a8/spec/smoosh-hang.html (tests that seem to require SIGKILL)

After fixing a bunch of issues, this seems pretty accurate.

  • osh on smoosh: 150 tests, 98 pass, 42 fail, 10 timeout
  • osh on smoosh-hang: 6/6 fail (or maybe timeout)

andychu pushed a commit that referenced this issue Jul 29, 2019
@andychu andychu changed the title Run smoosh test suite and make a report Fix POSIX compliance issues uncovered by smoosh (Run smoosh test suite and make a report) Nov 12, 2019
@andychu
Copy link
Contributor Author

andychu commented Feb 3, 2022

Looking at trap tests now

https://www.oilshell.org/release/0.9.7/test/spec.wwz/survey/smoosh.html

It also disagrees on case 53, but that's intended since OSH is stricter about handling invalid KILL and 9 signals.

builtin.trap.kill.undef.test

Other trap cases don't seem to have that much consensus from shells

@andychu
Copy link
Contributor Author

andychu commented Feb 3, 2022

Also OSH disagrees with 3 shells on

semantics.var.star.emptyifs.test

which might be related to #707

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants