Skip to content

ClangTool::run chdir call corrupts internal Clang file path cache (?) #515

@mattmccutchen-cci

Description

@mattmccutchen-cci

As part of the change to expand macros before running 3C, I tried to change convert_project so that instead of (1) passing an adjusted version of the union of all compiler options seen in the compilation database to 3C via -extra-arg-before, it (2) lets 3C read the options directly from the compilation database. This is because approach (1) may be wrong if different translation units have different compiler options, and I was more concerned about this as convert_project started to have more direct interaction with the preprocessor. Importantly, the adjustment in (1) included expanding relative paths in -I options to absolute paths based on the working directory of the translation unit carrying the options.

Unfortunately, this change seemed to cause our icecast benchmark to trigger a bug in Clang LibTooling. The symptom looks like this:

2021-03-26 14:52:27.200 INFO generate_ccommands - run3C: Running:/home/matt/3c-3.wt/build/bin/3c -dump-stats -p /home/matt/benchmarks/icecast-2.4.4/compile_commands.json -extra-arg=-w -base-dir="/home/matt/benchmarks/icecast-2.4.4" -output-dir="/home/matt/benchmarks/icecast-2.4.4/out.checked" /home/matt/benchmarks/icecast-2.4.4/src/format_flac.c /home/matt/benchmarks/icecast-2.4.4/src/format_ogg.c /home/matt/benchmarks/icecast-2.4.4/src/format_kate.c /home/matt/benchmarks/icecast-2.4.4/src/main.c /home/matt/benchmarks/icecast-2.4.4/src/format_mp3.c /home/matt/benchmarks/icecast-2.4.4/src/sighandler.c /home/matt/benchmarks/icecast-2.4.4/src/global.c /home/matt/benchmarks/icecast-2.4.4/src/cfgfile.c /home/matt/benchmarks/icecast-2.4.4/src/format_ebml.c /home/matt/benchmarks/icecast-2.4.4/src/event.c /home/matt/benchmarks/icecast-2.4.4/src/auth_htpasswd.c /home/matt/benchmarks/icecast-2.4.4/src/refbuf.c /home/matt/benchmarks/icecast-2.4.4/src/avl/avl.c /home/matt/benchmarks/icecast-2.4.4/src/format_vorbis.c /home/matt/benchmarks/icecast-2.4.4/src/connection.c /home/matt/benchmarks/icecast-2.4.4/src/util.c /home/matt/benchmarks/icecast-2.4.4/src/admin.c /home/matt/benchmarks/icecast-2.4.4/src/log/log.c /home/matt/benchmarks/icecast-2.4.4/src/format_opus.c /home/matt/benchmarks/icecast-2.4.4/src/thread/thread.c /home/matt/benchmarks/icecast-2.4.4/src/client.c /home/matt/benchmarks/icecast-2.4.4/src/timing/timing.c /home/matt/benchmarks/icecast-2.4.4/src/net/resolver.c /home/matt/benchmarks/icecast-2.4.4/src/stats.c /home/matt/benchmarks/icecast-2.4.4/src/net/sock.c /home/matt/benchmarks/icecast-2.4.4/src/source.c /home/matt/benchmarks/icecast-2.4.4/src/slave.c /home/matt/benchmarks/icecast-2.4.4/src/format_skeleton.c /home/matt/benchmarks/icecast-2.4.4/src/logging.c /home/matt/benchmarks/icecast-2.4.4/src/fserve.c /home/matt/benchmarks/icecast-2.4.4/src/auth.c /home/matt/benchmarks/icecast-2.4.4/src/format_midi.c /home/matt/benchmarks/icecast-2.4.4/src/md5.c /home/matt/benchmarks/icecast-2.4.4/src/format.c /home/matt/benchmarks/icecast-2.4.4/src/xslt.c /home/matt/benchmarks/icecast-2.4.4/src/httpp/httpp.c
avl.c:33:11: fatal error: cannot open file '../config.h': No such file or directory
 #include <config.h>
          ^
avl.c:33:11: fatal error: cannot open file '../config.h': No such file or directory
 #include <config.h>
          ^
[...more similar errors...]

My rough theory is as follows: Clang has a cache where the first time it sees #include STR (where STR is of the form <PATH> or "PATH"), it searches the include path for the first matching file and caches the path at which it found the file (to a first approximation, the concatenation of the -I directory with PATH). If Clang later sees #include STR again, it tries to open the cached path directly and raises a fatal error (seen above) if it fails. The problem arises when the cached path is relative, which can occur if the directory path specified via -I was relative. ClangTool::run iterates over the specified translation units, and for each one, it does a chdir to the working directory specified in the compilation database but (apparently) does not invalidate the cache. Consequently, if different translation units have different working directories, the preprocessor may try and fail to open a cached relative path because the working directory is different than it was when the path was added to the cache, when instead the preprocessor should do the include search over. Surprisingly, #488 did not fix the problem because ClangTool::buildASTs still calls ClangTool::run internally (!).

Here is the original benchmark workflow run in which the problem appeared (though the logs will probably expire from GitHub soon). It should be possible to reproduce the problem by re-running that revision of the preprocess-before-conversion workflow (mwhicks1/3c-actions@7651529) on the corresponding revision of the preprocess-before-conversion branch of this repository (c113b1d). We could probably construct a smaller test case with a compilation database with two entries (and presumably that's what we would do if we wanted to add a regression test for the problem to 3C), but I don't want to take the time to do that now.

In a web search, I found a few reports of similar-looking problems (1, 2), but it doesn't appear that anyone has tracked down the details and formally reported the bug in the Clang bug database. We could do so if we wish, assuming (as Mike pointed out) that it still reproduces on the latest main branch of the LLVM monorepo; to test that, we'd probably want to write a trivial LibTooling-based app rather than try to upgrade the whole checkedc-clang codebase to Clang main just for this test.

Ultimately, we'll probably want to fix or work around this problem somehow so that end users get correct behavior when running 3C on a compilation database like that of icecast. For now, I'm planning to work around the problem in convert_project by restoring the legacy behavior of passing -extra-arg-before to 3C, but only for the absolute versions of -I options. Since we use -extra-arg-before, this will ensure that every included file is found via an absolute -I directory before we reach the relative ones in the compilation database, so the cached path will be absolute, avoiding the problem. In principle, this could be wrong if different translation units have different sets of resolved -I directories: if we apply the union of the -I directories to all translation units, then a translation unit could use a file from an -I directory that was not supposed to be active for that translation unit, when it was intended to use a file from a later -I directory instead. However, I don't believe this happens in any of our current benchmarks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    UpstreamWork item that has to deal with making changes to upstream.bugSomething isn't workingclang preprocessorcommand-line

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions