Description
Allow the user of 3C to specify files and directories as "open world" instead of default "closed world". For closed world files, 3C would continue using its current assumptions. It has a complete program with all dependencies and clients available. Open world files drop this assumption. Dependencies might not be available (manifesting as undefined functions), and there might be more clients than are in the available source code (inferred types must accommodate unchecked callers without casting).
Much of the unique behavior enabled for open world files would be the same as can already be controlled with the -itypes-for-extern
and infer-types-for-undefs
flags. It might be possible to redefine them in terms of applying open or closed world behavior to the entire project in order to avoid duplicating similar logic. Alternatively, they could be deprecated in favor of whatever new mechanism is added to specify open world files.
Closed world
We assume 3C has access to all callers of all functions, and all uses of all structures, global variables, and typedefs. This matches 3C's current assumptions, so behavior should not change in this mode.
- A function can be rewritten to a checked type if it is internally checked, regardless of its callers, because we have access to all callers and can automatically insert casts. Functions with an unsafe definition are rewritten using itypes. Undefined functions are treated as unchecked because we can assume that no definitions exists. Functions must be defined (for successful linking), so something is clearly wrong, and the functions should be given unchecked types instead of fully checked or itypes. This will manifest as a warning in the root cause analysis if the function is called.
- Structure fields and global variables can be checked if all uses of the field are safe. Any unsafe use forces a pointer to use an unchecked type (or an itype if we implement that).
- Typedefs follow similar rules to structures and globals. They can be checked if all uses are checked. Checked C does not support any notion of itypes on typedefs, but 3C does have limited support for isolating unsafe uses.
Open World
We cannot assume that 3C can see all function caller, function definitions, etc., so the analysis must be adjusted to permit arbitrary types in the missing code. This will act like some combination of the current -itypes-for-extern
and -infer-types-for-undefs
flags.
- A function may never be rewritten to a fully checked type because there may be unsafe calls that we are not aware of and on which we cannot insert casts. An internally safe function must be rewritten using an itype instead (as is the case for internally unsafe functions). This matches the behavior with
-itypes-for-extern
. Undefined functions are handled as they are with the current-infer-types-for-undefs
flag. Since the function definition is not visible, we must conservatively treat the definition as unchecked. Undefined functions will then be internally unsafe, and will be rewritten to use itypes. An open question here is how 3C should go about forcing a function parameter to be an itype. Currently,-itypes-for-extern
, does this by moving checked types into itypes only during rewriting, but this is known to cause invalid rewriting in some cases. Instead, the internal constraint variable for the parameter might be constrained to WILD. This avoids potential Checked C types errors, but limits conversion of local variables inside the function. Similar questions exist for structure fields, global variables, and typedefs. - Structure fields and global variables cannot be fully checked for the same reason. The current version of 3C cannot infer itypes here, so if they were constrained to WILD, they would solve to fully unchecked types. Improvements to 3C should add support for itypes on structure fields and global variables with unsafe uses. They would then rewrite to itypes, matching the current behavior of
-itypes-for-extern
. - Typedefs again follow similar rules, but with adjustments due to the lack of itypes. All typedefs must be unchecked to accommodate any potential unchecked declaration using the typedef. Function parameter declarations using the unchecked typedef use the current itype workaround where the typedef is used as the unchecked portion of a typedef with the typedef expanded in the checked portion. Again, this matches the current behavior of
-itypes-for-extern
. The workaround seems unsatisfying. If every typedef is unchecked, then every local variable using the typedef would have to remain unchecked as well. Other ideas have been discussed for duplicating typedefs into checked and unchecked versions.
Example Use: converting a library header
In the libjpeg tutorial, the jpeglib.h
header file was copied into a local include directory. 3C was then re-run with -infer-types-for-undefs
to enable solving for and inserting checked types into the local copy of the header file even though the functions in the header were not defined.
After the changes proposed here, the flag passed to 3C would then be -open-world=./include
to specify that files in ./include
use the open world assumptions. All other files use the (default) closed world assumptions. When 3C is re-run, the open world assumption allows the undefined functions in jpeglib.h
to solve to itypes as before. The changes are:
- Any undefined functions outside of
./include
are still unchecked. - Structure fields and global variables rewrite to itypes instead of fully checked types. This should allow the converted version of the header to more easily be used to compile the unconverted libjpeg source code. Previously these itypes needed to be added manually for this to be possible.
- Typedefs are completely unchecked. This is a problem and will substantially limit conversion in
to_ppm.c
due to local variables using the typedef remaining unchecked.
Example Use: Converting a single file in a project
The current approach is to enable -itypes-for-extern
, convert with 3C, and then keep only the converted header files and the single source file you want to convert. Instead, the whole project could be specified as open world with -open-world=.
. This would make all functions solve to itypes, all typedefs be unchecked, and all structure fields and global variables use itypes instead of checked types. The changes from current behavior are:
- Any undefined functions will also solve to itypes. Since
-itypes-for-extern
is only expected to be enabled in phase two of porting, it might be reasonable to assume that there are no undefined functions outside of any previously specified open world files, since these could be warnings or errors as discussed earlier.