Skip to content

Conversation

@dvyukov
Copy link
Collaborator

@dvyukov dvyukov commented Dec 9, 2024

  • executor: query globs in the test program context
  • executor: mount gadgetfs
  • pkg/compiler: add automatic meta
  • tools/syz-declextract: rewrite
  • pkg/declextract: rename generated names for consistency
  • pkg/declextract: refactor netlink generation
  • pkg/declextract: refine more networking types
  • pkg/declextract: emit more netlink families
  • tools/syz-declextract: extract file_operations descriptions
  • tools/syz-declextract: generate file_operations descriptions

@dvyukov dvyukov changed the title dvyukov declextract rewrite pkg/declextract: various improvements Dec 9, 2024
@a-nogikh
Copy link
Collaborator

Please also update #5410 if/as necessary.

@dvyukov dvyukov force-pushed the dvyukov-declextract-rewrite branch 3 times, most recently from 7c3aa3d to 2e17321 Compare December 10, 2024 16:21
@dvyukov dvyukov requested a review from a-nogikh December 10, 2024 17:36
a-nogikh
a-nogikh previously approved these changes Dec 10, 2024
a-nogikh
a-nogikh previously approved these changes Dec 11, 2024
a-nogikh
a-nogikh previously approved these changes Dec 11, 2024
We query globs for 2 reasons:
1. Expand glob types in syscall descriptions.
2. Dynamic file probing for automatic descriptions generation.

In both of these contexts are are interested in files
that will be present during test program execution
(rather than normal unsandboxed execution).
For example, some files may not be accessible to test programs
after pivot root. On the other hand, we create and link
some additional files for the test program that don't
normally exist.

Add a new request type for querying of globs that are
executed in the test program context.
We can reach it at least with automatic descriptions.
Mark the whole file with "meta automatic" instead of marking each syscall.
This reduces size of descriptions + allows to do special things
with the whole file (e.g. we already treat auto consts specially).
syz-declextract accumulated a bunch of code health problems
so that now it's hard to change/extend it, lots of new features
can only be added in in hacky ways and cause lots of code duplication.
It's also completly untested. Rewrite the tool to:
 - move as much code as possible to Go (working with the clang tool
   is painful for a number of reasons)
 - allow testing and add unit tests (first layer of tests test
   what information is produced by the clang tool, second layer
   of tests test how that information is transformed to descriptions)
 - allow extending the clang tool output to export arbitrary info
   in non-hacky way (now it produces arbitrary JSON instead of a mix
   of incomplete descriptions and interfaces)
 - remove code duplication in the clang tool and provide common
   infrastructure to add new analysis w/o causing more duplication
 - provide more convinient primitives in the clang tool
 - improve code style consistency and stick to the LLVM code style
   (in particular, variable names must start with a capital letter,
   single-statement blocks are not surrounded with {})
 - remove intermixing of code that works on different levels
   (currently we have AST analysis + busness logic + printfs
   all intermixed with each other)
 - provide several helper Go packages for better code structuring
   (e.g. pkg/clangtool just runs the tool on source files in parallel
   and returns results, this already separates a bunch of low-level
   logic from the rest of the code under a simple abstraction)

I've tried to make the output match the current output as much as possible
so that the diff is managable (in some cases at the cost of code quality,
this should be fixed in future commits). There are still some differences,
but hopefully they are managable for review (more includes/defines,
reordered some netlink attributes).

Fixed minor bugs are fixed along the way, but mostly NFC:
1. Some unions were incorrectly emitted as [varlen]
   (C unions are never varlen).
2. Only a of [packed], [align[N]] attributes was emitted
   for struct (both couldn't be emitted).
Currently we append "$auto", or "$auto_record", or prepend "auto_",
or insert "auto" somewhere in the middle.
Use more consistent naming: always append "$auto".
Emit all information related to a single netlink family close to each other.
Previously we emitted them scattered and grouped by info type.
That was both inconvinient to emit and inconvinient to read.
NFC.
Emit families w/o policy, emit duplicate commands.
Extend the clang tool to locate file_operations variables and arrays
and dump open/read/write/mmap/ioctl callbacks for each.
It also tries to extract set of ioctl commands and argument types
for them in a simple best-effort way (for now). It just locates switch
in the ioctl callback and extracts each case as a command.
Emit descriptions for special files in /dev, /sys, /proc, and ./.

pkg/declextract combines file_operations info produced by the clang tool
with the dynamic probing info produced by pkg/ifaceprobe in order
to produce complete descriptions for special files.
@dvyukov dvyukov force-pushed the dvyukov-declextract-rewrite branch from 9587f3b to b6a2c92 Compare December 11, 2024 14:46
Linter points to very large cyclomatic complexity/length of some functions.
Fix that.
This is a side-effect of making auto descriptions use sockaddr
(it contains nfc_dev_id).
@dvyukov dvyukov force-pushed the dvyukov-declextract-rewrite branch from b6a2c92 to 806e25c Compare December 11, 2024 14:59
@dvyukov dvyukov requested a review from a-nogikh December 11, 2024 15:20
@dvyukov dvyukov added this pull request to the merge queue Dec 11, 2024
Merged via the queue into google:master with commit 6f1b0fa Dec 11, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants