|
| 1 | +# Introduction |
| 2 | + |
| 3 | +We have a rudimentary call tracing system in place which can record native, Java and managed (sort of) |
| 4 | +stack traces when necessary. However, this system was quickly slapped up together in order to try |
| 5 | +to find out why marshal methods break Blazor apps. The system doesn't optimized collection and delivery, |
| 6 | +making it heavy (e.g. it relies on logcat to log large stack traces) and hard to read the results. |
| 7 | + |
| 8 | +This branch is an attempt to fix the above downsides and implement a generic (as far as we are concerned - |
| 9 | +not for arbitrary use in other products) framework to store timed events in a thread-safe manner and with |
| 10 | +as little work at runtime at possible, but with options to do more when necessary. |
| 11 | + |
| 12 | +# Design goals and notes |
| 13 | + |
| 14 | +## Collection |
| 15 | + |
| 16 | +Records are stored in pre-allocated buffers, one per thread. Buffer is allocated at the time when thread |
| 17 | +attaches to the runtime, its pointer is stored in TLS as well as in some central structure for further |
| 18 | +processing. At runtime, the pointer from TLS is used to store events, thus enable lockless operation. |
| 19 | + |
| 20 | +The collection process can be started at the following points: |
| 21 | + |
| 22 | + * startup of the application |
| 23 | + * after an initial delay |
| 24 | + * by a p/invoke at the app discretion |
| 25 | + * by an external signal/intent (signal might be faster) |
| 26 | + |
| 27 | +The collection process can be stopped at the following points: |
| 28 | + |
| 29 | + * exit from the application |
| 30 | + * after a designated delay from the start |
| 31 | + * by a p/invoke at the app discretion |
| 32 | + * by an external signal/intent |
| 33 | + |
| 34 | +`Buffer` is used loosely here, the collected data may be called in a linked list or some other form of |
| 35 | +container. |
| 36 | + |
| 37 | +Each trace point may indicate that it wants to store one or more stack traces (native, Java and managed) |
| 38 | + |
| 39 | +### Native call traces |
| 40 | + |
| 41 | +If call stack trace is to be collected, gather: |
| 42 | + |
| 43 | + 1. Shared library name or address, if name not available (we might be able to get away with using just |
| 44 | + the address, if we can gleam the address from memory map post-mortem) |
| 45 | + 2. entry point address |
| 46 | + |
| 47 | +### Java call traces |
| 48 | + |
| 49 | +Those will require some JNI work as it might not be possible to map stack frame addresses to Java methods |
| 50 | +post-mortem. |
| 51 | + |
| 52 | +### Managed call traces |
| 53 | + |
| 54 | +Might be heavy, may require collecting all the info at run time. |
| 55 | + |
| 56 | +## Delivery |
| 57 | + |
| 58 | +At the point where collection is stopped, the results are dumped to a location on device, into a single |
| 59 | +structured file and optionally compressed (`lz4` probably) |
| 60 | + |
| 61 | +The dumped data should contain enough information to identify native frames, most likely the process |
| 62 | +memory map(s). |
| 63 | + |
| 64 | +Events collected from threads are stored in separate "streams" in the output file, no ordering or processing |
| 65 | +is done at this point. |
| 66 | + |
| 67 | +## Time stamping |
| 68 | + |
| 69 | +Each event is time-stamped using the highest resolution clock available. No ordering of events is attempted |
| 70 | +at run time. |
| 71 | + |
| 72 | +## Multi-threaded collection |
| 73 | + |
| 74 | +No locks should be used, pointer to buffer stored in thread's TLS and in some central structure for use when |
| 75 | +dumping. |
| 76 | + |
| 77 | +## Managed call tracing |
| 78 | + |
| 79 | +While MonoVM runtime can let us know when every method is called, we don't want to employ this technique here |
| 80 | +because it would require filtering out the methods we're not interested in at run time, and that means string |
| 81 | +allocation, comparison - too much time wasted. Instead we will have a system in place that makes the whole |
| 82 | +tracing nimbler. |
| 83 | + |
| 84 | +### Assembly and type tracing |
| 85 | + |
| 86 | +Each trace record must identify the assembly the method is in, the class token and the method token. Assemblies are |
| 87 | +not identified by their MVVID, but rather by an ordinal number assigned to them at build time and stored somewhere |
| 88 | +in `obj/` for future reference. Each instrumented method has hard-coded class and method tokens within that |
| 89 | +assembly. |
| 90 | + |
| 91 | +### Filters |
| 92 | + |
| 93 | +We'll provide a way to provide a filter to instrument only the desired methods/classes. Filters apply to assemblies |
| 94 | +after linking but **before** AOT since we must instrument the methods before AOT processes them. Three kinds |
| 95 | +of targets: |
| 96 | + |
| 97 | + * `Methods in types`. |
| 98 | + A regex might be a good idea here. |
| 99 | + * `Full class name`. |
| 100 | + By default all type methods are instrumented, should be possible to exclude some (regex) |
| 101 | + * `Marshal method wrappers`. |
| 102 | + They are **not** covered by the above two kinds, have to be enabled explicitly. By default all are instrumented, |
| 103 | + possible to filter with a regex. Both inclusion and exclusion must be supported. |
0 commit comments