-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathlesson-1.html
More file actions
233 lines (231 loc) · 8.48 KB
/
lesson-1.html
File metadata and controls
233 lines (231 loc) · 8.48 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" href="index.css" />
<title>Womangling</title>
</head>
<body>
<main>
<div class="content" id="content-area">
<h1>
<a class="root-link" href=".">Learn C++ Itanium Symbol Mangling</a>
</h1>
<h2>Lesson 1: Basics</h2>
<noscript>
<p>
Warning: You have JavaScript disabled. While the content is still
viewable, interactive exercises will not work. Consider enabling
JavaScript for this website.
</p>
</noscript>
<section data-step="0" class="step">
<p>
After getting an understanding of how this guide works and learning
about the not-mangling of C identifiers, we are ready to dive into
C++.
</p>
<p>
Every C++ mangled symbol is prefixed with the string
<code>_Z</code>. This signifies that this is a mangled C++ symbol.
<code>_Z</code> starts with an underscore followed by an uppercase
letter. All symbols of that structure are reserved by the C
standard and cannot be used by programs. This ensures that there are
no name collisions with normal C functions and mangled C++
functions.
</p>
<p>
After that, the name of the entity is stored. For now, we will only
look at functions. For functions, the function type is appended to
the name to get the full symbol.
</p>
<pre class="code">
void f() {}
</pre>
<p>
This empty function will be mangled to <code>_Z1fv</code>. The
<code>1f</code> signifies the name (we will look at this in more
detail later in this lesson) and the <code>v</code> signifies the
function type.
</p>
<p>
We will see the <code>v</code> function type a lot in the rest of
this guide. It stands for a function that takes no arguments.
</p>
<quiz-section>
<p>
Which of these symbols cannot possibly be a mangled C++ symbol?
Answer with the name of the symbol.
</p>
<ul>
<li><code>_ZN3FooIA4_iE3barE</code></li>
<li><code>_ZN6System5Sound4beepEv</code></li>
<li><code>_RN3FooIA4_iE3barE</code></li>
</ul>
<form
data-challenge="1"
data-answer="_RN3FooIA4_iE3barE"
data-hint="Look at the prefix"
>
<input class="quiz-input" />
<button
data-challenge-submit="1"
class="submit-challenge"
type="submit"
>
Answer
</button>
<div class="error"></div>
</form>
</quiz-section>
</section>
<section data-step="1" class="step">
<p>
For names, there are two cases to consider for now. Either the name
is in the global scope, or it is in a namespace.
</p>
<p>For global names, we just prefix the name with its length.</p>
<pre class="code">
void hello_world() {}
</pre>
<p>
This will therefore get mangled as <code>_Z11hello_worldv</code>.
The length of <code>hello_world</code> is 11, so we concatenate
<code>11</code> and <code>hello_world</code>. This entire thing is
then appended to the previously mentioned prefix <code>_Z</code> and
then we add the type, which is just <code>v</code> here, at the end.
</p>
<quiz-section>
<p>What is the mangling of the following identifier?</p>
<pre class="code">
void meow() {}
</pre>
<form
data-challenge="2"
data-answer="_Z4meowv"
data-hint="Remember the prefix and function type"
>
<input class="quiz-input" />
<button
data-challenge-submit="2"
class="submit-challenge"
type="submit"
>
Answer
</button>
<div class="error"></div>
</form>
</quiz-section>
</section>
<section data-step="2" class="step">
<p>
Functions that are declared in a namespace get a bit more
complicated. They are referred to as <i>nested names</i>, because
they are <i>nested</i> in a namespace. They can also be nested in
multiple namespaces, the encoding is the same.
</p>
<p>
Nested names start with an <code>N</code> and end with an
<code>E</code> (the <code>E</code> stands for "end" and is commonly
used to end sequences). Between those two letters, the hierarchy of
the namespace is represented by putting on namespace name after
another, with the function name last. Every name has the leading
length and then the name itself, just like with global names.
</p>
<pre class="code">
namespace outer {
void inner() {}
}
</pre>
<p>
That means that this function will be mangled as
<code>_ZN5outer5innerEv</code>. We can decode this into the
following structure
</p>
<ul>
<li><code>_Z</code>: Prefix</li>
<li><code>N</code>: Start of nested name</li>
<li>
<code>5outer</code>: Outer namespace, name prefixed by length
</li>
<li>
<code>5inner</code>: Inner function, name prefixed by length
</li>
<li><code>E</code>: End of nested name</li>
<li><code>v</code>: Function type</li>
</ul>
<p>Nested namespaces follow the same structure.</p>
<pre class="code">
namespace a {
namespace b {
namespace c {
void inner() {}
}
}
}
</pre>
<p>
This function will mangle as <code>_ZN1a1b1c5innerEv</code>. We get
all the concatenated names as <code>1a1b1c5inner</code>, with the
previously mentioned characters around them.
</p>
<quiz-section>
<p>What is the mangling of the following identifier?</p>
<pre class="code">
namespace cats {
namespace like {
void meow() {}
}
}
</pre>
<form
data-challenge="3"
data-answer="_ZN4cats4like4meowEv"
data-hint="Remember the prefix and function type, and don't forget to wrap it in the nested start and end"
>
<input class="quiz-input" />
<button
data-challenge-submit="3"
class="submit-challenge"
type="submit"
>
Answer
</button>
<div class="error"></div>
</form>
</quiz-section>
</section>
<section data-step="3" class="step">
<p>
Good job! You have successfully answered all the question and now
know the basic makeup of an Itanium-mangled C++ symbol.
</p>
<p>
In the next lesson, we will use this knowledge to look at basic
function types beyond <code>v</code>. Mangling function types is
important for function overloading, but I don't want to overload you
with information, so feel free to take a break and let the previous
knowledge sink in.
</p>
<p class="lesson-last-paragraph">
If you want to try out more code and look at its mangling, I
recommend using Compiler Explorer on
<a href="https://godbolt.org">godbolt.org</a>. Under "Output", you
can uncheck the box to demangle identifiers to see the mangled
identifiers for any C++ code you enter on the left.
</p>
<div class="center">
<a href="lesson-2.html" class="action-button">
Lesson 2: Arguments
</a>
</div>
</section>
</div>
</main>
<script>
window.LESSON = 1;
</script>
<script type="module" src="lessons.js"></script>
</body>
</html>