Skip to content

Commit 2214b53

Browse files
committed
decrypt then eval writeup
1 parent 8407bf4 commit 2214b53

File tree

13 files changed

+278
-2
lines changed

13 files changed

+278
-2
lines changed

mkdocs.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ nav:
1313
- Number mashing: ./number_mashing/index.md
1414
- Intercepted transmissions: ./transmissions/index.md
1515
- Vector overflow: ./vector/index.md
16-
# - Decrypt then eval: ./decrypt_eval/index.md
16+
- Decrypt then eval: ./decrypt_eval/index.md
1717
# - Yawa: ./yawa/index.md
1818
# - DNAdecay: ./dna/index.md
1919
# - Sign in: ./sign_in/index.md

solve/decrypt_eval/index.md

+256-1
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,29 @@ Solved: 197
1717

1818
Input files:
1919

20-
??? info "encoding.txt"
20+
??? info "decrypt-then-eval.py"
21+
```py
22+
#!/usr/bin/env python3
2123

24+
from Crypto.Cipher import AES
25+
import os
26+
27+
KEY = os.urandom(16)
28+
IV = os.urandom(16)
29+
FLAG = os.getenv('FLAG', 'DUCTF{testflag}')
30+
31+
def main():
32+
while True:
33+
ct = bytes.fromhex(input('ct: '))
34+
aes = AES.new(KEY, AES.MODE_CFB, IV, segment_size=128)
35+
try:
36+
print(eval(aes.decrypt(ct)))
37+
except Exception:
38+
print('invalid ct!')
39+
40+
if __name__ == '__main__':
41+
main()
42+
```
2243

2344
NB:
2445

@@ -32,12 +53,246 @@ NB:
3253
Ie, first character of string `Hello World!` is `H`, fifth is `o`.
3354

3455
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
56+
57+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
3558

3659
## My struggle
3760

61+
### Analysis
62+
63+
We got only one file to start with:
64+
65+
```py
66+
KEY = os.urandom(16) # AES params
67+
IV = os.urandom(16)
68+
FLAG = os.getenv('FLAG', 'DUCTF{testflag}') # flag variable this will be our target
69+
70+
def main():
71+
while True:
72+
ct = bytes.fromhex(input('ct: ')) # read input string
73+
aes = AES.new(KEY, AES.MODE_CFB, IV, segment_size=128) # create AES cipher
74+
try:
75+
print(eval(aes.decrypt(ct))) # decrypt input string, evaluate result value
76+
except Exception:
77+
print('invalid ct!')
78+
```
79+
Our goal is to get `aes.decrypt` return string `FLAG`, then evaluation of it will print value of the
80+
`FLAG` variable back to us.
81+
82+
AES is considered to be a secure algorithm. If its used correctly - its practically unbreakable. The key part of
83+
this statement is "if used correctly". The key issue of the implementation is that it `AES.new` is created afresh for every
84+
user input. Given IV and KEY are same every time, same cipher keystream is generated for each input. Combined with the fact
85+
that CFB mode is used, we can control result of decryption even though we will never know values of KEY and IV.
86+
87+
Lets review strategy of controlling output of AES description in CFB mode when same keystream is applied to every input
88+
that we provide. Program executes following algorithm:
89+
90+
1. Read used input;
91+
2. Generate same keystream every time for given KEY nad IV
92+
3. XOR input with keystream
93+
4. Evaluate result
94+
5. If expression is evaluated successfully - print result, otherwise print "invalid ct!"
95+
96+
If we know the keystream, we can easily construct input that will give us any desired output. For example if first keystream byte
97+
was 0x67 and we would want it to be 'F' (ascii value 0x46) then input we are lookign for is `0x67 ^ 0x46 = 0x21`. Same calculation
98+
works for any other byte value of the keystream.
99+
100+
### Attempt 1
101+
102+
How would we find the keystream? My first idea was to loop through all possible inputs until I get first and last characters
103+
to be double quotes, then everything is the middle will be considered as string that will be printed back.
104+
Once I have bytes the middle, I can calculate input.
105+
106+
Pseudocode that I used for this:
107+
108+
```py title="attempt_1.py"
109+
input = [0] * 16 # this is our input 16 bytes long (source code mentioned segment size 128)
110+
# when same byte of keystream is XOR-ed with 256 different values in input
111+
# output will also cover 256 possible values
112+
# one of them will be double quote that I am looking for
113+
for i in range(256): # try all possible first bytes
114+
for j in range(256): # try all possible last bytes
115+
input[0] = i # set first byte to i, last byte to j, all others will be 0
116+
input[15] = j
117+
io.sendline(binascii.hexlify(bytearray(input))) # send input to the decrypt-eval program
118+
response = io.recvline().strip() # read result line
119+
# if we got something interesting - print it, I expect to double quoted string and single quoted
120+
# and maybe some other inputs that are randomly valid
121+
if b'invlaid ct!' not in line:
122+
print("We received response that is not error: ", line)
123+
```
124+
125+
I've run the program and to my surprise I got nothing. There must be some other evaluation errors. I've modified source code of the decrypt-eval program
126+
to include more debug information printed while keeping functionality intact and increasing
127+
performance. With this version I can iterate much quicker:
128+
129+
```py title="modified decrypt-then-eval.py"
130+
aes = AES.new(KEY, AES.MODE_CFB, IV, segment_size=128) # create AES instance at the start of the program
131+
keysream = aes.decrypt(bytearray(16)) # by using input [0,0,0,0....0] extract keystream
132+
133+
def main():
134+
while True:
135+
ct = bytes.fromhex(input('ct: '))
136+
try:
137+
print(eval(xor_arrays(keysream, ct))) # xor keystream with provided input
138+
except Exception as e:
139+
print('invalid ct!', e) # add exception details to the message
140+
```
141+
Once I rerun my enumeration script I found errors of the eval:
142+
143+
1. source code string cannot contain null bytes;
144+
2. invalid utf8 encoding
145+
146+
Looks like there is a bad sequence of bytes somewhere in the middle of the string. So far, all input middle bytes were 0. I think
147+
we should try different value to deal with encoding problems. For nullbyte error we should try both 0 and 1 as input
148+
(only one may produce null-byte, not both at the same time).
149+
150+
Pair 127,128 should take care of invalid UTF-8 sequence. UTF8 encoded characters are variable length byte sequences.
151+
It means that frequently used characters like latin alphabet, digits will take only 1 byte, and some less frequently used (emoji etc)
152+
assign 2-3 byte sequences. Decoding process is quite straightforward: first bit has a special meaning, its a flag indicating
153+
that current byte is final byte of codepoint. Remaining bits are concatenated to form codepoint value. For example:
154+
```
155+
0XXXXXXX -> 1 byte sequence, codepoint is XXXXXXX
156+
1AAAAAAA 0BBBBBBB -> 2 byte sequence codepoints is AAAAAAABBBBBBB
157+
1AAAAAAA 1BBBBBBB 0CCCCCCC -> 3 byte sequence codepoint is AAAAAAABBBBBBBCCCCCCC
158+
...
159+
```
160+
1 byte UTF-8 symbols are all defined (matches ascii table for backwards compatibility), 2+ byte sequences have gaps and not
161+
every codepoint is defined (ie valid). This is where we get encoding errors. If all characters were 1byte sequences, we would
162+
not have undefined codepoints error. Pair `127, 128` should take care of it - flips highes bit, therefore we can
163+
reach a sequence in output where every byte is starting with 0 and is not a null-byte.
164+
165+
The resulting candidate values to try for input bytes 1..14 are `[0, 1, 127, 128]`.
166+
167+
### Attempt 2
38168

169+
Here is script to enumerate through all values 0..255 for bytes 0 and 15, and vocabulary [0, 1, 127, 128]
170+
for bytes 1..14 as discussed earlier to deal with encoding errors.
171+
172+
Feel free to skip implementation of the function `input_generator` as long as you understand the sequence that its producing and
173+
reasoning why we want this sequence (I think implementation is not too important for understanding the challenge).
174+
Loop in the end of the snippet is mostly the same as before.
175+
176+
```py title="attempt_2.py"
177+
# Generate sequence of states:
178+
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
179+
# [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
180+
# ...
181+
# [255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
182+
# [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
183+
# [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
184+
# ...
185+
# [256, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
186+
# [0, 127, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
187+
# [1, 127, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
188+
# ...
189+
def input_generator():
190+
state = [0] * 16
191+
possible_values = [range(256) if i == 0 or i == 15 else [0, 1, 127, 128] for i in range(16)]
192+
while True:
193+
yield [possible_values[i][state[i]] for i in range(len(state))]
194+
for i in range(len(state)):
195+
if state[i] < len(possible_values[i]) - 1:
196+
state[i] += 1
197+
break
198+
else:
199+
if i == len(state) - 1:
200+
return
201+
state[i] = 0
202+
203+
204+
for state in input_generator():
205+
io.sendline(binascii.hexlify(bytearray(state))) # send input to local bynary
206+
response = io.recvline().strip() # read result line
207+
# if we got something interesting - print it, I expect to double quoted string and single quoted
208+
# and maybe some other inputs that are randomly unique
209+
if b'invlaid ct!' not in line:
210+
print("We received response that is not error: ", line)
211+
```
212+
213+
While the script is running I had some time (actually quite a lot of time) to calculate total number of iterations. The formula
214+
is trivial: `number of possible values for byte 0` * `number of possible values for byte 1` * `number of values for byte 2` * ...
215+
216+
256 * (4 ** 14) * 256 = 17592186044416
217+
218+
No wonder it takes a lot of time!
219+
220+
Immediate thought was to reduce number of states for bytes 1..14 from `[0, 1, 127, 128]` to `[0, 128]`, unless we are very unlucky
221+
this should also work. Its also easy to update the script.
222+
223+
But it doesn't seem to be enough.
224+
225+
On the second thought, I am running against local application that performs no cryptography, but only XOR of two arrays.
226+
Its magnitudes faster than remote script. Therefore, proper solution should finish locally under few seconds.
227+
Besides that, all heavy lifting of cryptography is done on the server side, its very unlikely author expects all teams to run
228+
thousands of cryptography iterations each, this would be nontrivial question for scalability/costs.
229+
230+
### Attempt 3
231+
232+
My new idea for desired output: first byte is digit, second byte is `#`, other bytes doesn't matter as
233+
they will be treated as comment and hence ignored. Complexity of native implementation of such algorithm would be `256*256`,
234+
but I am sure given all digits are consecutive in ascii table and any of them works for us, there is a smart lookup of first byte
235+
that will reduce algorithm complexity to `25*256` iterations.
236+
Quick prototyping only to got me a disappointing discovery: value for eval should
237+
be a valid python source file without nullbytes and invalid codepoints, even if its a comment.
238+
239+
At this moment it became clear that I am looking in the wrong direction. Therefore, I went back to the task description and
240+
original source code.
241+
242+
Then it struck me: CFB is a stream cipher, it means I can provide as little as 1 byte and output will also be 1 byte (compared to
243+
block ciphers that even for 1 byte input are adding padding and produce fixed blocks of output).
244+
245+
### Attempt 4
246+
247+
Algorithm:
248+
249+
1. Iterate through all possible values of byte 0 (0..255) until we receive digit as an output;
250+
2. Calculate keystream byte using formula `k[i] = input[i] ^ ord(output[i])`
251+
3. Calculate input for byte 0 that will produce a space as output using formula `input[0] = k[0] ^ ord(' ')`
252+
4. Use value from step 3 as prefix, repeat step 1-3 to calculate other keystream bytes.
253+
5. Now we can calculate input that will produce `FLAG`: `input[0] = k[i] ^ ord('F')`, `input[1] = k[1] ^ ord('L')`.
254+
255+
Complexity of the algorithm is 4*256 iterations.
256+
257+
??? success "solve.py"
258+
```py
259+
from pwn import *
260+
261+
if args['REMOTE']:
262+
remote_server = '2024.ductf.dev'
263+
remote_port = 30020
264+
io = remote(remote_server, remote_port)
265+
else:
266+
io = process(["python", "decrypt-then-eval.py"])
267+
268+
keystream = [] # store keystream values we idenitified so far
269+
for j in range(4): # we will repeat for 4 bytes
270+
for i in range(256): # try all possible values for next byte
271+
io.recvuntil(b': ') # wait for decrypt-then-eval.py to init
272+
# use keystream prefix XORed with space (eval trims spaces) as prefix
273+
# and append candidate value of next input `i`
274+
io.sendline(binascii.hexlify(bytearray([p ^ ord(' ') for p in keystream]) + i.to_bytes()))
275+
# read result
276+
line = io.recvline().strip()
277+
if b'invalid ct!' not in line: # if not error
278+
if line == b'0': # if result is 0, technically we can stop on any digit, but its not a substantial difference
279+
keystream.append(i ^ ord('0')) # append keystream value we found
280+
break # on the the next byte
281+
282+
io.recvuntil(b': ')
283+
# calculate input that once decrypted produces FLAG
284+
payload = bytearray([keystream[0] ^ ord('F'), keystream[1] ^ ord('L'), keystream[2] ^ ord('A'), keystream[3] ^ ord('G')])
285+
io.sendline(binascii.hexlify(payload))
286+
flag = io.recvline().strip()
287+
print(f'{flag=}')
288+
io.close()
289+
```
39290

40291
## Epilogue
41292

42293
* Official website: [https://downunderctf.com/](https://downunderctf.com/)
43294
* Official writeups: https://github.com/DownUnderCTF/Challenges_2024_Public
295+
296+
*[IV]: initialization vector
297+
*[AES]: Advanced Encryption Standard
298+
*[CFB]: Cipher Feedback mode: each byte of plaintext is XOR-ed with byte of keystream.

solve/dna/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ NB:
3333

3434
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3535

36+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
37+
3638
## My struggle
3739

3840

solve/jmp_flag/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ NB:
3333

3434
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3535

36+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
37+
3638
## My struggle
3739

3840

solve/number_mashing/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ NB:
3434

3535
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3636

37+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
38+
3739
## My struggle
3840

3941
Check what type of file we got:

solve/pac_shell/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ NB:
3333

3434
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3535

36+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
37+
3638
## My struggle
3739

3840

solve/rusty/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ NB:
3333

3434
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3535

36+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
37+
3638
## My struggle
3739

3840

solve/shufflebox/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,8 @@ NB:
5757

5858
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
5959

60+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
61+
6062
## My struggle
6163

6264
First things first - review source code of the script that ciphers data. Explanation for relevant parts of the code added as comments:

solve/sign_in/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ NB:
3333

3434
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3535

36+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
37+
3638
## My struggle
3739

3840

solve/sssshhhh/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ NB:
3333

3434
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3535

36+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
37+
3638
## My struggle
3739

3840

solve/transmissions/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,8 @@ NB:
3838

3939
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
4040

41+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
42+
4143
## My struggle
4244

4345
Quick google CCIR476 leads us to wiki page that explains that CCIR476 is a character enconding used in radio data protocol.

solve/vector/index.md

+1
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ NB:
7070
Ie, first character of string `Hello World!` is `H`, fifth is `o`.
7171

7272
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
73+
7374
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
7475

7576
## My struggle

solve/yawa/index.md

+2
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ NB:
3333

3434
* Solution code was redacted for readability purposes. Due to time pressure during the competition I was using a lot of one-letter variables and questionable code structure.
3535

36+
* I am using gdb with [pwndbg](https://github.com/pwndbg/pwndbg) plugin
37+
3638
## My struggle
3739

3840

0 commit comments

Comments
 (0)