Buffer Overflows and You

Modern defenses

Everything has been pretty cool up to this point. Sadly, it's time for reality. The attacks that we just talked about can no longer happen on modern systems. There are three reasons for this...

NX and Exec Shield

Modern architectures provide a "No eXecute" bit, which allows you to mark certain regions of memory as non-executable. If the stack is marked in this way, it is impossible to run shellcode that has been injected into a buffer on the stack. That said, you may be able to get around this by overflowing a heap buffer (but heaps are almost always non-executable now too) or by using a return-to-libc-style attack.

On older architectures that do not provide the NX bit, there is something called "exec shield," found on many Red Hat systems. It emulates the NX bit on systems that do not have hardware support for it. Other systems accomplish the same thing, just with a different name. Even newer versions of windoze have support for software emulation of the NX bit, called "Data Execution Prevention" (DEP).

If you recall from earlier, we actually turned this off for our programs. This can be accomplished by setting a flag in the actual binary (we can also shut it off system-wide, but there's really no reason to). You can use "execstack" to set the flag on existing binaries, or, if you're compiling a new binary you can pass the "-z execstack" flag on to gcc.

gcc StackGuard

gcc by default also adds extra code to programs to protect against buffer overflows in general. This extra code adds "special" values called canaries before and after the return address and checks to make sure they haven't been overwritten before proceeding to execute a return. We disable it by including the "-fno-stack-protection" when compiling.

Address space layout randomization (ASLR)

Let's take our sample program from the introduction, sample1, and run it a couple of times while looking at each memory map...

$ pmap 12662
12662:   ./sample1
0000000000400000      4K r-x--  /home/turkstra/src/cs526/sample1
0000000000600000      4K rw---  /home/turkstra/src/cs526/sample1
00000032e7000000    120K r-x--  /lib64/ld-2.11.1.so
00000032e721d000      4K r----  /lib64/ld-2.11.1.so
00000032e721e000      4K rw---  /lib64/ld-2.11.1.so
00000032e721f000      4K rw---    [ anon ]
00000032e7400000   1468K r-x--  /lib64/libc-2.11.1.so
00000032e756f000   2048K -----  /lib64/libc-2.11.1.so
00000032e776f000     16K r----  /lib64/libc-2.11.1.so
00000032e7773000      4K rw---  /lib64/libc-2.11.1.so
00000032e7774000     20K rw---    [ anon ]
00007f314ab04000     12K rw---    [ anon ]
00007f314ab28000     12K rw---    [ anon ]
00007fff06c3d000     84K rw---    [ stack ]
00007fff06d19000      4K r-x--    [ anon ]
ffffffffff600000      4K r-x--    [ anon ]
 total             3812K

$ pmap 12666
12666:   ./sample1
0000000000400000      4K r-x--  /home/turkstra/src/cs526/sample1
0000000000600000      4K rw---  /home/turkstra/src/cs526/sample1
00000032e7000000    120K r-x--  /lib64/ld-2.11.1.so
00000032e721d000      4K r----  /lib64/ld-2.11.1.so
00000032e721e000      4K rw---  /lib64/ld-2.11.1.so
00000032e721f000      4K rw---    [ anon ]
00000032e7400000   1468K r-x--  /lib64/libc-2.11.1.so
00000032e756f000   2048K -----  /lib64/libc-2.11.1.so
00000032e776f000     16K r----  /lib64/libc-2.11.1.so
00000032e7773000      4K rw---  /lib64/libc-2.11.1.so
00000032e7774000     20K rw---    [ anon ]
00007fbbe2954000     12K rw---    [ anon ]
00007fbbe2978000     12K rw---    [ anon ]
00007fff261f5000     84K rw---    [ stack ]
00007fff26282000      4K r-x--    [ anon ]
ffffffffff600000      4K r-x--    [ anon ]
 total             3812K

Notice how the address of the stack keeps changing? Well this is the final nail in the coffin. If the stack is no longer executable, we must rely on return-to-libc style attacks, and those almost always rely on knowing where to find a particular function (like system()) as well as a string to pass that function (eg, "/bin/sh"). With the stack's location randomized, the location of that string changes every time the program executes. This makes attacks much more difficult, particularly if the randomization is being done properly.

It's worth noting that ASLR only works in the context of a non-executable stack. If the stack is executable it's usually possible (as we did in the previous example) to develop a payload that does not strictly rely on any fixed addresses.