In the previous section you saw that a stack buffer overflow can change the return address of a function. In this section, you are going to explicitly choose the value that gets written to the location where the return address is stored. The goal is to modify the behavior of the program without making it crash.
In this section, you will go through the simplest possible example for doing so.
The example from the previous section has been slightly modified to the contents shown below. Save the modified content into a file named redirect1.c
:
#include <string.h>
#include <stdio.h>
__attribute__((noinline))
char f(char *src) {
char buffer[8];
strcpy(buffer, src);
return buffer[2];
}
int main(int argc, char** argv) {
char *chars = argv[1];
f(chars);
puts("The string on the command line was: ");
puts(chars);
return 0;
}
Compile it with the following command at your docker prompt:
clang -g -O1 redirect1.c -o redirect1
Now run this program:
./redirect1 hello
The string on the command line was:
hello
It produces the expected output, as long as the string provided is 15 characters or less. When the string is longer, there is a buffer overflow and the stored return address is overwritten.
Let’s test if you can still get the bus error as seen in the program in the previous section if you provide the string that was hard-coded in that program:
./redirect1 "These are 24 chars. yes"
Bus error
Yes, you can.
We assume that an attacker can control the input to the program, that is the string argument on the command line. Could you as an attacker craft a string such that the program does not crash but still change the behavior of the program?
Let’s investigate.
You haven’t changed function f
, and the assembly instructions that the compiler
generates from it haven’t changed. That means that the function will return to
the address that is encoded in bytes 16 to 23 in the string it processes, which
is the string on the command line.
Start by looking at the disassembly of the main function to see if you can find an interesting address to return to:
gdb -q ./redirect1
Reading symbols from ./redirect1...
(gdb) break main
__output__Breakpoint 1 at 0x808: file redirect1.c, line 13.
(gdb) run
__output__Starting program: /armlearningpaths/redirect1
__output__[Thread debugging using libthread_db enabled]
__output__Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
__output__
__output__Breakpoint 1, main (argc=1, argv=0xfffffffff748) at redirect1.c:13
__output__13 char *chars = argv[1];
(gdb) disass main
__output__Dump of assembler code for function main:
__output__ 0x0000aaaaaaaa07fc <+0>: stp x29, x30, [sp, #-32]!
__output__ 0x0000aaaaaaaa0800 <+4>: str x19, [sp, #16]
__output__ 0x0000aaaaaaaa0804 <+8>: mov x29, sp
__output__=> 0x0000aaaaaaaa0808 <+12>: ldr x19, [x1, #8]
__output__ 0x0000aaaaaaaa080c <+16>: mov x0, x19
__output__ 0x0000aaaaaaaa0810 <+20>: bl 0xaaaaaaaa07d4 <f>
__output__ 0x0000aaaaaaaa0814 <+24>: adrp x0, 0xaaaaaaaa0000
__output__ 0x0000aaaaaaaa0818 <+28>: add x0, x0, #0x850
__output__ 0x0000aaaaaaaa081c <+32>: bl 0xaaaaaaaa0670 <puts@plt>
__output__ 0x0000aaaaaaaa0820 <+36>: mov x0, x19
__output__ 0x0000aaaaaaaa0824 <+40>: bl 0xaaaaaaaa0670 <puts@plt>
__output__ 0x0000aaaaaaaa0828 <+44>: ldr x19, [sp, #16]
__output__ 0x0000aaaaaaaa082c <+48>: mov w0, wzr
__output__ 0x0000aaaaaaaa0830 <+52>: ldp x29, x30, [sp], #32
__output__ 0x0000aaaaaaaa0834 <+56>: ret
__output__End of assembler dump.
You see that function f
is called by the bl 0xaaaaaaaa07d4
instruction at address
0xaaaaaaaa0810
. The bl
instruction stores the address of the instruction
after it in register x30
, so you expect that the return address that will be
seen in function f
is 0xaaaaaaaa0814
.
Let’s check to see if that is the case.
(gdb) break f
__output__Breakpoint 2 at 0xaaaaaaaa07e4: file redirect1.c, line 8.
(gdb) cont
__output__Continuing.
__output__
__output__Breakpoint 2, f (src=src@entry=0x0) at redirect1.c:8
__output__8 strcpy(buffer, src);
(gdb) disass f
__output__Dump of assembler code for function f:
__output__ 0x0000aaaaaaaa07d4 <+0>: sub sp, sp, #0x20
__output__ 0x0000aaaaaaaa07d8 <+4>: stp x29, x30, [sp, #16]
__output__ 0x0000aaaaaaaa07dc <+8>: add x29, sp, #0x10
__output__ 0x0000aaaaaaaa07e0 <+12>: mov x1, x0
__output__=> 0x0000aaaaaaaa07e4 <+16>: add x0, sp, #0x8
__output__ 0x0000aaaaaaaa07e8 <+20>: bl 0xaaaaaaaa0680 <strcpy@plt>
__output__ 0x0000aaaaaaaa07ec <+24>: ldp x29, x30, [sp, #16]
__output__ 0x0000aaaaaaaa07f0 <+28>: ldrb w0, [sp, #10]
__output__ 0x0000aaaaaaaa07f4 <+32>: add sp, sp, #0x20
__output__ 0x0000aaaaaaaa07f8 <+36>: ret
__output__End of assembler dump.
(gdb) info register x30
__output__x30 0xaaaaaaaa0814 187649984432148
Indeed, the return address in x30 is as expected.
Now you can try to change that return address by specifying a specially crafted string
as the program argument, so that the program does not print the string
"The string on the command line was: "
. In the disassembly
of function main
printed earlier in this section, you can see 2 calls to the
function puts
(Look for the bl 0xaaaaaaaa0670 <puts@plt>
instructions).
If you can point the return address to just beyond the first puts
call, then
that first puts
call should not be executed anymore, as the program’s control
flow will never go through that instruction anymore.
The address of the instruction after that first puts
call is
0x0000aaaaaaaa0820
. You can try to put that address in the right bytes of the input
argument so that the return address while on the stack in function f
is
overwritten with this value.
You saw previously that the return address will be overwritten with bytes 16 till 23 of the string argument to redirect1
.
Create a string that results in overwriting the return address in function f
with the value 0x0000aaaaaaaa0820
.
The answer to this exercise can be found in the Answers section .
You can use the syntax $'\xAB'
to easily construct strings on the command
line that contain specific byte values, written as a hexadecimal value.
For example:
echo $'hello \xaa' | od -t x1 -c
0000000 68 65 6c 6c 6f 20 aa 0a
h e l l o 252 \n
0000010
You will also need to take into account that you are working on a little-endian target.