Linux - First step with GDB

· Read in about 7 min · (1385 words)

I am interested regarding the pentesting, so, I decided to write a set of articles regarding that field. In this first article, I will share with you how to do a reverse-engineering binaries with a simple C program and we will familiarize with the GNU debugger which is an important tool.

How memory works

In modern system, when an application is executed, the system allocate memory and each variables, functions and other components from the application is stored in the main memory.

Memory

In the diagram above, we have two parts important in the memory: the stack and the heap. The stack contains all local variables declared in the function and the size is 8KBytes and you can see with the ulimit command:

$ ulimit -a | grep stack
stack size                  (kbytes, -s) 8192

The heap contains all dynamic variables allocated with malloc and the heap size is unlimited, but, the access is slower than the stack.

Before to start

For this article, we will use this code:

#include <stdlib.h>
#include <stdio.h>

int sum(int a, int b){
	return a + b;
}
int main(void){
	int a = 0;
	int b = 5;
	int c = 10;
	printf("%d\n", a);
	a = sum(a, b);
	printf("%d\n", a);
	printf("%d\n", c);
	c = sum(b, c);
	printf("%d\n", c);
}

As you see, we created with variables a, b and c and these variables are stored in stack memory, also, I created a small function for doing just the sum and return the result.

Now, we need to compile the program with gcc compiler:

gcc -Wno-all -ggdb -O0 -o main main.c
  • The parameter -Wno-all indicate to disable all warnings during the compilation
  • the parameter -ggdb produces debugging information
  • The option -O enable the level of the optimization, but, for reading some variables in the program, we need to disable all optimizations with -O0.

Now, we created the output file main, we are going to analyze it with the GNU Debugger, gdb. Execute it with the output file of gcc in parameter: gdb -q main. The parameter -q means, quiet, so, without introduction messages.

Assembly register

Now, we are in the debugger and we can do a lot of things for debugging the program. You can use the command help for having a help.

The command disassemble is very important, that’s disassemble the program in assembly language:

(gdb) disas main
Dump of assembler code for function main:
   0x0000555555555149 <+0>:	push   %rbp
   0x000055555555514a <+1>:	mov    %rsp,%rbp
   0x000055555555514d <+4>:	sub    $0x10,%rsp
   0x0000555555555151 <+8>:	movl   $0x0,-0x4(%rbp)
   0x0000555555555158 <+15>:	movl   $0x5,-0x8(%rbp)
   0x000055555555515f <+22>:	movl   $0xa,-0xc(%rbp)
   0x0000555555555166 <+29>:	mov    -0x4(%rbp),%eax
   0x0000555555555169 <+32>:	mov    %eax,%esi
   0x000055555555516b <+34>:	lea    0xe92(%rip),%rdi        # 0x555555556004
   0x0000555555555172 <+41>:	mov    $0x0,%eax
   0x0000555555555177 <+46>:	call   0x555555555030 <printf@plt>
   0x000055555555517c <+51>:	mov    -0x8(%rbp),%edx
   0x000055555555517f <+54>:	mov    -0x4(%rbp),%eax
   0x0000555555555182 <+57>:	mov    %edx,%esi
   0x0000555555555184 <+59>:	mov    %eax,%edi
   0x0000555555555186 <+61>:	call   0x555555555135 <sum>
   0x000055555555518b <+66>:	mov    %eax,-0x4(%rbp)
   0x000055555555518e <+69>:	mov    -0x4(%rbp),%eax
   0x0000555555555191 <+72>:	mov    %eax,%esi
   0x0000555555555193 <+74>:	lea    0xe6a(%rip),%rdi        # 0x555555556004
   0x000055555555519a <+81>:	mov    $0x0,%eax
   0x000055555555519f <+86>:	call   0x555555555030 <printf@plt>
   0x00005555555551a4 <+91>:	mov    -0xc(%rbp),%eax
   0x00005555555551a7 <+94>:	mov    %eax,%esi
   0x00005555555551a9 <+96>:	lea    0xe54(%rip),%rdi        # 0x555555556004
   0x00005555555551b0 <+103>:	mov    $0x0,%eax
   0x00005555555551b5 <+108>:	call   0x555555555030 <printf@plt>
   0x00005555555551ba <+113>:	mov    -0xc(%rbp),%edx
   0x00005555555551bd <+116>:	mov    -0x8(%rbp),%eax
   0x00005555555551c0 <+119>:	mov    %edx,%esi
   0x00005555555551c2 <+121>:	mov    %eax,%edi
   0x00005555555551c4 <+123>:	call   0x555555555135 <sum>
   0x00005555555551c9 <+128>:	mov    %eax,-0xc(%rbp)
   0x00005555555551cc <+131>:	mov    -0xc(%rbp),%eax
   0x00005555555551cf <+134>:	mov    %eax,%esi
   0x00005555555551d1 <+136>:	lea    0xe2c(%rip),%rdi        # 0x555555556004
   0x00005555555551d8 <+143>:	mov    $0x0,%eax
   0x00005555555551dd <+148>:	call   0x555555555030 <printf@plt>
   0x00005555555551e2 <+153>:	mov    $0x0,%eax
   0x00005555555551e7 <+158>:	leave  
   0x00005555555551e8 <+159>:	ret
End of assembler dump.

And we can do the same for the sum function: disas sum.

You can see above, the assembly language can be undigestible. For each line, you have the address memory(in 64bit) and the instructions associated (call, add, mov, push, etc….) and the assembly register (eax, esp, ebp, etc.).

What is assembly registers ?

The assembly register is a storage area in CPU, called General-Purpose Registers. In the array below, you have an exhaustive list of differents registers:

Register 32 bits 64 bits Comment
Accumulator EAX RAX It’s used for arithmetics, logical and I/O instructions
Counter ECX RCX It’s a counter for loops
Data EDX RDX Also used for I/O instructions
Base EBX RBX It’s a index for the value
Pointer ESP RSP It’s the stack pointer of the current data
Pointer EBP RBP Pointer to the base of the current stack frame
Index EDI RDI Pointer for maniupulating string
Index EIP RIP Contain the next instruction pointer

We will not learn all these registers, it’s not the purpose of this article, but, if you want to learn them, you may try to understand how the x32-64 architectures works.

With GDB, you can print the register:

(gdb) info register
eax            0x0                 0
ecx            0xffffd140          -11968
edx            0x1                 1
ebx            0x0                 0
esp            0xffffd12c          0xffffd12c
ebp            0x0                 0x0
esi            0xf7fa6000          -134586368
edi            0xf7fa6000          -134586368
eip            0x565561ec          0x565561ec <main+83>

If you want to print a specific register:

(gdb) info register $esp
esp            0xffffd12c          0xffffd12c

Working with assembly registers

For understanding the assembly registers, in this section, we will play with these registers: RSP and RBP, because they manipulate the memory stack.

First, we need to run the application, for doing that, you have the command run:

(gdb) run
Starting program: /home/geoffrey/Documents/C/overflow/test/main 
0
5
10
15
[Inferior 1 (process 8114) exited normally]

In the disassemble above, you have these lines, they are very interesting, because they are our variables,a, b and c:

   0x0000555555555151 <+8>:	movl   $0x0,-0x4(%rbp)
   0x0000555555555158 <+15>:	movl   $0x5,-0x8(%rbp)
   0x000055555555515f <+22>:	movl   $0xa,-0xc(%rbp)

When the compilator generate the program, they allocate local variable in a function in the stack memory. The high address is the first variable and the low address is the last variable. In our case, the variable a has the address 0xc (high address) and the variable c has the address 0x4 (low address) and these values are put in the variables with the movl instructions and they are stored in the stack memory.

The purpose of a debugger is to debug the program when you have an issue, so, each debugger have an interesting tool, called the breakpoint. A breakpoint stop the program in a specific location for debugging the issue. With gcc, you can use make a breakpoint in your program.

When we want to create a breakpoint with the hex address, we need to put a ‘*’ before the address. We need to create two breadpoints. One before to do the sum for the c variable and one after it. I specified the address for the printf call function.

(gdb) break *0x00005555555551b5
Breakpoint 1 at 0x5555555551b5
(gdb) break *0x00005555555551dd
Breakpoint 2 at 0x5555555551dd

We run our program:

(gdb) run
Starting program: /home/geoffrey/Documents/C/overflow/test/main 
0
5

Breakpoint 1, 0x00005555555551b5 in main ()

Now, the program is blocked at the breakpoint we created and we will print the result of the variable c stored at the location ESP + 0x4. For doing that, we have the command x which mean examine.

(gdb) x/x $rsp+0x4
0x7fffffffdf44:	0x0000000a

We can see the result of the command x, the value of the variable c is 0xa (10).

And now, we can continue the program until the next breakpoint and display the value of c:

(gdb) continue
Continuing.
10

Breakpoint 2, 0x00005555555551dd in main ()
(gdb) x/x $rsp+0x4
0x7fffffffdf64:	0x0000000f

If we want to put print the command in decimal:

(gdb) x/d $rsp+0x4
0x7fffffffdf64:	15

That’s works. It’s our variables c and we can see the value changed.

We can also print the address of our variable with the command print:

(gdb) print/x $rsp+0x4
$10 = 0x7fffffffdf64

As you see, the breakpoint is very useful for troubleshooting your program. If you want to remove the breakpoint, you can use the command delete and the id of the breakpoint:

(gdb) del 1

You have an alternative to display the value of a variable:

(gdb) print c
$1 = 10
(gdb) print/x c
$2 = 0xa

If you see this error:

(gdb) print c
No symbol "c" in current context.

That’s means you compiled with optimization, you must use the parameter -o0 for the gcc program.

We finished with this first article for reverse-engineering binaries and for the next articles, we will exploit GDB for making some pentest and try to have an access to a system Linux. In the section below some useful command you can use in GDB.

References