好的,这是我之前回答的大幅修改版。我认为我现在找到了一种方法。
你(仍然:)面临这个具体问题:
(gdb) disas main
No symbol table is loaded. Use the "file" command.
现在,如果你编译这段代码(我在末尾添加了
return 0
),则使用
gcc -S
会得到以下结果:
pushq %rbp
movq %rsp, %rbp
movl $.LC0, %edi
call puts
movl $0, %eax
leave
ret
现在,你可以看到你的二进制文件给出了一些信息:
Striped:
(gdb) info files
Symbols from "/home/beco/Documents/fontes/cpp/teste/stackoverflow/distrip".
Local exec file:
`/home/beco/Documents/fontes/cpp/teste/stackoverflow/distrip', file type elf64-x86-64.
Entry point: 0x400440
0x0000000000400238 - 0x0000000000400254 is .interp
...
0x00000000004003a8 - 0x00000000004003c0 is .rela.dyn
0x00000000004003c0 - 0x00000000004003f0 is .rela.plt
0x00000000004003f0 - 0x0000000000400408 is .init
0x0000000000400408 - 0x0000000000400438 is .plt
0x0000000000400440 - 0x0000000000400618 is .text
...
0x0000000000601010 - 0x0000000000601020 is .data
0x0000000000601020 - 0x0000000000601030 is .bss
这里最重要的条目是.text
。它是汇编代码开始的常见名称,并且从下面我们对main的解释中,从它的大小可以看出它包含main。如果您反汇编它,您将看到对__libc_start_main的调用。最重要的是,您正在反汇编一个真正的代码入口点(您不会误导将DATA更改为CODE)。
disas 0x0000000000400440,0x0000000000400618
Dump of assembler code from 0x400440 to 0x400618:
0x0000000000400440: xor %ebp,%ebp
0x0000000000400442: mov %rdx,%r9
0x0000000000400445: pop %rsi
0x0000000000400446: mov %rsp,%rdx
0x0000000000400449: and $0xfffffffffffffff0,%rsp
0x000000000040044d: push %rax
0x000000000040044e: push %rsp
0x000000000040044f: mov $0x400540,%r8
0x0000000000400456: mov $0x400550,%rcx
0x000000000040045d: mov $0x400524,%rdi
0x0000000000400464: callq 0x400428 <__libc_start_main@plt>
0x0000000000400469: hlt
...
0x000000000040046c: sub $0x8,%rsp
...
0x0000000000400482: retq
0x0000000000400483: nop
...
0x0000000000400490: push %rbp
..
0x00000000004004f2: leaveq
0x00000000004004f3: retq
0x00000000004004f4: data32 data32 nopw %cs:0x0(%rax,%rax,1)
...
0x000000000040051d: leaveq
0x000000000040051e: jmpq *%rax
...
0x0000000000400520: leaveq
0x0000000000400521: retq
0x0000000000400522: nop
0x0000000000400523: nop
0x0000000000400524: push %rbp
0x0000000000400525: mov %rsp,%rbp
0x0000000000400528: mov $0x40062c,%edi
0x000000000040052d: callq 0x400418 <puts@plt>
0x0000000000400532: mov $0x0,%eax
0x0000000000400537: leaveq
0x0000000000400538: retq
调用__libc_start_main时,它的第一个参数是指向main()函数的指针。因此,在调用之前堆栈中的最后一个参数就是main()函数的地址。
0x000000000040045d: mov $0x400524,%rdi
0x0000000000400464: callq 0x400428 <__libc_start_main@plt>
这里是0x400524(正如我们已经知道的)。现在你可以设置一个断点并试一试:
(gdb) break *0x400524
Breakpoint 1 at 0x400524
(gdb) run
Starting program: /home/beco/Documents/fontes/cpp/teste/stackoverflow/disassembly/d2
Breakpoint 1, 0x0000000000400524 in main ()
(gdb) n
Single stepping until exit from function main,
which has no line number information.
hello 1
__libc_start_main (main=<value optimized out>, argc=<value optimized out>, ubp_av=<value optimized out>,
init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>,
stack_end=0x7fffffffdc38) at libc-start.c:258
258 libc-start.c: No such file or directory.
in libc-start.c
(gdb) n
Program exited normally.
(gdb)
现在您可以使用以下方式对其进行反汇编:
(gdb) disas 0x0000000000400524,0x0000000000400600
Dump of assembler code from 0x400524 to 0x400600:
0x0000000000400524: push %rbp
0x0000000000400525: mov %rsp,%rbp
0x0000000000400528: sub $0x10,%rsp
0x000000000040052c: movl $0x1,-0x4(%rbp)
0x0000000000400533: mov $0x40064c,%eax
0x0000000000400538: mov -0x4(%rbp),%edx
0x000000000040053b: mov %edx,%esi
0x000000000040053d: mov %rax,%rdi
0x0000000000400540: mov $0x0,%eax
0x0000000000400545: callq 0x400418 <printf@plt>
0x000000000040054a: mov $0x0,%eax
0x000000000040054f: leaveq
0x0000000000400550: retq
0x0000000000400551: nop
0x0000000000400552: nop
0x0000000000400553: nop
0x0000000000400554: nop
0x0000000000400555: nop
...
这主要是解决方案。
顺便说一下,这是不同的代码,看看它是否有效。这就是为什么上面的汇编有点不同。上面的代码来自于这个C文件:
#include <stdio.h>
int main(void)
{
int i=1;
printf("hello %d\n", i);
return 0;
}
但是!
如果这个方法不起作用,那么你仍然有一些提示:
从现在开始,您应该尝试在所有函数的开头设置断点。它们位于 ret
或 leave
的前面。第一个入口点是.text
本身。这是汇编的起始点,但不是主要的。
问题是,并不总是断点会让程序运行,例如在非常短的.text
中设置的断点就无法发挥作用:
(gdb) break *0x0000000000400440
Breakpoint 2 at 0x400440
(gdb) run
Starting program: /home/beco/Documents/fontes/cpp/teste/stackoverflow/disassembly/d2
Breakpoint 2, 0x0000000000400440 in _start ()
(gdb) n
Single stepping until exit from function _start,
which has no line number information.
0x0000000000400428 in __libc_start_main@plt ()
(gdb) n
Single stepping until exit from function __libc_start_main@plt,
which has no line number information.
0x0000000000400408 in ?? ()
(gdb) n
Cannot find bounds of current function
所以你需要不断尝试,直到找到正确的方法,在以下位置设置断点:
0x400440
0x40046c
0x400490
0x4004f4
0x40051e
0x400524
从另一个回答中,我们应该保留这些信息:
在文件的非条纹版本中,我们看到:
(gdb) disas main
Dump of assembler code for function main:
0x0000000000400524 <+0>: push %rbp
0x0000000000400525 <+1>: mov %rsp,%rbp
0x0000000000400528 <+4>: mov $0x40062c,%edi
0x000000000040052d <+9>: callq 0x400418 <puts@plt>
0x0000000000400532 <+14>: mov $0x0,%eax
0x0000000000400537 <+19>: leaveq
0x0000000000400538 <+20>: retq
End of assembler dump.
现在我们知道main函数的地址是0x0000000000400524,0x0000000000400539
。如果我们使用相同的偏移量来查看分条形式的二进制文件,我们会得到相同的结果:
(gdb) disas 0x0000000000400524,0x0000000000400539
Dump of assembler code from 0x400524 to 0x400539:
0x0000000000400524: push %rbp
0x0000000000400525: mov %rsp,%rbp
0x0000000000400528: mov $0x40062c,%edi
0x000000000040052d: callq 0x400418 <puts@plt>
0x0000000000400532: mov $0x0,%eax
0x0000000000400537: leaveq
0x0000000000400538: retq
End of assembler dump.
所以,除非你可以获得一些提示,确定主体从哪里开始(比如使用带符号的另一个代码),另一种方法是如果你可以获取一些有关最初汇编指令的信息,那么你就可以在特定位置进行反汇编,并查看是否匹配。如果你根本无法访问代码,则仍然可以阅读ELF定义以了解代码中应该出现多少个段,并尝试计算地址。不过,你还需要有关代码段的信息!
这是一项艰苦的工作,朋友!祝你好运!
Beco