Incorrect Disassembly Fix
For some common disassemblers, such as objdump
, or disassembler projects based on objdump
, there are some disassembly flaws. There are ways to make objdump
disassemble the code, not so Accurate.
Jump to the middle of an instruction¶
The easiest way is to use jmp
to jump to the middle of an instruction, that is, the real code starts from "inside" an instruction, but in disassembly it cannot be listed for the entire instruction. The actual assembly instruction code that is being run.
It sounds like a mouthful, it's hard to understand, let's look at an example, give the following assembly code.
start:
jmp label+1
label:
DB 0x90
mov eax, 0xf001
The first instruction in the code label
is DB 0x90
. Let's take a look at the result of disassembly of this code by objdump
:
08048080 <start>:
8048080: e9 01 00 00 00 jmp 8048086 <label+0x1>
08048085 <label>:
8048085: 90 nop
8048086: b8 01 f0 00 00 mov eax,0xf001
It seems that there is no problem, DB 0x90
is accurately disassembled into 90 nop
.
But if we change the nop
instruction to a command of more than 1 byte, then objdump will not follow our jump and correctly disassemble, but linearly continue to assemble from top to bottom (linear scan algorithm). For example, I Change DB 0x90
to DB 0xE9
to see the result of objdump disassembling again:
08048080 <start>:
8048080: e9 01 00 00 00 jmp 8048086 <label+0x1>
08048085 <label>:
8048085: e9 b8 01 f0 00 jmp 8f48242 <__bss_start+0xeff1b6>
Compared with the previous disassembly results, you can clearly see what is going on. DB 0xE9
is just a piece of data, it will not be executed, but the result of disassembly will be treated as an instruction. The result has also changed.
Objdump` ignores the code at the destination address of jmp and directly assembles the instructions after jmp, so that our real code is well "hidden"
Solution¶
How to solve this problem? The most straightforward way is to manually replace this useless 0xE9
with a hex editor with 0x90
. But if the program has file checksum, calculate the checksum value, then this The method will not work.
So a better solution is to use a disassembler such as IDA or similar control flow analysis, for programs that are also problematic. The disassembly results might look like this:
---- section .text ----:
08048080 E9 01 00 00 00 jmp Label_08048086
; (08048086)
; (near + 0x1)
08048085 DB E9
Label_08048086:
08048086 B8 01 F0 00 00 mov eax, 0xF001
; xref ( 08048080 )
Disassembly results look okay
Run time calculation jump address¶
This method can even counter the disassembler of the analysis control flow. We can look at a sample code to better understand:
; ----------------------------------------------------------------------------
call earth+1
Return:
; x instructions or random bytes here x byte(s)
earth: ; earth = Return + x
xor eax, eax ; align disassembly, using single byte opcode 1 byte
pop eax ; start of function: get return address ( Return ) 1 byte
; y instructions or random bytes here y byte(s)
add eax, x+2+y+2+1+1+z ; x+y+z+6 2 bytes
push eax ; 1 byte
right ; 1 byte
; z instructions or random bytes here z byte(s)
; Code:
; !! Code Continues Here !!
; ----------------------------------------------------------------------------
The program uses call+pop
to get the return address that the calling function saved to the stack at the time. It is actually the EIP
before the function is called. Then the garbage data is stuffed at the function return. But it will actually return when the function is running. The address is modified to Code. So the earth
function returns to jump to Code
and continues to run, instead of Return
.
Look at a simple demo
; ----------------------------------------------------------------------------
call earth+1
earth:
DB 0xE9 ; 1 <--- pushed return address,
; E9 is opcode for jmp to disalign disas-
; sembly
pop eax ; 1 hidden
NOP ; first
add eax, 9 ; 2 hidden
push eax ; 1 hidden
right ; 1 hidden
DB 0xE9 ; 1 opcode for jmp to misalign disassembly
Code: ; code continues here <--- pushed return address + 9
NOP
NOP
NOP
right
; ----------------------------------------------------------------------------
If you use objdump for disassembly, there will be problems with call earth+1
, as follows:
00000000 <earth-0x5>:
0: e8 01 00 00 00 call 6 <earth+0x1>
00000005 <earth>:
5: e9 58 90 05 09 jmp 9059062 <earth+0x905905d>
a: 00 00 add% al, (% eax)
c: 00 50 c3 add %dl,0xffffffc3(%eax)
f: e9 90 90 90 c3 jmp c39090a4 <earth+0xc390909f>
Let's take a look at the case of ida
text:08000000 ; Segment permissions: Read/Execute
.text:08000000 _text segment para public 'CODE' use32
.text:08000000 assume cs:_text
.text:08000000 ;org 8000000h
.text:08000000 assume es:nothing, ss:nothing, ds:_text,
.text:08000000 fs:nothing, gs:nothing
.text:08000000 dd 1E8h
.text:08000004 ; -------------------------------------------------------------
.text:08000004 add cl, ch
.text:08000006 pop eax
.text: 08000007 nop
.text:08000008 add eax, 9
.text:0800000D push eax
.text: 0800000E retn
.text:0800000E ; -------------------------------------------------------------
.text:0800000F dd 909090E9h
.text:08000013 ; -------------------------------------------------------------
.text: 08000013 retn
.text:08000013 _text ends
.text:08000013
.text:08000013
.text:08000013 end
We are very well hidden in the last three nop
. Not only that, but our process of calculating EIP
is also perfectly hidden. In fact, the entire disassembled code is completely different from the actual code.
How to solve this problem? There is actually no tool that can guarantee '100%' accurate disassembly. When the disassembler does code simulation, it may be able to do the correct assembly.
In reality, this is not a particularly big problem. Because it is for the interactive disassembler. You can specify the starting position of the code. And when debugging, you can also see the address of the actual jump of the program.
So at this point we need static analysis, but also need dynamic debugging.
Reference: Beginners Guide to Basic Linux Anti Anti Debugging Techniques
本页面的全部内容在 CC BY-NC-SA 4.0 协议之条款下提供,附加条款亦可能应用。