Software Reverse Engineering Introduction
Reverse engineering, also called back engineering, is the process by which a man-made object is deconstructed to reveal its designs, architecture, or to extract knowledge from the object; ------ from wikipedia
Software code reversal mainly refers to reverse disassembly and analysis of software structure, process, algorithm, code, etc.
Mainly used in software maintenance, software cracking, vulnerability mining, malicious code analysis.
Reverse in the CTF competition¶
> A variety of programming technologies involving Windows, Linux, and Android platforms require reverse analysis of source code and binary files using common tools, mastering the reverse analysis of Android mobile application APK files, mastering encryption and decryption, kernel programming, algorithms, anti-debugging, and Code obfuscation technology. > ------ "National College Student Information Security Competition Entry Guide"
- Familiar with related knowledge such as operating system, assembly language, encryption and decryption
- Experience in programming with a variety of high-level languages
- Familiar with the compiler principle of multiple compilers
- Strong program understanding and reverse analysis capabilities
Regular reverse process¶
- Collect information using static analysis tools such as
strings/file/binwalk/IDAand perform a
google/githubsearch based on these static information.
- Study the protection methods of the program, such as code obfuscation, protective shell and anti-debugging techniques, and try to break or bypass protection
- Disassemble the target software and quickly locate the key code for analysis
- Combine dynamic debugging, verify your initial guess, and clarify the program function during the analysis process.
- For the program function, write the corresponding script to solve the flag
Positioning key code tips¶
Analyze control flow
The control flow can be seen in the Control Flow Chart (CFG) generated by IDA. The disassembly code is read block by block along the branch loop and function call.
- Using data, code cross-references
For example, the output prompt string can be found through the data cross-reference to find the corresponding call location, and then find the key code. Code cross-references such as graphical interface programs to get user input, you can use the corresponding windowsAPI function, we can find the key code through these API function call location.
Each programmer's coding style is different. Students who are familiar with the development design pattern can analyze the function module function more quickly.
- Principle of concentration
When programmers develop programs, they are often used to write function-related code or data in the same place, and this can be shown in disassembled code, so you can view functions and data near key code during analysis.
- Code reuse
Code reuse is very common, and Github, the largest source code repository, is the primary source. In the analysis, you can find some features (such as strings, code styles, etc.) to search on Github, you may find similar code, and recover the missing symbol information during analysis.
- Seven points reverse three-point guess
Reasonable guessing can often get twice the result with half the effort. If you encounter a suspicious function but can't see the logic inside, you can guess the function according to the clues and continue to analyze it according to the guess. In the constant guessing, it may help you get closer to the code. The truth.
- Distinguishing code
To get the disassembly code, you must be able to distinguish which code is written manually and which is automatically appended by the compiler. In the code written by man, what are the library function codes, which are the code written by the questioner himself, and how is the code of the questioner optimized by the compiler? It is important that we don't have to spend time on code outside of the issuer. If you analyze the half-day in the library function, it will not only experience very bad results, but also have no effect.
In any case, given enough time, you can always analyze a program thoroughly. But it should not be abandoned too early. I believe that I can definitely break through the problem in the process of twitching and stripping.
The purpose of dynamic analysis is to locate the key code and verify its inference or understand the program function by outputting information (register, memory change, program output) during the running of the program.
The main methods are: debugging, symbol execution, stain analysis
Algorithm and data structure identification¶
- Common algorithm identification
Tea/XTea/XXTea/IDEA/RC4/RC5/RC6/AES/DES/IDEA/MD5/SHA256/SHA1 and other encryption algorithms, large number addition, subtraction, multiplication and division, shortest path and other traditional algorithms
- Common data structure identification
The identification of advanced data structures such as diagrams, trees, and hash tables in assembly code.
For example, using tools such as
SMC to confuse the code makes program analysis very difficult.
Then there is also anti-aliasing technology, the main purpose is to restore the control flow. Such as
simulation execution and
There are many types of protective shells, and simple compressed shells can be classified into the following types.
- unpack -> execute
Extract the program code directly into memory and continue executing the program code.
- unpack -> execute -> unpack -> execute ...
Unzip part of the code and execute it while decompressing
- unpack -> [decoder | encoded code] -> decode -> execute
The program code has been coded, and after decompressing, the function is executed to decode the real program code.
There are also related methods for shelling, such as
single stepping method, `ESP law', etc.
Anti-debugging is intended to prevent the program from being debugged and analyzed by means such as detecting the debugger. For example, use some API functions such as
IsDebuggerPresent to detect the debugger, use
SEH exception handling, time difference detection and other methods. It can also be protected by overwriting the debug port, self-tuning, and so on.
Unconventional reverse thinking¶
Unconventional reverse problem design has a wide range of topics and can be any format file of any architecture.
However, the method of reverse engineering is not afraid of these unknown platform formats. In the case of such unconventional problems, we also have some basic processes that can be used universally.
- Read the documentation. The quick way to learn the platform language is to read the official documentation.
- Official tools. The tools provided or recommended by the government are necessarily the most appropriate tools.
- Tutorial. On the reverse side, there may be many seniors who wrote reverse tutorials specific to the platform language, so they can quickly absorb this knowledge.
Looking for tools¶
Mainly look for
file parsing tools,
disassembler' is required, thedebugger
also contains the corresponding disassembly function, and for thedecompiler', you have to ask for more blessings, and I am fortunate to lose my life.
Looking for tools to sum up is: Google Dafa is good. Using Google search grammar reasonably, keyword search can help you find the right tool faster and better.
本页面的全部内容在 CC BY-NC-SA 4.0 协议之条款下提供，附加条款亦可能应用。