Summary: | Function recognition is the preliminary step of the reverse analysis. It is the premise of binary reuse, generating control flow graphs, and performing semantic analysis. There are still many shortcomings in the application of current function identification tools, including cross-platform support, efficiency, and resource usage. To improve the accuracy and efficiency of function recognition in embedded system firmware, this paper proposes a new function recognition algorithm FRwithMBA based on the structure information. With the help of function call resolve, FRwithMBA identifies functions by determining the termination position of the function based on branches instruction and termination signatures of the function. In the experiment, we evaluated FRwithMBA against IDA Pro, radare2, and angr with the router binary Cisco IOS and found that FRwithMBA is more effective in identifying functions in the stripped binary under PPC and MIPS than the other tools. In terms of recognition effectiveness, FRwithMBA performs a little better than IDA Pro, but the results of radare2 and angr are poor. For a 14-MB binary, the parsing time of IDA Pro is 178 s, radare2 uses 376 s, angr uses 1896 s, and FRwithMBA uses 25 s. When processing 220-MB Executable and Linkable Format files, FRwithMBA can identify nearly 420 000 functions in about 240 s, IDA takes 2754 s, and both radare2 and angr get a fault. In the experiment, the recall of FRwithMBA reached 99.99% and the precision reached 99.7%. It is better than the other tools. In other words, FRwithMBA has more speed, higher accuracy, and less resource occupation.
|