Function Recognition in Stripped Binary of Embedded Devices

Function recognition is the preliminary step of the reverse analysis. It is the premise of binary reuse, generating control flow graphs, and performing semantic analysis. There are still many shortcomings in the application of current function identification tools, including cross-platform support,...

Full description

Bibliographic Details
Main Authors: Xiaokang Yin, Shengli Liu, Long Liu, Da Xiao
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8552415/
_version_ 1818854575539486720
author Xiaokang Yin
Shengli Liu
Long Liu
Da Xiao
author_facet Xiaokang Yin
Shengli Liu
Long Liu
Da Xiao
author_sort Xiaokang Yin
collection DOAJ
description Function recognition is the preliminary step of the reverse analysis. It is the premise of binary reuse, generating control flow graphs, and performing semantic analysis. There are still many shortcomings in the application of current function identification tools, including cross-platform support, efficiency, and resource usage. To improve the accuracy and efficiency of function recognition in embedded system firmware, this paper proposes a new function recognition algorithm FRwithMBA based on the structure information. With the help of function call resolve, FRwithMBA identifies functions by determining the termination position of the function based on branches instruction and termination signatures of the function. In the experiment, we evaluated FRwithMBA against IDA Pro, radare2, and angr with the router binary Cisco IOS and found that FRwithMBA is more effective in identifying functions in the stripped binary under PPC and MIPS than the other tools. In terms of recognition effectiveness, FRwithMBA performs a little better than IDA Pro, but the results of radare2 and angr are poor. For a 14-MB binary, the parsing time of IDA Pro is 178 s, radare2 uses 376 s, angr uses 1896 s, and FRwithMBA uses 25 s. When processing 220-MB Executable and Linkable Format files, FRwithMBA can identify nearly 420 000 functions in about 240 s, IDA takes 2754 s, and both radare2 and angr get a fault. In the experiment, the recall of FRwithMBA reached 99.99% and the precision reached 99.7%. It is better than the other tools. In other words, FRwithMBA has more speed, higher accuracy, and less resource occupation.
first_indexed 2024-12-19T07:54:53Z
format Article
id doaj.art-77200e2868fe4aee956e8db58c53e095
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T07:54:53Z
publishDate 2018-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-77200e2868fe4aee956e8db58c53e0952022-12-21T20:30:02ZengIEEEIEEE Access2169-35362018-01-016756827569410.1109/ACCESS.2018.28839738552415Function Recognition in Stripped Binary of Embedded DevicesXiaokang Yin0https://orcid.org/0000-0002-1617-4561Shengli Liu1Long Liu2Da Xiao3State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, ChinaFunction recognition is the preliminary step of the reverse analysis. It is the premise of binary reuse, generating control flow graphs, and performing semantic analysis. There are still many shortcomings in the application of current function identification tools, including cross-platform support, efficiency, and resource usage. To improve the accuracy and efficiency of function recognition in embedded system firmware, this paper proposes a new function recognition algorithm FRwithMBA based on the structure information. With the help of function call resolve, FRwithMBA identifies functions by determining the termination position of the function based on branches instruction and termination signatures of the function. In the experiment, we evaluated FRwithMBA against IDA Pro, radare2, and angr with the router binary Cisco IOS and found that FRwithMBA is more effective in identifying functions in the stripped binary under PPC and MIPS than the other tools. In terms of recognition effectiveness, FRwithMBA performs a little better than IDA Pro, but the results of radare2 and angr are poor. For a 14-MB binary, the parsing time of IDA Pro is 178 s, radare2 uses 376 s, angr uses 1896 s, and FRwithMBA uses 25 s. When processing 220-MB Executable and Linkable Format files, FRwithMBA can identify nearly 420 000 functions in about 240 s, IDA takes 2754 s, and both radare2 and angr get a fault. In the experiment, the recall of FRwithMBA reached 99.99% and the precision reached 99.7%. It is better than the other tools. In other words, FRwithMBA has more speed, higher accuracy, and less resource occupation.https://ieeexplore.ieee.org/document/8552415/Reverse engineeringbinary analysisfunction recognitionembedded devices
spellingShingle Xiaokang Yin
Shengli Liu
Long Liu
Da Xiao
Function Recognition in Stripped Binary of Embedded Devices
IEEE Access
Reverse engineering
binary analysis
function recognition
embedded devices
title Function Recognition in Stripped Binary of Embedded Devices
title_full Function Recognition in Stripped Binary of Embedded Devices
title_fullStr Function Recognition in Stripped Binary of Embedded Devices
title_full_unstemmed Function Recognition in Stripped Binary of Embedded Devices
title_short Function Recognition in Stripped Binary of Embedded Devices
title_sort function recognition in stripped binary of embedded devices
topic Reverse engineering
binary analysis
function recognition
embedded devices
url https://ieeexplore.ieee.org/document/8552415/
work_keys_str_mv AT xiaokangyin functionrecognitioninstrippedbinaryofembeddeddevices
AT shengliliu functionrecognitioninstrippedbinaryofembeddeddevices
AT longliu functionrecognitioninstrippedbinaryofembeddeddevices
AT daxiao functionrecognitioninstrippedbinaryofembeddeddevices