Genetic Analysis of Android Malware

  • Linna Wang Tianjin University, Tianjin 300072, China
Keywords: Malware gene traceability, Malware analysis, Android

Abstract

With the proliferation of Android malware, the issue of traceability in malware analysis has emerged as a significant problem that requires exploration. By establishing links between newly discovered, unreported malware and prior knowledge from existing malware data pools, security analysts can gain a better understanding of the evolution process of malware and its underlying reasons. However, in real-world scenarios, analyzing the traceability of malware can be complex and time-consuming due to the large volume of existing malware data, requiring extensive manual analysis. Furthermore, the results obtained from such analysis often lack explanation. Therefore, there is a pressing need to develop a comprehensive automated malware tracking system that can provide detailed insights into the tracking and evolution process of malware and offer strong explanatory capabilities. In this paper, we propose a knowledge graph-based approach that uses partial API call graphs comprising semantic and behavioral features to reveal the traceability relations among malware and provide explainable results for these relations. Our approach is implemented on a dataset of over 20,000 malware samples labeled with family information, spanning a time period of 10 years. To address the challenges associated with the complexity of analysis, we leverage prior knowledge from existing malware research and a branch pruning method on call graphs to reduce computational complexity and enhance the precision of explanations when determining traceability relations.

References

Willems C, Holz T, Freiling F, 2007, Toward Automated Dynamic Malware Analysis Using Cwsandbox, IEEE Security & Privacy, 5(2): 32–39.

Arp D, Spreitzenbarth M, Hubner M, et al., 2014, Drebin: Effective and Explainable Detection of Android Malware in Your Pocket, in ISOC Network and Distributed System Security Symposium (NDSS), (14), 23–26.

Rastogi V, Chen Y, Jiang X, 2013, Droidchameleon: Evaluating Android Anti-Malware Against Transformation Attacks, in Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, 329–334.

Kolosnjaji B, Zarras A, Webster G, et al., 2016, Deep Learning for Classification of Malware System Call Sequences, in Australasian Joint Conference on Artificial Intelligence, Springer, 137–149.

David OE, Netanyahu NS, 2015, Deepsign: Deep Learning for Automatic Malware Signature Generation and Classification, 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, 1–8.

Ahmadi M, Ulyanov D, Semenov S, et al., 2016, Novel Feature Extraction, Selection and Fusion for Effective Malware Family Classification, in Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy (CODASP), 183–194.

Meng G, Feng R, Bai G, et al., 2018, Droidecho: An In-Depth Dissection of Malicious Behaviors in Android Applications. Cybersecurity, 1(1): 1–17.

Chen J, Wang C, Zhao Z, et al., 2017, Uncovering the Face of Android Ransomware: Characterization and Real-Time Detection. IEEE Trans Actions on Information Forensics and Security (TIFS), 13(5): 1286–1300.

Li L, Gao J, Kong P, et al., 2020, Knowledgezooclient: Constructing Knowledge Graph for Android, in 2020 35th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW), IEEE, 73–78.

Tam K, Feizollah A, Anuar NB, et al., 2017, The Evolution of Android Malware and Android Analysis Techniques. ACM Computing Surveys (CSUR), 49(4): 1–41.

Published
2025-08-08