![]() | Only 14 pages are availabe for public view |
Abstract The sharing of malicious code libraries and techniques over the Internet has vastly increased the release of new malware instances at an unprecedented rate. They pose a significant threat due to their fatal consequences to computer systems. Anti-malware, or the incorrect technical name known as ”anti-virus”, has been the main tool for malware detection. However, malware authors are continuously evolving obfuscation and evasion techniques that exploit the intrinsic weakness in the traditional anti-malware programs, which is the reliance on signatures database to distinguish between benign and malware instances. Consequently, anti-malware programs cannot effectively detect polymorphic or a recently released malware due to the absence of their signatures. Fortunately, the sharing of malicious code libraries and techniques has made malware instances to share similar behavior and features. Therefore, behavior-based detection techniques are effective in catching malware instances with similar behaviors, yet with different signatures. In this thesis, two features models, and a classification module are proposed. The features models are extracted after performing static analysis and dynamic analysis on a relatively recent malware dataset. The static analysis identifies the anomalies found in malware image, while the dynamic analysis describes the malicious behavior exhibited by the malware during its runtime. Additionally, the classification module is built on the proposed features models by genetic programming. Many experiments have been carried out to assess the viability of the proposed features models and classification module. The two proposed features models have achieved an improved classification accuracy compared to another state-of-the-art models in literature. More precisely, the features model extracted by static analysis has achieved a 97.47% accuracy rate, while the features model extracted by dynamic analysis has achieved a 97.85% accuracy rate. Compared to other models, the static and dynamic features models have increased the accuracy rate by 2.5% and 1.7%, respectively. |