To search, Click below search items.

 

All Published Papers Search Service

Title

Exploring the Performance of Feature Dimensionality Reduction Technique Using Malware Dataset.

Author

Azaabi Cletus, Alex Opoku (PhD), Benjamin Weyory (PhD.)

Citation

Vol. 22  No. 6  pp. 690-696

Abstract

Features play a critical role in the machine learning or predictive modelling in general and in malware detection in particular. Machine learning models or algorithms start and die on the bases of the features used in the training and testing phases. To ensure optimum predictive capability of a model and to reduce computational resources, irrelevant, duplicates and unwanted features need to be transformed into lower dimensional space for improved prediction. The paper experimented three feature dimensionality reduction techniques (data-dependent, data-independent and graphic based) dimensionality reduction strategies for the prediction of malware. We implemented a data-dependent (Principal Component Analysis), data-independent (Hashing trick) and graph-based (Uniform Manifold Approximation & Projection) UMAP using malware diagnostic dataset. After an initial experiment on the original dataset with all features using four classifiers, the best performing classifier (SVM) was selected for the implementation. Features were reduced to 16 and the performance of the model was evaluated. The performance of the dimensionality reduction techniques on the dataset was evaluated using Accuracy and false positive Rate (FPR). The performances of the three techniques were compared. The result demonstrate that, the data independent technique (HT) outperformed the other two with accuracy (99.1%), FPR(1.2%) as against the two data-dependent and graph-based at accuracy(98.7%) and False positive (1.7%) respectively. The paper concluded that, data-independent dimensionality reduction technique (HT) produces superior malware detection accuracy, lower FPR with the malware dataset, and consequently presents a high potential for malware detection and classification.

Keywords

Features, malware, transformation, accuracy, false positive.

URL

http://paper.ijcsns.org/07_book/202206/20220687.pdf