Optimasi Metode Supervised Learning Dengan Menggunakan Particle Swarm Optimization Untuk Deteksi Malware

Authors

  • Mayadi - Universitas Bhayangkara Jakarta Raya
  • Ismaniah - Universitas Bhayangkara Jakarta Raya
  • Tyastuti Sri Lestari Universitas Bhayangkara Jakarta Raya
  • Wowon Priatna Universitas Bhayangkara Jakarta Raya

DOI:

https://doi.org/10.34012/jutikomp.v6i2.4281

Keywords:

Malware Detection, Particle Swarm Optimization, Supervised Learning

Abstract

The purpose of this research is for malware detection to solve problems that arise when users access the internet and download files that have been infiltrated by malware. One of the popular solutions today is to use machine learning techniques to train many malware models by considering special features that allow prediction of whether particular software is malware or harmless using machine learning algorithms. The dataset used is a malware detection dataset from Kaggle, which will then be classified using the ensemble classifier algorithm which belongs to the supervised learning category algorithm. Improve classification with feature optimization using Particle Swarm Optimization (PSO). This study resulted in an accuracy value generated by the Ensemble algorithm of 92%, AUC 0.94%. Then, the classification was optimized with PSO, resulting in an accuracy value increased by 7.32% to 100% accuracy while AUC increased by 0.059 to AUC of 1. From the results of the research produced, feature selection is recommended before building a classification model for malware detection.

References

A. Kamboj, P. Kumar, A. K. Bairwa, and S. Joshi, “Detection of malware in downloaded files using various machine learning models,” Egypt. Informatics J., vol. 24, no. 1, pp. 81–94, 2022, doi: 10.1016/j.eij.2022.12.002.

E. Raff and C. K. Nicholas, “Machine Learning for Malware Detection,” Mach. Learn. Malware Detect., 2024, doi: 10.1142/13017.

S. A. Habtor and A. H. H. Dahah, “Machine-Learning Classifiers for Malware Detection Using Data Features,” J. ICT Res. Appl., vol. 15, no. 3, pp. 265–290, 2021, doi: 10.5614/ITBJ.ICT.RES.APPL.2021.15.3.5.

A. Amer and N. A. Aziz, “Malware Detection through Machine Learning Techniques,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 8, no. 5, pp. 2408–2413, 2019.

T. Arifin and A. Herliana, “Optimasi Metode Klasifikasi dengan Menggunakan Particle Swarm Optimization untuk Identifikasi Penyakit Diabetes Retinopathy,” vol. 4, no. 2, pp. 77–81, 2018.

C. W. Kim, “NtMalDetect: A Machine Learning Approach to Malware Detection Using Native API System Calls,” pp. 1–8, 2018, [Online]. Available: http://arxiv.org/abs/1802.05412.

R. B. Hadiprakoso, N. Qomariasih, and R. N. Yasa, “Identifikasi Malware Android Menggunakan Pendekatan Analisis Hibrid Dengan Deep Learning,” J. Teknol. Inf. Univ. Lambung Mangkurat, vol. 6, no. 2, pp. 77–84, 2021, doi: 10.20527/jtiulm.v6i2.82.

R. B. Hadiprakoso, W. R. Aditya, and F. N. Pramitha, “Analisis Statis Deteksi Malware Android Menggunakan Algoritma Supervised Machine Learning,” Cyber Secur. dan Forensik Digit., vol. 5, no. 1, pp. 1–5, 2022, doi: 10.14421/csecurity.2022.5.1.3116.

Z. Salekshahrezaee, J. L. Leevy, and T. M. Khoshgoftaar, “The effect of feature extraction and data sampling on credit card fraud detection,” J. Big Data, vol. 10, no. 1, 2023, doi: 10.1186/s40537-023-00684-w.

B. Liu and G. Tsoumakas, “Dealing with class imbalance in classifier chains via random undersampling,” Knowledge-Based Syst., vol. 192, p. 105292, 2020, doi: 10.1016/j.knosys.2019.105292.

Y. E. Kurniawati and Y. D. Prabowo, “Model optimisation of class imbalanced learning using ensemble classifier on over-sampling data,” IAES Int. J. Artif. Intell., vol. 11, no. 1, pp. 276–283, 2022, doi: 10.11591/ijai.v11.i1.pp276-283.

L. Liu, X. Wu, S. Li, Y. Li, S. Tan, and Y. Bai, “Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection,” BMC Med. Inform. Decis. Mak., vol. 22, no. 1, pp. 1–16, 2022, doi: 10.1186/s12911-022-01821-w.

E. Purnamasari, D. Palupi Rini, and Sukemi, “Seleksi Fitur menggunakan Algoritma Particle Swarm Optimization pada Klasifikasi Kelulusan Mahasiswa dengan Metode Naive Bayes,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 4, no. 3, pp. 469–475, 2020.

S. A. Alsenan, I. M. Al-Turaiki, and A. M. Hafez, “Feature extraction methods in quantitative structure-activity relationship modeling: A comparative study,” IEEE Access, vol. 8, pp. 78737–78752, 2020, doi: 10.1109/ACCESS.2020.2990375.

A. Fauzi and A. H. Yunial, “Optimasi Algoritma Klasifikasi Naive Bayes, Decision Tree, K – Nearest Neighbor, dan Random Forest menggunakan Algoritma Particle Swarm Optimization pada Diabetes Dataset,” J. Edukasi dan Penelit. Inform., vol. 8, no. 3, p. 470, 2022, doi: 10.26418/jp.v8i3.56656.

D. Zheng, C. Qin, and P. Liu, “Adaptive Particle Swarm Optimization Algorithm Ensemble Model Applied to Classification of Unbalanced Data,” Sci. Program., vol. 2021, no. 1, 2021, doi: 10.1155/2021/7589756.

N. Saravana, “Malware Detection,” https://www.kaggle.com/, 2017. https://www.kaggle.com/datasets/nsaravana/malware-detection.

Z. Jin, J. Shang, Q. Zhu, C. Ling, W. Xie, and B. Qiang, “RFRSF: Employee Turnover Prediction Based on Random Forests and Survival Analysis,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 12343 LNCS, pp. 503–515, 2020, doi: 10.1007/978-3-030-62008-0_35.

W. Yustanti and N. Rochmawati, “Analisis Algoritma Klasifikasi untuk Memprediksi Karakteristik Mahasiswa pada Pembelajaran Daring,” J. Edukasi dan Penelit. Inform., vol. 8, no. 1, pp. 57–61, 2022.

Yoga Religia, Agung Nugroho, and Wahyu Hadikristanto, “Klasifikasi Analisis Perbandingan Algoritma Optimasi pada Random Forest untuk Klasifikasi Data Bank Marketing,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 187–192, 2021, doi: 10.29207/resti.v5i1.2813.

M. R. Givari, M. R. Sulaeman, and Y. Umaidah, “Perbandingan Algoritma SVM, Random Forest Dan XGBoost Untuk Penentuan Persetujuan Pengajuan Kredit,” Nuansa Inform., vol. 16, no. 1, pp. 141–149, 2022, doi: 10.25134/nuansa.v16i1.5406.

H. H. Sinaga and S. Agustian, “Pebandingan Metode Decision Tree dan XGBoost untuk Klasifikasi Sentimen Vaksin Covid-19 di Twitter,” J. Nas. Teknol. dan Sist. Inf., vol. 8, no. 3, pp. 107–114, 2022, doi: 10.25077/teknosi.v8i3.2022.107-114.

Z. Salam Patrous, “Evaluating XGBoost for User Classification by using Behavioral Features Extracted from Smartphone Sensors,” p. 67, 2018, [Online]. Available: https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1240595&dswid=-6444.

A. N. A. Aldania, A. M. Soleh, and K. A. Notodiputro, “A Comparative Study of CatBoost and Double Random Forest for Multi-class Classification,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 7, no. 1, pp. 129–137, 2023, doi: 10.29207/resti.v7i1.4766.

L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “Catboost: Unbiased boosting with categorical features,” Adv. Neural Inf. Process. Syst., vol. 2018-Decem, no. Section 4, pp. 6638–6648, 2018.

S. Touzani, J. Granderson, and S. Fernandes, “Gradient boosting machine for modeling the energy consumption of commercial buildings,” Energy Build., vol. 158, no. January 2018, pp. 1533–1543, 2018, doi: 10.1016/j.enbuild.2017.11.039.

I. Wardhana, Musi Ariawijaya, Vandri Ahmad Isnaini, and Rahmi Putri Wirman, “Gradient Boosting Machine, Random Forest dan Light GBM untuk Klasifikasi Kacang Kering,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 1, pp. 92–99, 2022, doi: 10.29207/resti.v6i1.3682.

Y. Wanli Sitorus, P. Sukarno, S. Mandala, F. Informatika, and U. Telkom, “Analisis Deteksi Malware Android menggunakan metode Support Vector Machine & Random Forest,” e-Proceeding Eng., vol. 8, no. 6, p. 12500, 2021.

L. Zhang, “A Feature Selection Algorithm Integrating Maximum Classification Information and Minimum Interaction Feature Dependency Information,” Hindawi Comput. Intell. Neurosci., vol. 2021, 2021.

Y. Zhang, S. Wang, P. Phillips, and G. Ji, “Binary PSO with mutation operator for feature selection using decision tree applied to spam detection,” Knowledge-Based Syst., vol. 64, pp. 22–31, 2014, doi: 10.1016/j.knosys.2014.03.015.

A. G. Gad, Particle Swarm Optimization Algorithm and Its Applications: A Systematic Review, vol. 29, no. 5. Springer Netherlands, 2022.

R. C. Chen, C. Dewi, S. W. Huang, and R. E. Caraka, “Selecting critical features for data classification based on machine learning methods,” J. Big Data, vol. 7, no. 1, 2020, doi: 10.1186/s40537-020-00327-4.

T. R. Shultz and S. E. Fahlman, Encyclopedia of Machine Learning and Data Mining. 2017.

P. Sedgwick, “How to read a receiver operating characteristic curve,” BMJ, vol. 350, no. May, 2015, doi: 10.1136/bmj.h2464.

M. R. S. Alfarizi, M. Z. Al-farish, M. Taufiqurrahman, G. Ardiansah, and M. Elgar, “Penggunaan Python Sebagai Bahasa Pemrograman untuk Machine Learning dan Deep Learning,” Karya Ilm. Mhs. Bertauhid (KARIMAH TAUHID), vol. 2, no. 1, pp. 1–6, 2023.

Downloads

Published

2023-10-31

How to Cite

-, M., -, I., Lestari, T. S. ., & Priatna, W. (2023). Optimasi Metode Supervised Learning Dengan Menggunakan Particle Swarm Optimization Untuk Deteksi Malware. JURNAL TEKNOLOGI DAN ILMU KOMPUTER PRIMA (JUTIKOMP), 6(2), 150-155. https://doi.org/10.34012/jutikomp.v6i2.4281