Interpretable Ensemble-Based Intrusion Detection Using Feature Selection on the ToN_IoT Dataset

Vaman Shakir Sulaiman, Firas Mahmood Mustafa

Abstract


With With the rapid growth of IoT, securing interconnected devices against cyber threats has become critical. IoT datasets such as ToN-IoT are often high-dimensional, which poses challenges for efficient and accurate intrusion detection. Moreover, interpretable models are essential to help security analysts understand and trust automated decisions. Intrusion Detection Systems (IDS) powered by machine learning offer promising solutions, especially when trained on realistic datasets such as ToN_IoT. However, achieving a balance between high accuracy, computational efficiency, and model interpretability remains a challenge. This study proposes an efficient and interpretable IDS framework for binary classification using the ToN_IoT dataset, aiming to identify the optimal feature selection method and ensemble learning model while leveraging explainable artificial intelligence to interpret model decisions. A quantitative experimental approach was adopted, applying and comparing Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE) for feature selection, and evaluating the performance of LightGBM, XGBoost, and Random Forest classifiers using Accuracy, F1-score, Precision, Recall, and training time. RFE outperformed PCA, identifying 11 key features, and LightGBM emerged as the top-performing model with an accuracy of 99.72%, demonstrating both speed and strong generalization. SHAP (SHapley Additive exPlanations) was used to generate summary plots for global feature importance, enhancing the transparency and interpretability of IDS decisions. Overall, the combination of RFE and LightGBM resulted in a high-performing and explainable IDS framework, underscoring the importance of strategic feature selection and model choice. Compared to existing IDS approaches on the ToN-IoT dataset, our proposed framework not only achieves higher accuracy but also provides a rapid and lightweight solution. Additionally, by incorporating SHAP for feature importance analysis, our approach ensures clear model interpretability, allowing security analysts to understand and trust the system’s decisions. This combination of high performance, efficiency, and explainability highlights the practical advantages of our method over previous work. Future research will extend this framework to support multiclass classification and online learning for real-time threat detection.


Keywords


LightGBM; Intrusion Detection System; XAI; SHAP; Ensemble Learning; RFE

Full Text:

PDF

References


“Zscaler ThreatLabz 2023 Enterprise IoT and OT Threat Report | Zscaler.” Accessed: Oct. 31, 2025. [Online]. Available: https://info.zscaler.com/resources-industry-reports-threatlabz-2023-enterprise-iot-ot-threat-report?utm_source=chatgpt.com

“SonicWall 2024 Mid-Year Cyber Threat Report: IoT Madness, PowerShell Problems and More.” Accessed: Oct. 31, 2025. [Online]. Available: https://www.sonicwall.com/blog/sonicwall-2024-mid-year-cyber-threat-report-iot-madness-powershell-problems-and-more?utm_source=chatgpt.com

M. Nitti et al., “Citation: Trustworthy Adaptive AI for Real-Time Intrusion Detection in Industrial IoT Security,” IoT, 2025, doi: 10.3390/iot6030053.

N. Sharma and B. Arora, “Machine Learning and Deep Learning Models for Anomaly Intrusion Detection in Networks: A Systematic Review,” SN Computer Science 2025 6:7, vol. 6, no. 7, pp. 1–38, Sep. 2025, doi: 10.1007/S42979-025-04352-Z.

A. Alsaedi, N. Moustafa, Z. Tari, A. Mahmood, and Adna N Anwar, “TON-IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion detection systems,” IEEE Access, vol. 8, pp. 165130–165150, 2020, doi: 10.1109/ACCESS.2020.3022862.

H. Dhirar and A. Hamad, “Comparative evaluation of a novel IDS dataset for SDN-IoT using deep learning models against InSDN, BoT-IoT, and ToN-IoT,” Measurement: Digitalization, vol. 4, p. 100015, Dec. 2025, doi: 10.1016/J.MEADIG.2025.100015.

S. Rajarajeswari, M. Grover, L. Yashoda, P. Mathurkar, D. Bhanu, and M. Singh, “An Effective Design of Intrusion Detection System With Classification Algorithms And Feature Reduction In Machine Learning,” 2nd IEEE International Conference on Innovations in High-Speed Communication and Signal Processing, IHCSP 2024, 2024, doi: 10.1109/IHCSP63227.2024.10960043.

M. Prasad, S. Tripathi, and K. Dahal, “A Feature Probability Estimation-based Feature Selection Approach for Intrusion Detection,” 2025 6th International Conference on Recent Advances in Information Technology (RAIT), pp. 1–6, Jul. 2025, doi: 10.1109/RAIT65068.2025.11089293.

M. Rajkumar, L. Vs, R. Karthik, and S. Pavithra, “Optimized Deep Learning Mechanism for Intrusion Detection: Leveraging RFE-Based Feature Selection and PCA for Improved Accuracy,” 5th International Conference on Sustainable Communication Networks and Application, ICSCNA 2024 - Proceedings, pp. 1517–1522, 2024, doi: 10.1109/ICSCNA63714.2024.10863936.

K. Wu, Y. Li, J. Sun, Q. Qin, and J. Li, “An ensemble framework with improved grey wolf optimization algorithm and multi-level feature selection for IoT intrusion detection,” Cluster Computing 2025 28:12, vol. 28, no. 12, pp. 1–34, Sep. 2025, doi: 10.1007/S10586-025-05374-1.

M. S. Farooq et al., “Interpretable Federated Learning Model for Cyber Intrusion Detection in Smart Cities with Privacy-Preserving Feature Selection,” Computers, Materials & Continua, vol. 0, no. 0, pp. 1–10, 2025, doi: 10.32604/CMC.2025.069641.

Y. Alotaibi and M. Ilyas, “Ensemble-Learning Framework for Intrusion Detection to Enhance Internet of Things’ Devices Security,” Sensors, vol. 23, no. 12, Jun. 2023, doi: 10.3390/s23125568.

S. Yaras and M. Dener, “IoT-Based Intrusion Detection System Using New Hybrid Deep Learning Algorithm,” Electronics (Basel), 2024, doi: 10.3390/electronics.

J. Li, M. S. Othman, H. Chen, and L. M. Yusuf, “Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learning,” J Big Data, vol. 11, no. 1, Dec. 2024, doi: 10.1186/s40537-024-00892-y.

R. A. Elsayed, R. A. Hamada, M. I. Abdalla, and S. A. Elsaid, “Securing IoT and SDN systems using deep-learning based automatic intrusion detection,” Ain Shams Engineering Journal, vol. 14, no. 10, Oct. 2023, doi: 10.1016/j.asej.2023.102211.

A. Alabbadi and F. Bajaber, “An Intrusion Detection System over the IoT Data Streams Using eXplainable Artificial Intelligence (XAI),” Sensors, vol. 25, no. 3, Feb. 2025, doi: 10.3390/s25030847.

J. Li, H. Chen, M. O. Shahizan, and L. M. Yusuf, “Enhancing IoT security: A comparative study of feature reduction techniques for intrusion detection system,” Intelligent Systems with Applications, vol. 23, Sep. 2024, doi: 10.1016/j.iswa.2024.200407.




DOI: https://doi.org/10.31326/jisa.v8i2.2487

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Vaman Shakir Sulaiman, Firas Mahmood Mustafa

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


JOURNAL IDENTITY

Journal Name: JISA (Jurnal Informatika dan Sains)
e-ISSN: 2614-8404, p-ISSN: 2776-3234
Publisher: Program Studi Teknik Informatika Universitas Trilogi
Publication Schedule: June and December 
Language: English
APC: The Journal Charges Fees for Publishing 
IndexingEBSCODOAJGoogle ScholarArsip Relawan Jurnal IndonesiaDirectory of Research Journals Indexing, Index Copernicus International, PKP IndexScience and Technology Index (SINTA, S4) , Garuda Index
OAI addresshttp://trilogi.ac.id/journal/ks/index.php/JISA/oai
Contactjisa@trilogi.ac.id
Sponsored by: DOI – Digital Object Identifier Crossref, Universitas Trilogi

In Collaboration With: Indonesian Artificial Intelligent Ecosystem(IAIE), Relawan Jurnal IndonesiaJurnal Teknologi dan Sistem Komputer (JTSiskom)

 

 


JISA (Jurnal Informatika dan Sains) is Published by Program Studi Teknik Informatika, Universitas Trilogi under Creative Commons Attribution-ShareAlike 4.0 International License.