A Hybrid Multi-factor Forensic Data Analytics Framework for Suspicious Cyber Activity Detection

Himanshu Shukla *

Department of Information Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand – 263145, India.

Shikha Goswami

Department of Information Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand – 263145, India.

Rajeev Singh

Department of Information Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand – 263145, India.

Govind Verma

Department of Information Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand – 263145, India.

*Author to whom correspondence should be addressed.


Abstract

Background: The convergence of cloud, mobile, and enterprise networks improves operational efficiency and connectivity for organisations. However, it also increases cybersecurity risks, making multi-layered defences essential against threats such as credential attacks, data exfiltration, DDoS attacks, and insider threats.

Aims: This study examines whether integrating domain-expert rule logic with ensemble machine learning classifiers can produce a more dependable and operationally robust mechanism for identifying suspicious cyber activities within authentication-intensive environments than any single detection strategy alone.

Study Design: This comparative experimental study evaluated four detection configurations—rule-based scoring, Decision Tree (DT), Random Forest (RF), and a hybrid fusion model—against a binary-labelled cybercrime forensic dataset sourced from Kaggle.

Place and Duration of Study: Department of Information Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, India; January 2025 – May 2026.

Methodology: A dataset of 7,400 records with eleven attributes was pre-processed through missing-value imputation, label encoding, and SMOTE + Tomek Links resampling to address class imbalance. Login_Attempts and the hour of the day were retained as primary predictors. DT and RF classifiers were trained alongside a rule-based multi-factor scoring model, and their outputs were fused through a logical-OR strategy to form the hybrid model. Performance was assessed using Accuracy, Precision, Recall, and confusion-matrix statistics, with Recall treated as the governing metric given the asymmetric cost of undetected attacks.

Results: The Hybrid Model achieved the highest Recall of 97.20%, substantially outperforming the Rule-Based model (88.80%), Decision Tree (5.88%), and Random Forest (5.88%). The Decision Tree and Random Forest recorded the highest Accuracy (95.20% and 95.00%, respectively) and Precision (100% and 98.35%), whereas the Hybrid Model produced a False Negative count of only 13, the lowest among all configurations. These findings suggest that recall-optimised fusion is an appropriate detection paradigm for security-critical applications.

Conclusion: Fusing domain-driven rule logic with supervised ensemble learning through a logical-OR strategy substantially improves minority-class attack detection. The proposed framework reduces missed attacks and demonstrates potential for deployment within real-world forensic cybersecurity pipelines, notwithstanding an elevated false-positive count that may be addressed through alert prioritisation in operational environments.

Keywords: Intrusion detection systems, digital forensics, forensic data analytics, hybrid machine learning, rule-based detection, Random Forest, Decision Tree, class imbalance, SMOTE–Tomek Links, cyber threat detection, suspicious activity detection


How to Cite

Shukla, Himanshu, Shikha Goswami, Rajeev Singh, and Govind Verma. 2026. “A Hybrid Multi-Factor Forensic Data Analytics Framework for Suspicious Cyber Activity Detection”. Journal of Advances in Mathematics and Computer Science 41 (7):176-88. https://doi.org/10.9734/jamcs/2026/v41i72174.

Downloads

Download data is not yet available.