|
Abstract
|
Preprocessing, the process of preparation of data for effective use by machine learning algorithms. It is a crucial step in machine learning model training as it can directly affect model performance and especially in zero-day network intrusion detection system (NIDS) due to the increasing complexity of network traffic as a direct consequence of diverse and broad network services, diverse packet features, including redundant information produced by the many types of network protocols.
These kinds of features lead to increasing computational complexity and reduction in accuracy protocols. Effective preprocessing includes data cleaning to manage missing values and outliers, data denoising and dimensionality reduction.
Combining multiple techniques such as wavelet and Principal components analysis (PCA) in the preprocessing stage in the context of zero-day intrusion detection has shown great promise in improving the model accuracy, wavelet transform has been shown to be effective in noise reduction by decomposing high dimensional data into different frequencies which allows for the separating of the high frequency noise from the low frequency signal and Further accurate suppression of the noise while retaining critical data by thresholding the wavelet coefficient. . . .
|