Abstract
|
The requirement to include unstructured data in metadata is growing along with the volume of textual data on the Web and in digital libraries. It takes a lot of time and money to manually extract such information from an expanding collection of papers. Consequently, automatic and efficient text processing techniques are required [5]. One of the most popular applications of natural language processing is text classification. Information retrieval, emotion analysis, intent judgment, spam detection, news text classification, and more are common uses for text classification. Conventional text classification techniques include the manual extraction of features that are then provided to the classifier for training. These techniques, which were often founded on statistical concepts, manually classified classifiers and then classified fresh data using tagged datasets. However, these approaches have weak generalizability in novel settings since they are rapidly influenced by the data set in brief text classification tasks. The development of deep learning networks has led researchers to conclude that, in addition to resolving current issues, improved accuracy and performance can be attained by employing convolutional neural networks or repeating neural networks. In this research we use a combination of CNN and Word2Vec, as classic method, for feature extraction of texts.
|