To search, Click
below search items.
|
|
All
Published Papers Search Service
|
Title
|
Evaluation of Similarity Analysis of Newspaper Article
Using Natural Language Processing
|
Author
|
Ayako Ohshiro, Takeo Okazaki, Takashi Kano, and Shinichiro Ueda
|
Citation |
Vol. 24 No. 6 pp. 1-7
|
Abstract
|
Comparing text features involves evaluating the ¡±similarity¡± between texts. It is crucial to use appropriate similarity measures when comparing similarities. This study utilized various techniques to assess the similarities between newspaper articles, including deep learning and a previously proposed method: a combination of Pointwise Mutual Information (PMI) and Word Pair Matching (WPM), denoted as PMI+WPM. For performance comparison, law data from medical research in Japan were utilized as validation data in evaluating the PMI+WPM method. The distribution of similarities in text data varies depending on the evaluation technique and genre, as revealed by the comparative analysis. For newspaper data, non-deep learning methods demonstrated better similarity evaluation accuracy than deep learning methods. Additionally, evaluating similarities in law data is more challenging than in newspaper articles. Despite deep learning being the prevalent method for evaluating textual similarities, this study demonstrates that non-deep learning methods can be effective regarding Japanese-based texts.
|
Keywords
|
Pointwise Mutual Information, Simpson coefficient, Doc2vec, BERT, Newspaper.
|
URL
|
http://paper.ijcsns.org/07_book/202406/20240601.pdf
|
|