摘要: | 近年來,隨著網路的普及和社群媒體的興起,大量的用戶將自己對商品或服務的評價發布在網路上,這些評價包含了豐富的訊息,能夠為消費者提供有價值的參考。然而,由於評價數據龐大,手工分析是不現實的,因此需要使用自然語言處理技術,例如深度學習之CNN、RNN、BiRNN、LSTM、Transformer方法,對評價進行自動化分析。情感分析的目的是自動分類評論對某一特定方面的情緒,情緒可以是積極的、消極的或中性的。
本研究以Yahoo電影版之電影評價為例,使用深度學習的方法對網路上的正負評論進行分類。評價的文本經過預處理,包括斷詞、去除停用詞等,轉化為數據。接著,使用CNN、RNN、BiRNN、LSTM、Transformer模型進行訓練和預測,通過對已知正負標籤的評價進行訓練,提高模型的分類能力。
模擬結果顯示, CNN、RNN、BiRNN、LSTM和Transformer方法在電影評價的分類上表現良好,訓練集、驗證集、與測試集準確率皆達到84 %以上,驗證集和測試集上的準確率相對接近,這表示模型在泛化到未見過的數據時表現穩定。
總體而言,本研究證明了CNN、RNN、BiRNN、LSTM、Transformer方法在網路評價分析中的應用價值,並為進一步提高評價分類的準確率提供了一定的參考。
In recent years, with the prevalence of the Internet and the rise of social media, many people have been posted their own reviews of products or services on the Internet. These reviews contain rich information and might offer consumers useful resources. However, manual analysis is impractical, given the large volume of review data. Therefore, the use of natural language processing technologies, such as deep learning techniques like CNN, RNN, BiRNN, LSTM,and Transformer, is necessary for auto-mated analysis of reviews. Sentiment analysis is used to automatically classify a re-view’s sentiment--which might be neutral, positive, or negative--towards a particular aspect.
This study takes movie reviews from Yahoo Movies as an example, classifying positive and negative reviews on the Internet using deep learning techniques. The text of the review undergoes preprocessing, including tokenization, removing stop words, and data conversion. Subsequently, CNN, RNN, BiRNN, LSTM, and Transformer models are used for training and prediction. By training the models with reviews la-beled as positive and negative, the classification ability of the model is enhanced.
The simulation results show that the CNN, RNN, BiRNN, LSTM, and Transform-er methods perform well in the classification of movie reviews. The accuracy of the training set, verification set, and test set all exceeds 84%. The accuracy on the verifi-cation set and test set is relatively close, indicating that the model exhibits stable per-formance when generalizing to unseen data. |