S. Daisy Fatima Mary, G. Mageswary
Sentiment analysis is a significant field of study and a highly sought-after discipline that focuses on discerning the feelings, viewpoints, and emotions conveyed within a piece of text. Sentiment Analysis enables the extraction of valuable insights from textual data sourced from platforms such as Facebook, Twitter, Amazon, and more. Social media data are frequently unstructured and challenging to manage due to their diverse formats and complex nature. This paper employed the MOOC dataset for sentiment analysis. This research paper aims to provide essential information on how to preprocess reviews in order to determine sentiment and analyze whether they are positive or negative and neutral. Various text preprocessing techniques are applied to improve the efficiency of text classification includes stemming, lemmatization, tokenization, removing emoticons, removing stopwords, and spelling correction are applied to the unstructured text. Data preprocessing can be done with the help of Natural Language Processing (NLP) is a vital component in sentiment analysis, as it helps in preprocessing text data, extracting features, classifying sentiment, understanding language nuances, and presenting results in an interpretable manner. The accuracy of the text is calculated before and after preprocessing. Results proved that the accuracy of algorithm was significantly improved after applying the preprocessing steps. This research work demonstrates the impact of text preprocessing on the accuracy, particularly highlighting the improvement in machine learning algorithms. Proper preprocessing techniques contribute to improved prediction accuracy and reduced computational time, ultimately leading to better outcomes in various applications.
Machine learning, Sentiment analysis, data Pre-processing, Natural Language Processing.