Using Apache Spark for Analysing the Sentiments of Unstructured Data with Logistic Regression Algorithm


Using Apache Spark for Analysing the Sentiments of Unstructured Data with Logistic Regression Algorithm

Chetan Balaji

Chetan Balaji "Using Apache Spark for Analysing the Sentiments of Unstructured Data with Logistic Regression Algorithm" Published in International Journal of Trend in Research and Development (IJTRD), ISSN: 2394-9333, Volume-7 | Issue-5 , October 2020, URL: http://www.ijtrd.com/papers/IJTRD22297.pdf

Sentiment analysis has become an interesting field for both research and industrial domains. The expression sentiment refers to the feelings or thought of the person across some certain issues. Besides, it is additionally viewed as an immediate application for feeling mining. The tremendous measure of unstructured information has been the wellspring of printed information and one of the most fundamental information volumes; subsequently, this information has various points, for example, business, modern or social points as indicated by the information necessity and required preparing. As a matter of fact, the measure of information, which is huge, develops quickly every second and this is called large information which requires unique preparing methods and high computational force so as to play out the necessary mining errands. Here we propose an idea to perform a sentiment analysis with the help of Apache Spark framework, which is considered an open source distributed data processing platform which utilizes distributed memory abstraction. The goal of using Apache Spark’s Machine learning library (MLIB) is to handle an extraordinary amount of data effectively. We recommend some Pre-processing and Machine learning text feature extraction steps for getting greater results in Sentiment Analysis classification. The effectiveness of our proposed approach is proved against other approaches achieving better classification results when using Naïve Bayes, and Decision trees classification algorithms. Finally, our solution estimates the performance of Apache Spark concerning its scalability

Apache Spark, Unstructured Data


Volume-7 | Issue-5 , October 2020

2394-9333

IJTRD22297
pompy wtryskowe|cheap huarache shoes| cheap jordans|cheap jordans|cheap air max| cheap sneaker cheap nfl jerseys|cheap air jordanscheap jordan shoes
cheap wholesale jordans