IJTRD

Title of Paper:
Evaluating Cassandra Data-sets with Hadoop Approaches

Download

Authors:
Ruchira A. Kulkarni

Cite This Article :

Ruchira A. Kulkarni "Evaluating Cassandra Data-sets with Hadoop Approaches" Published in International Journal of Trend in Research and Development (IJTRD), ISSN: 2394-9333, Volume-2 | Issue-5 , October 2015, URL: http://www.ijtrd.com/papers/IJTRD48.pdf

Abstract :
The progressive transition in both scientific and industrial datasets has been the driving force behind the development and study interests in the NoSQL model. Loosely structured data poses a challenge to traditional data store systems, and when working with NoSQL model, these systems are often considered impractical and costly. As the quantity and quality of less structured data grows, so does the demand for a processing pipeline that is capable of seamlessly bind the NoSQL storage model and mapReduce which is “Big Data” processing platform. Although MapReduce is the exemplar of choice for data intensive computing, Java based frameworks like Hadoop requires users to write MapReduce code in Java while Hadoop Streaming module let users to define non Java executables as map and reduce operations. When challenged with legacy C/C++ applications and non Java executables, there arises a further need to permit NoSQL data stores access to the functions of Hadoop Streaming. We present approaches in solving the difficulty of integrating NoSQL data stores with MapReduce using non Java application scenarios, along with benefits and drawbacks of each approach. We compare Hadoop Streaming with our own streaming framework, MARISSA, to see performance implications of coupling NoSQL data stores like Cassandra with MapReduce structure that normally trust on file-system based data stores. this experiments also include Hadoop-C*, which is a configuration where a Hadoop cluster is Located with a Cassandra cluster in order to process data by using Hadoop with non java executables.

Keywords :
Hadoop, Cassandra, NoSQL, Pipelines, Map Reduce

Publication Details:

Published In :
Volume-2 | Issue-5 , October 2015

e-ISSN Number :
2394-9333

Unique Identification Number :
IJTRD48

International Journal of Trend in Research and Development

International Peer Reviewed, Open Access Journal ISSN: 2394-9333

Evaluating Cassandra Data-sets with Hadoop Approaches

For Author

Archives

Statistics

Contact