/* ---- Google Analytics Code Below */

Monday, April 25, 2016

Basic Apache Spark Utilization

Good basic, mostly non technical introduction:

Basic Spark Utilization for Analytics in Big Data
by Atif Farid Mohammad

Apache Spark is a Big Data Analytics Engine, which runs in both memory as well as on disk. It is 100 times faster, if we run it using in-memory processing. It is still 10 times faster than standard MapReduce on the disk also. Apache Spark process interactive processing using Spark SQL to process real time queries. The use of Spark is also helpful for the Data Scientists to run their Machine Learning algorithms, as Apache Spark provides iterative processing also.  ... " 

No comments: