1.

How spark core fit into the picture to solving the big data use case?

Answer»

Reduce, collection, aggregation API, stream, parallel stream, optional which can easily HANDLE to all the use CASE where we are dealing volume of data handling.

Bullet points are as follows:

  • Spark core is the distributed execution engine for large-scala parallel and distributed data PROCESSING.
  • Spark core provide a real time processing for large data set.
  • Handle memory management and fault recovery.
  • Scheduling, distributing and monitoring jobs on a cluster.
  • Spark core comes with map, flatmap, reduce, reducebykey, groupbykey which handling the key VALUE pair-based data processing for large data set.
  • Spark core also support aggregation operation.
  • Spark core support Java, Scala and Python.
  • Code snippet: val counts = textReader.flatMap(line => line.split(",")).map(word => (word, 1)).reduceByKey(_ + _).

APPARENTLY spark use for data processing framework, however we can also use to perform the data analysis and data science.



Discussion

No Comment Found