1.

What are the benefits of using Spark streaming for real time processing instead of other framework and tools?

Answer»

Spark Streaming supports MICRO-batch-oriented stream processing engine, Spark has a capability to allow the data can be ingested from many sources like Kafka, Flume, Kinesis, or TCP SOCKETS

and can be processed using complex algorithms expressed with high-level functions like map, reduce, join and window.

Below are the other key benefits which Spark streaming support.

  • Spark streaming is one of features of Spark used to process the real time data efficiently.
  • Spark Streaming implement using Kafka and Zookeeper messaging API, which is again a fault tolerant messaging container can create a messaging cluster.
  • PROVIDE high-throughput and fault-tolerant stream processing 
  • Provide DStream data structure which is a basically a stream of RDD to process the real-time data.
  • Spark Streaming fits for scenario where interaction require Kafka to  Database or Kafka to Data science model type of context.

Spark work on batches which RECEIVES an input data stream and divided into the micro batches, which is further processed by the spark engine to generate the final stream of result in the batches.

Below diagram CLEARLY illustrated the workflow of Spark streaming. 



Discussion

No Comment Found