1.

What is SparkSession in Pyspark?

Answer»

SparkSession is the entry point to PYSPARK and is the replacement of SparkContext since PySpark version 2.0. This acts as a starting point to access all of the PySpark FUNCTIONALITIES related to RDDs, DataFrame, DATASETS etc. It is also a Unified API that is used in replacing the SQLContext, StreamingContext, HiveContext and all other contexts.

The SparkSession internally creates SparkContext and SparkConfig based on the details provided in SparkSession. SparkSession can be created by making USE of BUILDER patterns.



Discussion

No Comment Found