Home
About Us
Contact Us
Bookmark
Saved Bookmarks
Current Affairs
General Knowledge
Chemical Engineering
UPSEE
BSNL
ISRO
BITSAT
Amazon
ORACLE
Verbal Ability
→
Spark Interview Questions
→
Spark Interview Questions for Experienced in Spark Interview Questions
→
What is the difference between repartition and coa...
1.
What is the difference between repartition and coalesce?
Answer»
Repartition
Coalesce
Usage repartition can increase/decrease the
NUMBER
of data partitions.
Spark coalesce can only reduce the number of data partitions.
Repartition creates
NEW
data partitions and performs a full
SHUFFLE
of evenly distributed data.
Coalesce makes use of already existing partitions to reduce the amount of shuffled data unevenly.
Repartition internally calls coalesce with shuffle parameter thereby making it
SLOWER
than coalesce.
Coalesce is faster than repartition.
HOWEVER
, if there are unequal-sized data partitions, the speed might be slightly slower.
Show Answer
Discussion
No Comment Found
Post Comment
Related InterviewSolutions
What is YARN in Spark?
What do you understand by Shuffling in Spark?
What are the data formats supported by Spark?
What is the difference between repartition and coalesce?
What are receivers in Apache Spark Streaming?
List the types of Deploy Modes in Spark.
What does DAG refer to in Apache Spark?
What is RDD?
What are the features of Apache Spark?
Can you tell me what is Apache Spark about?
Reply to Comment
×
Name
*
Email
*
Comment
*
Submit Reply
Your experience on this site will be improved by allowing cookies. Read
Cookie Policy
Reject
Allow cookies