Skip to main content

Search This Blog

Balaji Reddy 's Blog

This Blog is to share my perspective of Spark and Scala with my fellow Developers.

Posts

Featured

November 18, 2019

zip()

zip(...) is a simple and interesting operation which can be considered as general purpose operation. Basically it returns key-value pairs (ZippedPartitionsRDD2) with the first element in each RDD, second element in each RDD, etc. It is an element-wise , narrow transformation operation. Following example demonstrates the above definitions. val rdd1 = spark . sparkContext . parallelize ( Seq ( 1 , 2 , 3 , 4 , 5 ) ) val rdd2 = spark . sparkContext . parallelize ( Seq ( 6 , 7 , 8 , 9 , 10 ) ) rdd1 . zip ( rdd2) . foreach ( println ( _ ) ) Result (1,6) (2,7) (4,9) (5,10) (3,8) DAG The above result shows that, first element of rdd1 is zipped with first element of rdd2 and so on.Also the DAG confirms that zip(...) is a narrow operation. If you look at the code, you could find both rdd1 and rdd2 has same number of record counts but interesting point here is, following error will be thrown if your rdds ...

Get link
Facebook
X
Pinterest
Email
Other Apps

Latest posts

November 01, 2019

filter()

Get link
Facebook
X
Pinterest
Email
Other Apps

October 16, 2019

mapPartitions()

Get link
Facebook
X
Pinterest
Email
Other Apps

October 11, 2019

Intersection()

Get link
Facebook
X
Pinterest
Email
Other Apps

October 08, 2019

Subtract()

Get link
Facebook
X
Pinterest
Email
Other Apps

September 28, 2019

Collect vs Map Operation

Get link
Facebook
X
Pinterest
Email
Other Apps

August 19, 2017

Union Operation

Get link
Facebook
X
Pinterest
Email
Other Apps

Reduce() Operation

Get link
Facebook
X
Pinterest
Email
Other Apps

February 18, 2017

Map Operation

Get link
Facebook
X
Pinterest
Email
Other Apps

January 17, 2017

Spark Execution

Get link
Facebook
X
Pinterest
Email
Other Apps

January 02, 2017

map(), flatMap() vs mapValues(),flatMapValues()

Get link
Facebook
X
Pinterest
Email
Other Apps

Powered by Blogger

Theme images by Mae Burke

Balaji Reddy

Archive

November 20192
October 20193
September 20191
August 20171
May 20171
February 20171
January 20172
December 20162

Labels

Action Operation1
Narrow Operation4
Scala1
SET Operation3
Spark-RDD12
Transformation Operation6

Report Abuse