1.

What is the Jitter Plot? Explain with an example.

Answer»

Jitter PLOT is used for correlation. It provides pretty much all points which scatter plots typically do not show up.

We consider mpg dataset with city mileage (CTY) and highway mileage (HWY). The original data has 234 data points but a typical scatter plot seems to display fewer points.

This is because there are many overlapping points appearing as a single dot. The fact that both cty and hwy are integers in the source dataset made it all the more convenient to hide this detail.

  • load package and data library(ggplot2)

data(mpg, package="ggplot2") theme_set(theme_bw())

g <- ggplot(mpg, aes(cty, hwy))

  • Scatterplot

g + geom_point() +

geom_smooth(method="lm", se=F) +

labs(subtitle="mpg: city VS highway mileage",

y="hwy",

x="cty",

title="Scatterplot with overlapping points",

caption="Source: midwest")

Now we can handle this with a Jitter plot.

We can make a jitter plot with jitter_geom(). As the name suggests, the overlapping points are randomly jittered around its original position BASED on a threshold controlled by

the width argument.

  • load package and data library(ggplot2)

data(mpg, package="ggplot2")

  • Jitter plot

theme_set(theme_bw()) # pre-set the bw theme.

g <- ggplot(mpg, aes(cty, hwy))

g + geom_jitter(width = .5, size=1) + labs(subtitle="mpg: city vs highway mileage",

y="hwy",

x="cty",

title="Jittered Points")



Discussion

No Comment Found

Related InterviewSolutions