This practical brings together what we’ve covered so far:
ggplot2The goal is to create a complete, reproducible analysis document that combines data exploration with professional visualisations.
In Practical 1, you learned about data frames — the fundamental structure
for storing tabular data in R. You accessed columns using $ notation
(e.g., mtcars$mpg) and explored built-in datasets.
In Practical 2, you learned how to create R Markdown documents that weave together code, output, and narrative text. You used chunk options to control what appears in your final document:
| Goal | Chunk option |
|---|---|
| Hide code, show output | #| echo: false |
| Hide everything | #| include: false |
| Figure caption | #| fig.cap: "..." |
| Figure size | #| fig.width, #| fig.height |
In lectures, you learned about ggplot2 and the grammar of graphics:
ggplot(data, aes(x = var1, y = var2)) +
geom_*() +
labs(x = "X Label", y = "Y Label", title = "Title")
Now we combine all three: create ggplot2 visualisations inside an R Markdown
document, using chunk options to produce professional output.
When you create a ggplot in an R Markdown code chunk, the plot is
automatically included in your output document. Here’s an example:
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
labs(x = "Weight (1000 lbs)", y = "Miles per Gallon")
Figure 2.1: Scatterplot of weight vs fuel efficiency.
A typical analysis workflow in R Markdown involves:
#| include: false)dplyr are useful here)Tips for professional output:
labs() to make plots self-explanatory.#| fig.cap: "..." to your chunk options.#| fig.width and #| fig.height for the
actual dimensions, and #| out.width for how much space it takes.#| echo: false so readers see only the
visualisation, not the code.
Note: in these practicals, especially the solutions, the code is sometimes included for illustration. In the assignments and exam, whether you need to hide the code will be specified.In this exercise, you will create an R Markdown document that contains
7 different types of visualisations using the mpg dataset (fuel economy
data for 234 vehicles). Your final document should include:
Getting started:
practical03_plots.Rmdggplot2:```{r setup}
#| include: false
library(ggplot2)
```
For each question:
#| fig.cap: "..."#| echo: falseWhen you have completed all 7 plots, knit your document to PDF.
Histogram: Create a histogram of cty using geom_histogram().
Experiment with different binwidth values (try 1, 2, and 5).
Add appropriate axis labels using labs().
In your interpretation, describe whether the distribution is symmetric
or skewed.
Answer: The distribution is right-skewed with most vehicles getting 15-20 city MPG.
ggplot(mpg, aes(x = cty)) +
geom_histogram(binwidth = 2, fill = "steelblue", colour = "white") +
labs(x = "City MPG", y = "Count")
Figure 3.1: Distribution of city fuel efficiency.
Density plot: Create a density plot of cty using geom_density().
Add a rug plot underneath using geom_rug(). In your interpretation,
compare this to the histogram — what does the density plot show that
the histogram doesn’t (and vice versa)?
Answer: Density plot shows a smoother view of the distribution shape. Histogram shows actual counts and is affected by bin width choice
ggplot(mpg, aes(x = cty)) +
geom_density(fill = "lightblue", alpha = 0.5) +
geom_rug() +
labs(x = "City MPG", y = "Density")
Figure 3.2: Density of city fuel efficiency.
Bar chart: Create a bar chart of class (vehicle type) using
geom_bar(). Add appropriate axis labels. In your interpretation,
state which vehicle class is most common in the dataset.
Answer: SUV is the most common vehicle class
ggplot(mpg, aes(x = class)) +
geom_bar(fill = "steelblue") +
labs(x = "Vehicle Class", y = "Count")
Figure 3.3: Bar chart of vehicle type.
Scatterplot: Explore the relationship between cty and hwy.
Create a scatterplot of cty (\(y\)-axis) vs hwy (\(x\)-axis) using
geom_point(). Add a trend line using geom_smooth(method = "lm").
In your interpretation, describe the relationship you observe.
Answer: Strong positive linear relationship: higher highway MPG = higher city MPG
ggplot(mpg, aes(x = hwy, y = cty)) +
geom_point() +
geom_smooth(method = "lm") +
labs(x = "Highway MPG", y = "City MPG")
Figure 3.4: City fuel efficiency against highway fuel efficiency.
Line plot: Create a line plot using the economics dataset.
Plot psavert (personal savings rate) over time using geom_line().
Add appropriate axis labels. In your interpretation, describe the
trend you observe.
Answer: General declining trend from 1970s to 2005, then increase after 2008 crisis
ggplot(economics, aes(x = date, y = psavert)) +
geom_line() +
labs(x = "Date", y = "Personal Savings Rate (%)")
Figure 3.5: Personal saving rates over time.
Stripchart: Create a stripchart of cty by class using
geom_jitter() with width = 0.2. Add appropriate axis labels.
ggplot(mpg, aes(x = class, y = cty)) +
geom_jitter(width = 0.2) +
labs(x = "Vehicle Class", y = "City MPG")
Figure 3.6: Boxplot of city fuel efficiency by vehicle class.
Boxplot: Compare the distribution of cty across vehicle classes
using geom_boxplot() with aes(x = class, y = cty). Add appropriate
axis labels. Combine the boxplot with a jitter plot (use
outlier.shape = NA in the boxplot to avoid duplicate points).
In your interpretation, state which vehicle class has the best city
fuel efficiency.
Answer: Subcompact has the best median city fuel efficiency
ggplot(mpg, aes(x = class, y = cty)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2, alpha = 0.5) +
labs(x = "Vehicle Class", y = "City MPG")
Figure 3.7: Boxplot of city fuel efficiency by vehicle class, overlayed by stripchart / jitter plot.
Create a complete analysis document
Using another dataset such as mtcars or airquality, create a
complete R Markdown document that includes:
#| echo: false)Answer: This is left as an exercise.
This practical demonstrated how to:
ggplot2)In future practicals, we will build on these skills to create more sophisticated visualisations using colours, scales, faceting, and themes.