1 Instructions

This practical focuses on building interactive Shiny apps in R. You will:

The vehicle_pts dataset contains 2,635 vehicle-related records from Chicago in September 2016. Each row is a single record with columns including:

You will need the following packages. Install any that are missing using install.packages():

library(ggplot2)
library(dplyr)
library(sf)
library(shiny)
library(geodaData)

Load the dataset:

data(vehicle_pts)

2 Part A: The static plots

2.1 Exercise 1

  1. Use head() or str() to inspect vehicle_pts. Confirm that CreationDt is a Date column and identify the range of dates covered.

    head(vehicle_pts)
    ## old-style crs object detected; please recreate object with a recent sf::st_crs()
    ## old-style crs object detected; please recreate object with a recent sf::st_crs()
    ## Simple feature collection with 6 features and 10 fields
    ## Geometry type: POINT
    ## Dimension:     XY
    ## Bounding box:  xmin: -87.76067 ymin: 41.65195 xmax: -87.53995 ymax: 41.98211
    ## Geodetic CRS:  WGS 84
    ##   CreationDt             Address ZIPCode       X       Y Ward PoliceD Comm
    ## 1 2016-09-26      4137 W 62ND ST   60629 1149744 1863164   23       8   65
    ## 2 2016-09-27     3428 E 134TH ST   60633 1201060 1816722   10       4   55
    ## 3 2016-09-19      2015 W 83RD ST   60620 1164308 1849607   18       6   71
    ## 4 2016-09-04   5322 W WARNER AVE   60641 1140129 1927077   38      16   15
    ## 5 2016-09-13 9811 S MICHIGAN AVE   60628 1178876 1839902    9       5   49
    ## 6 2016-09-16   2649 W GREGORY ST   60625 1157677 1936759   40      20    4
    ##   Latitude Longitude                   geometry
    ## 1 41.78032 -87.72670  POINT (-87.7267 41.78032)
    ## 2 41.65195 -87.53995 POINT (-87.53995 41.65195)
    ## 3 41.74283 -87.67361 POINT (-87.67361 41.74283)
    ## 4 41.95610 -87.76067  POINT (-87.76067 41.9561)
    ## 5 41.71598 -87.62032 POINT (-87.62032 41.71598)
    ## 6 41.98211 -87.69620  POINT (-87.6962 41.98211)
    str(sf::st_drop_geometry(vehicle_pts))
    ## 'data.frame': 2635 obs. of  10 variables:
    ##  $ CreationDt: Date, format: "2016-09-26" "2016-09-27" ...
    ##  $ Address   : chr  "4137 W 62ND ST" "3428 E 134TH ST" "2015 W 83RD ST" "5322 W WARNER AVE" ...
    ##  $ ZIPCode   : num  60629 60633 60620 60641 60628 ...
    ##  $ X         : num  1149744 1201060 1164308 1140129 1178876 ...
    ##  $ Y         : num  1863164 1816722 1849607 1927077 1839902 ...
    ##  $ Ward      : num  23 10 18 38 9 40 18 6 50 28 ...
    ##  $ PoliceD   : num  8 4 6 16 5 20 8 7 24 12 ...
    ##  $ Comm      : num  65 55 71 15 49 4 70 68 2 28 ...
    ##  $ Latitude  : num  41.8 41.7 41.7 42 41.7 ...
    ##  $ Longitude : num  -87.7 -87.5 -87.7 -87.8 -87.6 ...

    Answer: CreationDt is of class Date, covering 2016-09-01 to 2016-09-30 (30 unique dates). The data is an sf object with POINT geometry in EPSG:4326.

  2. Create a static geospatial plot of all vehicle record locations using geom_sf(). Use coord_sf() with xlim = c(-87.9, -87.5) and ylim = c(41.65, 42.05) (and crs = 4326) to zoom to the Chicago area.

    ggplot(vehicle_pts) +
      geom_sf(size = 0.5) +
      coord_sf(xlim = c(-87.9, -87.5), ylim = c(41.65, 42.05), crs = 4326)
    ## old-style crs object detected; please recreate object with a recent sf::st_crs()
    ## old-style crs object detected; please recreate object with a recent sf::st_crs()
    All vehicle record locations in Chicago, September 2016.

    Figure 2.1: All vehicle record locations in Chicago, September 2016.

  3. Create a static scatterplot of PoliceD (y-axis) against Ward (x-axis) for all records.

    vehicle_pts |>
      ggplot(aes(x = Ward, y = PoliceD)) +
      geom_point()
    Police district against ward for all vehicle records.

    Figure 2.2: Police district against ward for all vehicle records.

3 Part B: Building the Shiny app

3.1 Exercise 2: A minimal app with a date slider

  1. Create a new file called app.R in your working directory. Copy and paste the template below, fill in every ___, and run the app.

    library(shiny)
    library(ggplot2)
    library(dplyr)
    library(sf)
    library(geodaData)
    
    data(vehicle_pts)
    
    ui <- fluidPage(
      sliderInput(
        inputId = "date",
        label   = "Select date:",
        min     = as.Date("2016-09-01"),
        max     = as.Date("2016-09-30"),
        value   = as.Date("2016-09-15")
      ),
      plotOutput("map_plot")
    )
    
    server <- function(input, output, session) {
      output$map_plot <- renderPlot({
        pts_day <- vehicle_pts |> filter(CreationDt == ___)
        ggplot(pts_day) +
          geom_sf(size = ___) +
          coord_sf(xlim = c(-87.9, -87.5), ylim = c(41.65, 42.05),
                   crs = 4326)
      })
    }
    
    shinyApp(ui = ui, server = server)
    library(shiny)
    library(ggplot2)
    library(dplyr)
    library(sf)
    library(geodaData)
    
    data(vehicle_pts)
    
    ui <- fluidPage(
      sliderInput(
        inputId = "date",
        label   = "Select date:",
        min     = as.Date("2016-09-01"),
        max     = as.Date("2016-09-30"),
        value   = as.Date("2016-09-15")
      ),
      plotOutput("map_plot")
    )
    
    server <- function(input, output, session) {
      output$map_plot <- renderPlot({
        pts_day <- vehicle_pts |> filter(CreationDt == input$date)
        ggplot(pts_day) +
          geom_sf(size = 1) +
          coord_sf(xlim = c(-87.9, -87.5), ylim = c(41.65, 42.05),
                   crs = 4326)
      })
    }
    
    shinyApp(ui = ui, server = server)
  2. Move the slider across several dates. Do the number of points on the map change noticeably from day to day?

    Answer: Yes — the number of records varies from day to day. Weekdays tend to show more records than weekends, and the spatial distribution also shifts slightly.

3.2 Exercise 3: Two plots in a fluidRow

  1. Add a second renderPlot() to the server that produces a scatterplot of PoliceD (y-axis) against Ward (x-axis) for the records on the selected date. Place the two plots side by side using fluidRow() and column() (each with width = 6).

    library(shiny)
    library(ggplot2)
    library(dplyr)
    library(sf)
    library(geodaData)
    
    data(vehicle_pts)
    
    ui <- fluidPage(
      sliderInput(
        inputId = "date",
        label   = "Select date:",
        min     = as.Date("2016-09-01"),
        max     = as.Date("2016-09-30"),
        value   = as.Date("2016-09-15")
      ),
      fluidRow(
        column(width = 6, plotOutput("map_plot")),
        column(width = 6, plotOutput("scatter_plot"))
      )
    )
    
    server <- function(input, output, session) {
      output$map_plot <- renderPlot({
        pts_day <- vehicle_pts |> filter(CreationDt == input$date)
        ggplot(pts_day) +
          geom_sf(size = 1) +
          coord_sf(xlim = c(-87.9, -87.5), ylim = c(41.65, 42.05),
                   crs = 4326) +
          labs(title = format(input$date, "%d %B %Y"))
      })
      output$scatter_plot <- renderPlot({
        pts_day <- vehicle_pts |>
          sf::st_drop_geometry() |>
          filter(CreationDt == input$date)
        ggplot(pts_day, aes(x = Ward, y = PoliceD)) +
          geom_point() +
          labs(x = "Ward", y = "Police District",
               title = format(input$date, "%d %B %Y"))
      })
    }
    
    shinyApp(ui = ui, server = server)
  2. Both plots share the same slider. Why is data(vehicle_pts) placed above the ui and server definitions rather than inside renderPlot()?

    Answer: vehicle_pts is a fixed dataset that does not change between renders. Loading it once outside the server means it is read into memory when the app starts. Each renderPlot() then filters the same in-memory object. If it were inside renderPlot(), the full dataset would be reloaded from the package every time the slider moved, which is unnecessarily slow.

3.3 Exercise 4: selectInput as an alternative

  1. Replace the sliderInput in your app with a selectInput whose choices are all 30 dates in September 2016 (2016-09-01 to 2016-09-30 inclusive). Use seq() to generate the sequence of dates and as.character() to convert them to the character strings that selectInput requires.

    Hint: selectInput always passes its selected value to the server as a character string, so you will need as.Date(input$date) inside filter() to convert it back to a Date before comparing with CreationDt. The rest of the server code does not need to change.

    library(shiny)
    library(ggplot2)
    library(dplyr)
    library(sf)
    library(geodaData)
    
    data(vehicle_pts)
    
    ui <- fluidPage(
      selectInput(
        inputId  = "date",
        label    = "Select date:",
        choices  = as.character(
          seq(as.Date("2016-09-01"), as.Date("2016-09-30"), by = "day")
        ),
        selected = "2016-09-15"
      ),
      fluidRow(
        column(width = 6, plotOutput("map_plot")),
        column(width = 6, plotOutput("scatter_plot"))
      )
    )
    
    server <- function(input, output, session) {
      output$map_plot <- renderPlot({
        pts_day <- vehicle_pts |>
        filter(CreationDt == as.Date(input$date))
        ggplot(pts_day) +
          geom_sf(size = 1) +
          coord_sf(xlim = c(-87.9, -87.5), ylim = c(41.65, 42.05),
                   crs = 4326) +
          labs(title = input$date)
      })
      output$scatter_plot <- renderPlot({
        pts_day <- vehicle_pts |>
        filter(CreationDt == as.Date(input$date))
        ggplot(pts_day, aes(x = Ward, y = PoliceD)) +
          geom_point() +
          labs(x = "Ward", y = "Police District",
               title = input$date)
      })
    }
    
    shinyApp(ui = ui, server = server)
  2. Both the sliderInput version (Exercise 2) and the selectInput version (Exercise 4) let the user pick one of 30 dates and produce identical plots. Which input widget do you find more appropriate for this dataset, and why?

    Answer: Either is defensible, but sliderInput is arguably more natural here: the 30 dates form a continuous ordered sequence and the slider conveys that ordering visually, making it easy to sweep through the month. selectInput is more suitable when choices are an unordered list of labels (e.g., district names) or when there are too many values for a slider to be readable. With only 30 options selectInput also works, but the dropdown hides the ordering and requires more clicks to step through the month day by day.

4 Part C: Reflection

4.1 Exercise 5

  1. Look at the scatterplot of PoliceD against Ward for several different dates. Does the scatterplot reveal a meaningful relationship between the two variables? Explain why or why not.

    Answer: The scatterplot does not reveal a meaningful relationship. Both Ward and PoliceD are identifiers (administrative labels), not quantities or measurements. Plotting one identifier against another shows which ward–district combinations appear in the data on that day, but the numeric values of the identifiers carry no inherent meaning — ward 10 is not ‘twice’ ward 5 in any real sense. A more meaningful plot would use counts (e.g., number of records per ward per day) or a continuous measurement.

  2. What would make a more informative plot using these two columns? Suggest one alternative visualisation that would be more meaningful.

    Answer: One alternative is a bar chart of the number of records per ward for the selected date, using count(Ward) before passing to ggplot() with geom_col(). This would show which wards have the most vehicle incidents on a given day, which is a genuine count rather than an identifier-vs-identifier comparison.