STAA 566 HW 3 - Map

Author

Tiana Stastny

2022 Thailand District Population Map

The purpose of this map is to display how population varies across the district regions in Thailand. The data used is from 2022.

Thailand shape data

The Thailand shape data, which gives the boundary information for various spatial units (province, district, sub-district, etc) can be downloaded from here:

https://data.humdata.org/dataset/cod-ab-tha

Specifically, I used the ‘tha_adm_rtsd_itos_20210121_shp.zip’ file.

District is a subset of province, and it was more interesting to plot the data at the district level, which is found in the ‘tha_admbnda_adm2_rtsd_20220121.shp’ file. The spatial units being displayed in the maps are regional districts.

Thailand population data for 2022

The Thailand population data came from the same ‘Humanitarian Data Exchange’ (HDX) source:

https://data.humdata.org/dataset/cod-ps-tha

Specifically, I used the ‘tha_admpop_adm2_2022.csv’ file.

Load libraries

library(tidyverse)
-- Attaching packages --------------------------------------- tidyverse 1.3.1 --
v ggplot2 3.3.5     v purrr   0.3.4
v tibble  3.1.6     v dplyr   1.0.7
v tidyr   1.1.4     v stringr 1.4.0
v readr   2.1.1     v forcats 0.5.1
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
library(rgeos)
Warning: package 'rgeos' was built under R version 4.1.3
Loading required package: sp
rgeos version: 0.5-9, (SVN revision 684)
 GEOS runtime version: 3.9.1-CAPI-1.14.2 
 Please note that rgeos will be retired by the end of 2023,
plan transition to sf functions using GEOS at your earliest convenience.
 GEOS using OverlayNG
 Linking to sp version: 1.5-0 
 Polygon checking: TRUE 
library(rgdal)
Warning: package 'rgdal' was built under R version 4.1.3
Please note that rgdal will be retired by the end of 2023,
plan transition to sf/stars/terra functions using GDAL and PROJ
at your earliest convenience.

rgdal: version: 1.5-32, (SVN revision 1176)
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 3.4.1, released 2021/12/27
Path to GDAL shared files: C:/Users/tiana/OneDrive/Documents/R/win-library/4.1/rgdal/gdal
GDAL binary built with GEOS: TRUE 
Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
Path to PROJ shared files: C:/Users/tiana/OneDrive/Documents/R/win-library/4.1/rgdal/proj
PROJ CDN enabled: FALSE
Linking to sp version:1.5-0
To mute warnings of possible GDAL/OSR exportToProj4() degradation,
use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
library(maptools)
Checking rgeos availability: TRUE
Please note that 'maptools' will be retired by the end of 2023,
plan transition at your earliest convenience;
some functionality will be moved to 'sp'.
library(broom)
Warning: package 'broom' was built under R version 4.1.3
library(dplyr)
library(viridis)
Warning: package 'viridis' was built under R version 4.1.3
Loading required package: viridisLite
library(leaflet)
Warning: package 'leaflet' was built under R version 4.1.3
library(mapview)
Warning: package 'mapview' was built under R version 4.1.3
library(htmlwidgets)
Warning: package 'htmlwidgets' was built under R version 4.1.3

Load shape data

# load data
# thailand_shape_data <- readOGR("C:/Users/tiana/OneDrive/Documents/CSU_STAA_566_Data_Viz/maps-TStas/tha_admbnda_adm2_rtsd_20220121.shp") 

thailand_shape_data <- readOGR("C:/Users/tiana/OneDrive/Documents/CSU_STAA_566_Data_Viz/maps-TStas/tha_adm_rtsd_itos_20210121_shp/tha_admbnda_adm2_rtsd_20220121.shp") 
OGR data source with driver: ESRI Shapefile 
Source: "C:\Users\tiana\OneDrive\Documents\CSU_STAA_566_Data_Viz\maps-TStas\tha_adm_rtsd_itos_20210121_shp\tha_admbnda_adm2_rtsd_20220121.shp", layer: "tha_admbnda_adm2_rtsd_20220121"
with 928 features
It has 19 fields
# tidying the shape data gets it into a data frame with longitude and latitude
thai_tidy <- tidy(thailand_shape_data, region="ADM2_PCODE")

# plot just districts to begin with
district_plot <- ggplot() + geom_polygon(data = thai_tidy, aes(x = long, y = lat, group = group), fill="white", color="grey")+
  theme_void() +
  coord_map()
district_plot

Load population data

Load the population data.

# read in population data from csv file

thailand_population <- read.csv("C:/Users/tiana/OneDrive/Documents/CSU_STAA_566_Data_Viz/maps-TStas/tha_admpop_adm2_2022.csv")

Important columns include ADM2_EN (district name) and T_TL (total population for a district). We will also need to keep track of ADM2_PCODE, the code value for each district. This will be used to combine the population data with the shape data.

# want the ADM2_EN, ADM2_PCODE and T_TL columns
# Notes on levels of data:
# 0 = country
# 1 = province
# 2 = district  ** we want level 2
# 3 = sub-district

Make a new dataframe comprised of the desired columns.

# make new dataframe
district_populations <- data.frame(thailand_population$ADM2_NAME, thailand_population$ADM2_PCODE, thailand_population$T_TL)

# take a look
head(district_populations)
  thailand_population.ADM2_NAME thailand_population.ADM2_PCODE
1                      Chanuman                         TH3702
2                    Hua Taphan                         TH3706
3                     Lue Amnat                         TH3707
4          Mueang Amnat Charoen                         TH3701
5           Pathum Ratchawongsa                         TH3703
6                         Phana                         TH3704
  thailand_population.T_TL
1                    25704
2                    23228
3                    18235
4                    67475
5                    24735
6                    10830
# get dimensions
dim(district_populations)
[1] 928   3

Make a basic bar plot to get high level overview of population values.

p_bar <- ggplot(data = district_populations, aes(x=district_populations$thailand_population.ADM2_NAME, y=district_populations$thailand_population.T_TL)) + 
  geom_bar(stat="identity") + 
  theme(axis.text.x=element_blank())  # remove x-axis labels - too messy for now
p_bar
Warning: Use of `district_populations$thailand_population.ADM2_NAME` is
discouraged. Use `thailand_population.ADM2_NAME` instead.
Warning: Use of `district_populations$thailand_population.T_TL` is discouraged.
Use `thailand_population.T_TL` instead.

Combine the geospatial and population data

Use the value of the district code (under ‘ADM2_PCODE’ in the population data and ‘id’ in the tidied spatial data) to combine the shape and population data.

# combine geospatial and population data
# geospatial data is in: thai_tidy
# population data is in: district_populations

districts_and_populations <- thai_tidy %>% 
  left_join(., district_populations, by=c("id"="thailand_population.ADM2_PCODE"))

# review
head(districts_and_populations)
# A tibble: 6 x 9
   long   lat order hole  piece group    id    thailand_popula~ thailand_popula~
  <dbl> <dbl> <int> <lgl> <fct> <fct>    <chr> <chr>                       <int>
1  101.  13.7     1 FALSE 1     TH1001.1 TH10~ Phra Nakhon                 27890
2  100.  13.7     2 FALSE 1     TH1001.1 TH10~ Phra Nakhon                 27890
3  100.  13.7     3 FALSE 1     TH1001.1 TH10~ Phra Nakhon                 27890
4  100.  13.7     4 FALSE 1     TH1001.1 TH10~ Phra Nakhon                 27890
5  100.  13.7     5 FALSE 1     TH1001.1 TH10~ Phra Nakhon                 27890
6  100.  13.7     6 FALSE 1     TH1001.1 TH10~ Phra Nakhon                 27890

Make choropleth of 2022 Thailand district populations

First, try a basic version of a choropleth.

# basic choropleth 
thailand_pop <- ggplot() +
  geom_polygon(data =  districts_and_populations, aes(fill = districts_and_populations$thailand_population.T_TL, x = long, y = lat, group = group)) +
  theme_void() +
  coord_map()

# thailand_pop

Now, improve the basic plot by updating color, legend, and formatting. I found using a log scale helped improve the color differentiation, which this post suggests:

https://r-graph-gallery.com/327-chloropleth-map-from-geojson-with-ggplot2.html

# improve coloring, legend, etc
thailand_pop_2 <- ggplot() + 
  geom_polygon(data =  districts_and_populations, aes(fill = districts_and_populations$thailand_population.T_TL, x = long, y = lat, group = group)) +
  theme_void() +
  scale_fill_viridis(trans = "log", breaks=c(1000, 10000, 50000, 100000, 500000, 1000000), name="District population", guide = guide_legend(keyheight = unit(3, units="mm"), keywidth=unit(12, units="mm"), label.position = "bottom", title.position = 'top', nrow=1))+
  labs(
    title="2022 Thailand district population",
  ) + 
  coord_map()

# thailand_pop_2

# save ggplot object as png
# ggsave("thailand_population_choropleth.png", bg="white")

Use leaflet to build an interactive plot

It would be useful to be able to zoom in, click on a district, and be able to see both the name and population of that district.

# set colors for leaflet
colors_for_districts <- colorQuantile("viridis", NULL, n = 6)

# create popup of both district name and population value, each on one line
popup_district_names <- paste0("<strong>District: </strong>", district_populations$thailand_population.ADM2_NAME, "<br/>", "<strong>Population: </strong>", district_populations$thailand_population.T_TL)

# build leaflet
thailand_leaflet <- leaflet(thailand_shape_data) %>%
  addTiles() %>%
  setView(lat = 13.7563, lng = 100.5018, zoom = 5) %>%  # start with lat, lng of Bangkok
  addPolygons(
    fillColor = ~colors_for_districts(district_populations$thailand_population.T_TL),
    fillOpacity = 0.8,
    smoothFactor = 0.5,
    popup = popup_district_names
  )

# thailand_leaflet

# save
#saveWidget(thailand_leaflet, file="thailand_population_leaflet.html")