The purpose of this map is to display how population varies across the district regions in Thailand. The data used is from 2022.
Thailand shape data
The Thailand shape data, which gives the boundary information for various spatial units (province, district, sub-district, etc) can be downloaded from here:
Specifically, I used the ‘tha_adm_rtsd_itos_20210121_shp.zip’ file.
District is a subset of province, and it was more interesting to plot the data at the district level, which is found in the ‘tha_admbnda_adm2_rtsd_20220121.shp’ file. The spatial units being displayed in the maps are regional districts.
Thailand population data for 2022
The Thailand population data came from the same ‘Humanitarian Data Exchange’ (HDX) source:
v ggplot2 3.3.5 v purrr 0.3.4
v tibble 3.1.6 v dplyr 1.0.7
v tidyr 1.1.4 v stringr 1.4.0
v readr 2.1.1 v forcats 0.5.1
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
library(rgeos)
Warning: package 'rgeos' was built under R version 4.1.3
Loading required package: sp
rgeos version: 0.5-9, (SVN revision 684)
GEOS runtime version: 3.9.1-CAPI-1.14.2
Please note that rgeos will be retired by the end of 2023,
plan transition to sf functions using GEOS at your earliest convenience.
GEOS using OverlayNG
Linking to sp version: 1.5-0
Polygon checking: TRUE
library(rgdal)
Warning: package 'rgdal' was built under R version 4.1.3
Please note that rgdal will be retired by the end of 2023,
plan transition to sf/stars/terra functions using GDAL and PROJ
at your earliest convenience.
rgdal: version: 1.5-32, (SVN revision 1176)
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 3.4.1, released 2021/12/27
Path to GDAL shared files: C:/Users/tiana/OneDrive/Documents/R/win-library/4.1/rgdal/gdal
GDAL binary built with GEOS: TRUE
Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
Path to PROJ shared files: C:/Users/tiana/OneDrive/Documents/R/win-library/4.1/rgdal/proj
PROJ CDN enabled: FALSE
Linking to sp version:1.5-0
To mute warnings of possible GDAL/OSR exportToProj4() degradation,
use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
library(maptools)
Checking rgeos availability: TRUE
Please note that 'maptools' will be retired by the end of 2023,
plan transition at your earliest convenience;
some functionality will be moved to 'sp'.
library(broom)
Warning: package 'broom' was built under R version 4.1.3
library(dplyr)library(viridis)
Warning: package 'viridis' was built under R version 4.1.3
Loading required package: viridisLite
library(leaflet)
Warning: package 'leaflet' was built under R version 4.1.3
library(mapview)
Warning: package 'mapview' was built under R version 4.1.3
library(htmlwidgets)
Warning: package 'htmlwidgets' was built under R version 4.1.3
OGR data source with driver: ESRI Shapefile
Source: "C:\Users\tiana\OneDrive\Documents\CSU_STAA_566_Data_Viz\maps-TStas\tha_adm_rtsd_itos_20210121_shp\tha_admbnda_adm2_rtsd_20220121.shp", layer: "tha_admbnda_adm2_rtsd_20220121"
with 928 features
It has 19 fields
# tidying the shape data gets it into a data frame with longitude and latitudethai_tidy <-tidy(thailand_shape_data, region="ADM2_PCODE")# plot just districts to begin withdistrict_plot <-ggplot() +geom_polygon(data = thai_tidy, aes(x = long, y = lat, group = group), fill="white", color="grey")+theme_void() +coord_map()district_plot
Load population data
Load the population data.
# read in population data from csv filethailand_population <-read.csv("C:/Users/tiana/OneDrive/Documents/CSU_STAA_566_Data_Viz/maps-TStas/tha_admpop_adm2_2022.csv")
Important columns include ADM2_EN (district name) and T_TL (total population for a district). We will also need to keep track of ADM2_PCODE, the code value for each district. This will be used to combine the population data with the shape data.
# want the ADM2_EN, ADM2_PCODE and T_TL columns# Notes on levels of data:# 0 = country# 1 = province# 2 = district ** we want level 2# 3 = sub-district
Make a new dataframe comprised of the desired columns.
# make new dataframedistrict_populations <-data.frame(thailand_population$ADM2_NAME, thailand_population$ADM2_PCODE, thailand_population$T_TL)# take a lookhead(district_populations)
Make a basic bar plot to get high level overview of population values.
p_bar <-ggplot(data = district_populations, aes(x=district_populations$thailand_population.ADM2_NAME, y=district_populations$thailand_population.T_TL)) +geom_bar(stat="identity") +theme(axis.text.x=element_blank()) # remove x-axis labels - too messy for nowp_bar
Warning: Use of `district_populations$thailand_population.ADM2_NAME` is
discouraged. Use `thailand_population.ADM2_NAME` instead.
Warning: Use of `district_populations$thailand_population.T_TL` is discouraged.
Use `thailand_population.T_TL` instead.
Combine the geospatial and population data
Use the value of the district code (under ‘ADM2_PCODE’ in the population data and ‘id’ in the tidied spatial data) to combine the shape and population data.
# combine geospatial and population data# geospatial data is in: thai_tidy# population data is in: district_populationsdistricts_and_populations <- thai_tidy %>%left_join(., district_populations, by=c("id"="thailand_population.ADM2_PCODE"))# reviewhead(districts_and_populations)
Make choropleth of 2022 Thailand district populations
First, try a basic version of a choropleth.
# basic choropleth thailand_pop <-ggplot() +geom_polygon(data = districts_and_populations, aes(fill = districts_and_populations$thailand_population.T_TL, x = long, y = lat, group = group)) +theme_void() +coord_map()# thailand_pop
Now, improve the basic plot by updating color, legend, and formatting. I found using a log scale helped improve the color differentiation, which this post suggests:
# improve coloring, legend, etcthailand_pop_2 <-ggplot() +geom_polygon(data = districts_and_populations, aes(fill = districts_and_populations$thailand_population.T_TL, x = long, y = lat, group = group)) +theme_void() +scale_fill_viridis(trans ="log", breaks=c(1000, 10000, 50000, 100000, 500000, 1000000), name="District population", guide =guide_legend(keyheight =unit(3, units="mm"), keywidth=unit(12, units="mm"), label.position ="bottom", title.position ='top', nrow=1))+labs(title="2022 Thailand district population", ) +coord_map()# thailand_pop_2# save ggplot object as png# ggsave("thailand_population_choropleth.png", bg="white")
Use leaflet to build an interactive plot
It would be useful to be able to zoom in, click on a district, and be able to see both the name and population of that district.
# set colors for leafletcolors_for_districts <-colorQuantile("viridis", NULL, n =6)# create popup of both district name and population value, each on one linepopup_district_names <-paste0("<strong>District: </strong>", district_populations$thailand_population.ADM2_NAME, "<br/>", "<strong>Population: </strong>", district_populations$thailand_population.T_TL)# build leafletthailand_leaflet <-leaflet(thailand_shape_data) %>%addTiles() %>%setView(lat =13.7563, lng =100.5018, zoom =5) %>%# start with lat, lng of BangkokaddPolygons(fillColor =~colors_for_districts(district_populations$thailand_population.T_TL),fillOpacity =0.8,smoothFactor =0.5,popup = popup_district_names )# thailand_leaflet# save#saveWidget(thailand_leaflet, file="thailand_population_leaflet.html")