This code and graphic attempt to show how the insurance type of countries is related to how much they spend on health insurance per person and how long of a healthy life their citizens are expected to live. It also shows how inefficient the U.S. is with its expenditures.

1. Install Packages

library(readxl)
library(dplyr)
library(ggplot2)
library(colorspace)
library(ggrepel)
library(ggthemes)
library(plotly)
library(htmlwidgets)

2. Import Data

Healthcare expenditure was scraped from the world health organization’s online database. Current health expenditure per capita in purchasing power parity (a way to account for how much you can get with a dollar in different countries) was used in order to try to control for the value of a dollar in different parts of the world. Insurance types and obesity values were scraped from wikipedia.

https://apps.who.int/nha/database/Select/Indicators/en

https://en.wikipedia.org/wiki/Health_care_systems_by_country

https://en.wikipedia.org/wiki/List_of_countries_by_obesity_rate

healthyLifeExpectancy <- read_excel('C:/Users/Jeffrey/Desktop/My Documents/Grad School/Classes/Fall 2022/Staa 566/ExpectedHealthyYearsByCountry.xlsx')  
healthyLifeExpectancy <- data.frame(healthyLifeExpectancy)

healthExpenditure <- read_excel('C:/Users/Jeffrey/Desktop/My Documents/Grad School/Classes/Fall 2022/Staa 566/HealthExpenditureByCountry.xlsx') 
healthExpenditure <- data.frame(healthExpenditure)

insuranceType <- read_excel('C:/Users/Jeffrey/Desktop/My Documents/Grad School/Classes/Fall 2022/Staa 566/InsuranceTypeByCountry.xlsx') 
insuranceType <- data.frame(insuranceType)

obesityIndex <- read_excel('C:/Users/Jeffrey/Desktop/My Documents/Grad School/Classes/Fall 2022/Staa 566/ObesityIndexByCountry.xlsx')   
obesityIndex <- data.frame(obesityIndex)

3. Create Data for Graph

dfHle <- healthyLifeExpectancy %>% 
  select('Country', 'Year', 'ExpectedHealthyYears') %>%
  filter(Year == 2019)

dfHe <- healthExpenditure %>% 
  select('Country', 'Indicator', 'Y2019') %>%
  filter(Indicator == 'GDP per Capita in US Dollar')

dfHePPP <- healthExpenditure %>% 
  select('Country', 'Indicator', 'Y2019') %>%
  filter(Indicator == 'CHC per Capita in PPP')

dfIt <- insuranceType
dfOi <- obesityIndex

df <- dfHle %>%
              inner_join(dfHe, by='Country') %>%
              inner_join(dfHePPP, by='Country') %>%
              inner_join(dfIt, by='Country') %>%
              inner_join(dfOi, by='Country')
  
colnames(df)[which(names(df) == 'Y2019.x')] <- 'GdpPerCapita'
colnames(df)[which(names(df) == 'Y2019.y')] <- 'PPP'
colnames(df)[which(names(df) == 'ObesityIndex.y')] <- 'ObesityIndex' 

df <- df %>% select('Country', 'ExpectedHealthyYears', 'GdpPerCapita', 'PPP', 'ObesityIndex','InsuranceType')

4. Create Plot

myPlot <- ggplot(df, aes(PPP, ExpectedHealthyYears)) +
  geom_smooth(method = "lm", formula = y~log(x), color="black" 
  ) +
  geom_point(aes(color = InsuranceType, fill = InsuranceType, size=ObesityIndex),alpha = 0.9, shape = 21
  ) +
scale_color_manual(name = NULL,values = c("black","black", "black", "black", "black", "black")
  ) +
scale_fill_manual(name = NULL,values = c("darkred", "darkorange", "darkblue", "darkgreen", "purple", "black")
  ) + 
   scale_x_continuous(name = "PPP (purchasing power parity) per Capita"
  ) +
  scale_y_continuous(name = "Expected Healthy Years of Life at Birth"
  ) +
  annotate('text', x = 10100, y = 65.2, label = 'U.S.A.') +
  ggtitle("Country's Cost-effectiveness of Healthcare by Insurance Type") +
guides(color = guide_legend(override.aes = list(size=5)))

myPlot

Besides instantly seeing that the U.S.A. spends a lot more for less, we also see that the U.S.A. spends three times more on healhtcare per person compared to similar non-universally insured countries, yet has a population with a similar healthy life expectancy.